How to declare a variable but not assign it? - rust

I want to declare acceptable_set and use it. But an empty vector is assigned to it. So the compiler warns. How to declare a variable and do not assign it?
let mut acceptable_set: Vec<String> = Vec::new();
if opt.acceptable_set.is_none() {
acceptable_set = crate::builtin_words::ACCEPTABLE
.to_vec()
.iter_mut()
.map(|x| x.to_string())
.collect();
} else {
acceptable_set = get_acceptable_set(opt)
}
warning: value assigned to `acceptable_set` is never read
--> src/basic_function.rs:27:13
|
27 | let mut acceptable_set: Vec<String> = Vec::new();
| ^^^^^^^^^^^^^^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?

Instead of declaring an uninitialised variable like this
let var;
if condition {
var = value1;
} else {
var = value2;
}
you could directly initialise the variable with the alternative.
let var = if condition { value1 } else { value2 };
Your variable does not even need the mut keyword (except if you want to mutate it afterwards).
And since in the example of your question you seem to test against an Option (.is_none()), you could use this form.
fn main() {
let value1 = 12;
let value2 = 34;
let opt = Some(987);
let var = if let Some(opt_value) = opt {
value1 + opt_value // use the data in the option
} else {
value2 // the option contains nothing, use a default value
};
println!("var {:?}", var);
}
See also map_or_else().

Related

Confusion about temporary values

I am writing a small function that reads /proc/bus/input/devices line by line and looks for a pattern in each block.
let mut handlers = Vec::<&str>::new();
let entry = RefCell::new(String::new());
let re = Regex::new(r"(?m)(event\d+)").unwrap();
for line in lines {
let l = line.unwrap();
if !l.is_empty() {
entry.borrow_mut().push_str(&l);
continue
}
if entry.borrow().contains("EV=120013") {
if let Some(captures) = re.captures(entry.borrow().as_str().clone()) {
if let Some(m) = captures.get(0) {
&handlers.push(m.as_str());
}
}
}
entry.borrow_mut().clear();
}
However, the build fails with the following error:
error[E0716]: temporary value dropped while borrowed
--> src/linux/keylogger.rs:50:53
|
50 | if let Some(captures) = re.captures(entry.borrow().as_str().clone()) {
| ^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
51 | if let Some(m) = captures.get(0) {
52 | &handlers.push(m.as_str());
| ------------------------- borrow later used here
...
55 | }
| - temporary value is freed at the end of this statement
|
= note: consider using a `let` binding to create a longer lived value
I tried to create such binding, but I couldn't get it to work ...
entry.borrow() returns a so called guard. You have to keep this guard alive for the entire time you access the content.
As you are only using it inline, it goes out of scope right away and the borrowed value gets returned.
You need to store it in a local variable that stays alive for as long as you need the borrowed content.
That's only the first problem, though. The second one is that you can't you can't store &str that aren't &'static str in a vector. &str do not keep the content of the string alive, they are only references. To store them somewhere, you need the owning version, String.
use std::cell::RefCell;
use regex::Regex;
fn main() {
let lines = [Some("aaa"), Some("bbb")];
let mut handlers = Vec::<String>::new();
let entry = RefCell::new(String::new());
let re = Regex::new(r"(?m)(event\d+)").unwrap();
for line in lines {
let l = line.unwrap();
if !l.is_empty() {
entry.borrow_mut().push_str(&l);
continue;
}
if entry.borrow().contains("EV=120013") {
let entry_guard = entry.borrow();
if let Some(captures) = re.captures(&entry_guard) {
if let Some(m) = captures.get(0) {
handlers.push(m.as_str().to_string());
}
}
}
entry.borrow_mut().clear();
}
}

How do I tackle lifetimes in Rust?

I am having issues with the concept of lifetimes in rust. I am trying to use the crate bgpkit_parser to read in a bz2 file via url link and then create a radix trie.
One field extracted from the file is the AS Path which I have named path in my code within the build_routetable function. I am having trouble as to why rust does not like let origin = clean_path.last() which takes the last element in the vector.
fn as_parser(element: &BgpElem) -> Vec<u32> {
let x = &element.as_path.as_ref().unwrap().segments[0];
let mut as_vec = &Vec::new();
let mut as_path: Vec<u32> = Vec::new();
if let AsPathSegment::AsSequence(value) = x {
as_vec = value;
}
for i in as_vec {
as_path.push(i.asn);
}
return as_path;
}
fn prefix_parser(element: &BgpElem) -> String {
let subnet_id = element.prefix.prefix.ip().to_string().to_owned();
let prefix_id = element.prefix.prefix.prefix().to_string().to_owned();
let prefix = format!("{}/{}", subnet_id, prefix_id);//.as_str();
return prefix;
}
fn get_aspath(raw_aspath: Vec<u32>) -> Vec<u32> {
let mut as_path = Vec::new();
for i in raw_aspath {
if i < 64511 {
if as_path.contains(&i) {
continue;
}
else {
as_path.push(i);
}
}
else if 65535 < i && i < 4000000000 {
if as_path.contains(&i) {
continue;
}
else {
as_path.push(i);
}
}
}
return as_path;
}
fn build_routetable(mut trie4: Trie<String, Option<&u32>>, mut trie6: Trie<String, Option<&u32>>) {
let url: &str = "http://archive.routeviews.org/route-views.chile/\
bgpdata/2022.06/RIBS/rib.20220601.0000.bz2";
let parser = BgpkitParser::new(url).unwrap();
let mut count = 0;
for elem in parser {
if elem.elem_type == bgpkit_parser::ElemType::ANNOUNCE {
let record_timestamp = &elem.timestamp;
let record_type = "A";
let peer = &elem.peer_ip;
let prefix = prefix_parser(&elem);
let path = as_parser(&elem);
let clean_path = get_aspath(path);
// Issue is on the below line
// `clean_path` does not live long enough
// borrowed value does not live long
// enough rustc E0597
// main.rs(103, 9): `clean_path` dropped
// here while still borrowed
// main.rs(77, 91): let's call the
// lifetime of this reference `'1`
// main.rs(92, 17): argument requires
// that `clean_path` is borrowed for `'1`
let origin = clean_path.last(); //issue line
if prefix.contains(":") {
trie6.insert(prefix, origin);
}
else {
trie4.insert(prefix, origin);
}
count+=1;
if count >= 10000 {
println!("{:?} | {:?} | {:?} | {:?} | {:?}",
record_type, record_timestamp, peer, prefix, path);
count=0
}
};
}
println!("Trie4 size: {:?} prefixes", trie4.len());
println!("Trie6 size: {:?} prefixes", trie6.len());
}
Short answer: you're "inserting" a reference. But what's being referenced doesn't outlive what it's being inserted into.
Longer: The hint is your trie4 argument, the signature of which is this:
mut trie4: Trie<String, Option<&u32>>
So that lives beyond the length of the loop where things are declared. This is all in the loop:
let origin = clean_path.last(); //issue line
if prefix.contains(":") {
trie6.insert(prefix, origin);
}
While origin is a Vec<u32> and that's fine, the insert method is no doubt taking a String and either an Option<&u32> or a &u32. Obviously a key/value pair. But here's your problem: the value has to live as long as the collection, but your value is the last element contained in the Vec<u32>, which goes away! So you can't put something into it that will not live as long as the "container" object! Rust has just saved you from dangling references (just like it's supposed to).
Basically, your containers should be Trie<String, Option<u32>> without the reference, and then this'll all just work fine. Your problem is that the elements are references, and not just contained regular values, and given the size of what you're containing, it's actually smaller to contain a u32 than a reference (pointer size (though actually, it'll likely be the same either way, because alignment issues)).
Also of note: trie4 and trie6 will both be gone at the end of this function call, because they were moved into this function (not references or mutable references). I hope that's what you want.

How to fix use of moved value in Rust?

I am trying to convert a yaml file to xml using Rust and I am not able to figure out how to fix this error regarding the use of moved value. I think I understand why this error is coming, but haven't got a clue about what to do next.
Here's the code:
struct Element {
element_name: String,
indentation_count: i16,
}
struct Attribute<'a> {
attribute_name: &'a str,
attribute_value: &'a str,
}
fn convert_yaml_to_xml(content: String, indentation_count: i16) -> String {
let mut xml_elements: Vec<Element> = vec![];
let mut attributes: Vec<Attribute> = vec![];
xml_elements.push(Element {element_name: "xmlRoot".to_string(), indentation_count: -1});
let mut target: Vec<u8> = Vec::new();
let mut xml_data_writer = EmitterConfig::new().perform_indent(true).create_writer(&mut target);
let mut attribute_written_flag = false;
let mut xml_event;
xml_event = XmlEvent::start_element("xmlRoot");
for line in content.lines() {
let current_line = line.trim();
let caps = indentation_count_regex.captures(current_line).unwrap();
let current_indentation_count = caps.get(1).unwrap().as_str().to_string().len() as i16;
if ELEMENT_REGEX.is_match(current_line) {
loop {
let current_attribute_option = attributes.pop();
match current_attribute_option {
Some(current_attribute_option) => {
xml_event.attr(current_attribute_option.attribute_name, current_attribute_option.attribute_value)
},
None => {
break;
},
};
}
xml_data_writer.write(xml_event);
// Checking if the line is an element
let caps = ELEMENT_REGEX.captures(current_line).unwrap();
let element_name = caps.get(2);
let xml_element_struct = Element {
indentation_count: current_indentation_count,
element_name: element_name.unwrap().as_str().to_string(),
};
xml_elements.push(xml_element_struct);
xml_event = XmlEvent::start_element(element_name.unwrap().as_str());
attribute_written_flag = false;
} else if ATTR_REGEX.is_match(current_line) {
// Checking if the line is an attribute
let caps = ATTR_REGEX.captures(current_line).unwrap();
let attr_name = caps.get(2);
let attr_value = caps.get(3);
// Saving attributes to a stack
attributes.push(Attribute{ attribute_name: attr_name.unwrap().as_str(), attribute_value: attr_value.unwrap().as_str() });
// xml_event.attr(attr_name.unwrap().as_str(), attr_value.unwrap().as_str());
}/* else if NEW_ATTR_SET_REGEX.is_match(current_line) {
let caps = NEW_ATTR_SET_REGEX.captures(current_line).unwrap();
let new_attr_set_name = caps.get(2);
let new_attr_set_value = caps.get(3);
current_xml_hash.insert("name".to_string(), new_attr_set_name.unwrap().as_str().to_string());
current_xml_hash.insert("value".to_string(), new_attr_set_value.unwrap().as_str().to_string());
} */
}
if attribute_written_flag {
xml_data_writer.write(xml_event);
}
for item in xml_elements.iter() {
let event = XmlEvent::end_element();
let event_name = item.element_name.to_string();
xml_data_writer.write(event.name(event_name.as_str()));
}
println!("OUTPUT");
println!("{:?}", target);
return "".to_string();
}
And here's the error:
error[E0382]: use of moved value: `xml_event`
--> src/main.rs:77:25
|
65 | let mut xml_event;
| ------------- move occurs because `xml_event` has type `StartElementBuilder<'_>`, which does not implement the `Copy` trait
...
77 | xml_event.attr(current_attribute_option.attribute_name, current_attribute_option.attribute_value)
| ^^^^^^^^^ --------------------------------------------------------------------------------------- `xml_event` moved due to this method call, in previous iteration of loop
|
note: this function takes ownership of the receiver `self`, which moves `xml_event`
--> /Users/defiant/.cargo/registry/src/github.com-1ecc6299db9ec823/xml-rs-0.8.4/src/writer/events.rs:193:24
|
193 | pub fn attr<N>(mut self, name: N, value: &'a str) -> StartElementBuilder<'a>
| ^^^^
From XmlEvent::start_element() documentation we see that it produces a StartElementBuilder<'a>.
From StartElementBuilder<'a>::attr() documentation we see that it consumes the StartElementBuilder<'a> (the first parameter is self, not &mut self) and produces a new StartElementBuilder<'a> (which is probably similar to self but considers the expected effect of .attr()).
This approach is known as the consuming builder pattern, which is used in Rust (for example std::thread::Builder).
The typical usage of such an approach consists in chaining the function calls: something.a().b().c().d() such as something is consumed by a(), its result is consumed by b(), the same about c() and finally d() does something useful with the last result.
The alternative would be to use mutable borrows in order to modify in place something but dealing with mutable borrows is known as difficult in some situations.
In your case, you can just reassign the result of .attr() to xml_event because otherwise the .attr() function would have no effect (its result is discarded) and xml_event would become unusable because it is consumed; reassigning it makes it usable again afterwards (at least i guess, i didn't try).

Issues with Rust timelines and ownerships

I am trying to create a hashmap by reading a file. Below is the code that I have written. The twist is that I need to persist subset_description till the next iteration so that I can store it in the hasmap and then finally return the hashmap.
fn myfunction(filename: &Path) -> io::Result<HashMap<&str, &str>> {
let mut SIF = HashMap::new();
let file = File::open(filename).unwrap();
let mut subset_description = "";
for line in BufReader::new(file).lines() {
let thisline = line?;
let line_split: Vec<&str> = thisline.split("=").collect();
subset_description = if thisline.starts_with("a") {
let subset_description = line_split[1].trim();
subset_description
} else {
""
};
let subset_ids = if thisline.starts_with("b") {
let subset_ids = line_split[1].split(",");
let subset_ids = subset_ids.map(|s| s.trim());
subset_ids.collect()
} else {
Vec::new()
};
for k in subset_ids {
SIF.insert(k, subset_description);
println!("");
}
if thisline.starts_with("!dataset_table_begin") {
break;
}
}
Ok(SIF)
}
I am getting the below error and not able to resolve this
error[E0515]: cannot return value referencing local variable `thisline`
--> src/main.rs:73:5
|
51 | let line_split: Vec<&str> = thisline.split("=").collect();
| -------- `thisline` is borrowed here
...
73 | Ok(SIF)
| ^^^^^^^ returns a value referencing data owned by the current function
The problem lies within the guarantees the Rust makes on your behalf. The root of the problem can be seen as following. You are reading a file and manipulating it's content into a HashMap, and you are trying to return reference to the the data you read. But by returning a reference you would need to guarantee, that the strings in the file wont be changed later on, which you naturally can not do.
In Rust terms you keep trying to return references to local variables, which get dropped at the end of the function, which would efficiently leave you with dangling pointers. Here is the changes I made, even though they may not be most efficient, they do compile.
fn myfunction(filename: &Path) -> io::Result<HashMap<String, String>> {
let mut SIF = HashMap::new();
let file = File::open(filename).unwrap();
let mut subset_description = "";
for line in BufReader::new(file).lines() {
let thisline = line?;
let line_split: Vec<String> = thisline.split("=").map(|s| s.to_string()).collect();
subset_description = if thisline.starts_with("a") {
let subset_description = line_split[1].trim();
subset_description
} else {
""
};
let subset_ids = if thisline.starts_with("b") {
let subset_ids = line_split[1].split(",");
let subset_ids = subset_ids.map(|s| s.trim());
subset_ids.map(|s| s.to_string()).collect()
} else {
Vec::new()
};
for k in subset_ids {
SIF.insert(k, subset_description.to_string());
println!("");
}
if thisline.starts_with("!dataset_table_begin") {
break;
}
}
Ok(SIF)
}
As you can see, now you give away the ownership of strings in return value. This is achieved by modifying the return type and using to_string() function, to give away the ownership of local strings to HashMap.
There is an argument that to_string() is slow, so you can explore the use of into or to_owned(), but as I am not proficient with those constructs I can not assist you in optimization.

re.captures error: borrowed value does not live long enough

Trying to complete the "Hash Maps" chapter of the Rust book at https://doc.rust-lang.org/book/2018-edition/ch08-03-hash-maps.html , with this code:
extern crate regex;
use std::collections::HashMap;
use std::io;
use regex::Regex;
fn get_command() -> String {
let mut input_cmd = String::new();
io::stdin().read_line(&mut input_cmd)
.expect("Failed to read command");
let input_cmd = input_cmd.trim();
input_cmd.to_string()
}
fn main() {
println!("Add someone by typing e.g. \"Add Sally to Engineering\", list everyone in a department by typing e.g. \"List everyone in Sales\", or list everyone by typing \"List everyone\". To quit, type \"Quit\".");
let mut employees_by_dept: HashMap<&str, Vec<&str>> = HashMap::new();
let add_to_dept_re = Regex::new("^Add ([A-Za-z]+) to ([A-Za-z]+)$").unwrap();
let list_in_dept_re = Regex::new("^List everyone in ([A-Za-z]+)$").unwrap();
let list_all_re = Regex::new("^List everyone$").unwrap();
loop {
let input_cmd = get_command();
let caps = add_to_dept_re.captures(&input_cmd).unwrap();
if add_to_dept_re.is_match(&input_cmd) {
let dept_name = caps.get(2).unwrap().as_str();
let employee_name = caps.get(1).unwrap().as_str();
println!("Adding person");
employees_by_dept.entry(&dept_name)
.or_insert_with(Vec::new)
.push(employee_name);
} else if list_in_dept_re.is_match(&input_cmd) {
println!("Listing people");
} else if list_all_re.is_match(&input_cmd) {
println!("Listing everyone");
} else if input_cmd == "Quit" {
break;
} else {
println!("Invalid command");
break;
}
}
println!("Bye!");
}
But I get this:
error[E0597]: `input_cmd` does not live long enough
--> src/main.rs:28:45
|
28 | let caps = add_to_dept_re.captures(&input_cmd).unwrap();
| ^^^^^^^^^ borrowed value does not live long enough
...
48 | }
| - `input_cmd` dropped here while still borrowed
...
51 | }
| - borrowed value needs to live until here
Have tried .captures(&input_cmd.clone()) and various other things, but doesn't help. Any ideas?
Rust memory safety rules prevents this type of approach: your HashMap value outlives the inserted items.
See embedded comments below but especially the Ownership chapter of the book.
fn main() {
let mut employees_by_dept: HashMap<&str, Vec<&str>> = HashMap::new();
let add_to_dept_re = Regex::new("^Add ([A-Za-z]+) to ([A-Za-z]+)$").unwrap();
let list_in_dept_re = Regex::new("^List everyone in ([A-Za-z]+)$").unwrap();
let list_all_re = Regex::new("^List everyone$").unwrap();
loop {
let input_cmd = get_command();
let caps = add_to_dept_re.captures(&input_cmd).unwrap();// <--- input_cmd
//is borrowed here
// ... code for getting dept_name and employee_name references
// and inserting into HashMap omitted
} // <----- The String input_cmd is dropped here (memory is freed)
// this implies that dept_name and employee_name references
// points to deallocated memory
// ... At this point you will have a live employees_by_dept HashMap
// that contains references to deallocated memory
println!("Bye!");
}
Make instead the HashMap take ownership of the keys/items values:
fn main() {
println!("Add someone by typing e.g. \"Add Sally to Engineering\", list everyone in a department by typing e.g. \"List everyone in Sales\", or list everyone by typing \"List everyone\". To quit, type \"Quit\".");
let mut employees_by_dept: HashMap<String, Vec<String>> = HashMap::new();
let add_to_dept_re = Regex::new("^Add ([A-Za-z]+) to ([A-Za-z]+)$").unwrap();
let list_in_dept_re = Regex::new("^List everyone in ([A-Za-z]+)$").unwrap();
let list_all_re = Regex::new("^List everyone$").unwrap();
loop {
let input_cmd = get_command();
let caps = add_to_dept_re.captures(&input_cmd).unwrap();
if add_to_dept_re.is_match(&input_cmd) {
let dept_name = caps.get(2).unwrap().as_str();
let employee_name = caps.get(1).unwrap().as_str();
println!("Adding person");
employees_by_dept
.entry(dept_name.to_string())
.or_insert_with(Vec::new)
.push(employee_name.to_string());
} else if list_in_dept_re.is_match(&input_cmd) {
println!("Listing people");
} else if list_all_re.is_match(&input_cmd) {
println!("Listing everyone");
} else if input_cmd == "Quit" {
break;
} else {
println!("Invalid command");
break;
}
}
println!("Bye!");
}

Resources