How to know when a borrow ends - rust

I wrote this for simple input parsing:
use std::io;
fn main() {
let mut line = String::new();
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
let parts = line.split_whitespace();
for p in parts {
println!("{}", p);
}
line.clear();
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
}
The above code creates a String object, reads a line into it, splits it by whitespace and prints he output. Then it tries to do the same using the same String object. On compilation I get error:
--> src/main.rs:15:5
|
9 | let parts = line.split_whitespace();
| ---- immutable borrow occurs here
...
15 | line.clear();
| ^^^^ mutable borrow occurs here
...
19 | }
| - immutable borrow ends here
As String is owned by an iterator. The solution is described as:
let parts: Vec<String> = line.split_whitespace()
.map(|s| String::from(s))
.collect();
I have few questions here:
I have already consumed the iterator by calling for each on it. Its borrow should have ended.
How do I know lifetimes of borrow from function definitions?
If a function is borrowing an object how do I know its releasing it? e.g. in solution using collect() releases the borrow.
I think I am missing an important concept here.

The problem in your code is that you bind the result of line.split_whitespace() to a name (parts). If you write this instead:
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
for p in line.split_whitespace() { // <-- pass directly into loop
println!("{}", p);
}
line.clear();
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
That way it just works. Another possibility is to artificially restrict the lifetimes of parts, like so:
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
{
let parts = line.split_whitespace();
for p in parts {
println!("{}", p);
}
}
line.clear();
io::stdin().read_line(&mut line)
.expect("Cannot read line.");
This also works.
So why is that? This is due to how the compiler currently works, often called "lexical borrows". The problem here is that each non-temporary value which contains a borrow will be "alive" until the end of its scope.
In your case: since you assign the result of split_whitespace() (which borrows the string) to parts, the borrow is "alive" until the end of scope of parts. Not until the end of life of parts.
In the first version in this answer, we don't bind a name to the value, thus the result of split_whitespace() is only a temporary and the borrow doesn't extend out the the whole scope. That's also why your collect() example works: not because of collect(), but because there is never a name bound to something borrowing the string. In my second version, we just restrict the scope.
Note, that this is a known shortcoming of the compiler. You are right, the compiler just doesn't see it.

Related

How do I read a String from a File, split it, and create a Vec<&str> in one statement?

I need help regarding how to convert file input taken as a string to an vector.
I tried
let content = fs::read_to_string(file_path).expect("Failed to read input");
let content: Vec<&str> = content.split("\n").collect();
This works, but I wanted to convert it to one statement. Something like
let content: Vec<&str> = fs::read_to_string(file_path)
.expect("Failed to read input")
.split("\n")
.collect();
I tried using
let content: Vec<&str> = match fs::read_to_string(file_path) {
Ok(value) => value.split("\n").collect(),
Err(err) => {
println!("Error Unable to read the file {}", err);
return ();
}
};
and
let content: Vec<&str> = match fs::read_to_string(file_path) {
Ok(value) => value,
Err(err) => {
println!("Error Unable to read the file {}", err);
return ();
}
}
.split("\n")
.collect();
The compiler says that the borrowed values does not live long enough (1st) and value in freed while in use (2nd) (problem with borrowing, scope and ownership).
error[E0716]: temporary value dropped while borrowed
--> src/lib.rs:4:26
|
4 | let content: Vec<&str> = fs::read_to_string("")
| __________________________^
5 | | .expect("Failed to read input")
| |___________________________________^ creates a temporary which is freed while still in use
6 | .split("\n")
7 | .collect();
| - temporary value is freed at the end of this statement
8 |
9 | dbg!(content);
| ------- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
I still lack much understanding about how to fix them.
It is impossible to do this in one expression. Use two expressions with a let, as you already are and as the compiler tells you.
The problem is that split produces string slices (&str) that reference the temporary String. That String is deallocated at the end of the statement, making the references invalid. Rust is preventing you from introducing memory unsafety:
fs::read_to_string(file_path) // Creates a String
.expect("Failed to read input")
.split("\n") // Takes references into the String
.collect(); // String is dropped, invalidating references
If you didn't need a Vec<&str>, you could have a Vec<String>:
fs::read_to_string(file_path)
.expect("Failed to read input")
.split("\n")
.map(|s| s.to_string()) // Convert &str to String
.collect();
See also:
Temporary value dropped while borrowed, but I don't want to do a let
Using a `let` binding to increase a values lifetime
"borrowed value does not live long enough" seems to blame the wrong thing
Why does the compiler tell me to consider using a `let` binding" when I already am?
Do I need to use a `let` binding to create a longer lived value?
Why is it legal to borrow a temporary?
You create a String with read_to_string, but it is not bound to any variable because you are using it right after by using it with split. The String is the temporary variable mentioned in the error. It is not bound to anything. The split function returns references to the contents of the string... but this String will be deallocated by the end of the line, because again, it is not bound to a variable.
If you really need to do it in one line, but a bit less efficiently:
let content: Vec<String> = fs::read_to_string(file_path)
.expect("Failed to read input")
.split("\n")
.map(|line| line.to_string())
.collect();

How to tell the borrow checker that a cleared Vec contains no borrows? [duplicate]

This question already has answers here:
Borrow checker doesn't realize that `clear` drops reference to local variable
(6 answers)
Closed 2 years ago.
I'm processing a massive TSV (tab separated values) file and want to do this as efficiently as possible. To that end, I thought I'd prevent allocation of a new Vec for every line by pre-allocating it before the loop:
let mut line = String::new();
let mut fields = Vec::with_capacity(headers.len());
while reader.read_line(&mut line)? > 0 {
fields.extend(line.split('\t'));
// do something with fields
fields.clear();
}
Naturally, the borrow checker isn't amused, because we're overwriting line while fields may still have references into it:
error[E0502]: cannot borrow `line` as mutable because it is also borrowed as immutable
--> src/main.rs:66:28
|
66 | while reader.read_line(&mut line)? > 0 {
| ^^^^^^^^^ mutable borrow occurs here
67 | fields.extend(line.split('\t'));
| ------ ---- immutable borrow occurs here
| |
| immutable borrow later used here
(Playground)
This isn't actually a problem because fields.clear(); removes all references, so at the start of the loop when read_line(&mut line) is called, fields does not actually borrow anything from line.
But how do I inform the borrow checker of this?
Your problem looks similar to the one described in this post.
In addition to the answers there (lifetime transmutations, refcells), depending on the Complex Operation you commented out, you might not need to store references to line at all. Consider, for example, the following modification of your playground code:
use std::io::BufRead;
fn main() -> Result<(), std::io::Error> {
let headers = vec![1,2,3,4];
let mut reader = std::io::BufReader::new(std::fs::File::open("foo.txt")?);
let mut fields = Vec::with_capacity(headers.len());
loop {
let mut line = String::new();
if reader.read_line(&mut line)? == 0 {
break;
}
fields.push(0);
fields.extend(line.match_indices('\t').map(|x| x.0 + 1));
// do something with fields
// each element of fields starts a field; you can use the next
// element of fields to find the end of the field.
// (make sure to account for the \t, and the last field having no
// 'next' element in fields.
fields.clear();
}
Ok(())
}

How to push a value to a Vec and append it to a String at the same time?

I want to write a program that sets the shell for the system's nslookup command line program:
fn main() {
let mut v: Vec<String> = Vec::new();
let mut newstr = String::from("nslookup");
for arg in std::env::args() {
v.push(arg);
newstr.push_str(&format!(" {}", arg));
}
println!("{:?}", v);
println!("{}", newstr);
}
error[E0382]: borrow of moved value: `arg`
--> src/main.rs:6:41
|
5 | v.push(arg);
| --- value moved here
6 | newstr.push_str(&format!(" {}", arg));
| ^^^ value borrowed here after move
|
= note: move occurs because `arg` has type `std::string::String`, which does not implement the `Copy` trait
How to correct the code without traversing env::args() again?
Reverse the order of the lines that use arg:
for arg in std::env::args() {
//newstr.push_str(&format!(" {}", arg));
write!(&mut newstr, " {}", arg);
v.push(arg);
}
Vec::push takes its argument by value, which moves ownership of arg so it can't be used anymore after v.push(arg). format! and related macros implicitly borrow their arguments, so you can use arg again after using it in one of those.
If you really needed to move the same String to two different locations, you would need to add .clone(), which copies the string. But that's not necessary in this case.
Also note that format! creates a new String, which is wasteful when all you want is to add on to the end of an existing String. If you add use std::fmt::Write; to the top of your file, you can use write! instead (as shown above), which is more concise and may be more performant.
See also
What are move semantics in Rust?
error: use of moved value - should I use "&" or "mut" or something else?
Does println! borrow or own the variable?
You can do like that:
fn main() {
let args: Vec<_> = std::env::args().collect();
let s = args.join(" ");
println!("{}", s);
}
First, you create the vector, and then you create your string.

Confusion about Rust HashMap and String borrowing

This program accepts an integer N, followed by N lines containing two strings separated by a space. I want to put those lines into a HashMap using the first string as the key and the second string as the value:
use std::collections::HashMap;
use std::io;
fn main() {
let mut input = String::new();
io::stdin().read_line(&mut input)
.expect("unable to read line");
let desc_num: u32 = match input.trim().parse() {
Ok(num) => num,
Err(_) => panic!("unable to parse")
};
let mut map = HashMap::<&str, &str>::new();
for _ in 0..desc_num {
input.clear();
io::stdin().read_line(&mut input)
.expect("unable to read line");
let data = input.split_whitespace().collect::<Vec<&str>>();
println!("{:?}", data);
// map.insert(data[0], data[1]);
}
}
The program works as intended:
3
a 1
["a", "1"]
b 2
["b", "2"]
c 3
["c", "3"]
When I try to put those parsed strings into a HashMap and uncomment map.insert(data[0], data[1]);, the compilation fails with this error:
error: cannot borrow `input` as mutable because it is also borrowed as immutable [E0502]
input.clear();
^~~~~
note: previous borrow of `input` occurs here; the immutable borrow prevents subsequent moves or mutable borrows of `input` until the borrow ends
let data = input.split_whitespace().collect::<Vec<&str>>();
^~~~~
note: previous borrow ends here
fn main() {
...
}
^
I don't understand why this error would come up, since I think the map.insert() expression doesn't borrow the string input at all.
split_whitespace() doesn't give you two new Strings containing (copies of) the non-whitespace parts of the input. Instead you get two references into the memory managed by input, of type &str. So when you then try to clear input and read the next line of input into it, you try overwriting memory that's still being used by the hash map.
Why does split_whitespace (and many other string methods, I should add) complicate matters by returning &str? Because it's often enough, and in those cases it avoid unnecessary copies. In this specific case however, it's probably best to explicitly copy the relevant parts of the string:
map.insert(data[0].clone(), data[1].clone());

Using a `let` binding to increase a values lifetime

I wrote the following code to read an array of integers from stdin:
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
let xs: Vec<i32> = line.unwrap()
.trim()
.split(' ')
.map(|s| s.parse().unwrap())
.collect();
println!("{:?}", xs);
}
}
This worked fine, however, I felt the let xs line was a bit long, so I split it into two:
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
let ss = line.unwrap().trim().split(' ');
let xs: Vec<i32> = ss.map(|s| s.parse().unwrap()).collect();
println!("{:?}", xs);
}
}
This didn't work! Rust replied with the following error:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:6:18
|
6 | let ss = line.unwrap().trim().split(' ');
| ^^^^^^^^^^^^^ - temporary value dropped here while still borrowed
| |
| temporary value does not live long enough
...
10 | }
| - temporary value needs to live until here
|
= note: consider using a `let` binding to increase its lifetime
This confuses me. Is it line or ss that doesn't live long enough? And how can I use a let binding to increase their lifetime? I thought I was already using a let?
I've read through the lifetime guide, but I still can't quite figure it out. Can anyone give me a hint?
In your second version, the type of ss is Split<'a, char>. The lifetime parameter in the type tells us that the object contains a reference. In order for the assignment to be valid, the reference must point to an object that exists after that statement. However, unwrap() consumes line; in other words, it moves Ok variant's data out of the Result object. Therefore, the reference doesn't point inside the original line, but rather on a temporary object.
In your first version, you consume the temporary by the end of the long expression, though the call to map. To fix your second version, you need to bind the result of unwrap() to keep the value living long enough:
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
let line = line.unwrap();
let ss = line.trim().split(' ');
let xs: Vec<i32> = ss.map(|s| s.parse().unwrap()).collect();
println!("{:?}", xs);
}
}
It's about the unwrap() call, it's getting the contained object but this reference should outlive the container object, which goes out of scope in the next line (there is no local binding to it).
If you want to get cleaner code, a very common way to write it is:
use std::io::{self, BufRead};
fn main() {
let stdin = io::stdin();
for line in stdin.lock().lines() {
let xs: Vec<i32> = line.unwrap()
.trim()
.split(' ')
.map(|s| s.parse().unwrap())
.collect();
println!("{:?}", xs);
}
}
If not, you can create the binding to the "unwrapped" result and use it.

Resources