Visibility of set outside of infinite loop filling it [duplicate] - rust

I set myself a little task to acquire some basic Rust knowledge. The task was:
Read some key-value pairs from stdin and put them into a hashmap.
This, however, turned out to be a trickier challenge than expected. Mainly due to the understanding of lifetimes. The following code is what I currently have after a few experiments, but the compiler just doesn't stop yelling at me.
use std::io;
use std::collections::HashMap;
fn main() {
let mut input = io::stdin();
let mut lock = input.lock();
let mut lines_iter = lock.lines();
let mut map = HashMap::new();
for line in lines_iter {
let text = line.ok().unwrap();
let kv_pair: Vec<&str> = text.words().take(2).collect();
map.insert(kv_pair[0], kv_pair[1]);
}
println!("{}", map.len());
}
The compiler basically says:
`text` does not live long enough
As far as I understand, this is because the lifetime of 'text' is limited to the scope of the loop.
The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to a dangling pointer since 'text' will be destroyed after each iteration. (Please tell me if I'm wrong)
The big question is: How to solve this issue?
My intuition says:
Make an "owned copy" of the key value pair and "expand" it's lifetime to the outer scope .... but I have no idea how to achieve this.

The lifetime of 'text' is limited to the scope of the loop. The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to an dangling pointer since 'text' will be destroyed after each iteration.
Sounds right to me.
Make an "owned copy" of the key value pair.
An owned &str is a String:
map.insert(kv_pair[0].to_string(), kv_pair[1].to_string());
Edit
The original code is below, but I've updated the answer above to be more idiomatic
map.insert(String::from_str(kv_pair[0]), String::from_str(kv_pair[1]));

In Rust 1.1 the function words was marked as deprecated. Now you should use split_whitespace.
Here is an alternative solution which is a bit more functional and idiomatic (works with 1.3).
use std::io::{self, BufRead};
use std::collections::HashMap;
fn main() {
let stdin = io::stdin();
// iterate over all lines, "change" the lines and collect into `HashMap`
let map: HashMap<_, _> = stdin.lock().lines().filter_map(|line_res| {
// convert `Result` to `Option` and map the `Some`-value to a pair of
// `String`s
line_res.ok().map(|line| {
let kv: Vec<_> = line.split_whitespace().take(2).collect();
(kv[0].to_owned(), kv[1].to_owned())
})
}).collect();
println!("{}", map.len());
}

Related

Is there any way to borrow a RefCell immutably and mutably at the same time?

I have a piece of code which needs to operate on a list. This list contains items which come from another source and need to be processed and eventually removed. The list is also passed along to multiple functions which decide whether to add or remove an item. I created an example code which reflects my issue:
use std::{cell::RefCell, rc::Rc};
pub fn foo() {
let list: Rc<RefCell<Vec<Rc<RefCell<String>>>>> = Rc::new(RefCell::new(Vec::new()));
list.borrow_mut()
.push(Rc::new(RefCell::new(String::from("ABC"))));
while list.borrow().len() > 0 {
let list_ref = list.borrow();
let first_item = list_ref[0].borrow_mut();
//item processing, needed as mutable
list.borrow_mut().remove(0);
}
}
This panics at runtime:
thread 'main' panicked at 'already borrowed: BorrowMutError', src/libcore/result.rs:997:5
I think I understand the problem: I have two immutable borrows and then a third which is mutable. According to the Rust docs, this is not allowed: either many immutable borrows or a single mutable one. Is there any way to get around this issue?
I have no idea what you are actually trying to achieve as you have failed to provide a minimal reproducible example, but I think you just mixed up the borrows of the list and the item in your data structure and that confused you in the first place.
Nonetheless the following code (which you can run in the playground) does what you have described above.
use std::{cell::RefCell, rc::Rc};
pub fn foo() {
let list = Rc::new(RefCell::new(Vec::new()));
let mut list = list.borrow_mut();
let item = Rc::new(RefCell::new(String::from("ABC")));
list.push(item);
println!("list: {:?}", list);
while let Some(item) = list.pop() {
println!("item: {:?}", item);
item.borrow_mut().push_str("DEF");
println!("item: {:?}", item);
}
println!("list: {:?}", list);
}
fn main() {
foo();
}
There are two tricks which I used here.
I borrowed the list only once and that borrow was a mutable one, which allowed me to add and remove items from it.
Because your description said you want to remove the items from the list anyway, I was able to iterate over the Vec with the pop or the remove methods (depending on the order you wish to get the items from the list). This means I didn't have to borrow the Vec for the scope of the loop (which you would otherwise do if you would iterate over it).
There are other ways to remove an element based on some predicate. For example: Removing elements from a Vec based on some condition.
To actually answer your original question: there is no way to have an immutable and a mutable borrow at the same time safely. That is one of the core principles of Rust which makes it memory safe. Think about it, what kind of guarantee would immutability be if at the same time, under the hood, the data could actually change?

How do I convert a Vec<String> to Vec<&str>?

I can convert Vec<String> to Vec<&str> this way:
let mut items = Vec::<&str>::new();
for item in &another_items {
items.push(item);
}
Are there better alternatives?
There are quite a few ways to do it, some have disadvantages, others simply are more readable to some people.
This dereferences s (which is of type &String) to a String "right hand side reference", which is then dereferenced through the Deref trait to a str "right hand side reference" and then turned back into a &str. This is something that is very commonly seen in the compiler, and I therefor consider it idiomatic.
let v2: Vec<&str> = v.iter().map(|s| &**s).collect();
Here the deref function of the Deref trait is passed to the map function. It's pretty neat but requires useing the trait or giving the full path.
let v3: Vec<&str> = v.iter().map(std::ops::Deref::deref).collect();
This uses coercion syntax.
let v4: Vec<&str> = v.iter().map(|s| s as &str).collect();
This takes a RangeFull slice of the String (just a slice into the entire String) and takes a reference to it. It's ugly in my opinion.
let v5: Vec<&str> = v.iter().map(|s| &s[..]).collect();
This is uses coercions to convert a &String into a &str. Can also be replaced by a s: &str expression in the future.
let v6: Vec<&str> = v.iter().map(|s| { let s: &str = s; s }).collect();
The following (thanks #huon-dbaupp) uses the AsRef trait, which solely exists to map from owned types to their respective borrowed type. There's two ways to use it, and again, prettiness of either version is entirely subjective.
let v7: Vec<&str> = v.iter().map(|s| s.as_ref()).collect();
and
let v8: Vec<&str> = v.iter().map(AsRef::as_ref).collect();
My bottom line is use the v8 solution since it most explicitly expresses what you want.
The other answers simply work. I just want to point out that if you are trying to convert the Vec<String> into a Vec<&str> only to pass it to a function taking Vec<&str> as argument, consider revising the function signature as:
fn my_func<T: AsRef<str>>(list: &[T]) { ... }
instead of:
fn my_func(list: &Vec<&str>) { ... }
As pointed out by this question: Function taking both owned and non-owned string collections. In this way both vectors simply work without the need of conversions.
All of the answers idiomatically use iterators and collecting instead of a loop, but do not explain why this is better.
In your loop, you first create an empty vector and then push into it. Rust makes no guarantees about the strategy it uses for growing factors, but I believe the current strategy is that whenever the capacity is exceeded, the vector capacity is doubled. If the original vector had a length of 20, that would be one allocation, and 5 reallocations.
Iterating from a vector produces an iterator that has a "size hint". In this case, the iterator implements ExactSizeIterator so it knows exactly how many elements it will return. map retains this and collect takes advantage of this by allocating enough space in one go for an ExactSizeIterator.
You can also manually do this with:
let mut items = Vec::<&str>::with_capacity(another_items.len());
for item in &another_items {
items.push(item);
}
Heap allocations and reallocations are probably the most expensive part of this entire thing by far; far more expensive than taking references or writing or pushing to a vector when no new heap allocation is involved. It wouldn't surprise me if pushing a thousand elements onto a vector allocated for that length in one go were faster than pushing 5 elements that required 2 reallocations and one allocation in the process.
Another unsung advantage is that using the methods with collect do not store in a mutable variable which one should not use if it's unneeded.
another_items.iter().map(|item| item.deref()).collect::<Vec<&str>>()
To use deref() you must add using use std::ops::Deref
This one uses collect:
let strs: Vec<&str> = another_items.iter().map(|s| s as &str).collect();
Here is another option:
use std::iter::FromIterator;
let v = Vec::from_iter(v.iter().map(String::as_str));
Note that String::as_str is stable since Rust 1.7.

get file information from DirEntry in a for loop

I am new to Rust. I am trying to build a JSON object where the keys are file names and the value is the file contents.
So far, I have:
use std::fs;
use std::io;
use std::env;
use std::collections::HashMap;
use std::path::{Path, PathBuf};
fn main() {
make_json();
}
fn make_json() -> io::Result<()> {
let mut modules = HashMap::new();
let mut dir = env::current_dir().unwrap();
let mut read_dir = fs::read_dir(dir);
for entry in try!(read_dir) {
let entry = try!(entry);
let file_name = entry.path().file_name().unwrap().to_string_lossy();
modules.insert(file_name, "");
}
Ok(())
}
When I go to compile it, I get
src/main.rs:19:25: 19:37 error: borrowed value does not live long enough
src/main.rs:19 let file_name = entry.path().file_name().unwrap().to_string_lossy();
^~~~~~~~~~~~
note: in expansion of for loop expansion
src/main.rs:17:5: 21:6 note: expansion site
src/main.rs:13:38: 23:2 note: reference must be valid for the block suffix following statement 0 at 13:37...
src/main.rs:13 let mut modules = HashMap::new();
src/main.rs:14 let mut dir = env::current_dir().unwrap();
src/main.rs:15 let mut read_dir = fs::read_dir(dir);
src/main.rs:16
src/main.rs:17 for entry in try!(read_dir) {
src/main.rs:18 let entry = try!(entry);
...
src/main.rs:19:9: 19:77 note: ...but borrowed value is only valid for the statement at 19:8
src/main.rs:19 let file_name = entry.path().file_name().unwrap().to_string_lossy();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/main.rs:19:9: 19:77 help: consider using a `let` binding to increase its lifetime
src/main.rs:19 let file_name = entry.path().file_name().unwrap().to_string_lossy();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error: aborting due to previous error
I understand what this error is telling me; entry is defined within the scope of the for loop, and therefore if I store it to the HashMap it will no longer be valid memory because the place in memory will have been freed already. I get that.
What I don't get, is how I access the the DirEntrys within read_dir without using some sort of closure, since I will need their information outside of whatever closure I retrieve them in.
Everything that I have come across hasn't been able to help me.
DirEntry.path() returns a PathBuf, which is 'static (i.e. it contains no non-static references and is a completely standalong object). It is where the problem lies.
PathBuf.file_name() returns Option<&OsStr>, a reference into that object, and OsStr.to_string_lossy() returns Cow<str>. Note with that last that it is not 'static; with the elided lifetimes reinstated, it’s fn to_string_lossy<'a>(&'a self) -> Cow<'a, str>. This is for efficiency, because if the path is legal UTF-8 then there’s no need to go creating an entirely new owned string (String), it can keep it as a string slice (&str). (Because that’s what Cow<'a, str> is: its variants, with generics filled in, are Owned(String) and Borrowed(&'a str).)
What you need in this location is to turn the Cow<str> into a String. This is accomplished with the into_owned method of Cow<T>.
That line of code thus becomes this:
let file_name = entry.path().file_name().unwrap().to_string_lossy().into_owned();
Problem while dealing with Rust file system forced me to create this rust library brown
While dealing with Rust fs and specially while working with loops, the main issue is that every thing return another thing and then we need to convert that thing.
We need something to flatten the items for us
My suggestion :: Do not do any calculations etc in a loop, it should just have function calls to a well tested library and just checking its results.

Why doesn't this compile - use of undeclared type name `thread::scoped`

I'm trying to get my head around Rust. I've got an alpha version of 1.
Here's the problem I'm trying to program: I have a vector of floats. I want to set up some threads asynchronously. Each thread should wait for the number of seconds specified by each element of the vector, and return the value of the element, plus 10. The results need to be in input order.
It's an artificial example, to be sure, but I wanted to see if I could implement something simple before moving onto more complex code. Here is my code so far:
use std::thread;
use std::old_io::timer;
use std::time::duration::Duration;
fn main() {
let mut vin = vec![1.4f64, 1.2f64, 1.5f64];
let mut guards: Vec<thread::scoped> = Vec::with_capacity(3);
let mut answers: Vec<f64> = Vec::with_capacity(3);
for i in 0..3 {
guards[i] = thread::scoped( move || {
let ms = (1000.0f64 * vin[i]) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", vin[i]);
answers[i] = 10.0f64 + (vin[i] as f64);
})};
for i in 0..3 {guards[i].join(); };
for i in 0..3 {println!("{}", vin[i]); }
}
So the input vector is [1.4, 1.2, 1.5], and I'm expecting the output vector to be [11.4, 11.2, 11.5].
There appear to be a number of problems with my code, but the first one is that I get a compilation error:
threads.rs:7:25: 7:39 error: use of undeclared type name `thread::scoped`
threads.rs:7 let mut guards: Vec<thread::scoped> = Vec::with_capacity(3);
^~~~~~~~~~~~~~
error: aborting due to previous error
There also seem to be a number of other problems, including using vin within a closure. Also, I have no idea what move does, other than the fact that every example I've seen seems to use it.
Your error is due to the fact that thread::scoped is a function, not a type. What you want is a Vec<T> where T is the result type of the function. Rust has a neat feature that helps you here: It automatically detects the correct type of your variables in many situations.
If you use
let mut guards = Vec::with_capacity(3);
the type of guards will be chosen when you use .push() the first time.
There also seem to be a number of other problems.
you are accessing guards[i] in the first for loop, but the length of the guards vector is 0. Its capacity is 3, which means that you won't have any unnecessary allocations as long as the vector never contains more than 3 elements. use guards.push(x) instead of guards[i] = x.
thread::scoped expects a Fn() -> T, so your closure can return an object. You get that object when you call .join(), so you don't need an answer-vector.
vin is moved to the closure. Therefore in the second iteration of the loop that creates your guards, vin isn't available anymore to be moved to the "second" closure. Every loop iteration creates a new closure.
i is moved to the closure. I have no idea what's going on there. But the solution is to let inval = vin[i]; outside the closure, and then use inval inside the closure. This also solves Point 3.
vin is mutable. Yet you never mutate it. Don't bind variables mutably if you don't need to.
vin is an array of f64. Therefore (vin[i] as f64) does nothing. Therefore you can simply use vin[i] directly.
join moves out of the guard. Since you cannot move out of an array, your cannot index into an array of guards and join the element at the specified index. What you can do is loop over the elements of the array and join each guard.
Basically this means: don't iterate over indices (for i in 1..3), but iterate over elements (for element in vector) whenever possible.
All of the above implemented:
use std::thread;
use std::old_io::timer;
use std::time::duration::Duration;
fn main() {
let vin = vec![1.4f64, 1.2f64, 1.5f64];
let mut guards = Vec::with_capacity(3);
for inval in vin {
guards.push(thread::scoped( move || {
let ms = (1000.0f64 * inval) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", inval);
10.0f64 + inval
}));
}
for guard in guards {
let answer = guard.join();
println!("{}", answer);
};
}
In supplement of Ker's answer: if you really need to mutate arrays within a thread, I suppose the most closest valid solution for your task will be something like this:
use std::thread::spawn;
use std::old_io::timer;
use std::sync::{Arc, Mutex};
use std::time::duration::Duration;
fn main() {
let vin = Arc::new(vec![1.4f64, 1.2f64, 1.5f64]);
let answers = Arc::new(Mutex::new(vec![0f64, 0f64, 0f64]));
let mut workers = Vec::new();
for i in 0..3 {
let worker_vin = vin.clone();
let worker_answers = answers.clone();
let worker = spawn( move || {
let ms = (1000.0f64 * worker_vin[i]) as i64;
let d = Duration::milliseconds(ms);
timer::sleep(d);
println!("Waited {}", worker_vin[i]);
let mut answers = worker_answers.lock().unwrap();
answers[i] = 10.0f64 + (worker_vin[i] as f64);
});
workers.push(worker);
}
for worker in workers { worker.join().unwrap(); }
for answer in answers.lock().unwrap().iter() {
println!("{}", answer);
}
}
In order to share vectors between several threads, I have to prove, that these vectors outlive all of my threads. I cannot use just Vec, because it will be destroyed at the end of main block, and another thread could live longer, possibly accessing freed memory. So I took Arc reference counter, which guarantees, that my vectors will be destroyed only when the counter downs to zero.
Arc allows me to share read-only data. In order to mutate answers array, I should use some synchronize tools, like Mutex. That is how Rust prevents me to make data races.

Storing from inside a loop a borrowed value to container in outer scope?

I set myself a little task to acquire some basic Rust knowledge. The task was:
Read some key-value pairs from stdin and put them into a hashmap.
This, however, turned out to be a trickier challenge than expected. Mainly due to the understanding of lifetimes. The following code is what I currently have after a few experiments, but the compiler just doesn't stop yelling at me.
use std::io;
use std::collections::HashMap;
fn main() {
let mut input = io::stdin();
let mut lock = input.lock();
let mut lines_iter = lock.lines();
let mut map = HashMap::new();
for line in lines_iter {
let text = line.ok().unwrap();
let kv_pair: Vec<&str> = text.words().take(2).collect();
map.insert(kv_pair[0], kv_pair[1]);
}
println!("{}", map.len());
}
The compiler basically says:
`text` does not live long enough
As far as I understand, this is because the lifetime of 'text' is limited to the scope of the loop.
The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to a dangling pointer since 'text' will be destroyed after each iteration. (Please tell me if I'm wrong)
The big question is: How to solve this issue?
My intuition says:
Make an "owned copy" of the key value pair and "expand" it's lifetime to the outer scope .... but I have no idea how to achieve this.
The lifetime of 'text' is limited to the scope of the loop. The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to an dangling pointer since 'text' will be destroyed after each iteration.
Sounds right to me.
Make an "owned copy" of the key value pair.
An owned &str is a String:
map.insert(kv_pair[0].to_string(), kv_pair[1].to_string());
Edit
The original code is below, but I've updated the answer above to be more idiomatic
map.insert(String::from_str(kv_pair[0]), String::from_str(kv_pair[1]));
In Rust 1.1 the function words was marked as deprecated. Now you should use split_whitespace.
Here is an alternative solution which is a bit more functional and idiomatic (works with 1.3).
use std::io::{self, BufRead};
use std::collections::HashMap;
fn main() {
let stdin = io::stdin();
// iterate over all lines, "change" the lines and collect into `HashMap`
let map: HashMap<_, _> = stdin.lock().lines().filter_map(|line_res| {
// convert `Result` to `Option` and map the `Some`-value to a pair of
// `String`s
line_res.ok().map(|line| {
let kv: Vec<_> = line.split_whitespace().take(2).collect();
(kv[0].to_owned(), kv[1].to_owned())
})
}).collect();
println!("{}", map.len());
}

Resources