I don't understand why this doesn't work:
use std::collections::HashSet;
let test = "foo\nbar\n";
let hashset: HashSet<_> = test
.lines()
.collect::<Result<HashSet<_>, _>>()
.unwrap()
I get this error:
a value of type Result<HashSet<_>, _> cannot be built from an iterator over elements of type &str
I tried to use an intermediary Vec but I didn't succeed either. I understand the error but I don't know how to elegantly fix this
This works but isn't the fastest solution:
use std::collections::HashSet;
let test = "foo\nbar\n";
let hashset = HashSet::new();
for word in test.lines() {
hashset.insert(p.to_string());
}
The lines() method cannot fail, as it operates over a &str, therefore you should collect to a HashSet<&str>.
See https://doc.rust-lang.org/std/primitive.str.html#method.lines.
For example:
let test = "foo\nbar\n";
let hashset: HashSet<&str> = test
.lines()
.collect();
See it in action in the playground.
Your confusion here seems to come from the fact that there's a similar lines method that operates on BufRead which can fail due to operating on files, or other I/O based sources.
See https://doc.rust-lang.org/std/io/trait.BufRead.html#method.lines.
Apart from this difference, BufRead.lines varies as it yields owning Strings instead of borrowed &str.
If you want to create a HashSet which owns its contents, you can modify your code as this:
let test = "foo\nbar\n";
let hashset: HashSet<String> = test
.lines()
.map(String::from)
.collect();
Related
I am a Rust newbie, I tested following code and got a question. Is type of slice [T]?If so, [T] is unsized, but it passed when I compiled the code. Why is that?
#[test]
fn test_scilce(){
let v = vec!['a', 'b', 'v'];
let slice = (v[1..3]).into_iter();
// let s: String = slice.collect();
println!("{:?}", slice);
println!("{:?}", v);
}
Since [T]::into_iter(self) doesn't exist, but [T]::into_iter(&self) does, the compiler inserts the missing reference and treats (v[1..3]).into_iter() as (&v[1..3]).into_iter(). That is in turn the same as (&v[1..3]).iter() and gives out references to the elements of the vector. (There is even a clippy lint warning you of using into_iter() on slice or other references.)
The same auto-referencing mechanism is what allows you to write v.len() instead of the "correct" (&v).len(), despite Vec::len taking &self.
Take the following data type:
let mut items = HashMap::<u64, HashMap<u64, bool>>::new();
I successfully managed to turn it into a vector of tuples like this:
let mut vector_of_tuples: Vec<(u64, u64, bool)> = vec![];
for (outer_id, inner_hash_map) in items.iter() {
for (inner_id, state) in inner_hash_map.iter() {
vector_of_tuples.push((*outer_id, *inner_id, *state));
}
}
But I want to shrink this code logic, possibly with the help of the Map and Zip functions from the Rust standard library.
How can I achieve the same result without using for loops?
How can I achieve the same result without using for loops?
You can use collect() to build a vector from an iterator without an explicit loop:
let vector_of_tuples: Vec<(u64, u64, bool)> = items
.iter()
// ...
.collect();
To expand the contents of the inner hash maps into the iterator, you can use flat_map:
let vector_of_tuples: Vec<_> = items
.iter()
.flat_map(|(&outer_id, inner_hash_map)| {
inner_hash_map
.iter()
.map(move |(&inner_id, &state)| (outer_id, inner_id, state))
})
.collect();
In many cases chaining iterator adapters will yield more understandable code than the equivalent for loop because iterators are written in a declarative style and tend not to require side effects. However, in this particular case the original for loop might actually be the more readable option. YMMV, the best option will sometimes depend on the programming background of you and other project contributors.
I set myself a little task to acquire some basic Rust knowledge. The task was:
Read some key-value pairs from stdin and put them into a hashmap.
This, however, turned out to be a trickier challenge than expected. Mainly due to the understanding of lifetimes. The following code is what I currently have after a few experiments, but the compiler just doesn't stop yelling at me.
use std::io;
use std::collections::HashMap;
fn main() {
let mut input = io::stdin();
let mut lock = input.lock();
let mut lines_iter = lock.lines();
let mut map = HashMap::new();
for line in lines_iter {
let text = line.ok().unwrap();
let kv_pair: Vec<&str> = text.words().take(2).collect();
map.insert(kv_pair[0], kv_pair[1]);
}
println!("{}", map.len());
}
The compiler basically says:
`text` does not live long enough
As far as I understand, this is because the lifetime of 'text' is limited to the scope of the loop.
The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to a dangling pointer since 'text' will be destroyed after each iteration. (Please tell me if I'm wrong)
The big question is: How to solve this issue?
My intuition says:
Make an "owned copy" of the key value pair and "expand" it's lifetime to the outer scope .... but I have no idea how to achieve this.
The lifetime of 'text' is limited to the scope of the loop. The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to an dangling pointer since 'text' will be destroyed after each iteration.
Sounds right to me.
Make an "owned copy" of the key value pair.
An owned &str is a String:
map.insert(kv_pair[0].to_string(), kv_pair[1].to_string());
Edit
The original code is below, but I've updated the answer above to be more idiomatic
map.insert(String::from_str(kv_pair[0]), String::from_str(kv_pair[1]));
In Rust 1.1 the function words was marked as deprecated. Now you should use split_whitespace.
Here is an alternative solution which is a bit more functional and idiomatic (works with 1.3).
use std::io::{self, BufRead};
use std::collections::HashMap;
fn main() {
let stdin = io::stdin();
// iterate over all lines, "change" the lines and collect into `HashMap`
let map: HashMap<_, _> = stdin.lock().lines().filter_map(|line_res| {
// convert `Result` to `Option` and map the `Some`-value to a pair of
// `String`s
line_res.ok().map(|line| {
let kv: Vec<_> = line.split_whitespace().take(2).collect();
(kv[0].to_owned(), kv[1].to_owned())
})
}).collect();
println!("{}", map.len());
}
I have something that is Read; currently it's a File. I want to read a number of bytes from it that is only known at runtime (length prefix in a binary data structure).
So I tried this:
let mut vec = Vec::with_capacity(length);
let count = file.read(vec.as_mut_slice()).unwrap();
but count is zero because vec.as_mut_slice().len() is zero as well.
[0u8;length] of course doesn't work because the size must be known at compile time.
I wanted to do
let mut vec = Vec::with_capacity(length);
let count = file.take(length).read_to_end(vec).unwrap();
but take's receiver parameter is a T and I only have &mut T (and I'm not really sure why it's needed anyway).
I guess I can replace File with BufReader and dance around with fill_buf and consume which sounds complicated enough but I still wonder: Have I overlooked something?
Like the Iterator adaptors, the IO adaptors take self by value to be as efficient as possible. Also like the Iterator adaptors, a mutable reference to a Read is also a Read.
To solve your problem, you just need Read::by_ref:
use std::io::Read;
use std::fs::File;
fn main() {
let mut file = File::open("/etc/hosts").unwrap();
let length = 5;
let mut vec = Vec::with_capacity(length);
file.by_ref().take(length as u64).read_to_end(&mut vec).unwrap();
let mut the_rest = Vec::new();
file.read_to_end(&mut the_rest).unwrap();
}
1. Fill-this-vector version
Your first solution is close to work. You identified the problem but did not try to solve it! The problem is that whatever the capacity of the vector, it is still empty (vec.len() == 0). Instead, you could actually fill it with empty elements, such as:
let mut vec = vec![0u8; length];
The following full code works:
#![feature(convert)] // needed for `as_mut_slice()` as of 2015-07-19
use std::fs::File;
use std::io::Read;
fn main() {
let mut file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let mut vec = vec![0u8; length];
let count = file.read(vec.as_mut_slice()).unwrap();
println!("read {} bytes.", count);
println!("vec = {:?}", vec);
}
Of course, you still have to check whether count == length, and read more data into the buffer if that's not the case.
2. Iterator version
Your second solution is better because you won't have to check how many bytes have been read, and you won't have to re-read in case count != length. You need to use the bytes() function on the Read trait (implemented by File). This transform the file into a stream (i.e an iterator). Because errors can still happen, you don't get an Iterator<Item=u8> but an Iterator<Item=Result<u8, R::Err>>. Hence you need to deal with failures explicitly within the iterator. We're going to use unwrap() here for simplicity:
use std::fs::File;
use std::io::Read;
fn main() {
let file = File::open("/usr/share/dict/words").unwrap();
let length: usize = 100;
let vec: Vec<u8> = file
.bytes()
.take(length)
.map(|r: Result<u8, _>| r.unwrap()) // or deal explicitly with failure!
.collect();
println!("vec = {:?}", vec);
}
You can always use a bit of unsafe to create a vector of uninitialized memory. It is perfectly safe to do with primitive types:
let mut v: Vec<u8> = Vec::with_capacity(length);
unsafe { v.set_len(length); }
let count = file.read(vec.as_mut_slice()).unwrap();
This way, vec.len() will be set to its capacity, and all bytes in it will be uninitialized (likely zeros, but possibly some garbage). This way you can avoid zeroing the memory, which is pretty safe for primitive types.
Note that read() method on Read is not guaranteed to fill the whole slice. It is possible for it to return with number of bytes less than the slice length. There are several RFCs on adding methods to fill this gap, for example, this one.
I set myself a little task to acquire some basic Rust knowledge. The task was:
Read some key-value pairs from stdin and put them into a hashmap.
This, however, turned out to be a trickier challenge than expected. Mainly due to the understanding of lifetimes. The following code is what I currently have after a few experiments, but the compiler just doesn't stop yelling at me.
use std::io;
use std::collections::HashMap;
fn main() {
let mut input = io::stdin();
let mut lock = input.lock();
let mut lines_iter = lock.lines();
let mut map = HashMap::new();
for line in lines_iter {
let text = line.ok().unwrap();
let kv_pair: Vec<&str> = text.words().take(2).collect();
map.insert(kv_pair[0], kv_pair[1]);
}
println!("{}", map.len());
}
The compiler basically says:
`text` does not live long enough
As far as I understand, this is because the lifetime of 'text' is limited to the scope of the loop.
The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to a dangling pointer since 'text' will be destroyed after each iteration. (Please tell me if I'm wrong)
The big question is: How to solve this issue?
My intuition says:
Make an "owned copy" of the key value pair and "expand" it's lifetime to the outer scope .... but I have no idea how to achieve this.
The lifetime of 'text' is limited to the scope of the loop. The key-value pair that I'm extracting within the loop is therefore also bound to the loops boundaries. Thus, inserting them to the outer map would lead to an dangling pointer since 'text' will be destroyed after each iteration.
Sounds right to me.
Make an "owned copy" of the key value pair.
An owned &str is a String:
map.insert(kv_pair[0].to_string(), kv_pair[1].to_string());
Edit
The original code is below, but I've updated the answer above to be more idiomatic
map.insert(String::from_str(kv_pair[0]), String::from_str(kv_pair[1]));
In Rust 1.1 the function words was marked as deprecated. Now you should use split_whitespace.
Here is an alternative solution which is a bit more functional and idiomatic (works with 1.3).
use std::io::{self, BufRead};
use std::collections::HashMap;
fn main() {
let stdin = io::stdin();
// iterate over all lines, "change" the lines and collect into `HashMap`
let map: HashMap<_, _> = stdin.lock().lines().filter_map(|line_res| {
// convert `Result` to `Option` and map the `Some`-value to a pair of
// `String`s
line_res.ok().map(|line| {
let kv: Vec<_> = line.split_whitespace().take(2).collect();
(kv[0].to_owned(), kv[1].to_owned())
})
}).collect();
println!("{}", map.len());
}