Immutable access in rust - rust

I am new to rust from python and have used the functional style in python extensively.
What I am trying to do is to take in a string (slice) (or any iterable) and iterate with a reference to the current index and the next index. Here is my attempt:
fn main() {
// intentionally immutable, this should not change
let x = "this is a
multiline string
with more
then 3 lines.";
// initialize multiple (mutable) iterators over the slice
let mut lineiter = x.chars();
let mut afteriter = x.chars();
// to have some reason to do this
afteriter.skip(1);
// zip them together, comparing the current line with the next line
let mut zipped = lineiter.zip(afteriter);
for (char1, char2) in zipped {
println!("{:?} {:?}", char1, char2);
}
}
I think it should be possible to get different iterators that have different positions in the slice but are referring to the same parts of memory without having to copy the string, but the error I get is as follows:
error[E0382]: use of moved value: `afteriter`
--> /home/alex/Documents/projects/simple-game-solver/src/src.rs:15:35
|
10 | let afteriter = x.chars();
| --------- move occurs because `afteriter` has type `std::str::Chars<'_>`, which does not implement the `Copy` trait
11 | // to have some reason to do this
12 | afteriter.skip(1);
| --------- value moved here
...
15 | let mut zipped = lineiter.zip(afteriter);
| ^^^^^^^^^ value used here after move
I also get a warning telling me that zipped does not need to be mutable.
Is it possible to instantiate multiple iterators over a single variable and if so how can it be done?

Is it possible to instantiate multiple iterators over a single variable and if so how can it be done?
If you check the signature and documentation for Iterator::skip:
fn skip(self, n: usize) -> Skip<Self>
Creates an iterator that skips the first n elements.
After they have been consumed, the rest of the elements are yielded. Rather than overriding this method directly, instead override the nth method.
You can see that it takes self by value (consumes the input iterator) and returns a new iterator. This is not a method which consumes the first n elements of the iterator in-place, it's one which converts the existing iterator into one which skips the first n elements.
So instead of:
let mut afteriter = x.chars();
afteriter.skip(1);
you just write:
let mut afteriter = x.chars().skip(1);
I also get a warning telling me that zipped does not need to be mutable.
That's because Rust for loop uses the IntoIterator trait, which moves the iterable into the loop. It's not creating a mutable reference, it's just consuming whatever the RHS is.
Therefore it doesn't care what the mutability of the variable. You do need mut if you iterate explicitly, or if you call some other "terminal" method (e.g. nth or try_fold or all), or if you want to iterate on the mutable reference (that's mostly useful for collections though), but not to hand off iterators to some other combinator method, or to a for loop.
A for loop takes self, if you will. Just as for_each does in fact.

Thanks to #Stargateur for giving me the solution. The .skip(1) takes ownership of afteriter and returns ownership to a version without the first element. What was happening before was ownership was lost on the .skip and so the variable could not be mutated anymore (I am pretty sure)

Related

Get elements from Vector of tab delimited Strings

I have a vector of Strings as in the example below, and for every element in that vector, I want to get the second and third items. I don't know if I should be collecting a &str or String, but I haven't gotten to that part because this does not compile.
Everything is "fine" until I add the slicing [1..]
let elements: Vec<&str> = vec!["foo\tbar\tbaz", "ffoo\tbbar\tbbaz"]
.iter()
.map(|rec| rec.rsplit('\t').collect::<Vec<_>>()[1..])
.collect();
It complains because
the size for values of type `[&str]` cannot be known at compilation time
the trait `std::marker::Sized` is not implemented for `[&str]`rustcE0277
As the compiler tells you, the slicing is broken because in Rust a slice returns, well, the slice. Whose size is unknown at compile-time (hence the compiler complaining that it's unsized).
That's why you normally reference the slice e.g.
&thing[1..]
unless it's a context where it doesn't matter. Or you immediately convert the slice to a vector or array.
However here it would not work, because a slice is a "borrowing" structure, it doesn't own anything. And it borrows the Vec being created inside the map, which means you'll get a borrowing error, because the Vec will be destroyed at the end of the callback, and thus the slice would be referencing invalid memory:
error[E0515]: cannot return value referencing temporary value
--> src/main.rs:5:16
|
5 | .map(|rec| &rec.rsplit('\t').collect::<Vec<_>>()[1..])
| ^------------------------------------^^^^^
| ||
| |temporary value created here
| returns a value referencing data owned by the current function
The solution is to filter the iterator before collecting the vec, using Iterator::skip:
let elements: Vec<&str> = my_vec
.iter()
.map(|rec| rec.rsplit('\t').skip(1).collect::<Vec<_>>())
.collect();
However this means you now have an Iterator<Item=Vec<&str>>, which doesn't collect to a Vec<&str>.
You could always Iterator::flatten the inner vecs, but in reality they're completely unnecessary: you can just Iterator::flat_map each original string into a stream of strings which automatically get folded into the parent:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f2c33c1b6a30224202357dc4bd5c1d19
let my_vec = vec!["foo\tbar\tbaz", "ffoo\tbbar\tbbaz"];
let elements: Vec<&str> = my_vec
.iter()
.flat_map(|rec| rec.rsplit('\t').skip(1))
.collect();
dbg!(elements);
By the by, the code you're showing doesn't match the description, you say:
for every element in that vector, I want to get the second and third items
but since you're using rsplit what you're getting is the second and first: rsplit will iterate from the end, hence the r for reverse.

Creating word iterator from line iterator

I have a string iterator lines that I get from stdin with
use std::io::{self, BufRead};
let mut stdin = io::stdin();
let lines = stdin.lock().lines().map(|l| l.unwrap());
The lines iterator yields values of type String, not &str. I want to create an iterator that iterates over the input words instead of lines. It seems like this should be doable but my naive attempt does not work:
let words = lines.flat_map(|l| l.split_whitespace());
The compiler tells me that l is being dropped while still borrowed, which makes sense:
error[E0597]: `l` does not live long enough
--> src/lib.rs:6:36
|
6 | let words = lines.flat_map(|l| l.split_whitespace());
| ^ - `l` dropped here while still borrowed
| |
| borrowed value does not live long enough
7 | }
| - borrowed value needs to live until here
Is there some other clean way that accomplishes this?
In your example code, lines is an iterator over the lines read in from the reader you have obtained from stdin. As you say, it returns String instances, but you are not storing them anywhere.
std::string::String::split_whitespace is defined like this:
pub fn split_whitespace(&self) -> SplitWhitespace
So, it takes a reference to a string - it does not consume the string. It returns an iterator that yields string slices &str - which reference portions of the string, but don't own it.
In fact as soon as the closure you have passed to flat_map is done with it, no-one owns it, so it is dropped. That would leave the &str yielded by words dangling, thus the error.
One solution is to collect the lines into a vector, like this:
let lines: Vec<String> = stdin.lock().lines().map(|l| l.unwrap()).collect();
let words = lines.iter().flat_map(|l| l.split_whitespace());
The String instances are kept in the Vec<String>, which can live on so that the &str yielded by words have something to refer to.
If there were a lot of lines, and you did not want to keep them all in memory, you might prefer to do it a line at a time:
let lines = stdin.lock().lines().map(|l| l.unwrap());
let words = lines.flat_map(|l| {
l.split_whitespace()
.map(|s| s.to_owned())
.collect::<Vec<String>>()
.into_iter()
});
Here the words of each line are collected into a Vec, a line at a time. The trade-off is less overall memory consumption, against the overhead of constructing a Vec<String> for each line, and copy each word into it.
You might have been hoping for a zero-copy implementation, which consumed the Strings that lines produces. I think that would be possible to create, by creating a split_whitespace() function that takes ownership of the String and returns an iterator that owns the string.

Why do I get "no method named push found for type Option" with a vector of vectors?

I tried to use a String vector inside another vector:
let example: Vec<Vec<String>> = Vec::new();
for _number in 1..10 {
let mut temp: Vec<String> = Vec::new();
example.push(temp);
}
I should have 10 empty String vectors inside my vector, but:
example.get(0).push(String::from("test"));
fails with
error[E0599]: no method named `push` found for type `std::option::Option<&std::vec::Vec<std::string::String>>` in the current scope
--> src/main.rs:9:20
|
9 | example.get(0).push(String::from("test"));
| ^^^^
Why does it fail? Is it even possible to have an vector "inception"?
I highly recommend reading the documentation of types and methods before you use them. At the very least, look at the function's signature. For slice::get:
pub fn get<I>(&self, index: I) -> Option<&<I as SliceIndex<[T]>>::Output>
where
I: SliceIndex<[T]>,
While there's some generics happening here, the important part is that the return type is an Option. An Option<Vec> is not a Vec.
Refer back to The Rust Programming Language's chapter on enums for more information about enums, including Option and Result. If you wish to continue using the semantics of get, you will need to:
Switch to get_mut as you want to mutate the inner vector.
Make example mutable.
Handle the case where the indexed value is missing. Here I use an if let.
let mut example: Vec<_> = std::iter::repeat_with(Vec::new).take(10).collect();
if let Some(v) = example.get_mut(0) {
v.push(String::from("test"));
}
If you want to kill the program if the value is not present at the index, the shortest thing is to use the index syntax []:
example[0].push(String::from("test"));

How to get a slice from an Iterator?

I started to use clippy as a linter. Sometimes, it shows this warning:
writing `&Vec<_>` instead of `&[_]` involves one more reference and cannot be
used with non-Vec-based slices. Consider changing the type to `&[...]`,
#[warn(ptr_arg)] on by default
I changed the parameter to a slice but this adds boilerplate on the call side. For instance, the code was:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
but now it is:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect::<Vec<_>>();
function(&names);
otherwise, I get the following error:
error: the trait `core::marker::Sized` is not implemented for the type
`[collections::string::String]` [E0277]
So I wonder if there is a way to convert an Iterator to a slice or avoid having to specify the collected type in this specific case.
So I wonder if there is a way to convert an Iterator to a slice
There is not.
An iterator only provides one element at a time, whereas a slice is about getting several elements at a time. This is why you first need to collect all the elements yielded by the Iterator into a contiguous array (Vec) before being able to use a slice.
The first obvious answer is not to worry about the slight overhead, though personally I would prefer placing the type hint next to the variable (I find it more readable):
let names: Vec<_> = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
Another option would be for function to take an Iterator instead (and an iterator of references, at that):
let names = args.arguments.iter().map(|arg| &arg.name);
function(names);
After all, iterators are more general, and you can always "realize" the slice inside the function if you need to.
So I wonder if there is a way to convert an Iterator to a slice
There is. (in applicable cases)
Got here searching "rust iter to slice", for my use-case, there was a solution:
fn main() {
// example struct
#[derive(Debug)]
struct A(u8);
let list = vec![A(5), A(6), A(7)];
// list_ref passed into a function somewhere ...
let list_ref: &[A] = &list;
let mut iter = list_ref.iter();
// consume some ...
let _a5: Option<&A> = iter.next();
// now want to eg. return a slice of the rest
let slice: &[A] = iter.as_slice();
println!("{:?}", slice); // [A(6), A(7)]
}
That said, .as_slice is defined on an iter of an existing slice, so the previous answerer was correct in that if you've got, eg. a map iter, you would need to collect it first (so there is something to slice from).
docs: https://doc.rust-lang.org/std/slice/struct.Iter.html#method.as_slice

Why does the compiler tell me to consider using a `let` binding" when I already am?

What is my error and how to fix it?
fn get_m() -> Vec<i8> {
vec![1, 2, 3]
}
fn main() {
let mut vals = get_m().iter().peekable();
println!("Saw a {:?}", vals.peek());
}
(playground)
The compiler's error suggests "consider using a let binding" — but I already am:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:6:45
|
6 | let mut vals = get_m().iter().peekable();
| ------- ^ temporary value dropped here while still borrowed
| |
| temporary value created here
7 | println!("Saw a {:?}", vals.peek());
8 | }
| - temporary value needs to live until here
|
= note: consider using a `let` binding to increase its lifetime
This is obviously a newbie question -- though I thought I'd written enough Rust at this point that I had a handle on the borrow checker... apparently I haven't.
This question is similar to Using a `let` binding to increase value lifetime, but doesn't involve breaking down an expression into multiple statements, so I don't think the problem is identical.
The problem is that the Peekable iterator lives to the end of the function, but it holds a reference to the vector returned by get_m, which only lasts as long as the statement containing that call.
There are actually a lot of things going on here, so let's take it step by step:
get_m allocates and returns a vector, of type Vec<i8>.
We make the call .iter(). Surprisingly, Vec<i8> has no iter method, nor does it implement any trait that has one. So there are three sub-steps here:
Any method call checks whether its self value implements the Deref trait, and applies it if necessary. Vec<i8> does implement Deref, so we implicitly call its deref method. However, deref takes its self argument by reference, which means that get_m() is now an rvalue appearing in an lvalue context. In this situation, Rust creates a temporary to hold the value, and passes a reference to that. (Keep an eye on this temporary!)
We call deref, yielding a slice of type &[i8] borrowing the vector's elements.
This slice implements the SliceExt trait, which does have an iter method. Finally! This iter also takes its self argument by reference, and returns a std::slice::Iter holding a reference to the slice.
We make the call .peekable(). As before, std::slice::Iter has no peekable method, but it does implement Iterator; IteratorExt is implemented for every Iterator; and IteratorExt does have a peekable method. This takes its self by value, so the Iter is consumed, and we get a std::iter::Peekable back in return, again holding a reference to the slice.
This Peekable is then bound to the variable vals, which lives to the end of the function.
The temporary holding the original Vec<i8>, to whose elements the Peekable refers, now dies. Oops. This is the borrowed value not living long enough.
But the temporary dies there only because that's the rule for temporaries. If we give it a name, then it lasts as long as its name is in scope:
let vec = get_m();
let mut peekable = vec.iter().peekable();
println!("Saw a {:?}", vals.peek());
I think that's the story. What still confuses me, though, is why that temporary doesn't live longer, even without a name. The Rust reference says, "A temporary's lifetime equals the largest lifetime of any reference that points to it." But that's clearly not the case here.
This is happening because you are trying to run your .iter().peekable() on the actual vector inside of get_m(), which is getting re-referenced by vals.
Basically, you want something like this:
fn get_m() -> Vec<i8> {
vec![1, 2, 3]
}
fn main() {
let vals = get_m();
let mut val = vals.iter().peekable();
println!("Saw a {:?}", val.peek());
}
(Playground)
Result:
Saw a Some(1)

Resources