std::iter::map rust run parallelly? - rust

I have been searching to find parallelizes map in rust most answer point to rayon crate so I wonder if std::iter::map iterate sequentially by default?

I wonder if std::iter::map iterate sequentially by default?
It does
Rust iterators are lazy, meaning that nothing is computed unless asked explicitly. And they are computed one by one until the iterator is exhausted.
Map , from the documentation:
An iterator that maps the values of iter with f
Is an iterator adaptor, it will apply a transformation function on each item of the iterator one by one (when requested, through the next method).

Related

Multiplying nested vec by scalar

I have a nested Vec<Vec<f64>> in Rust, and I want to multiply each f64 in place by a value DT. I am currently doing:
dcm_dot.iter_mut().map(|a| a.iter_mut().map(|b| * b * DT));
This works, however, I am getting a lazy iterator warning, that the .map()s must be consumed. Is there a more idiomatically correct way to do this?
Iterators in Rust are lazy so unless you use the result of the .map(),
the closure inside will not even be executed.
In order to ensure that your code actually changes the Vec, you should use .for_each() instead.
Playground

What is the most idomatic way to write iterators that map an uncertain number of input items to other objects in Rust?

I'm trying to implement a Lexer. Since lexers emit tokens, I suppose that we can perceive a Lexer as a special iterator that maps certain chunks of chars to Tokens. I therefore expect Lexer to store an Iterator<Item=char> and manipulate that iterator instead of a &str to enable maximum flexibility.
struct Lexer<T: Iterator<Item=char>> {
source: T
}
Yet I find it hard to manipulate the iterator, since almost all iterator adaptors take ownership, and with generics I cannot change the type of T at runtime, unless I switch to Box.
self.source.take_while(|x| x.is_whitespace())
A possible workaround is to require that the iterator implement Clone, use a clone every time I want to transform it, remember how many characters I have seen, and call next that many times. I believe that it is too clumsy.
I wonder if there is an idomatic way to write iterators that map an uncertain number of input items (in this case, chars) into another object (in this case, Tokens)?
The most elegant way I can come up with so far is to use while let etc. which are not so fluent-style-like. I inspected the implementation of GroupBy in itertools and found that they use the while let approach too.

Efficiency of flattening and collecting slices

If one uses the standard .flatten().collect::<Box<[T]>>() on an Iterator<Item=&[T]> where T: Copy, does it:
perform a single allocation; and
use memcpy to copy each item to the destination
or does it do something less efficient?
Box<[T]> does not implement FromIterator<&T>, so I'll assume your actual inner iterator is something that yields owned Ts.
FromIterator<T> for Box<[T]> forwards to Vec<T>, which uses size_hint() to reserve space for lower + 1 items, and reallocates as it grows beyond that (moving elements as necessary). So the question is, what does Flatten<I> return for size_hint?
The implementation of Iterator::size_hint for Flatten<I> forwards to the internal struct FlattenCompat<I>, which is a little complicated because it supports double-ended iteration, but ultimately returns (0, None) if the outer iterator has not been advanced or exhausted.
So the answer to your question is: it does something less efficient. Namely, (unless you have already called next or next_back on the iterator at least once) it creates an empty Vec<T> and grows it progressively according to whatever growth strategy Vec uses (which is unspecified, but guaranteed by the documentation to result in O(1) amortized push).
This isn't an artificial limitation; it is fundamental to the way Flatten works. The only way you could pre-calculate the size of the flattened iterator is by exhausting the outer iterator and adding up all the inner size_hints. This is a bad idea both because it doesn't always work (the inner iterators may not return useful size_hints) and because you also have to find a way to keep the inner iterators around after exhausting the outer one; there's no solution that would be acceptable for a general purpose iterator adapter.
If you know something about your particular iterator that enables you to know what the final size should be, you can reserve the allocation yourself by calling Vec::with_capacity and use Extend to fill it from the flattened iterator, rather than using collect.

How do I collect the values of a HashMap into a vector?

I can not find a way to collect the values of a HashMap into a Vec in the documentation. I have score_table: HashMap<Id, Score> and I want to get all the Scores into all_scores: Vec<Score>.
I was tempted to use the values method (all_scores = score_table.values()), but it does not work since values is not a Vec.
I know that Values implements the ExactSizeIterator trait, but I do not know how to collect all values of an iterator into a vector without manually writing a for loop and pushing the values in the vector one after one.
I also tried to use std::iter::FromIterator; but ended with something like:
all_scores = Vec::from_iter(score_table.values());
expected type `std::vec::Vec<Score>`
found type `std::vec::Vec<&Score>`
Thanks to Hash map macro refuses to type-check, failing with a misleading (and seemingly buggy) error message?, I changed it to:
all_scores = Vec::from_iter(score_table.values().cloned());
and it does not produce errors to cargo check.
Is this a good way to do it?
The method Iterator.collect is designed for this specific task. You're right in that you need .cloned() if you want a vector of actual values instead of references (unless the stored type implements Copy, like primitives), so the code looks like this:
all_scores = score_table.values().cloned().collect();
Internally, collect() just uses FromIterator, but it also infers the type of the output. Sometimes there isn't enough information to infer the type, so you may need to explicitly specify the type you want, like so:
all_scores = score_table.values().cloned().collect::<Vec<Score>>();
If you don't need score_table anymore, you can transfer the ownership of Score values to all_scores by:
let all_scores: Vec<Score> = score_table.into_iter()
.map(|(_id, score)| score)
.collect();
This approach will be faster and consume less memory than the clone approach by #apetranzilla. It also supports any struct, not only structs that implement Clone.
There are three useful methods on HashMaps, which all return iterators:
values() borrows the collection and returns references (&T).
values_mut() gives mutable references &mut T which is useful to modify elements of the collection without destroying score_table.
into_values() gives you the elements directly: T! The iterator takes ownership of all the elements. This means that score_table no longer owns them, so you can't use score_table anymore!
In your example, you call values() to get &T references, then convert them to owned values T via a clone().
Instead, if we have an iterator of owned values, then we can convert it to a Vec using Iterator::collect():
let all_scores: Vec<Score> = score_table.into_values().collect();
Sometimes, you may need to specify the collecting type:
let all_scores = score_table.into_values().collect::<Vec<Score>>();

Create an iterator from a single element

I would like to prepend an element to an iterator. Specifically, I would like to create an iterator that steps through the sequence [2, 3, 5, 7, 9, ...] up to some maximum. The best I've been able to come up with is
range_step_inclusive(2,2,1).chain(range_step_inclusive(3, max, 2))
But the first iterator is kind of a hack to get the single element 2 as an iterator. Is there a more idiomatic way of creating a single-element iterator (or of prepending an element to an iterator)?
This is the exact use case of std::iter::once.
Creates an iterator that yields an element exactly once.
This is commonly used to adapt a single value into a chain of other kinds of iteration. Maybe you have an iterator that covers almost everything, but you need an extra special case. Maybe you have a function which works on iterators, but you only need to process one value.
range_step_inclusive is long gone, so let's also use the inclusive range syntax (..=):
iter::once(2).chain((3..=max).step_by(2))
You can use the Option by-value iterator, into_iter:
Some(2).into_iter().chain((3..).step_by(2))
It is not less boilerplate, but I guess it is more clear:
Repeat::new(2i).take(1).chain(range_step_inclusive(3, max, 2))
Repeat::new will create an endless iterator from the value you provide. Take will yield just the first value of that iterator. The rest is nothing new.
You can run this example on the playpen following the link: http://is.gd/CZbxD3

Resources