How to map over only the Some() values in an iterator? - rust

Both of the following work (in 2 invocations), but they feel too verbose.
fn main() {
let v = vec![Some(0), Some(1), None, Some(2)];
assert_eq!(
vec![0,2,4],
v.iter()
.filter(|x| x.is_some())
.map(|x| x.unwrap() * 2)
.collect::<Vec<u8>>());
assert_eq!(
vec![0,2,4],
v.iter()
.filter_map(|x| *x)
.map(|x| x*2)
.collect::<Vec<u8>>());
}
filter_map is close to what I want:
[filter_map] removes the Option layer automatically. If your
mapping is already returning an Option and you want to skip over
Nones, then filter_map is much, much nicer to use.
doc.rust-lang.org
But it doesn't unwrap the value in the closure because it expects an Option to be returned.
Is there a way to both filter on only the Some values, and map over those values with a single invocation?
Such as:
// Fake, does not work
fn main() {
let v = vec![Some(0), Some(1), None, Some(2)];
assert_eq!(
vec![0,2,4],
v.iter()
.map_only_some(|x| x * 2)
.collect::<Vec<u8>>());
}

Well I figured it out, and Iterator's next always returns an Option, so you just have to flatten it:
// Since v.iter().next() is Some(Some(0)), the following works
assert_eq!(
vec![0,2,4],
v.iter()
.flatten()
.map(|x| x * 2)
.collect::<Vec<u8>>());
It's not in a single invocation, but it's much much cleaner and, I think, idiomatic.

Is there a way to both filter on only the Some values, and map over those values with a single invocation?
You already use it filter_map():
fn main() {
let v = vec![Some(0), Some(1), None, Some(2)];
assert_eq!(
vec![0, 2, 4],
v.iter()
.filter_map(|x| x.map(|x| x * 2))
.collect::<Vec<u8>>()
);
}

Related

How do I avoid allocations in Iterator::flat_map?

I have a Vec of integers and I want to create a new Vec which contains those integers and squares of those integers. I could do this imperatively:
let v = vec![1, 2, 3];
let mut new_v = Vec::new(); // new instead of with_capacity for simplicity sake.
for &x in v.iter() {
new_v.push(x);
new_v.push(x * x);
}
println!("{:?}", new_v);
but I want to use iterators. I came up with this code:
let v = vec![1, 2, 3];
let new_v: Vec<_> = v.iter()
.flat_map(|&x| vec![x, x * x])
.collect();
println!("{:?}", new_v);
but it allocates an intermediate Vec in the flat_map function.
How to use flat_map without allocations?
As of Rust 1.53.0, this can be written with just the array literal:
let v = vec![1, 2, 3];
let new_v: Vec<_> = v.iter()
.flat_map(|&x| [x, x * x])
.collect();
Rust 1.53.0 implements IntoIterator for arrays, so the vec![] and workarounds in previous solutions are no longer needed. This works on all editions.
If your iterator is small and you don't want any external dependencies, a short iterator can be constructed from std::iter::once and std::iter::Iterator::chain. For example,
use std::iter;
let v = vec![1, 2, 3];
let new_v: Vec<_> = v
.iter()
.flat_map(|&x| iter::once(x).chain(iter::once(x * x)))
.collect();
println!("{:?}", new_v);
(playground)
This could be made into a macro, though be aware that using this for too many elements may cause the recursion limit to be reached. If you're making an iterator for more than a few dozen elements, it's probably not too bad to have an allocation. If you really need the slight increase in performance, nnnmmm's solution is probably better.
macro_rules! small_iter {
() => { std::iter::empty() };
($x: expr) => {
std::iter::once($x)
};
($x: expr, $($y: tt)*) => {
std::iter::once($x).chain(small_iter!($($y)*))
};
}
fn main() {
let v = vec![1, 2, 3];
let new_v: Vec<_> = v
.iter()
.flat_map(|&x| small_iter!(x, x * x))
.collect();
println!("{:?}", new_v);
}
(playground)
As of version 1.51.0, the struct core::array::IntoIter has been stabilized. You can use it like this:
use core::array;
let v = vec![1, 2, 3];
let new_v: Vec<_> = v.iter()
.flat_map(|&x| array::IntoIter::new([x, x * x]))
.collect();
The documentation warns that this may be deprecated in the future when IntoIterator is implemented for arrays, but currently it's the easiest way to do this.
You can use an ArrayVec for this.
let v = vec![1, 2, 3];
let new_v: Vec<_> = v.iter()
.flat_map(|&x| ArrayVec::from([x, x * x]))
.collect();
Making arrays be by-value iterators, so that you wouldn't need ArrayVec has been discussed, see https://github.com/rust-lang/rust/issues/25725 and the linked PRs.

How to get the index of the current element being processed in the iteration without a for loop?

I have read How to iterate a Vec<T> with the indexed position? where the answer is to use enumerate in a for-loop.
But if I don't use a for-loop like this:
fn main() {
let v = vec![1; 10]
.iter()
.map(|&x| x + 1 /* + index */ ) // <--
.collect::<Vec<_>>();
print!("v{:?}", v);
}
How could I get the index in the above closure?
You can also use enumerate!
let v = vec![1; 10]
.iter()
.enumerate()
.map(|(i, &x)| x + i)
.collect::<Vec<_>>();
println!("v{:?}", v); // prints v[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Let's see how this works. Iterator::enumerate returns Enumerate<Self>. That type also implements Iterator:
impl<I> Iterator for Enumerate<I>
where
I: Iterator,
{
type Item = (usize, <I as Iterator>::Item);
// ...
}
As you can see, the new iterator yields tuples of the index and the original
value.
You can simply use enumerate:
fn main() {
let v = vec![1; 10]
.iter()
.enumerate()
.map(|(i, x)| i + x)
.collect::<Vec<_>>();
print!("v{:?}", v);
}
The reason for this is because the for loop takes an enumerator:
In slightly more abstract terms:
for var in expression {
code
}
The expression is an iterator.

What type signature to use for an iterator generated from a slice?

I have this toy example, but it's what I'm trying to accomplish:
fn lazy_vec() {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(vec.into_iter());
iter = Box::new(iter.map(|x| x + 1));
// potentially do additional similar transformations to iter
println!("{:?}", iter.collect::<Vec<_>>());
}
This (if I'm not mistaken) is a lazy iterator pattern, and the actual map operation doesn't occur until .collect() is called. I want to do the same thing with slices:
fn lazy_slice() {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let slice: &[i64] = &vec[..3];
let mut iter: Box<Iterator<Item = i64>> = Box::new(slice.into_iter());
iter = Box::new(iter.map(|x| x + 1));
// potentially do additional similar transformations to iter
println!("{:?}", iter.collect::<Vec<_>>());
}
This results in a type mismatch:
error[E0271]: type mismatch resolving `<std::slice::Iter<'_, i64> as std::iter::Iterator>::Item == i64`
--> src/main.rs:4:47
|
4 | let mut iter: Box<Iterator<Item = i64>> = Box::new(slice.into_iter());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found i64
|
= note: expected type `&i64`
found type `i64`
= note: required for the cast to the object type `std::iter::Iterator<Item=i64>`
I can't figure out what I need to do to resolve this error. The second note made me think I needed:
iter = Box::new(iter.map(|x| x + 1) as Iterator<Item = i64>);
or
iter = Box::new(iter.map(|x| x + 1)) as Box<Iterator<Item = i64>>;
These fail with other errors depending on the exact syntax (e.g. expected reference, found i64, or expected i64, found &i64). I've tried other ways to declare the types involved, but I'm basically just blindly adding & and * in places and not making any progress.
What am I missing here? What do I need to change in order to make this compile?
Edit
Here's a slightly more concrete example - I need iter to be mut so that I can compose an unknown number of such transformations before actually invoking .collect(). My impression was this was a somewhat common pattern, apologies if that wasn't correct.
fn lazy_vec(n: i64) {
let vec: Vec<i64> = vec![1, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(vec.into_iter());
for _ in 0..n {
iter = Box::new(iter.map(|x| x + 1));
}
println!("{:?}", iter.collect::<Vec<_>>());
}
I'm aware I could rewrite this specific task in a simpler way (e.g. a single map that adds n to each element) - it's an oversimplified MCVE of the problem I'm running into. My issue is this works for lazy_vec, but I'm not sure how to do the same with slices.
Edit 2
I'm just learning Rust and some of the nomenclature and concepts are new to me. Here's what I'm envisioning doing in Python, for comparison. My intent is to do the same thing with slices that I can currently do with vectors.
#!/usr/bin/env python3
import itertools
ls = [i for i in range(10)]
def lazy_work(input):
for i in range(10):
input = (i + 1 for i in input)
# at this point no actual work has been done
return input
print("From list: %s" % list(lazy_work(ls)))
print("From slice: %s" % list(lazy_work(itertools.islice(ls, 5))))
Obviously in Python there's no issues with typing, but hopefully that more clearly demonstrates my intent?
As discussed in What is the difference between iter and into_iter?, these methods create iterators which yield different types when called on a Vec compared to a slice.
[T]::iter and [T]::into_iter both return an iterator which yields values of type &T. That means that the returned value doesn't implement Iterator<Item = i64> but instead Iterator<Item = &i64>, as the error message states.
However, your subsequent map statements change the type of the iterator's item to an i64, which means the type of the iterator would also need to change. As an analogy, you've essentially attempted this:
let mut a: &i64 = &42;
a = 99;
Iterator::cloned exists to make clones of the iterated value. In this case, it converts a &i64 to an i64 essentially dereferencing the value:
fn lazy_slice(n: i64) {
let array = [1i64, 2, 3, 4, 5];
let mut iter: Box<Iterator<Item = i64>> = Box::new(array.iter().cloned());
for _ in 0..n {
iter = Box::new(iter.map(|x| x + 1));
}
println!("{:?}", iter.collect::<Vec<_>>());
}

Is there any way to insert multiple entries into a HashMap at once in Rust?

Is there any way to insert multiple entries into a HashMap at once in Rust? Or to initialize it with multiple entries? Anything other than manually calling insert on every single element you're inserting?
Edit for an example using English letter frequencies:
I basically want:
let frequencies = {
'a': 0.08167,
'b': 0.01492,
...
'z': 0.00074
}
I know I can achieve the same result by doing a for loop like the following, but I want to know if there is a way to do this without creating additional arrays and then looping over them, or a more elegant solution in general.
let mut frequencies = HashMap::new();
let letters = ['a','b','c', ...... 'z'];
let freqs = [0.08167, 0.01492, 0.02782, ......., 0.00074];
for i in 0..26 {
frequencies.insert(letters[i], freqs[i]);
}
For a literal, I could use the answer here, which will probably work fine for this example, but I'm curious whether there's a way to do this without it being a literal, in case this comes up in the future.
Is there any way to insert multiple entries into a HashMap at once in Rust?
Yes, you can extend a HashMap with values from an Iterator, like this:
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.extend((1..3).map(|n| (format!("{}*2=", n), n * 2)));
map.extend((7..9).map(|n| (format!("{}*2=", n), n * 2)));
println!("{:?}", map); // Prints {"1*2=": 2, "8*2=": 16, "7*2=": 14, "2*2=": 4}.
}
It is even a bit faster than calling the insert manually, because extend uses the size hint provided by the Iterator in order to reserve some space beforehand.
Check out the source code of the method here, in map.rs.
Or to initialize it with multiple entries?
This is possible as well, thanks to HashMap implementing the FromIterator trait. When a collection implements FromIterator, you can use the Iterator::collect shorthand to construct it. Consider the following examples, all of them generating the same map:
use std::collections::HashMap;
fn main() {
let mut map: HashMap<_, _> = (1..3).map(|n| (format!("{}*2=", n), n * 2)).collect();
map.extend((7..9).map(|n| (format!("{}*2=", n), n * 2)));
println!("{:?}", map); // Prints {"1*2=": 2, "8*2=": 16, "7*2=": 14, "2*2=": 4}.
}
use std::collections::HashMap;
fn main() {
let map: HashMap<_, _> = (1..3)
.chain(7..9)
.map(|n| (format!("{}*2=", n), n * 2))
.collect();
println!("{:?}", map); // Prints {"1*2=": 2, "8*2=": 16, "7*2=": 14, "2*2=": 4}.
}
use std::collections::HashMap;
use std::iter::FromIterator;
fn main() {
let iter = (1..3).chain(7..9).map(|n| (format!("{}*2=", n), n * 2));
let map = HashMap::<String, u32>::from_iter(iter);
println!("{:?}", map); // Prints {"1*2=": 2, "8*2=": 16, "7*2=": 14, "2*2=": 4}.
}
use std::collections::HashMap;
fn main() {
let pairs = [
("a", 1),
("b", 2),
("c", 3),
("z", 50),
];
println!("1. Insert multiple entries into a HashMap at once");
let mut map = HashMap::new();
map.extend(pairs);
println!("map: {map:#?}\n");
println!("2. Initialize with multiple entries");
let map = HashMap::from([
("a", 1),
("b", 2),
("c", 3),
("z", 50),
]);
println!("map: {map:#?}\n");
println!("3. Initialize with multiple entries");
let map = HashMap::from(pairs);
println!("map: {map:#?}\n");
println!("4. Initialize with multiple entries");
let map: HashMap<_, _> = pairs.into();
println!("map: {map:#?}");
}
See the Rust Playground.

How do I concatenate two slices in Rust?

I want to take the x first and last elements from a vector and concatenate them. I have the following code:
fn main() {
let v = (0u64 .. 10).collect::<Vec<_>>();
let l = v.len();
vec![v.iter().take(3), v.iter().skip(l-3)];
}
This gives me the error
error[E0308]: mismatched types
--> <anon>:4:28
|
4 | vec![v.iter().take(3), v.iter().skip(l-3)];
| ^^^^^^^^^^^^^^^^^^ expected struct `std::iter::Take`, found struct `std::iter::Skip`
<anon>:4:5: 4:48 note: in this expansion of vec! (defined in <std macros>)
|
= note: expected type `std::iter::Take<std::slice::Iter<'_, u64>>`
= note: found type `std::iter::Skip<std::slice::Iter<'_, u64>>`
How do I get my vec of 1, 2, 3, 8, 9, 10? I am using Rust 1.12.
Just use .concat() on a slice of slices:
fn main() {
let v = (0u64 .. 10).collect::<Vec<_>>();
let l = v.len();
let first_and_last = [&v[..3], &v[l - 3..]].concat();
println!("{:?}", first_and_last);
// The output is `[0, 1, 2, 7, 8, 9]`
}
This creates a new vector, and it works with arbitrary number of slices.
(Playground link)
Ok, first of all, your initial sequence definition is wrong. You say you want 1, 2, 3, 8, 9, 10 as output, so it should look like:
let v = (1u64 .. 11).collect::<Vec<_>>();
Next, you say you want to concatenate slices, so let's actually use slices:
let head = &v[..3];
let tail = &v[l-3..];
At this point, it's really down to which approach you like the most. You can turn those slices into iterators, chain, then collect...
let v2: Vec<_> = head.iter().chain(tail.iter()).collect();
...or make a vec and extend it with the slices directly...
let mut v3 = vec![];
v3.extend_from_slice(head);
v3.extend_from_slice(tail);
...or extend using more general iterators (which will become equivalent in the future with specialisation, but I don't believe it's as efficient just yet)...
let mut v4: Vec<u64> = vec![];
v4.extend(head);
v4.extend(tail);
...or you could use Vec::with_capacity and push in a loop, or do the chained iterator thing, but using extend... but I have to stop at some point.
Full example code:
fn main() {
let v = (1u64 .. 11).collect::<Vec<_>>();
let l = v.len();
let head = &v[..3];
let tail = &v[l-3..];
println!("head: {:?}", head);
println!("tail: {:?}", tail);
let v2: Vec<_> = head.iter().chain(tail.iter()).collect();
println!("v2: {:?}", v2);
let mut v3 = vec![];
v3.extend_from_slice(head);
v3.extend_from_slice(tail);
println!("v3: {:?}", v3);
// Explicit type to help inference.
let mut v4: Vec<u64> = vec![];
v4.extend(head);
v4.extend(tail);
println!("v4: {:?}", v4);
}
You should collect() the results of the take() and extend() them with the collect()ed results of skip():
let mut p1 = v.iter().take(3).collect::<Vec<_>>();
let p2 = v.iter().skip(l-3);
p1.extend(p2);
println!("{:?}", p1);
Edit: as Neikos said, you don't even need to collect the result of skip(), since extend() accepts arguments implementing IntoIterator (which Skip does, as it is an Iterator).
Edit 2: your numbers are a bit off, though; in order to get 1, 2, 3, 8, 9, 10 you should declare v as follows:
let v = (1u64 .. 11).collect::<Vec<_>>();
Since the Range is left-closed and right-open.

Resources