Rust's drain, iterator dropped ... "removes any remaining elements"

Rust's drain, iterator dropped ... "removes any remaining elements" - rust

On page 327 of Programming Rust you can find the following statement
However, unlike the into_iter() method, which takes the collection by value and consumes it, drain merely borrows a mutable references to the collection, and when the iterator is dropped, it removes any remaining elements from the collection, and leaves it empty.
I'm confused at what it means it says it removes any remaining elements from the collection? I can see with this code when the iterator is dropped the remaining elements from a are still there,
fn main() {
let mut a = vec![1, 2, 3, 4, 5];
{
let b: Vec<i32> = a.drain(0..3).collect();
}
println!("Hello, world! {:?}", a);
}
Perhaps I'm confused at merely the wording. Is there something more to this?

This looks like a bit imprecise wording.
The real meaning of these words is: if you drop the drain iterator without exhausting it, it will drop all the elements used for its creation. As you've asked it to use only the first three elements, it won't empty the entire vector, but rather the first part only; but it will do this even if unused:
fn main() {
let mut a = vec![1, 2, 3, 4, 5];
{
let _ = a.drain(0..3);
}
println!("Hello, world! {:?}", a);
}
Hello, world! [4, 5]
playground
You could understand this in the following way: the "collection" mentioned here is not the initial collection the drain was called on, but rather is "sub-collection", specified by the passed range.

Related

parallel sorting on separate sections of a single slice

I'm trying to implement a sort of parallel bubble sort, e.g. have a number of threads work on distinct parts of the same slice and then have a final thread sort those two similar to a kind of merge sort
I have this code so far
pub fn parallel_bubble_sort(to_sort: Arc<&[i32]>) {
let midpoint = to_sort.len() / 2;
let ranges = [0..midpoint, midpoint..to_sort.len()];
let handles = (ranges).map(|range| {
thread::spawn(|| {
to_sort[range].sort();
})
});
}
But I get a series of errors, relating to 'to_sort's lifetime, etc
How would someone go about modifying distinct slices of a larger slice across thread bounds?

Disclaimer: I assume that you want to sort in place, as you call .sort().
There's a couple of problems with your code:
The to_sort isn't mutable, so you won't be able to modify it. Which is an essential part of sorting ;) So I think that Arc<&[i32]> should most certainly be &mut [i32].
You cannot split a mutable slice like this. Rust doesn't know if your ranges overlap, and therefore disallows this entirely. You can, however, use split_at to split it into two parts. This even works with mutable references, which is important in your case.
You cannot move mutable references to threads, because it's unknown how long the
thread will exists. Overcoming this issue is the hardest part, I'm afraid; I don't know how easy it is in normal Rust without the use of unsafe. I think the easiest solution would be to use a library like rayon which already solved those problems for you.
EDIT: Rust 1.63 introduces scoped threads, which eliminates the need for rayon in this usecase.
This should be a good start for you:
pub fn parallel_bubble_sort(to_sort: &mut [i32]) {
let midpoint = to_sort.len() / 2;
let (left, right) = to_sort.split_at_mut(midpoint);
std::thread::scope(|s| {
s.spawn(|| left.sort());
s.spawn(|| right.sort());
});
// TODO: merge left and right
}
fn main() {
let mut data = [1, 6, 3, 4, 9, 7, 4];
parallel_bubble_sort(&mut data);
println!("{:?}", data);
}
[1, 3, 6, 4, 4, 7, 9]
Previous answer for Rust versions older than 1.63
pub fn parallel_bubble_sort(to_sort: &mut [i32]) {
let midpoint = to_sort.len() / 2;
let (left, right) = to_sort.split_at_mut(midpoint);
rayon::scope(|s| {
s.spawn(|_| left.sort());
s.spawn(|_| right.sort());
});
// TODO: merge left and right
}
fn main() {
let mut data = [1, 6, 3, 4, 9, 7, 4];
parallel_bubble_sort(&mut data);
println!("{:?}", data);
}
[1, 3, 6, 4, 4, 7, 9]

How do I output multiple values from .map() or use map twice in one iteration?

How do I use map twice on one into_iter. Currently I have.
let res_arr_to: Vec<String> = v.result.transactions.into_iter().map( |x| x.to).collect();
let res_arr_from: Vec<String> = v.result.transactions.into_iter().map( |x| x.from).collect();
What I want is both arrays in one array, the order doesn't matter. I need either a closure that outputs two values (if that is even a closure?). Or a way to use map twice in one iteration, without using the generated value, but instead using the untouched iterator if that makes sense and is possible. I am a total noob in functional programming so if there is a completely different way to do this another explanation is fine to.
v is an EthBlockTxResponse:
#[derive(Debug, Deserialize)]
struct EthTransactionObj {
from: String,
to: String
}
#[derive(Debug, Deserialize)]
struct EthTransactions {
transactions : Vec<EthTransactionObj>
}
#[derive(Debug, Deserialize)]
struct EthBlockTxResponse {
result : EthTransactions
}
Thanks

You can use .unzip() to collect two vectors at once like this:
let (res_arr_to, res_arr_from): (Vec<_>, Vec<_>) =
v.result.transactions.into_iter().map(|x| (x.to, x.from)).unzip();
Note that into_iter consumes v.result.transactions - moving out of that field. This is probably not what you want, and you should copy the strings instead in that case:
let (res_arr_to, res_arr_from): (Vec<_>, Vec<_>) =
v.result.transactions.iter().map(|x| (x.to.clone(), x.from.clone())).unzip();

I find the question a bit vague, but think you're trying to get both the x.to and the x.from at the same time instead of having to iterate the data twice and build two vectors. I'll address that first and then some cases of what you might have meant by some other things you mentioned.
One way you can do it is with .flat_map(). This will produce one flat vector removing the extra level of nesting. If you wanted tuples, you could just use .map(|x| (x.from, x.to)). I'm assuming that x.from and x.to are Copy and you actually want everything in a single vector without nesting.
let res_arr_combined = v.result.transactions.into_iter()
.flat_map( |x| [x.to, x.from])
.collect::<Vec<_>>();
Reference:
Iterator::flat_map()
Excerpt:
The map adapter is very useful, but only when the closure argument produces values. If it produces an iterator instead, there’s an extra layer of indirection. flat_map() will remove this extra layer on its own.
fn main()
{
// Adding more data to an iterator stream.
(0..5).flat_map(|n| [n, n * n])
.for_each(|n| print!("{}, ", n));
println!("");
}
output:
0, 0, 1, 1, 2, 4, 3, 9, 4, 16,
You may not really require the following, but wrt your comment about wanting to get data from an iterator without using the value or changing the state of the iterator, there is a .peek() operation you can invoke on iterators wrapped in Peekable.
To get a peekable iterator, you just invoke .peekable() on any iterator.
let mut p = [1, 2, 3, 4].into_iter().peekable();
println!("{:?}", p.peek());
println!("{:?}", p.next());
output:
Some(1)
Some(1)
The peekable behaves the same way as the iterator it was taken from, but adds a couple interesting methods like .next_if(|x| x > 0), which produces an iterator that will continue rendering items until the condition evaluates to false without consuming the last item it didn't render.
And one last topic in line with "using map twice in one iteration", if by that you might mean to pull items from a slice in chunks of 2. If v.result.transactions is itself a Vec, you can use the .chunks() method to group its item by 2's - or 3's as I have below:
let a = [1, 2, 3, 4, 5, 6, 7, 8, 9].chunks(3).collect::<Vec<_>>();
println!("{:?}", a);
output:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Why does a doubly-reversed iterator act as if it was never reversed?

I have an input vector which contains numbers. In an output vector, I need to get a sequence of partial products but in right-to-left order. The last element of the output must be equal to the last one in the input; the second-to-last element of the output must be a product of the last and second-to-last elements of input; and so on. For example, if the input vector is
let input = vec![2, 3, 4];
then I need the output to be [24, 12, 4].
My implementation takes an iterator over the input, reverses it, maps, reverses again and collects:
fn main() {
let input = vec![2, 3, 4];
let mut prod = 1;
let p: Vec<usize> = input
.iter()
.rev()
.map(|v| {
prod *= v;
prod
}).rev()
.collect();
println!("{:?}", p);
}
The result is [2, 6, 24], the same as if I delete both rev()s. The two rev()s do not solve the problem, they just "annihilate" each other.
Is this task solvable in "chain of calls" style, without using for?

This behavior is actually explicitly described in the documentation:
Notes about side effects
The map iterator implements DoubleEndedIterator, meaning that
you can also map backwards:
[…]
But if your closure has state, iterating backwards may act in a way you do
not expect. […]
A way to solve this would be by adding an intermediary collect to be sure that the second rev does not apply on the Map:
fn main() {
let input = vec![2, 3, 4];
let mut prod = 1;
let p: Vec<usize> = input
.iter()
.map(|v| {
prod *= v;
prod
}).rev()
.collect::<Vec<_>>()
.into_iter()
.rev()
.collect();
println!("{:?}", p);
}
But that requires an extra allocation. Another way would be to collect, and then reverse:
fn main() {
let input = vec![2, 3, 4];
let mut prod = 1;
let mut p: Vec<usize> = input
.iter()
.rev()
.map(|v| {
prod *= v;
prod
}).collect();
p.reverse();
println!("{:?}", p);
}

Your prod variable is carrying state across from one item to the next, which is not what a mapping does. Mappings operate on each element independently, which makes them easily parallelized and easier to reason about. The result you're asking for is to be precise a right scan (a reversed case of a prefix sum), but I'm not sure there are convenient methods to collect from the right (probably the easiest mutable way would be using VecDeque::push_front). This led me to perform the operation in two passes for my first version:
fn main() {
let input: Vec<usize> = vec![2, 3, 4];
let initprod = 1;
let prev: Vec<usize> = input
.iter()
.rev()
.scan(initprod, |prod, &v| {
*prod *= v;
Some(*prod)
}).collect();
let p: Vec<usize> = prev.into_iter().rev().collect();
println!("{:?}", p);
}
Note that initprod is immutable; prod carries the state. Using into_iter also means prev is consumed. We could use vec.reverse as shown by mcarton, but then we need to have a mutable variable. Scans can be parallelized, but to a lesser degree than maps. See e.g. discussion on adding them to Rayon. One might also consider if a ExactSizeIterator should allow reverse collection into an ordinary vector, but the standard library scan method breaks the known size using Option (which by the next convention turns it into a take-while-scan).
Here's a fewer copy variant using a preallocated VecDeque to collect from the right. I used an extra scope to restrict the mutability. It also requires Rust 1.21 or later to use for_each. There's unnecessary overhead in tracking the number of items and ring buffer structure, but it's at least somewhat legible still.
use std::collections::VecDeque;
fn main() {
let input: Vec<usize> = vec![2,3,4];
let p = {
let mut pmut = VecDeque::with_capacity(input.len());
let initprod = 1;
input
.iter()
.rev()
.scan(initprod, |prod, &v| {
*prod *= v;
Some(*prod)
})
.for_each(|v| {
pmut.push_front(v)
});
pmut
};
println!("{:?}", p);
}
Incidentally, following the old adage about Lisp programmers knowing the value of everything and the cost of nothing, here's a Haskell version I don't really know how inefficient it is:
scanr1 (*) [2, 3, 4]

How do I match to a pattern like `&(&usize, &u32)`?

Let's say I have vectors of primes and powers:
let mut primes: Vec<usize> = ...;
let mut powers: Vec<u32> = ...;
It is a fact that primes.len() == powers.len().
I'd like to return to the user a list of primes which have a corresponding power value of 0 (this code is missing proper refs and derefs):
primes.iter().zip(powers)
.filter(|(p, power)| power > 0)
.map(|(p, power)| p)
.collect::<Vec<usize>>()
The compiler is complaining a lot, as you might imagine. In particular, the filter is receiving arguments of type &(&usize, &u32), but I am not correctly de-referencing in the pattern matching. I have tried various patterns the compiler suggests (e.g. &(&p, &power), which is the one that makes the most sense to me), but with no luck. How do I correctly perform the pattern matching so that I can do the power > 0 comparison without issue, and so that I can collect in the end a Vec<usize>?

primes.iter().zip(powers)
iter() iterates by reference, so you get &usize elements for primes. OTOH .zip() calls .into_iter() which iterates owned values, so powers are u32, and these iterators combined iterate over (&usize, u32). Technically, there's nothing wrong with iterating over such mixed type, but the inconsistency may be confusing. You can use .into_iter() or .iter().cloned() on primes to avoid the reference, or call .zip(powers.iter()) to get both as references.
Second thing is that .filter() takes items by reference &(_,_) (since it only "looks" at them), and .map() by owned value (_,_) (which allows it to change and return it).
For small values like integers, you'd usually use these methods like this:
.filter(|&item| …)
.map(|item| …)
Note that in closures the syntax is |pattern: type|, so in the example above &item is equivalent to:
.filter(|by_ref| {
let item = *by_ref;
})

That works:
fn main() {
let primes: Vec<usize> = vec![2, 3, 5, 7];
let powers: Vec<u32> = vec![2, 2, 2, 2];
let ret = primes.iter().zip(powers.iter())
.filter_map(|(p, pow)| { // both are refs, so we need to deref
if *pow > 0 {
Some(*p)
} else {
None
}
})
.collect::<Vec<usize>>();
println!("{:?}", ret);
}
Note that I also used powers.iter() which yields elements by reference. You could also use cloned() on both iterators and work with values.

filter_map can be used well with match:
.filter_map(|(p, pow)| match pow.cmp(&0) {
Greater => Some(*p),
_ => None,
})
Playground

Why does the Rust compiler allow index out of bounds?

Can someone explain why this compiles:
fn main() {
let a = vec![1, 2, 3];
println!("{:?}", a[4]);
}
When running it, I got:
thread '' panicked at 'index out of bounds: the len is 3 but the index is 4', ../src/libcollections/vec.rs:1132

If you would like to access elements of the Vec with index checking, you can use the Vec as a slice and then use its get method. For example, consider the following code.
fn main() {
let a = vec![1, 2, 3];
println!("{:?}", a.get(2));
println!("{:?}", a.get(4));
}
This outputs:
Some(3)
None

In order to understand the issue, you have to think about it in terms of what the compiler sees.
Typically, a compiler never reasons about the value of an expression, only about its type. Thus:
a is of type Vec<i32>
4 is of an unknown integral type
Vec<i32> implements subscripting, so a[4] type checks
Having a compiler reasoning about values is not unknown, and there are various ways to get it.
you can allow evaluation of some expression at compile-time (C++ constexpr for example)
you can encode value into types (C++ non-type template parameters, using Peano's numbers)
you can use dependent typing which bridges the gap between types and values
Rust does not support any of these at this point in time, and while there has been interest for the former two it will certainly not be done before 1.0.
Thus, the values are checked at runtime, and the implementation of Vec correctly bails out (here failing).

Note that the following is a compile time error:
fn main() {
let a = [1, 2, 3];
println!("{:?}", a[4]);
}
error: this operation will panic at runtime
--> src/main.rs:3:22
|
3 | println!("{:?}", a[4]);
| ^^^^ index out of bounds: the length is 3 but the index is 4
|
= note: `#[deny(unconditional_panic)]` on by default
This works because without the vec!, the type is [i32; 3], which does actually carry length information.
With the vec!, it's now of type Vec<i32>, which no longer carries length information. Its length is only known at runtime.

Maybe what you mean is :
fn main() {
let a = vec![1, 2, 3];
println!("{:?}", a[4]);
}
This returns an Option so it will return Some or None. Compare this to:
fn main() {
let a = vec![1, 2, 3];
println!("{:?}", &a[4]);
}
This accesses by reference so it directly accesses the address and causes the panic in your program.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust's drain, iterator dropped ... "removes any remaining elements" - rust

Related

parallel sorting on separate sections of a single slice

How do I output multiple values from .map() or use map twice in one iteration?

Why does a doubly-reversed iterator act as if it was never reversed?

How do I match to a pattern like `&(&usize, &u32)`?

Why does the Rust compiler allow index out of bounds?

Categories

Resources