Is there any way to unpack an iterator into a tuple? - rust

Is there any way to accomplish something like the following:
let v = vec![1, 2, 3];
let (a, b) = v.iter().take(2);
Such that a = 1 and b = 2 at the end?
I know I could just use a vector but I would like to have named variables.

The itertools crate has methods like tuples and next_tuple that can help with this.
use itertools::Itertools; // 0.9.0
fn main() {
let v = vec![1, 2, 3];
let (a, b) = v.iter().next_tuple().unwrap();
assert_eq!(a, &1);
assert_eq!(b, &2);
}

This may not be exactly what you asked for, but I suppose you rarely want to convert an arbitrarily large vector to a tuple anyway. If you just want to extract the first few elements of a vector into a tuple, you can do so using slice pattern matching:
fn main() {
let v = vec![1, 2, 3];
let (a, b) = match &v[..] {
&[first, second, ..] => (first, second),
_ => unreachable!(),
};
assert_eq!((a, b), (1, 2));
}

I wrote this ugly recursive macro that converts a Vec to a tuple because I wanted to learn something about macros.
macro_rules! tuplet {
{ ($y:ident $(, $x:ident)*) = $v:expr } => {
let ($y, $($x),*) = tuplet!($v ; 1 ; ($($x),*) ; ($v[0]) );
};
{ $v:expr ; $j:expr ; ($y:ident $(, $x:ident)*) ; ($($a:expr),*) } => {
tuplet!( $v ; $j+1 ; ($($x),*) ; ($($a),*,$v[$j]) )
};
{ $v:expr ; $j:expr ; () ; $accu:expr } => {
$accu
}
}
I am new to this and probably very bad at it, so there's most likely a better way to do it. This is just a proof of concept. It allows you to write:
fn main() {
let v = vec![1, 2, 3];
tuplet!((a, b, c) = v);
assert_eq!(a, 1);
assert_eq!(b, 2);
assert_eq!(c, 3);
}
Somewhere in that macro definition you find the part $v[$j], which you could replace by $v.nth($j) if you want to use it for iterators.

gcp is on the right track; his answer seems like the correct one to me.
I'm going to give a more compelling example, though, since the OP seemed in a comment to wonder whether what he asked for is even worthwhile ("I can't think of a good enough reason for this functionality to be possible."). Check out the Person::from_csv function below:
use itertools::Itertools;
#[derive(Debug)]
struct Person<'a> {
first: &'a str,
last: &'a str,
}
impl<'a> Person<'a> {
// Create a Person from a str of form "last,first".
fn from_csv(s: &'a str) -> Option<Self> {
s.split(',').collect_tuple().map(
|(last, first)| Person { first, last }
)
}
}
fn main() {
dbg!(Person::from_csv("Doe")); // None
dbg!(Person::from_csv("Doe,John")); // Some(...)
dbg!(Person::from_csv("Doe,John,foo")); // None
}
It takes the Iterator produced by split and collects the results into a tuple so that we can match and destructure it. If there are too many or too few commas, you won't get a matching tuple. This code is clean because collect_tuple lets us use pattern matching and destructuring.
Here it is in the playground.

Related

Partial move of Vec of tuple

I have a Vec<(String, i64)> and need to iterate over the Strings and move them and then iterate over the i64s.
However, if I move the Strings I have to store the i64 again into another Vec:
let l: Vec<_> = l
.into_iter()
.map(|(string, int)| {
drop(string);
int
})
.collect();
for i in l {
process(i);
}
How can I iterate over the Strings and i64s separately without incurring any additional performance overhead.
The only solution I can think of at the moment that will not cause additional operations is to store the Strings and i64s separately.
You can use std::mem::take() while iterating over the Vec in a first pass to take ownership of the String element while putting a non-allocating Default in its place. This allows you to keep the Vec in its original form, so no extra container is required.
fn foo(mut inp: Vec<(String, i64)>) {
// First pass over the Vec "extracts" the owned Strings, replacing the content
// in the Vec by a non-allocating empty String, which is close to zero cost;
// this leaves the Vec as is, so no intermediate representation is needed.
for s in inp.iter_mut().map(|(s, _)| std::mem::take(s)) {
// Process String
}
// Something happens
// Second pass ignores the empty strings, processes the integers
for i in inp.into_iter().map(|(_, i)| i) {
// Process the integers
}
}
If the type of the list can be changed to Vec<Option<String>, i64> from Vec<String, i64>, then you can try the following way.
fn main() {
let mut l = Vec::new();
l.push((Some("a".to_string()), 1i64));
l.push((Some("b".to_string()), 2));
l.push((Some("c".to_string()), 3));
l.push((Some("d".to_string()), 4));
l.iter_mut().for_each(|(s, _)| {
if let Some(x) = s.take() {
println!("Processing string: {}", x);
}
});
l.iter().for_each(|(_, i)| {
println!("Processing int: {}", i);
});
}
Playground
Use unzip to separate them:
fn main(){
let original = vec![("a", 1), ("b", 2)];
let (s, i): (Vec<_>, Vec<_>) = original.into_iter().unzip();
for a in s {
println!("{}", a);
}
for b in i {
println!("{}", b);
}
}
Playground

How can I group consecutive integers in a vector in Rust?

I have a Vec<i64> and I want to know all the groups of integers that are consecutive. As an example:
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
I'm expecting something like this or similar:
[[1, 2, 3], [5, 6, 7], [9, 10]];
The view (vector of vectors or maybe tuples or something else) really doesn't matter, but I should get several grouped lists with continuous numbers.
At the first look, it seems like I'll need to use itertools and the group_by function, but I have no idea how...
You can indeed use group_by for this, but you might not really want to. Here's what I would probably write instead:
fn consecutive_slices(data: &[i64]) -> Vec<&[i64]> {
let mut slice_start = 0;
let mut result = Vec::new();
for i in 1..data.len() {
if data[i - 1] + 1 != data[i] {
result.push(&data[slice_start..i]);
slice_start = i;
}
}
if data.len() > 0 {
result.push(&data[slice_start..]);
}
result
}
This is similar in principle to eXodiquas' answer, but instead of accumulating a Vec<Vec<i64>>, I use the indices to accumulate a Vec of slice references that refer to the original data. (This question explains why I made consecutive_slices take &[T].)
It's also possible to do the same thing without allocating a Vec, by returning an iterator; however, I like the above version better. Here's the zero-allocation version I came up with:
fn consecutive_slices(data: &[i64]) -> impl Iterator<Item = &[i64]> {
let mut slice_start = 0;
(1..=data.len()).flat_map(move |i| {
if i == data.len() || data[i - 1] + 1 != data[i] {
let begin = slice_start;
slice_start = i;
Some(&data[begin..i])
} else {
None
}
})
}
It's not as readable as a for loop, but it doesn't need to allocate a Vec for the return value, so this version is more flexible.
Here's a "more functional" version using group_by:
use itertools::Itertools;
fn consecutive_slices(data: &[i64]) -> Vec<Vec<i64>> {
(&(0..data.len()).group_by(|&i| data[i] as usize - i))
.into_iter()
.map(|(_, group)| group.map(|i| data[i]).collect())
.collect()
}
The idea is to make a key function for group_by that takes the difference between each element and its index in the slice. Consecutive elements will have the same key because indices increase by 1 each time. One reason I don't like this version is that it's quite difficult to get slices of the original data structure; you almost have to create a Vec<Vec<i64>> (hence the two collects). The other reason is that I find it harder to read.
However, when I first wrote my preferred version (the first one, with the for loop), it had a bug (now fixed), while the other two versions were correct from the start. So there may be merit to writing denser code with functional abstractions, even if there is some hit to readability and/or performance.
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
let mut res = Vec::new();
let mut prev = v[0];
let mut sub_v = Vec::new();
sub_v.push(prev);
for i in 1..v.len() {
if v[i] == prev + 1 {
sub_v.push(v[i]);
prev = v[i];
} else {
res.push(sub_v.clone());
sub_v.clear();
sub_v.push(v[i]);
prev = v[i];
}
}
res.push(sub_v);
This should solve your problem.
Iterating over the given vector, checking if the current i64 (in my case i32) is +1 to the previous i64, if so push it into a vector (sub_v). After the series breaks, push the sub_v into the result vector. Repeat.
But I guess you wanted something functional?
Another possible solution, that uses std only, could be:
fn consecutive_slices(v: &[i64]) -> Vec<Vec<i64>> {
let t: Vec<Vec<i64>> = v
.into_iter()
.chain([*v.last().unwrap_or(&-1)].iter())
.scan(Vec::new(), |s, &e| {
match s.last() {
None => { s.push(e); Some((false, Vec::new())) },
Some(&p) if p == e - 1 => { s.push(e); Some((false, Vec::new()))},
Some(&p) if p != e - 1 => {let o = s.clone(); *s = vec![e]; Some((true, o))},
_ => None,
}
})
.filter_map(|(n, v)| {
match n {
true => Some(v.clone()),
false => None,
}
})
.collect();
t
}
The chain is used to get the last vector.
I like the answers above but you could also use peekable() to tell if the next value is different.
https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html
I would probably use a fold for this?
That's because I'm very much a functional programmer.
Obviously mutating the accumulator is weird :P but this works too and represents another way of thinking about it.
This is basically a recursive solution and can be modified easily to use immutable datastructures.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=43b9e3613c16cb988da58f08724471a4
fn main() {
let v = vec![1, 2, 3, 5, 6, 7, 9, 10];
let mut res: Vec<Vec<i32>> = vec![];
let (last_group, _): (Vec<i32>, Option<i32>) = v
.iter()
.fold((vec![], None), |(mut cur_group, last), x| {
match last {
None => {
cur_group.push(*x);
(cur_group, Some(*x))
}
Some(last) => {
if x - last == 1 {
cur_group.push(*x);
(cur_group, Some(*x))
} else {
res.push(cur_group);
(vec![*x], Some(*x))
}
}
}
});
res.push(last_group);
println!("{:?}", res);
}

How to convert a string of digits into a vector of digits?

I'm trying to store a string (or str) of digits, e.g. 12345 into a vector, such that the vector contains {1,2,3,4,5}.
As I'm totally new to Rust, I'm having problems with the types (String, str, char, ...) but also the lack of any information about conversion.
My current code looks like this:
fn main() {
let text = "731671";
let mut v: Vec<i32>;
let mut d = text.chars();
for i in 0..text.len() {
v.push( d.next().to_digit(10) );
}
}
You're close!
First, the index loop for i in 0..text.len() is not necessary since you're going to use an iterator anyway. It's simpler to loop directly over the iterator: for ch in text.chars(). Not only that, but your index loop and the character iterator are likely to diverge, because len() returns you the number of bytes and chars() returns you the Unicode scalar values. Being UTF-8, the string is likely to have fewer Unicode scalar values than it has bytes.
Next hurdle is that to_digit(10) returns an Option, telling you that there is a possibility the character won't be a digit. You can check whether to_digit(10) returned the Some variant of an Option with if let Some(digit) = ch.to_digit(10).
Pieced together, the code might now look like this:
fn main() {
let text = "731671";
let mut v = Vec::new();
for ch in text.chars() {
if let Some(digit) = ch.to_digit(10) {
v.push(digit);
}
}
println!("{:?}", v);
}
Now, this is rather imperative: you're making a vector and filling it digit by digit, all by yourself. You can try a more declarative or functional approach by applying a transformation over the string:
fn main() {
let text = "731671";
let v: Vec<u32> = text.chars().flat_map(|ch| ch.to_digit(10)).collect();
println!("{:?}", v);
}
ArtemGr's answer is pretty good, but their version will skip any characters that aren't digits. If you'd rather have it fail on bad digits, you can use this version instead:
fn to_digits(text: &str) -> Option<Vec<u32>> {
text.chars().map(|ch| ch.to_digit(10)).collect()
}
fn main() {
println!("{:?}", to_digits("731671"));
println!("{:?}", to_digits("731six71"));
}
Output:
Some([7, 3, 1, 6, 7, 1])
None
To mention the quick and dirty elephant in the room, if you REALLY know your string contains only digits in the range '0'..'9', than you can avoid memory allocations and copies and use the underlying &[u8] representation of String from str::as_bytes directly. Subtract b'0' from each element whenever you access it.
If you are doing competitive programming, this is one of the worthwhile speed and memory optimizations.
fn main() {
let text = "12345";
let digit = text.as_bytes();
println!("Text = {:?}", text);
println!("value of digit[3] = {}", digit[3] - b'0');
}
Output:
Text = "12345"
value of digit[3] = 4
This solution combines ArtemGr's + notriddle's solutions:
fn to_digits(string: &str) -> Vec<u32> {
let opt_vec: Option<Vec<u32>> = string
.chars()
.map(|ch| ch.to_digit(10))
.collect();
match opt_vec {
Some(vec_of_digits) => vec_of_digits,
None => vec![],
}
}
In my case, I implemented this function in &str.
pub trait ExtraProperties {
fn to_digits(self) -> Vec<u32>;
}
impl ExtraProperties for &str {
fn to_digits(self) -> Vec<u32> {
let opt_vec: Option<Vec<u32>> = self
.chars()
.map(|ch| ch.to_digit(10))
.collect();
match opt_vec {
Some(vec_of_digits) => vec_of_digits,
None => vec![],
}
}
}
In this way, I transform &str to a vector containing digits.
fn main() {
let cnpj: &str = "123456789";
let nums: Vec<u32> = cnpj.to_digits();
println!("cnpj: {cnpj}"); // cnpj: 123456789
println!("nums: {nums:?}"); // nums: [1, 2, 3, 4, 5, 6, 7, 8, 9]
}
See the Rust Playground.

Best way to remove elements of Vec depending on other elements of the same Vec

I have a vector of sets and I want to remove all sets that are subsets of other sets in the vector. Example:
a = {0, 3, 5}
b = {0, 5}
c = {0, 2, 3}
In this case I would like to remove b, because it's a subset of a. I'm fine with using a "dumb" n² algorithm.
Sadly, it's pretty tricky to get it working with the borrow checker. The best I've come up with is (Playground):
let mut v: Vec<HashSet<u8>> = vec![];
let mut to_delete = Vec::new();
for (i, set_a) in v.iter().enumerate().rev() {
for set_b in &v[..i] {
if set_a.is_subset(&set_b) {
to_delete.push(i);
break;
}
}
}
for i in to_delete {
v.swap_remove(i);
}
(note: the code above is not correct! See comments for further details)
I see a few disadvantages:
I need an additional vector with additional allocations
Maybe there are more efficient ways than calling swap_remove often
If I need to preserve order, I can't use swap_remove, but have to use remove which is slow
Is there a better way to do this? I'm not just asking about my use case, but about the general case as it's described in the title.
Here is a solution that does not make additional allocations and preserves the order:
fn product_retain<T, F>(v: &mut Vec<T>, mut pred: F)
where F: FnMut(&T, &T) -> bool
{
let mut j = 0;
for i in 0..v.len() {
// invariants:
// items v[0..j] will be kept
// items v[j..i] will be removed
if (0..j).chain(i + 1..v.len()).all(|a| pred(&v[i], &v[a])) {
v.swap(i, j);
j += 1;
}
}
v.truncate(j);
}
fn main() {
// test with a simpler example
// unique elements
let mut v = vec![1, 2, 3];
product_retain(&mut v, |a, b| a != b);
assert_eq!(vec![1, 2, 3], v);
let mut v = vec![1, 3, 2, 4, 5, 1, 2, 4];
product_retain(&mut v, |a, b| a != b);
assert_eq!(vec![3, 5, 1, 2, 4], v);
}
This is a kind of partition algorithm. The elements in the first partition will be kept and in the second partition will be removed.
You can use a while loop instead of the for:
use std::collections::HashSet;
fn main() {
let arr: &[&[u8]] = &[
&[3],
&[1,2,3],
&[1,3],
&[1,4],
&[2,3]
];
let mut v:Vec<HashSet<u8>> = arr.iter()
.map(|x| x.iter().cloned().collect())
.collect();
let mut pos = 0;
while pos < v.len() {
let is_sub = v[pos+1..].iter().any(|x| v[pos].is_subset(x))
|| v[..pos].iter().any(|x| v[pos].is_subset(x));
if is_sub {
v.swap_remove(pos);
} else {
pos+=1;
}
}
println!("{:?}", v);
}
There are no additional allocations.
To avoid using remove and swap_remove, you can change the type of vector to Vec<Option<HashSet<u8>>>:
use std::collections::HashSet;
fn main() {
let arr: &[&[u8]] = &[
&[3],
&[1,2,3],
&[1,3],
&[1,4],
&[2,3]
];
let mut v:Vec<Option<HashSet<u8>>> = arr.iter()
.map(|x| Some(x.iter().cloned().collect()))
.collect();
for pos in 0..v.len(){
let is_sub = match v[pos].as_ref() {
Some(chk) =>
v[..pos].iter().flat_map(|x| x).any(|x| chk.is_subset(x))
|| v[pos+1..].iter().flat_map(|x| x).any(|x| chk.is_subset(x)),
None => false,
};
if is_sub { v[pos]=None };//Replace with None instead remove
}
println!("{:?}", v);//[None, Some({3, 2, 1}), None, Some({1, 4}), None]
}
I need an additional vector with additional allocations
I wouldn't worry about that allocation, since the memory and runtime footprint of that allocation will be really small compared to the rest of your algorithm.
Maybe there are more efficient ways than calling swap_remove often.
If I need to preserve order, I can't use swap_remove, but have to use remove which is slow
I'd change to_delete from Vec<usize> to Vec<bool> and just mark whether a particular hashmap should be removed. You can then use the Vec::retain, which conditionaly removes elements while preserving order. Unfortunately, this function doesn't pass the index to the closure, so we have to create a workaround (playground):
let mut to_delete = vec![false; v.len()];
for (i, set_a) in v.iter().enumerate().rev() {
for set_b in &v[..i] {
if set_a.is_subset(&set_b) {
to_delete[i] = true;
}
}
}
{
// This assumes that retain checks the elements in the order.
let mut i = 0;
v.retain(|_| {
let ret = !to_delete[i];
i += 1;
ret
});
}
If your hashmap has a special value which can never occur under normal conditions, you can use it to mark a hashmap as "to delete", and then check that condition in retain (it would require changing the outer loop from iterator-based to range-based though).
Sidenote (if that HashSet<u8> is not just a toy example): More eficient way to store and compare sets of small integers would be to use a bitset.

What is the idiomatic way to pop the last N elements in a mutable Vec?

I am contributing Rust code to RosettaCode to both learn Rust and contribute to the Rust community at the same time. What is the best idiomatic way to pop the last n elements in a mutable Vec?
Here's roughly what I have written but I'm wanting to see if there's a better way:
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
for _ in 0..n {
nums.pop();
}
for e in nums {
println!("{}", e)
}
}
(Playground link)
I'd recommend using Vec::truncate:
fn main() {
let mut nums = vec![1, 2, 3, 4, 5];
let n = 2;
let final_length = nums.len().saturating_sub(n);
nums.truncate(final_length);
println!("{:?}", nums);
}
Additionally, I
used saturating_sub to handle the case where there aren't N elements in the vector
used vec![] to construct the vector of numbers easily
printed out the entire vector in one go
Normally when you "pop" something, you want to have those values. If you want the values in another vector, you can use Vec::split_off:
let tail = nums.split_off(final_length);
If you want access to the elements but do not want to create a whole new vector, you can use Vec::drain:
for i in nums.drain(final_length..) {
println!("{}", i)
}
An alternate approach would be to use Vec::drain instead. This gives you an iterator so you can actually use the elements that are removed.
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
for removed_element in nums.drain(new_len..) {
println!("removed: {}", removed_element);
}
for retained_element in nums {
println!("retained: {}", retained_element);
}
}
The drain method accepts a RangeArgument in the form of <start-inclusive>..<end-exclusive>. Both start and end may be omitted to default to the beginning/end of the vector. So above, we're really just saying start at new_len and drain to the end.
You should take a look at the Vec::truncate function from the standard library, that can do this for you.
(playground)
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
nums.truncate(new_len);
for e in nums {
println!("{}", e)
}
}

Resources