Partial move of Vec of tuple - rust

I have a Vec<(String, i64)> and need to iterate over the Strings and move them and then iterate over the i64s.
However, if I move the Strings I have to store the i64 again into another Vec:
let l: Vec<_> = l
.into_iter()
.map(|(string, int)| {
drop(string);
int
})
.collect();
for i in l {
process(i);
}
How can I iterate over the Strings and i64s separately without incurring any additional performance overhead.
The only solution I can think of at the moment that will not cause additional operations is to store the Strings and i64s separately.

You can use std::mem::take() while iterating over the Vec in a first pass to take ownership of the String element while putting a non-allocating Default in its place. This allows you to keep the Vec in its original form, so no extra container is required.
fn foo(mut inp: Vec<(String, i64)>) {
// First pass over the Vec "extracts" the owned Strings, replacing the content
// in the Vec by a non-allocating empty String, which is close to zero cost;
// this leaves the Vec as is, so no intermediate representation is needed.
for s in inp.iter_mut().map(|(s, _)| std::mem::take(s)) {
// Process String
}
// Something happens
// Second pass ignores the empty strings, processes the integers
for i in inp.into_iter().map(|(_, i)| i) {
// Process the integers
}
}

If the type of the list can be changed to Vec<Option<String>, i64> from Vec<String, i64>, then you can try the following way.
fn main() {
let mut l = Vec::new();
l.push((Some("a".to_string()), 1i64));
l.push((Some("b".to_string()), 2));
l.push((Some("c".to_string()), 3));
l.push((Some("d".to_string()), 4));
l.iter_mut().for_each(|(s, _)| {
if let Some(x) = s.take() {
println!("Processing string: {}", x);
}
});
l.iter().for_each(|(_, i)| {
println!("Processing int: {}", i);
});
}
Playground

Use unzip to separate them:
fn main(){
let original = vec![("a", 1), ("b", 2)];
let (s, i): (Vec<_>, Vec<_>) = original.into_iter().unzip();
for a in s {
println!("{}", a);
}
for b in i {
println!("{}", b);
}
}
Playground

Related

Conditionally sort a Vec in Rust

Let's say I want to sort a Vec of non-Clone items - but only maybe (this is a boiled down example of an issue in my code).
My attempt would be something like:
fn maybe_sort<T>(x: Vec<T>) -> Vec<T>
where
T: std::cmp::Ord,
{
// First, I need a copy of the vector - but only the vector, not the items inside
let mut copied = x.iter().collect::<Vec<_>>();
copied.sort();
// In my actual code the line below depends on the sorted vec
if rand::random() {
return copied.into_iter().map(|x| *x).collect::<Vec<_>>();
} else {
return x;
}
}
Alas the borrow checker isn't happy. I have a shared reference to each item in the Vec, and although I am not ever returning 2 references to the same item, Rust can't tell.
Is there a way to do this without unsafe? (and if not, what's the cleanest way to do it with unsafe.
You can .enumerate() the values to keep their original index. You can sort this based on its value T and decide whether to return the sorted version, or reverse the sort by sorting by original index.
fn maybe_sort<T: Ord>(x: Vec<T>) -> Vec<T> {
let mut items: Vec<_> = x.into_iter().enumerate().collect();
items.sort_by(|(_, a), (_, b)| a.cmp(b));
if rand::random() {
// return items in current order
}
else {
// undo the sort
items.sort_by_key(|(index, _)| *index);
}
items.into_iter().map(|(_, value)| value).collect()
}
If T implements Default, you can do it with a single sort and without unsafe like this:
fn maybe_sort<T: Ord + Default> (mut x: Vec<T>) -> Vec<T> {
let mut idx = (0..x.len()).collect::<Vec<_>>();
idx.sort_by_key (|&i| &x[i]);
if rand::random() {
return x;
} else {
let mut r = Vec::new();
r.resize_with (x.len(), Default::default);
for (i, v) in idx.into_iter().zip (x.drain(..)) {
r[i] = v;
}
return r;
}
}
Playground
If T does not implement Default, the same thing can be done with MaybeUninit:
use std::mem::{self, MaybeUninit};
fn maybe_sort<T: Ord> (mut x: Vec<T>) -> Vec<T> {
let mut idx = (0..x.len()).collect::<Vec<_>>();
idx.sort_by_key (|&i| &x[i]);
if rand::random() {
return x;
} else {
let mut r = Vec::new();
r.resize_with (x.len(), || unsafe { MaybeUninit::uninit().assume_init() });
for (i, v) in idx.into_iter().zip (x.drain(..)) {
r[i] = MaybeUninit::new (v);
}
return unsafe { mem::transmute::<_, Vec<T>> (r) };
}
}
Playground
Finally, here's a safe solution which doesn't require T to implement Default, but allocates an extra buffer (there is theoretically a way to reorder the indices in place, but I'll leave it as an exercise to the reader ☺):
fn maybe_sort<T: Ord> (mut x: Vec<T>) -> Vec<T> {
let mut idx = (0..x.len()).collect::<Vec<_>>();
idx.sort_by_key (|&i| &x[i]);
if rand::random() {
let mut rev = vec![0; x.len()];
for (i, &j) in idx.iter().enumerate() {
rev[j] = i;
}
for i in 0..x.len() {
while rev[i] != i {
let j = rev[i];
x.swap (j, i);
rev.swap (j, i);
}
}
}
x
}
Playground

How can I duplicate the first and last elements of a vector?

I would like to take a vector of characters and duplicate the first letter and the last one.
The only way I managed to do that is with this ugly code:
fn repeat_ends(s: &Vec<char>) -> Vec<char> {
let mut result: Vec<char> = Vec::new();
let first = s.first().unwrap();
let last = s.last().unwrap();
result.push(*first);
result.append(&mut s.clone());
result.push(*last);
result
}
fn main() {
let test: Vec<char> = String::from("Hello world !").chars().collect();
println!("{:?}", repeat_ends(&test)); // "HHello world !!"
}
What would be a better way to do it?
I am not sure if it is "better" but one way is using slice patterns:
fn repeat_ends(s: &Vec<char>) -> Vec<char> {
match s[..] {
[first, .. , last ] => {
let mut out = Vec::with_capacity(s.len() + 2);
out.push(first);
out.extend(s);
out.push(last);
out
},
_ => panic!("whatever"), // or s.clone()
}
}
If it can be mutable:
fn repeat_ends(s: &mut Vec<char>) {
if let [first, .. , last ] = s[..] {
s.insert(0, first);
s.push(last);
}
}
If it's ok to mutate the original vector, this does the job:
fn repeat_ends(s: &mut Vec<char>) {
let first = *s.first().unwrap();
s.insert(0, first);
let last = *s.last().unwrap();
s.push(last);
}
fn main() {
let mut test: Vec<char> = String::from("Hello world !").chars().collect();
repeat_ends(&mut test);
println!("{}", test.into_iter().collect::<String>()); // "HHello world !!"
}
Vec::insert:
Inserts an element at position index within the vector, shifting all elements after it to the right.
This means the function repeat_ends would be O(n) with n being the number of characters in the vector. I'm not sure if there is a more efficient method if you need to use a vector, but I'd be curious to hear it if there is.

How to pass &mut str and change the original mut str without a return?

I'm learning Rust from the Book and I was tackling the exercises at the end of chapter 8, but I'm hitting a wall with the one about converting words into Pig Latin. I wanted to see specifically if I could pass a &mut String to a function that takes a &mut str (to also accept slices) and modify the referenced string inside it so the changes are reflected back outside without the need of a return, like in C with a char **.
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
Also what do you think of the way I handled the chars and indexing inside strings?
use std::io::{self, Write};
fn main() {
let v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
for s in &v {
// cannot borrow `s` as mutable, as it is not declared as mutable
// cannot borrow data in a `&` reference as mutable
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
io::stdout().flush().unwrap();
}
fn to_pig_latin(mut s: &mut str) {
let first = s.chars().nth(0).unwrap();
let mut pig;
if "aeiouAEIOU".contains(first) {
pig = format!("{}-{}", s, "hay");
s = &mut pig[..]; // `pig` does not live long enough
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
pig = format!("{}-{}{}", word, first.to_lowercase(), "ay");
s = &mut pig[..]; // `pig` does not live long enough
}
}
Edit: here's the fixed code with the suggestions from below.
fn main() {
// added mut
let mut v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
// added mut
for mut s in &mut v {
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
println!();
}
// converted into &mut String
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay");
} else {
// added code to make the new first letter uppercase
let second = s.chars().nth(1).unwrap();
*s = format!(
"{}{}-{}ay",
second.to_uppercase(),
// the slice starts at the third char of the string, as if &s[2..]
&s[first.len_utf8() * 2..],
first.to_lowercase()
);
}
}
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
What you're trying to do can't work: with a mutable reference you can update the referee in-place, but this is extremely limited here:
a &mut str can't change length or anything of that matter
a &mut str is still just a reference, the memory has to live somewhere, here you're creating new Strings inside your function then trying to use these as the new backing buffers for the reference, which as the compiler tells you doesn't work: the String will be deallocated at the end of the function
What you could do is take an &mut String, that lets you modify the owned string itself in-place, which is much more flexible. And, in fact, corresponds exactly to your request: an &mut str corresponds to a char*, it's a pointer to a place in memory.
A String is also a pointer, so an &mut String is a double-pointer to a zone in memory.
So something like this:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
*s = format!("{}-{}", s, "hay");
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
*s = format!("{}-{}{}", word, first.to_lowercase(), "ay");
}
}
You can also likely avoid some of the complete string allocations by using somewhat finer methods e.g.
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
s.replace_range(first.len_utf8().., "");
write!(s, "-{}ay", first.to_lowercase()).unwrap();
}
}
although the replace_range + write! is not very readable and not super likely to be much of a gain, so that might as well be a format!, something along the lines of:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
*s = format!("{}-{}ay", &s[first.len_utf8()..], first.to_lowercase());
}
}

Flattening a nested structure

Looks for wisdom on fixing this borrow-checker/lifetime issue in Rust. I'm trying to flatten a generic nested structure (into an impl Iterator or Vec). It's perhaps a few &s and `s away from working:
fn iter_els(prev_result: Vec<&El>) -> Vec<&El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result = prev_result.clone();
for el in prev_result {
for child in &el.children {
result.push(&child.clone());
}
result.extend(iter_els(&el.children));
}
result
}
You'll note that the immediate exception this raises is that iter_els expects a Vec of refs, not a ref itself. When addressing this directly, other issues rear their mischievous heads, as in a game of oxidized, but safe wack-a-mole.
Playground
There are various solutions to this task. One would be to pass the result as an out-parameter to the function:
fn iter_els<'el>(el_top: &'el El, result: &mut Vec<&'el El>) {
result.push(el_top);
for el in &el_top.children {
iter_els(el, result);
}
}
fn main() {
// build top_el as you did
let mut result = Vec::new();
iter_els(&top_el, &mut result);
println!("{:?}", result);
}
Adapting your original approach imho results in a more complex implementation:
fn iter_els<'el>(prev_result: &Vec<&'el El>) -> Vec<&'el El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result = prev_result.clone();
for el in prev_result {
for child in &el.children {
result.push(&child);
}
result.extend(iter_els(&el.children.iter().collect()));
}
result
}
fn main() {
// build top_el as you did
println!("{:?}", iter_els(&vec![&top_el]));
}
Alternatively:
fn iter_els<'el>(prev_result: &'el Vec<El>) -> Vec<&'el El> {
// Iterate over all elements from a tree, starting at the top-level element.
let mut result : Vec<_> = prev_result.iter().collect();
for el in prev_result {
for child in &el.children {
result.push(child);
}
result.extend(iter_els(&el.children));
}
result
}
fn main() {
// build top_el as you did
println!("{:?}", iter_els(&vec![top_el]));
}
As you can see, the first approach only operates on an immutable El, and one single result Vec, while the other implementations do not get around clone and collect.
Ideally, you would write a custom Iterator for your tree, but I think this could get quite cumbersome, because this iterator would have to keep track of the current state somehow (maybe can prove me wrong and show that it's actually easy to do).

What is the idiomatic way to pop the last N elements in a mutable Vec?

I am contributing Rust code to RosettaCode to both learn Rust and contribute to the Rust community at the same time. What is the best idiomatic way to pop the last n elements in a mutable Vec?
Here's roughly what I have written but I'm wanting to see if there's a better way:
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
for _ in 0..n {
nums.pop();
}
for e in nums {
println!("{}", e)
}
}
(Playground link)
I'd recommend using Vec::truncate:
fn main() {
let mut nums = vec![1, 2, 3, 4, 5];
let n = 2;
let final_length = nums.len().saturating_sub(n);
nums.truncate(final_length);
println!("{:?}", nums);
}
Additionally, I
used saturating_sub to handle the case where there aren't N elements in the vector
used vec![] to construct the vector of numbers easily
printed out the entire vector in one go
Normally when you "pop" something, you want to have those values. If you want the values in another vector, you can use Vec::split_off:
let tail = nums.split_off(final_length);
If you want access to the elements but do not want to create a whole new vector, you can use Vec::drain:
for i in nums.drain(final_length..) {
println!("{}", i)
}
An alternate approach would be to use Vec::drain instead. This gives you an iterator so you can actually use the elements that are removed.
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
for removed_element in nums.drain(new_len..) {
println!("removed: {}", removed_element);
}
for retained_element in nums {
println!("retained: {}", retained_element);
}
}
The drain method accepts a RangeArgument in the form of <start-inclusive>..<end-exclusive>. Both start and end may be omitted to default to the beginning/end of the vector. So above, we're really just saying start at new_len and drain to the end.
You should take a look at the Vec::truncate function from the standard library, that can do this for you.
(playground)
fn main() {
let mut nums: Vec<u32> = Vec::new();
nums.push(1);
nums.push(2);
nums.push(3);
nums.push(4);
nums.push(5);
let n = 2;
let new_len = nums.len() - n;
nums.truncate(new_len);
for e in nums {
println!("{}", e)
}
}

Resources