Insert constructed string into Vec in Rust - rust

Right now I am writing a program where I am updating a Vec with a string constructed based on conditions in a for loop. A (very contrived) simplified form of what I'm trying to do is the following:
fn main() {
let mut arr = vec!["_"; 5];
for (i, chr) in "abcde".char_indices() {
arr[i] = &chr.to_string().repeat(3);
}
}
However, I am getting the error temporary value dropped while borrowed. Any pointers on what to do here?

The lifetime of arr is the scope of the main method while chr.to_string() is only valid in the body of the for loop. Assigning it causes the error.
You can avoid this problem by using a Vec<String> instead of Vec<&str>.
fn main() {
let mut arr = vec!["_".to_string(); 5];
for (i, chr) in "abcde".char_indices() {
arr[i] = chr.to_string().repeat(3);
}
}
Here we see the String "_".to_string() copied five times (which is not very efficient). I suspect this is not the case in your real code.

Using String try this one liner too:
let arr: Vec<String> = "abcde".chars().map(|c| c.to_string().repeat(3)).collect();
Output:
["aaa", "bbb", "ccc", "ddd", "eee"]

As others have mentioned, the String that you are creating have to be owned by someone, otherwise they end up being dropped. As the compiler detects that this drop occurs while your arrays still holds references to them, it will complain.
You need to think about who needs to own these values. If your eventual array is the natural place for them to live, just move them there:
fn main() {
let mut arr = Vec::with_capacity(5);
for (i, chr) in "abcde".char_indices() {
arr.push(chr.to_string().repeat(3));
}
}
If you absolutely need an array of &str, you still need to maintain these values 'alive' for at least as long as the references themselves:
fn i_only_consume_refs(data: Vec<&String>) -> () {}
fn main() {
let mut arr = Vec::with_capacity(5);
for (i, chr) in "abcde".char_indices() {
arr.push(chr.to_string().repeat(3));
}
let refs = arr.iter().collect();
i_only_consume_refs(refs)
}
Here, we are still moving all the created Strings to the vector arr, and THEN taking references on its elements. This way, the vector of references is valid as long as arr (who owns the strings) is.
TL;DR: Someone needs to own these Strings while you keep references to them. You cannot create temporary strings, and only store the reference, otherwise you will have a reference to a dropped value, which is very bad indeed, and the compiler will not let you do that.

The problem is that arr only holds references, and the strings inside must be owned elsewhere. A possible fix is to simply leak the transient String you created inside the loop.
fn main() {
let mut arr = vec!["_"; 5];
for (i, chr) in "abcde".char_indices() {
arr[i] = Box::leak(Box::new(chr.to_string().repeat(3)));
}
}

Related

Alternative to swapping vector elements in rust

I'm experimenting with rust by porting some c++ code. I write a lot of code that uses vectors as object pools by moving elements to the back in various ways and then resizing. Here's a ported function:
use rand::{thread_rng, Rng};
fn main() {
for n in 1..11 {
let mut a: Vec<u8> = (1..11).collect();
keep_n_rand(&mut a, n);
println!("{}: {:?}", n, a);
}
}
fn keep_n_rand<T>(x: &mut Vec<T>, n: usize) {
let mut rng = thread_rng();
for i in n..x.len() {
let j = rng.gen_range(0..i);
if j < n {
x.swap(i, j);
}
}
x.truncate(n);
}
It keeps n elements chosen at random. It is done this way because it does not reduce the capacity of the vector so that more objects can be added later without allocating (on average). This might be iterated millions of times.
In c++, I would use x[j] = std::move(x[i]); because I am about to truncate the vector. While it has no impact in this example, if the swap was expensive, it would make sense to move. Is that possible and desirable in rust? I can live with a swap. I'm just curious.
Correct me if I'm wrong: you're looking for a way to retain n random elements in a Vec and discard the rest. In that case, the easiest way would be to use partial_shuffle(), a rand function implemented for slices.
Shuffle a slice in place, but exit early.
Returns two mutable slices from the source slice. The first contains amount elements randomly permuted. The second has the remaining elements that are not fully shuffled.
use rand::{thread_rng, seq::SliceRandom};
fn main() {
let mut rng = thread_rng();
// Use the `RangeInclusive` (`..=`) syntax at times like this.
for n in 1..=10 {
let mut elements: Vec<u8> = (1..=10).collect();
let (elements, _rest) = elements.as_mut_slice().partial_shuffle(&mut rng, n);
println!("{n}: {elements:?}");
}
}
Run this snippet on Rust Playground.
elements is shadowed, going from a Vec to a &mut [T]. If you're only going to use it inside the function, that's probably all you'll need. However, since it's a reference, you can't return it; the data it's pointing to is owned by the original vector, which will be dropped when it goes out of scope. If that's what you need, you'll have to turn the slice into a Vec.
While you can simply construct a new one from it using Vec::from, I suspect (but haven't tested) that it's more efficient to use Vec::split_off.
Splits the collection into two at the given index.
Returns a newly allocated vector containing the elements in the range [at, len). After the call, the original vector will be left containing the elements [0, at) with its previous capacity unchanged.
use rand::{thread_rng, seq::SliceRandom};
fn main() {
let mut rng = thread_rng();
for n in 1..=10 {
let mut elements: Vec<u8> = (1..=10).collect();
elements.as_mut_slice().partial_shuffle(&mut rng, n);
let elements = elements.split_off(elements.len() - n);
// `elements` is still a `Vec`; this time, containing only
// the shuffled elements. You can use it as the return value.
println!("{n}: {elements:?}");
}
}
Run this snippet on Rust Playground.
Since this function lives on a performance-critical path, I'd recommend benchmarking it against your current implementation. At the time of writing this, criterion is the most popular way to do that. That said, rand is an established library, so I imagine it will perform as well or better than a manual implementation.
Sample Benchmark
I don't know what kind of numbers you're working with, but here's a sample benchmark with for n in 1..=100 and (1..=100).collect() (i.e. 100 instead of 10 in both places) without the print statements:
manual time: [73.683 µs 73.749 µs 73.821 µs]
rand with slice time: [68.074 µs 68.147 µs 68.226 µs]
rand with vec time: [54.147 µs 54.213 µs 54.288 µs]
Bizarrely, splitting off a Vec performed vastly better than not. Unless I made an error in my benchmarks, the compiler is probably doing something under the hood that you'll need a more experienced Rustacean than me to explain.
Benchmark Implementation
Cargo.toml
[dependencies]
rand = "0.8.5"
[dev-dependencies]
criterion = "0.4.0"
[[bench]]
name = "rand_benchmark"
harness = false
[[bench]]
name = "rand_vec_benchmark"
harness = false
[[bench]]
name = "manual_benchmark"
harness = false
benches/manual_benchmark.rs
use criterion::{criterion_group, criterion_main, Criterion};
fn manual_solution() {
for n in 1..=100 {
let mut elements: Vec<u8> = (1..=100).collect();
keep_n_rand(&mut elements, n);
}
}
fn keep_n_rand<T>(elements: &mut Vec<T>, n: usize) {
use rand::{thread_rng, Rng};
let mut rng = thread_rng();
for i in n..elements.len() {
let j = rng.gen_range(0..i);
if j < n {
elements.swap(i, j);
}
}
elements.truncate(n);
}
fn benchmark(c: &mut Criterion) {
c.bench_function("manual", |b| b.iter(manual_solution));
}
criterion_group!(benches, benchmark);
criterion_main!(benches);
benches/rand_benchmark.rs
use criterion::{criterion_group, criterion_main, Criterion};
fn rand_solution() {
use rand::{seq::SliceRandom, thread_rng};
let mut rng = thread_rng();
for n in 1..=100 {
let mut elements: Vec<u8> = (1..=100).collect();
let (_elements, _) = elements.as_mut_slice().partial_shuffle(&mut rng, n);
}
}
fn benchmark(c: &mut Criterion) {
c.bench_function("rand with slice", |b| b.iter(rand_solution));
}
criterion_group!(benches, benchmark);
criterion_main!(benches);
benches/rand_vec_benchmark.rs
use criterion::{criterion_group, criterion_main, Criterion};
fn rand_solution() {
use rand::{seq::SliceRandom, thread_rng};
let mut rng = thread_rng();
for n in 1..=100 {
let mut elements: Vec<u8> = (1..=100).collect();
elements.as_mut_slice().partial_shuffle(&mut rng, n);
let _elements = elements.split_off(elements.len() - n);
}
}
fn benchmark(c: &mut Criterion) {
c.bench_function("rand with vec", |b| b.iter(rand_solution));
}
criterion_group!(benches, benchmark);
criterion_main!(benches);
Is that possible and desirable in rust?
It is not possible unless you constrain T: Copy or T: Clone: while C++ uses non-destructive moves (the source is in a valid but unspecified state) Rust uses destructive moves (the source is gone).
There are ways around it using unsafe but they require being very careful and it's probably not worth the hassle (you can look at Vec::swap_remove for a taste, it basically does what you're doing here except only between j and the last element of the vec).
I'd also recommend verified_tinker's solution, as I'm not convinced your shuffle is unbiased.

iterating through the chars on the reference of a string

This does not work:
let mut word = String::from("kobin");
for x in &word.chars() {
println!("looping through word: {}", x);
}
But this works:
let mut word = String::from("kobin");
let word_ref = &word;
for x in word_ref.chars() {
println!("looping through word: {}", x);
}
whats the difference. Aren't both referencing to the word?
&word.chars() is the same as &(word.chars()), so you're taking the iterator and borrowing it. Rust points out in this case that a reference to Chars (the iterator type) is not an iterator, but a Chars itself is. Parenthesizing fully will work
for x in (&word).chars() { ... }
But when calling methods on things, Rust is smart and will automatically borrow, so you can simply do
for x in word.chars() { ... }
and Rust is smart enough to know that str::chars only needs &self and will insert the & for you.

Construct string slice from vector of string slices

I have a vector holding n string slices. I would like to construct a string slice based on these.
fn main() {
let v: Vec<&str> = vec!["foo", "bar"];
let h: &str = "home";
let result = format!("hello={}#{}&{}#{}", v[0], h, v[1], h);
println!("{}", result);
}
I searched through the docs but I failed to find anything on this subject.
This can be done (somewhat inefficiently) with iterators:
let result = format!("hello={}",
v.iter().map(|s| format!("{}#{}", s, h))
.collect::<Vec<_>>()
.join("&")
);
(Playground)
If high performance is needed, a loop that builds a String will be quite a bit faster. The approach above allocates an additional String for each input &str, then a vector to hold them all before finally joining them together.
Here's a more efficient way to implement this. The operation carried out by this function is to call the passed function for each element in the iterator, giving it access to the std::fmt::Write reference passed in, and sticking the iterator in between successive calls. (Note that String implements std::fmt::Write!)
use std::fmt::Write;
fn delimited_write<W, I, V, F>(writer: &mut W, seq: I, delim: &str, mut func: F)
-> Result<(), std::fmt::Error>
where W: Write,
I: IntoIterator<Item=V>,
F: FnMut(&mut W, V) -> Result<(), std::fmt::Error>
{
let mut iter = seq.into_iter();
match iter.next() {
None => { },
Some(v) => {
func(writer, v)?;
for v in iter {
writer.write_str(delim)?;
func(writer, v)?;
}
},
};
Ok(())
}
You'd use it to implement your operation like so:
use std::fmt::Write;
fn main() {
let v: Vec<&str> = vec!["foo", "bar"];
let h: &str = "home";
let mut result: String = "hello=".to_string();
delimited_write(&mut result, v.iter(), "&", |w, i| {
write!(w, "{}#{}", i, h)
}).expect("write succeeded");
println!("{}", result);
}
It's not as pretty, but it makes no temporary String or Vec allocations. (Playground)
You will need to iterate over the vector as cdhowie suggests above. Let me explain why this is necessarily an O(n) problem and you can't create a single string slice from a vector of string slices without iterating over the vector:
Your vector only holds references to the strings; it doesn't hold the strings themselves. The strings are likely not stored contiguously in memory (only their references inside your vector are) so combining them into a single slice is not as simple as creating a slice that points to the beginning of the first string referenced in the vector and then extending the size of the slice.
Given that a &str is just an integer indicating the length of the slice and a pointer to a location in memory or the application binary where a str (essentially an array of char's) is stored, you can imagine that if the first &str in your vector references a string on the stack and the next one references a hardcoded string that is stored in the executable binary of the program, there is no way to create a single &str that points to both str's without copying at least one of them (in practice, probably both of them will be copied).
In order to get a single string slice from all of those &str's in your vector, you need to copy each of the str's they reference to a single, contiguous chunk of memory and then create a slice of that chunk. That copying requires iterating over the vector.

How to destroy reference within scope?

All I was trying to do is copy a vector to another vector by iterating through it. I am getting the error borrowed value does not live long enough. And I get it why I am seeing this, that's because the invalid memory reference must not live outside scope. My question is, how can I destroy the reference as well in the scope after using it?
fn main() {
let x: Vec<&str> = vec!["a","b","c","d"];
let mut y: Vec<&str> = vec![];
let z: String = String::from("xy");
let p: &String = &z;
for i in x {
let k = [i, p].concat();
let q: &str = &k;
y.push(q);
}
println!("{:?}",y);
}
I want to keep Vec<&str> for mut y. I don't want to change it to Vec<String> although that's possible, because I want to keep it on the stack, not heap. Is that possible?
k is getting allocated on the heap regardless of how you define your Vec, the problem is the scope of k.
for i in x {
let k = [i, p].concat();
let q: &str = &k;
y.push(q);
} // <- k goes out of scope at the end of each iteration.
y can not hold a reference to k because it will not exist when the loop finishes.
As you pointed out, Vec<String> will solve your issue because String Is an owned value, where &str is a borrowed one, and you are attempting to borrow from a String (k) that has a shorter lifetime than y.

How to translate "x-y" to vec![x, x+1, … y-1, y]?

This solution seems rather inelegant:
fn parse_range(&self, string_value: &str) -> Vec<u8> {
let values: Vec<u8> = string_value
.splitn(2, "-")
.map(|part| part.parse().ok().unwrap())
.collect();
{ values[0]..(values[1] + 1) }.collect()
}
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
There are two instances of collect(), and I wonder if it can be reduced to one (or even zero).
Trivial implementation
fn parse_range(string_value: &str) -> Vec<u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
{ first..second + 1 }.collect()
}
Playground
I would recommend returning a Result<Vec<u8>, Error> instead of panicking with expect/unwrap.
Nightly implementation
My next thought was about the second collect. Here is a code example which uses nightly code, but you won't need any collect at all.
#![feature(conservative_impl_trait, inclusive_range_syntax)]
fn parse_range(string_value: &str) -> impl Iterator<Item = u8> {
let pos = string_value.find(|c| c == '-').expect("No valid string");
let (first, second) = string_value.split_at(pos);
let first: u8 = first.parse().expect("Not a number");
let second: u8 = second[1..].parse().expect("Not a number");
first..=second
}
fn main() {
println!("{:?}", parse_range("3-7").collect::<Vec<u8>>());
}
Instead of calling collect the first time, just advance the iterator:
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
Do not call .ok().unwrap() — that converts the Result with useful error information to an Option, which has no information. Just call unwrap directly on the Result.
As already mentioned, if you want to return a Vec, you'll want to call collect to create it. If you want to return an iterator, you can. It's not bad even in stable Rust:
fn parse_range(string_value: &str) -> std::ops::Range<u8> {
let mut values = string_value
.splitn(2, "-")
.map(|part| part.parse().unwrap());
let start = values.next().unwrap();
let end = values.next().unwrap();
start..end + 1
}
fn main() {
assert!(parse_range("1-5").eq(1..6));
}
Sadly, inclusive ranges are not yet stable, so you'll need to continue to use +1 or switch to nightly.
Since splitn(2, "-") returns exactly two results for any valid string_value, it would be better to assign the tuple directly to two variables first and last rather than a seemingly arbitrary-length Vec. I can't seem to do this with a tuple.
This is not possible with Rust's type system. You are asking for dependent types, a way for runtime values to interact with the type system. You'd want splitn to return a (&str, &str) for a value of 2 and a (&str, &str, &str) for a value of 3. That gets even more complicated when the argument is a variable, especially when it's set at run time.
The closest workaround would be to have a runtime check that there are no more values:
assert!(values.next().is_none());
Such a check doesn't feel valuable to me.
See also:
What is the correct way to return an Iterator (or any other trait)?
How do I include the end value in a range?

Resources