How do I convert a Vec<String> to Vec<&str>? - string

I can convert Vec<String> to Vec<&str> this way:
let mut items = Vec::<&str>::new();
for item in &another_items {
items.push(item);
}
Are there better alternatives?

There are quite a few ways to do it, some have disadvantages, others simply are more readable to some people.
This dereferences s (which is of type &String) to a String "right hand side reference", which is then dereferenced through the Deref trait to a str "right hand side reference" and then turned back into a &str. This is something that is very commonly seen in the compiler, and I therefor consider it idiomatic.
let v2: Vec<&str> = v.iter().map(|s| &**s).collect();
Here the deref function of the Deref trait is passed to the map function. It's pretty neat but requires useing the trait or giving the full path.
let v3: Vec<&str> = v.iter().map(std::ops::Deref::deref).collect();
This uses coercion syntax.
let v4: Vec<&str> = v.iter().map(|s| s as &str).collect();
This takes a RangeFull slice of the String (just a slice into the entire String) and takes a reference to it. It's ugly in my opinion.
let v5: Vec<&str> = v.iter().map(|s| &s[..]).collect();
This is uses coercions to convert a &String into a &str. Can also be replaced by a s: &str expression in the future.
let v6: Vec<&str> = v.iter().map(|s| { let s: &str = s; s }).collect();
The following (thanks #huon-dbaupp) uses the AsRef trait, which solely exists to map from owned types to their respective borrowed type. There's two ways to use it, and again, prettiness of either version is entirely subjective.
let v7: Vec<&str> = v.iter().map(|s| s.as_ref()).collect();
and
let v8: Vec<&str> = v.iter().map(AsRef::as_ref).collect();
My bottom line is use the v8 solution since it most explicitly expresses what you want.

The other answers simply work. I just want to point out that if you are trying to convert the Vec<String> into a Vec<&str> only to pass it to a function taking Vec<&str> as argument, consider revising the function signature as:
fn my_func<T: AsRef<str>>(list: &[T]) { ... }
instead of:
fn my_func(list: &Vec<&str>) { ... }
As pointed out by this question: Function taking both owned and non-owned string collections. In this way both vectors simply work without the need of conversions.

All of the answers idiomatically use iterators and collecting instead of a loop, but do not explain why this is better.
In your loop, you first create an empty vector and then push into it. Rust makes no guarantees about the strategy it uses for growing factors, but I believe the current strategy is that whenever the capacity is exceeded, the vector capacity is doubled. If the original vector had a length of 20, that would be one allocation, and 5 reallocations.
Iterating from a vector produces an iterator that has a "size hint". In this case, the iterator implements ExactSizeIterator so it knows exactly how many elements it will return. map retains this and collect takes advantage of this by allocating enough space in one go for an ExactSizeIterator.
You can also manually do this with:
let mut items = Vec::<&str>::with_capacity(another_items.len());
for item in &another_items {
items.push(item);
}
Heap allocations and reallocations are probably the most expensive part of this entire thing by far; far more expensive than taking references or writing or pushing to a vector when no new heap allocation is involved. It wouldn't surprise me if pushing a thousand elements onto a vector allocated for that length in one go were faster than pushing 5 elements that required 2 reallocations and one allocation in the process.
Another unsung advantage is that using the methods with collect do not store in a mutable variable which one should not use if it's unneeded.

another_items.iter().map(|item| item.deref()).collect::<Vec<&str>>()
To use deref() you must add using use std::ops::Deref

This one uses collect:
let strs: Vec<&str> = another_items.iter().map(|s| s as &str).collect();

Here is another option:
use std::iter::FromIterator;
let v = Vec::from_iter(v.iter().map(String::as_str));
Note that String::as_str is stable since Rust 1.7.

Related

Confused about ownership in situations involving lines and map

fn problem() -> Vec<&'static str> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().collect()
}
This fails with the compilation error:
|
7 | my_string.lines().collect()
| ---------^^^^^^^^^^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `my_string` is borrowed here
I understand what this error means - it's to stop you returning a reference to a value which has gone out of scope. Having looked at the type signatures of the functions involved, it appears that the problem is with the lines method, which borrows the string it's called on. But why does this matter? I'm iterating over the lines of the string in order to get a vector of the parts, and what I'm returning is this "new" vector, not anything that would (illegally) directly reference my_string.
(I'm aware I could fix this particular example very easily by just using the string literal rather than converting to an "owned" string with String::from. This is a toy example to reproduce the problem - in my "real" code the string variable is read from a file, so I obviously can't use a literal.)
What's even more mysterious to me is that the following variation on the function, which to me ought to suffer from the same problem, works fine:
fn this_is_ok() -> Vec<i32> {
let my_string = String::from("1\n2\n3\n4");
my_string.lines().map(|n| n.parse().unwrap()).collect()
}
The reason can't be map doing some magic, because this also fails:
fn also_fails() -> Vec<&'static str> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().map(|s| s).collect()
}
I've been playing about for quite a while, trying various different functions inside the map - and some pass and some fail, and I've honestly no idea what the difference is. And all this is making me realise that I have very little handle on how Rust's ownership/borrowing rules work in non-trivial cases, even though I thought I at least understood the basics. So if someone could give me a relatively clear and comprehensive guide to what is going on in all these examples, and how it might be possible to fix those which fail, in some straightforward way, I would be extremely grateful!
The key is in the type of the value yielded by lines: &str. In order to avoid unnecessary clones, lines actually returns references to slices of the string it's called on, and when you collect it to a Vec, that Vec's elements are simply references to slices of your string. So, of course when your function exits and the string is dropped, the references inside the Vec will be dropped and invalid. Remember, &str is a borrowed string, and String is an owned string.
The parsing works because you take those &strs then you read them into an i32, so the data is transferred to a new value and you no longer need a reference to the original string.
To fix your problem, simply use str::to_owned to convert each element into a String:
fn problem() -> Vec<String> {
let my_string = String::from("First Line\nSecond Line");
my_string.lines().map(|v| v.to_owned()).collect()
}
It should be noted that to_string also works, and that to_owned is actually part of the ToOwned trait, so it is useful for other borrowed types as well.
For references to sized values (str is unsized so this doesn't apply), such as an Iterator<Item = &i32>, you can simply use Iterator::cloned to clone every element so they are no longer references.
An alternative solution would be to take the String as an argument so it, and therefore references to it, can live past the scope of the function:
fn problem(my_string: &str) -> Vec<&str> {
my_string.lines().collect()
}
The problem here is that this line:
let my_string = String::from("First Line\nSecond Line");
copies the string data to a buffer allocated on the heap (so no longer 'static). Then lines returns references to that heap-allocated buffer.
Note that &str also implements a lines method, so you don't need to copy the string data to the heap, you can use your string directly:
fn problem() -> Vec<&'static str> {
let my_string = "First Line\nSecond Line";
my_string.lines().collect()
}
Playground
which avoids all unnecessary allocations and copying.

How to prepend a slice to a Vec [duplicate]

This question already has answers here:
Efficiently insert or replace multiple elements in the middle or at the beginning of a Vec?
(3 answers)
Closed 5 years ago.
I was expecting a Vec::insert_slice(index, slice) method — a solution for strings (String::insert_str()) does exist.
I know about Vec::insert(), but that inserts only one element at a time, not a slice. Alternatively, when the prepended slice is a Vec one can append to it instead, but this does not generalize. The idiomatic solution probably uses Vec::splice(), but using iterators as in the example makes me scratch my head.
Secondly, the whole concept of prepending has seemingly been exorcised from the docs. There isn't a single mention. I would appreciate comments as to why. Note that relatively obscure methods like Vec::swap_remove() do exist.
My typical use case consists of indexed byte strings.
String::insert_str makes use of the fact that a string is essentially a Vec<u8>. It reallocates the underlying buffer, moves all the initial bytes to the end, then adds the new bytes to the beginning.
This is not generally safe and can not be directly added to Vec because during the copy the Vec is no longer in a valid state — there are "holes" in the data.
This doesn't matter for String because the data is u8 and u8 doesn't implement Drop. There's no such guarantee for an arbitrary T in a Vec, but if you are very careful to track your state and clean up properly, you can do the same thing — this is what splice does!
the whole concept of prepending has seemingly been exorcised
I'd suppose this is because prepending to a Vec is a poor idea from a performance standpoint. If you need to do it, the naïve case is straight-forward:
fn prepend<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
let mut tmp: Vec<_> = s.to_owned();
tmp.extend(v);
tmp
}
This has a bit higher memory usage as we need to have enough space for two copies of v.
The splice method accepts an iterator of new values and a range of values to replace. In this case, we don't want to replace anything, so we give an empty range of the index we want to insert at. We also need to convert the slice into an iterator of the appropriate type:
let s = &[1, 2, 3];
let mut v = vec![4, 5];
v.splice(0..0, s.iter().cloned());
splice's implementation is non-trivial, but it efficiently does the tracking we need. After removing a chunk of values, it then reuses that chunk of memory for the new values. It also moves the tail of the vector around (maybe a few times, depending on the input iterator). The Drop implementation of Slice ensures that things will always be in a valid state.
I'm more surprised that VecDeque doesn't support it, as it's designed to be more efficient about modifying both the head and tail of the data.
Taking into consideration what Shepmaster said, you could implement a function prepending a slice with Copyable elements to a Vec just like String::insert_str() does in the following way:
use std::ptr;
unsafe fn prepend_slice<T: Copy>(vec: &mut Vec<T>, slice: &[T]) {
let len = vec.len();
let amt = slice.len();
vec.reserve(amt);
ptr::copy(vec.as_ptr(),
vec.as_mut_ptr().offset((amt) as isize),
len);
ptr::copy(slice.as_ptr(),
vec.as_mut_ptr(),
amt);
vec.set_len(len + amt);
}
fn main() {
let mut v = vec![4, 5, 6];
unsafe { prepend_slice(&mut v, &[1, 2, 3]) }
assert_eq!(&v, &[1, 2, 3, 4, 5, 6]);
}

Way to specify a static slice of variable length

Let's say I have a function with following signature:
fn validate(samples: &[(&str, &[Token])])
Where Token is a custom enum.
I would like to be able to write something along those lines:
let samples = vec![
("a string", &[Token::PLUS, Token::MINUS, Token::PLUS]),
("another string", &[Token::MUL]),
];
validate(&samples);
But code like this yields mismatched types compile error:
error: mismatched types:
expected `&[(&str, &[Token])]`,
found `&collections::vec::Vec<(&str, &[Token; 3])>`
Is it possible to somehow convert the version with static length (&[Token; 3]) to a static slice (&[Token])?
In other words, I would like to be able to specify a static slice in similar way I specify &str, as some kind of "slice literal".
Or I am doing it completely wrong?
EDIT:
In short, I would like to find a syntax that creates an array with static lifetime (or at least a lifetime that is as long as the samples vector's one), and returns slice of it.
Something similar to how strings work, where just typing "a string" gives me reference of type &'static str.
EDIT2:
#Pablo's answer provides pretty good solution to my particular problem, although it is not exactly what I have meant at first.
I guess that the exact thing I have in mind might not be possible, so I will just accept that one for now, unless something more in lines of my initial idea come around.
In short, I would like to find a syntax that creates an array with
static lifetime (or at least a lifetime that is as long as the samples
vector's one), and returns slice of it.
You’d want something like this:
fn sliced(array: [Token; 3]) -> &'static [Token] { unimplemented!() }
So you could use it like this in your example:
let samples: Vec<(&str, &[Token])> = vec![
("a string", sliced([Token::PLUS, Token::MINUS, Token::PLUS])),
// ...
But there are two problems with it. The first and most glaring is that you can’t get a static reference out of a function which doesn’t take in a static reference (in which case it would just return it).
Therefore, since you want a slice at least as long-lived as your array, either you declare a const/static slice (which requires also a const/static declaration of its array), or you declare the array with a let statement first, and then make the slice. (This is what is done at my first alternative, below.) If you create the array inside a use of vec!, together with its slice, the array end its life with vec!, rendering the slice invalid. As an illustration, consider this, which fails due to the same reason:
fn main() {
let slice;
{
let array: [u8; 3] = [1,2,3];
slice = &array;
}
}
The second problem with the sliced function is that its input array has a fixed size, and you’d want to work generically over arrays of arbitrary size. However, this is currently not supported by Rust[1]. You have to work with slices in order to deal with arrays of arbitrary size.
One possibility, then, is to do the following [playpen]:
enum Token {
PLUS,
MINUS,
MUL,
}
fn validate(samples: &[(&str, &[Token])]) {
unimplemented!()
}
fn main() {
let tokens_0 = [Token::PLUS, Token::MINUS, Token::PLUS];
let tokens_1 = [Token::MUL];
let samples: Vec<(&str, &[Token])> = vec![
("a string", &tokens_0),
("another string", &tokens_1),
];
validate(&samples);
}
There are two changes here with respect to your code.
One, this code relies on implicit coercing of an array ([T; N]) as a slice (&[T]) by taking a reference to it. This is demanded by the declaration of samples as being of type Vec<(&str, &[Token])>. This is later satisfied, when using vec!, by passing references to the arrays, and thus eliciting the appropriate coercions.
Two, it creates the arrays of Token before using the vec! macro, which guarantees that they’ll live enough to be referenced from within the Vec it creates, keeping these references valid after vec! is done. This is necessary after resolving the previous type mismatch.
Addendum:
Or, for convenience, you may prefer to use a Vec instead of slices. Consider the following alternative [playpen]:
enum Token {
PLUS,
MINUS,
MUL,
}
fn validate<T>(samples: &[(&str, T)]) where
T: AsRef<[Token]>
{
let _: &[Token] = samples[0].1.as_ref();
unimplemented!()
}
fn main() {
let samples: Vec<(&str, Vec<Token>)> = vec![
("a string", vec![Token::PLUS, Token::MINUS, Token::PLUS]),
("another string", vec![Token::MUL]),
];
validate(&samples);
}
In this case, the AsRef<[Token]> bound on the second element of the tuple accepts any type from which you may take a &[Token], offering an as_ref() method which returns the expected reference. Vec<Token> is an example of such kind of type.
[1] “Rust does not currently support generics over the size of an array type.” [source]
Note: this answer is not valid in this particular situation because the arrays pointed by the nested slices cannot outlive the vector because they are only allocated for the duration of their respective expressions, therefore slices to them can't be stored in the vector.
The proper way would be to either hoist slices to the upper level and put them before the vector, or to use an entirely different structure, e.g. nested Vecs. Examples of all of these are provided in Pablo's answer.
You need to do this:
let samples = vec![
("a string", &[Token::PLUS, Token::MINUS, Token::PLUS] as &[_]),
("another string", &[Token::MUL] as &[_]),
];
validate(&samples);
Rust automatically converts references to arrays (&[T; n]) to slices (&[T]) when the target type is known, but in this case type inference doesn't work well because of the necessary deref coercion, so the compiler can't deduce that you need a slice instead of array and can't insert the appropriate conversion, thus you need to specify the type explicitly.
Also, there is no such thing as "static slice". The closest entity would be a slice with static lifetime, &'static [T], but as far as I remember, this is not the case of it.

How to get a slice from an Iterator?

I started to use clippy as a linter. Sometimes, it shows this warning:
writing `&Vec<_>` instead of `&[_]` involves one more reference and cannot be
used with non-Vec-based slices. Consider changing the type to `&[...]`,
#[warn(ptr_arg)] on by default
I changed the parameter to a slice but this adds boilerplate on the call side. For instance, the code was:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
but now it is:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect::<Vec<_>>();
function(&names);
otherwise, I get the following error:
error: the trait `core::marker::Sized` is not implemented for the type
`[collections::string::String]` [E0277]
So I wonder if there is a way to convert an Iterator to a slice or avoid having to specify the collected type in this specific case.
So I wonder if there is a way to convert an Iterator to a slice
There is not.
An iterator only provides one element at a time, whereas a slice is about getting several elements at a time. This is why you first need to collect all the elements yielded by the Iterator into a contiguous array (Vec) before being able to use a slice.
The first obvious answer is not to worry about the slight overhead, though personally I would prefer placing the type hint next to the variable (I find it more readable):
let names: Vec<_> = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
Another option would be for function to take an Iterator instead (and an iterator of references, at that):
let names = args.arguments.iter().map(|arg| &arg.name);
function(names);
After all, iterators are more general, and you can always "realize" the slice inside the function if you need to.
So I wonder if there is a way to convert an Iterator to a slice
There is. (in applicable cases)
Got here searching "rust iter to slice", for my use-case, there was a solution:
fn main() {
// example struct
#[derive(Debug)]
struct A(u8);
let list = vec![A(5), A(6), A(7)];
// list_ref passed into a function somewhere ...
let list_ref: &[A] = &list;
let mut iter = list_ref.iter();
// consume some ...
let _a5: Option<&A> = iter.next();
// now want to eg. return a slice of the rest
let slice: &[A] = iter.as_slice();
println!("{:?}", slice); // [A(6), A(7)]
}
That said, .as_slice is defined on an iter of an existing slice, so the previous answerer was correct in that if you've got, eg. a map iter, you would need to collect it first (so there is something to slice from).
docs: https://doc.rust-lang.org/std/slice/struct.Iter.html#method.as_slice

How do I transform &str to ~str in Rust?

This is for the current 0.6 Rust trunk by the way, not sure the exact commit.
Let's say I want to for each over some strings, and my closure takes a borrowed string pointer argument (&str). I want my closure to add its argument to an owned vector of owned strings ~[~str] to be returned. My understanding of Rust is weak, but I think that strings are a special case where you can't dereference them with * right? How do I get my strings from &str into the vector's push method which takes a ~str?
Here's some code that doesn't compile
fn read_all_lines() -> ~[~str] {
let mut result = ~[];
let reader = io::stdin();
let util = #reader as #io::ReaderUtil;
for util.each_line |line| {
result.push(line);
}
result
}
It doesn't compile because it's inferring result's type to be [&str] since that's what I'm pushing onto it. Not to mention its lifetime will be wrong since I'm adding a shorter-lived variable to it.
I realize I could use ReaderUtil's read_line() method which returns a ~str. But this is just an example.
So, how do I get an owned string from a borrowed string? Or am I totally misunderstanding.
You should call the StrSlice trait's method, to_owned, as in:
fn read_all_lines() -> ~[~str] {
let mut result = ~[];
let reader = io::stdin();
let util = #reader as #io::ReaderUtil;
for util.each_line |line| {
result.push(line.to_owned());
}
result
}
StrSlice trait docs are here:
http://static.rust-lang.org/doc/core/str.html#trait-strslice
You can't.
For one, it doesn't work semantically: a ~str promises that only one thing owns it at a time. But a &str is borrowed, so what happens to the place you borrowed from? It has no way of knowing that you're trying to steal away its only reference, and it would be pretty rude to trash the caller's data out from under it besides.
For another, it doesn't work logically: ~-pointers and #-pointers are allocated in completely different heaps, and a & doesn't know which heap, so it can't be converted to ~ and still guarantee that the underlying data lives in the right place.
So you can either use read_line or make a copy, which I'm... not quite sure how to do :)
I do wonder why the API is like this, when & is the most restricted of the pointers. ~ should work just as well here; it's not like the iterated strings already exist somewhere else and need to be borrowed.
At first I thought it was possible to use copy line to create owning pointer from the borrowed pointer to the string but this apparently copies burrowed pointer.
So I found str::from_slice(s: &str) -> ~str. This is probably what you need.

Resources