Way to specify a static slice of variable length - rust

Let's say I have a function with following signature:
fn validate(samples: &[(&str, &[Token])])
Where Token is a custom enum.
I would like to be able to write something along those lines:
let samples = vec![
("a string", &[Token::PLUS, Token::MINUS, Token::PLUS]),
("another string", &[Token::MUL]),
];
validate(&samples);
But code like this yields mismatched types compile error:
error: mismatched types:
expected `&[(&str, &[Token])]`,
found `&collections::vec::Vec<(&str, &[Token; 3])>`
Is it possible to somehow convert the version with static length (&[Token; 3]) to a static slice (&[Token])?
In other words, I would like to be able to specify a static slice in similar way I specify &str, as some kind of "slice literal".
Or I am doing it completely wrong?
EDIT:
In short, I would like to find a syntax that creates an array with static lifetime (or at least a lifetime that is as long as the samples vector's one), and returns slice of it.
Something similar to how strings work, where just typing "a string" gives me reference of type &'static str.
EDIT2:
#Pablo's answer provides pretty good solution to my particular problem, although it is not exactly what I have meant at first.
I guess that the exact thing I have in mind might not be possible, so I will just accept that one for now, unless something more in lines of my initial idea come around.

In short, I would like to find a syntax that creates an array with
static lifetime (or at least a lifetime that is as long as the samples
vector's one), and returns slice of it.
You’d want something like this:
fn sliced(array: [Token; 3]) -> &'static [Token] { unimplemented!() }
So you could use it like this in your example:
let samples: Vec<(&str, &[Token])> = vec![
("a string", sliced([Token::PLUS, Token::MINUS, Token::PLUS])),
// ...
But there are two problems with it. The first and most glaring is that you can’t get a static reference out of a function which doesn’t take in a static reference (in which case it would just return it).
Therefore, since you want a slice at least as long-lived as your array, either you declare a const/static slice (which requires also a const/static declaration of its array), or you declare the array with a let statement first, and then make the slice. (This is what is done at my first alternative, below.) If you create the array inside a use of vec!, together with its slice, the array end its life with vec!, rendering the slice invalid. As an illustration, consider this, which fails due to the same reason:
fn main() {
let slice;
{
let array: [u8; 3] = [1,2,3];
slice = &array;
}
}
The second problem with the sliced function is that its input array has a fixed size, and you’d want to work generically over arrays of arbitrary size. However, this is currently not supported by Rust[1]. You have to work with slices in order to deal with arrays of arbitrary size.
One possibility, then, is to do the following [playpen]:
enum Token {
PLUS,
MINUS,
MUL,
}
fn validate(samples: &[(&str, &[Token])]) {
unimplemented!()
}
fn main() {
let tokens_0 = [Token::PLUS, Token::MINUS, Token::PLUS];
let tokens_1 = [Token::MUL];
let samples: Vec<(&str, &[Token])> = vec![
("a string", &tokens_0),
("another string", &tokens_1),
];
validate(&samples);
}
There are two changes here with respect to your code.
One, this code relies on implicit coercing of an array ([T; N]) as a slice (&[T]) by taking a reference to it. This is demanded by the declaration of samples as being of type Vec<(&str, &[Token])>. This is later satisfied, when using vec!, by passing references to the arrays, and thus eliciting the appropriate coercions.
Two, it creates the arrays of Token before using the vec! macro, which guarantees that they’ll live enough to be referenced from within the Vec it creates, keeping these references valid after vec! is done. This is necessary after resolving the previous type mismatch.
Addendum:
Or, for convenience, you may prefer to use a Vec instead of slices. Consider the following alternative [playpen]:
enum Token {
PLUS,
MINUS,
MUL,
}
fn validate<T>(samples: &[(&str, T)]) where
T: AsRef<[Token]>
{
let _: &[Token] = samples[0].1.as_ref();
unimplemented!()
}
fn main() {
let samples: Vec<(&str, Vec<Token>)> = vec![
("a string", vec![Token::PLUS, Token::MINUS, Token::PLUS]),
("another string", vec![Token::MUL]),
];
validate(&samples);
}
In this case, the AsRef<[Token]> bound on the second element of the tuple accepts any type from which you may take a &[Token], offering an as_ref() method which returns the expected reference. Vec<Token> is an example of such kind of type.
[1] “Rust does not currently support generics over the size of an array type.” [source]

Note: this answer is not valid in this particular situation because the arrays pointed by the nested slices cannot outlive the vector because they are only allocated for the duration of their respective expressions, therefore slices to them can't be stored in the vector.
The proper way would be to either hoist slices to the upper level and put them before the vector, or to use an entirely different structure, e.g. nested Vecs. Examples of all of these are provided in Pablo's answer.
You need to do this:
let samples = vec![
("a string", &[Token::PLUS, Token::MINUS, Token::PLUS] as &[_]),
("another string", &[Token::MUL] as &[_]),
];
validate(&samples);
Rust automatically converts references to arrays (&[T; n]) to slices (&[T]) when the target type is known, but in this case type inference doesn't work well because of the necessary deref coercion, so the compiler can't deduce that you need a slice instead of array and can't insert the appropriate conversion, thus you need to specify the type explicitly.
Also, there is no such thing as "static slice". The closest entity would be a slice with static lifetime, &'static [T], but as far as I remember, this is not the case of it.

Related

Creating a vector and returning it along with a reference to one of its elements

In rust, I have a function that generates a vector of Strings, and I'd like to return this vector along with a reference to one of the strings. Obviously, I would need to appropriately specify the lifetime of the reference, since it is valid only when the vector is in scope. However, but I can't get this to work.
Here is a minimal example of a failed attempt:
fn foo<'a>() -> ('a Vec<String>, &'a String) {
let x = vec!["some", "data", "in", "the", "vector"].iter().map(|s| s.to_string()).collect::<Vec<String>>();
(x, &x[1])
}
(for this example, I know I could return the index to the vector, but my general problem is more complex. Also, I'd like to understand how to achieve this)
Rust doesn't allow you to do that without unsafe code. Probably your best option is to return the vector with the index of the element in question.
This is conceptually very similar to trying to create a self-referential struct. See this for more on why this is challenging.

Ergonomically passing a slice of trait objects

I am converting a variety of types to String when they are passed to a function. I'm not concerned about performance as much as ergonomics, so I want the conversion to be implicit. The original, less generic implementation of the function simply used &[impl Into<String>], but I think that it should be possible to pass a variety of types at once without manually converting each to a string.
The key is that ideally, all of the following cases should be valid calls to my function:
// String literals
perform_tasks(&["Hello", "world"]);
// Owned strings
perform_tasks(&[String::from("foo"), String::from("bar")]);
// Non-string types
perform_tasks(&[1,2,3]);
// A mix of any of them
perform_tasks(&["All", 3, String::from("types!")]);
Some various signatures I've attempted to use:
fn perform_tasks(items: &[impl Into<String>])
The original version fails twice; it can't handle numeric types without manual conversion, and it requires all of the arguments to be the same type.
fn perform_tasks(items: &[impl ToString])
This is slightly closer, but it still requires all of the arguments to be of one type.
fn perform_tasks(items: &[&dyn ToString])
Doing it this way is almost enough, but it won't compile unless I manually add a borrow on each argument.
And that's where we are. I suspect that either Borrow or AsRef will be involved in a solution, but I haven't found a way to get them to handle this situation. For convenience, here is a playground link to the final signature in use (without the needed references for it to compile), alongside the various tests.
The following way works for the first three cases if I understand your intention correctly.
pub fn perform_tasks<I, A>(values: I) -> Vec<String>
where
A: ToString,
I: IntoIterator<Item = A>,
{
values.into_iter().map(|s| s.to_string()).collect()
}
As the other comments pointed out, Rust does not support an array of mixed types. However, you can do one extra step to convert them into a &[&dyn fmt::Display] and then call the same function perform_tasks to get their strings.
let slice: &[&dyn std::fmt::Display] = &[&"All", &3, &String::from("types!")];
perform_tasks(slice);
Here is the playground.
If I understand your intention right, what you want is like this
fn main() {
let a = 1;
myfn(a);
}
fn myfn(i: &dyn SomeTrait) {
//do something
}
So it's like implicitly borrow an object as function argument. However, Rust won't let you to implicitly borrow some objects since borrowing is quite an important safety measure in rust and & can help other programmers quickly identified which is a reference and which is not. Thus Rust is designed to enforce the & to avoid confusion.

Slice of String vs Slice &String

I was reading the doc from rust lang website and in chapter 4 they did the following example:
let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];
hello is of type &str that I created from a variable s of type String.
Some rows below they define the following function:
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
This time s is of type &String but still &s[0..i] gave me a &str slice.
How is it possible? I thought that the correct way to achieve this would be something like &((*str)[0..i]).
Am I missing something? Maybe during the [0..i] operation Rust auto deference the variable?
Thanks
Maybe during the [0..i] operation Rust auto deference the variable?
This is exactly what happens. When you call methods/index a reference, it automatically dereferences before applying the method. This behavior can also be manually implemented with the Deref trait. String implements the Deref with a target of str, which means when you call str methods on String. Read more about deref coercion here.
It's important to realize what happens with &s[1..5], and that it's &(s[1..5]), namely, s[1..5] is first first evaluated, this returns a value of type str, and a reference to that value is taken. In fact, there's even more indirection: x[y] in rust is actually syntactic sugar for *std::ops::Index::index(x,y). Note the dereference, as this function always returns a reference, which is then dereferenced by the sugar, and then it is referenced again by the & in our code — naturally, the compiler will optimize this and ensure we are not pointlessly taking references to only dereference them again.
It so happens that the String type does support the Index<Range<usize>> trait and it's Index::output type is str.
It also happens that the str type supports the same, and that it's output type is also str, viā a blanket implementation of SliceIndex.
On your question of auto-dereferencing, it is true that Rust has a Deref trait defined on String as well so that in many contexts, such as this one, &String is automatically cast to &str — any context that accepts a &str also accepts a &String, meaning that the implementation on Index<usize> on String is actually for optimization to avoid this indirection. If it not were there, the code would still work, and perhaps the compiler could even optimize the indirection away.
But that automatic casting is not why it works — it simply works because indexing is defined on many different types.
Finally:
I thought that the correct way to achieve this would be something like &((*str)[0..i]).
This would not work regardless, a &str is not the same as a &String and cannot be dereferenced to a String like a &String. In fact, a &str in many ways is closer to a String than it is to a &String. a &str is really just a fat pointer to a sequence of unicode bytes, also containing the length of said sequence in the second word; a String is, if one will, an extra-fat pointer that also contains the current capacity of the buffer with it, and owns the buffer it points to, so it can delete and resize it.

Why does the argument for the find closure need two ampersands?

I have been playing with Rust by porting my Score4 AI engine to it - basing the work on my functional-style implementation in OCaml. I specifically wanted to see how Rust fares with functional-style code.
The end result: It works, and it's very fast - much faster than OCaml. It almost touches the speed of imperative-style C/C++ - which is really cool.
There's a thing that troubles me, though — why do I need two ampersands in the last line of this code?
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let target_score = if maximize_or_minimize {
ORANGE_WINS
} else {
YELLOW_WINS
};
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
...
I added them is because the compiler errors "guided" me to it; but I am trying to understand why... I used the trick mentioned elsewhere in Stack Overflow to "ask" the compiler to tell me what type something is:
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let () = moves_and_scores;
...which caused this error:
src/main.rs:108:9: 108:11 error: mismatched types:
expected `collections::vec::Vec<(u32, i32)>`,
found `()`
(expected struct `collections::vec::Vec`,
found ()) [E0308]
src/main.rs:108 let () = moves_and_scores;
...as I expected, moves_and_scores is a vector of tuples: Vec<(u32, i32)>. But then, in the immediate next line, iter() and find() force me to use the hideous double ampersands in the closure parameter:
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
Why does the find closure need two ampersands? I could see why it may need one (pass the tuple by reference to save time/space) but why two? Is it because of the iter? That is, is the iter creating references, and then find expects a reference on each input, so a reference on a reference?
If this is so, isn't this, arguably, a rather ugly design flaw in Rust?
In fact, I would expect find and map and all the rest of the functional primitives to be parts of the collections themselves. Forcing me to iter() to do any kind of functional-style work seems burdensome, and even more so if it forces this kind of "double ampersands" in every possible functional chain.
I am hoping I am missing something obvious - any help/clarification most welcome.
This here
moves_and_scores.iter()
gives you an iterator over borrowed vector elements. If you follow the API doc what type this is, you'll notice that it's just the iterator for a borrowed slice and this implements Iterator with Item=&T where T is (u32, i32) in your case.
Then, you use find which takes a predicate which takes a &Item as parameter. Sice Item already is a reference in your case, the predicate has to take a &&(u32, i32).
pub trait Iterator {
...
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where P: FnMut(&Self::Item) -> bool {...}
... ^
It was probably defined like this because it's only supposed to inspect the item and return a bool. This does not require the item being passed by value.
If you want an iterator over (u32, i32) you could write
moves_and_scores.iter().cloned()
cloned() converts the iterator from one with an Item type &T to one with an Item type T if T is Clone. Another way to do it would be to use into_iter() instead of iter().
moves_and_scores.into_iter()
The difference between the two is that the first option clones the borrowed elements while the 2nd one consumes the vector and moves the elements out of it.
By writing the lambda like this
|&&(_, score)| score == target_score
you destructure the "double reference" and create a local copy of the i32. This is allowed since i32 is a simple type that is Copy.
Instead of destructuring the parameter of your predicate you could also write
|move_and_score| move_and_score.1 == target_score
because the dot operator automatically dereferences as many times as needed.

Why does cloned() allow this function to compile

I'm starting to learn Rust and I tried to implement a function to reverse a vector of strings. I found a solution but I don't understand why it works.
This works:
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().cloned().rev().collect();
return actual;
}
But this doesn't.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().rev().collect(); // without clone
return actual;
}
Error message
src/main.rs:28:10: 28:16 error: mismatched types:
expected `collections::vec::Vec<&str>`,
found `collections::vec::Vec<&&str>`
(expected str,
found &-ptr) [E0308]
Can someone explain to me why? What happens in the second function? Thanks!
So the call to .cloned() is essentially like doing .map(|i| i.clone()) in the same position (i.e. you can replace the former with the latter).
The thing is that when you call iter(), you're iterating/operating on references to the items being iterated. Notice that the vector already consists of 'references', specifically string slices.
So to zoom in a bit, let's replace cloned() with the equivalent map() that I mentioned above, for pedagogical purposes, since they are equivalent. This is what it actually looks like:
.map(|i: & &str| i.clone())
So notice that that's a reference to a reference (slice), because like I said, iter() operates on references to the items, not the items themselves. So since a single element in the vector being iterated is of type &str, then we're actually getting a reference to that, i.e. & &str. By calling clone() on each of these items, we go from a & &str to a &str, just like calling .clone() on a &i64 would result in an i64.
So to bring everything together, iter() iterates over references to the elements. So if you create a new vector from the collected items yielded by the iterator you construct (which you constructed by calling iter()) you would get a vector of references to references, that is:
let actual: Vec<& &str> = strings.iter().rev().collect();
So first of all realize that this is not the same as the type you're saying the function returns, Vec<&str>. More fundamentally, however, the lifetimes of these references would be local to the function, so even if you changed the return type to Vec<& &str> you would get a lifetime error.
Something else you could do, however, is to use the into_iter() method. This method actually does iterate over each element, not a reference to it. However, this means that the elements are moved from the original iterator/container. This is only possible in your situation because you're passing the vector by value, so you're allowed to move elements out of it.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.into_iter().rev().collect();
return actual;
}
playpen
This probably makes a bit more sense than cloning, since we are passed the vector by value, we're allowed to do anything with the elements, including moving them to a different location (in this case the new, reversed vector). And even if we don't, the vector will be dropped at the end of that function anyways, so we might as well. Cloning would be more appropriate if we're not allowed to do that (e.g. if we were passed the vector by reference, or a slice instead of a vector more likely).

Resources