Why does std::vec::Vec implement two kinds of the Extend trait? - rust

The struct std::vec::Vec implements two kinds of Extend, as specified here – impl<'a, T> Extend<&'a T> for Vec<T> and impl<T> Extend<T> for Vec<T>. The documentation states that the first kind is an "Extend implementation that copies elements out of references before pushing them onto the Vec". I'm rather new to Rust, and I'm not sure if I'm understanding it correctly.
I would guess that the first kind is used with the equivalent of C++ normal iterators, and the second kind is used with the equivalent of C++ move iterators.
I'm trying to write a function that accepts any data structure that will allow inserting i32s to the back, so I take a parameter that implements both kinds of Extend, but I can't figure out how to specify the generic parameters to get it to work:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<'a, Rec: std::iter::Extend<i32> + std::iter::Extend<&'a i32>>(receiver: &mut Rec) {
let x = 1 + 4;
receiver.extend(&[x]);
}
The compiler complains that &[x] "creates a temporary which is freed while still in use" which makes sense because 'a comes from outside the function add_stuff. But of course what I want is for receiver.extend(&[x]) to copy the element out of the temporary array slice and add it to the end of the container, so the temporary array will no longer be used after receiver.extend returns. What is the proper way to express what I want?

From the outside of add_stuff, Rect must be able to be extended with a reference whose lifetime is given in the inside of add_stuff. Thus, you could require that Rec must be able to be extended with references of any lifetime using higher-ranked trait bounds:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<Rec>(receiver: &mut Rec)
where
for<'a> Rec: std::iter::Extend<&'a i32>
{
let x = 1 + 4;
receiver.extend(&[x]);
}
Moreover, as you see, the trait bounds were overly tight. One of them should be enough if you use receiver consistently within add_stuff.
That said, I would simply require Extend<i32> and make sure that add_stuff does the right thing internally (if possible):
fn add_stuff<Rec>(receiver: &mut Rec)
where
Rec: std::iter::Extend<i32>
{
let x = 1 + 4;
receiver.extend(std::iter::once(x));
}

Related

How to convert from `cell::Ref<'_, [u8]>` into `&[u8]` or the other way around?

I'm modifying a library that holds Items returned by ChunkExact slice iterator. My iterator requires interior mutability and uses a RefCell. As a result of that, my iterator cannot return Items of type &[u8] but returns Ref<'_, [u8]> instead. The two types seem to be equivalent for practical use. For instance, this:
for i in slice.chunks_exact(2) {
println!("{:?}", i);
}
works just as well as this:
for my_iterator() {
println!("{:?}", i);
}
Full working example in the playground.
Close as they are, I cannot convert one item into the other:
= note: expected enum `Option<Ref<'_, [u8]>>`
found enum `Option<&[u8]>`
I saw the experimental cell::Ref::leak method, but seems like what I'm trying to do should not require such a scary feature...
You can use Option::as_deref:
let a: Option<Ref<'_, [u8]>> = ...;
let b: Option<&[u8]> = a.as_deref();
Ref<'_, T> implements Deref<Target = T> which is why Ref<'_, [u8]> can be used like a &[u8] in most contexts. However, they are still different types, which is why you get the error on assigning one to the other in an option.
You cannot go the other way around and create a Ref<'_, T> from a &T since there is no RefCell for it to reference.
This will only work in a context where the original Option<Ref> is kept around. You cannot use this to convert your iterator from returning Ref<'_, [u8]>s to returning &[u8]s. This is because the lifetime of the &[u8] is bound to the Ref, not the original RefCell.

Proper signature for a function accepting an iterator of strings

I'm confused about the proper type to use for an iterator yielding string slices.
fn print_strings<'a>(seq: impl IntoIterator<Item = &'a str>) {
for s in seq {
println!("- {}", s);
}
}
fn main() {
let arr: [&str; 3] = ["a", "b", "c"];
let vec: Vec<&str> = vec!["a", "b", "c"];
let it: std::str::Split<'_, char> = "a b c".split(' ');
print_strings(&arr);
print_strings(&vec);
print_strings(it);
}
Using <Item = &'a str>, the arr and vec calls don't compile. If, instead, I use <Item = &'a'a str>, they work, but the it call doesn't compile.
Of course, I can make the Item type generic too, and do
fn print_strings<'a, I: std::fmt::Display>(seq: impl IntoIterator<Item = I>)
but it's getting silly. Surely there must be a single canonical "iterator of string values" type?
The error you are seeing is expected because seq is &Vec<&str> and &Vec<T> implements IntoIterator with Item=&T, so with your code, you end up with Item=&&str where you are expecting it to be Item=&str in all cases.
The correct way to do this is to expand Item type so that is can handle both &str and &&str. You can do this by using more generics, e.g.
fn print_strings(seq: impl IntoIterator<Item = impl AsRef<str>>) {
for s in seq {
let s = s.as_ref();
println!("- {}", s);
}
}
This requires the Item to be something that you can retrieve a &str from, and then in your loop .as_ref() will return the &str you are looking for.
This also has the added bonus that your code will also work with Vec<String> and any other type that implements AsRef<str>.
TL;DR The signature you use is fine, it's the callers that are providing iterators with wrong Item - but can be easily fixed.
As explained in the other answer, print_string() doesn't accept &arr and &vec because IntoIterator for &[T; n] and &Vec<T> yield references to T. This is because &Vec, itself a reference, is not allowed to consume the Vec in order to move T values out of it. What it can do is hand out references to T items sitting inside the Vec, i.e. items of type &T. In the case of your callers that don't compile, the containers contain &str, so their iterators hand out &&str.
Other than making print_string() more generic, another way to fix the issue is to call it correctly to begin with. For example, these all compile:
print_strings(arr.iter().map(|sref| *sref));
print_strings(vec.iter().copied());
print_strings(it);
Playground
iter() is the method provided by slices (and therefore available on arrays and Vec) that iterates over references to elements, just like IntoIterator of &Vec. We call it explicitly to be able to call map() to convert &&str to &str the obvious way - by using the * operator to dereference the &&str. The copied() iterator adapter is another way of expressing the same, possibly a bit less cryptic than map(|x| *x). (There is also cloned(), equivalent to map(|x| x.clone()).)
It's also possible to call print_strings() if you have a container with String values:
let v = vec!["foo".to_owned(), "bar".to_owned()];
print_strings(v.iter().map(|s| s.as_str()));

Is it possible to make my own Box-like wrapper?

I noticed that Box<T> implements everything that T implements and can be used transparently. For Example:
let mut x: Box<Vec<u8>> = Box::new(Vec::new());
x.push(5);
I would like to be able to do the same.
This is one use case:
Imagine I'm writing functions that operate using an axis X and an axis Y. I'm using values to change those axis that are of type numbers but belongs only to one or the other axis.
I would like my compiler to fail if I attempt to do operations with values that doesn't belong to the good axis.
Example:
let x = AxisX(5);
let y = AxisY(3);
let result = x + y; // error: incompatible types
I can do this by making a struct that will wrap the numbers:
struct AxisX(i32);
struct AxisY(i32);
But that won't give me access to all the methods that i32 provides like abs(). Example:
x.abs() + 3 // error: abs() does not exist
// ...maybe another error because I don't implement the addition...
Another possible use case:
You can appropriate yourself a struct of another library and implement or derive anything more you would want. For example: a struct that doesn't derive Debug could be wrapped and add the implementation for Debug.
You are looking for std::ops::Deref:
In addition to being used for explicit dereferencing operations with the (unary) * operator in immutable contexts, Deref is also used implicitly by the compiler in many circumstances. This mechanism is called 'Deref coercion'. In mutable contexts, DerefMut is used.
Further:
If T implements Deref<Target = U>, and x is a value of type T, then:
In immutable contexts, *x on non-pointer types is equivalent to *Deref::deref(&x).
Values of type &T are coerced to values of type &U
T implicitly implements all the (immutable) methods of the type U.
For more details, visit the chapter in The Rust Programming Language as well as the reference sections on the dereference operator, method resolution and type coercions.
By implementing Deref it will work:
impl Deref for AxisX {
type Target = i32;
fn deref(&self) -> &i32 {
&self.0
}
}
x.abs() + 3
You can see this in action on the Playground.
However, if you call functions from your underlying type (i32 in this case), the return type will remain the underlying type. Therefore
assert_eq!(AxisX(10).abs() + AxisY(20).abs(), 30);
will pass. To solve this, you may overwrite some of those methods you need:
impl AxisX {
pub fn abs(&self) -> Self {
// *self gets you `AxisX`
// **self dereferences to i32
AxisX((**self).abs())
}
}
With this, the above code fails. Take a look at it in action.

How do I create an iterator that invalidates the previous result each time `next` is called? [duplicate]

This would make it possible to safely iterate over the same element twice, or to hold some state for the global thing being iterated over in the item type.
Something like:
trait IterShort<Iter>
where Self: Borrow<Iter>,
{
type Item;
fn next(self) -> Option<Self::Item>;
}
then an implementation could look like:
impl<'a, MyIter> IterShort<MyIter> for &'a mut MyIter {
type Item = &'a mut MyItem;
fn next(self) -> Option<Self::Item> {
// ...
}
}
I realize I could write my own (I just did), but I'd like one that works with the for-loop notation. Is that possible?
The std::iter::Iterator trait can not do this, but you can write a different trait:
trait StreamingIterator {
type Item;
fn next<'a>(&'a mut self) -> Option<&'a mut Self::Item>;
}
Note that the return value of next borrows the iterator itself, whereas in Vec::iter for example it only borrows the vector.
The downside is that &mut is hard-coded. Making it generic would require higher-kinded types (so that StreamingIterator::Item could itself be generic over a lifetime parameter).
Alexis Beingessner gave a talk about this and more titled Who Owns This Stream of Data? at RustCamp.
As to for loops, they’re really tied to std::iter::IntoIterator which is tied to std::iter::Iterator. You’d just have to implement both.
The standard iterators can't do this as far as I can see. The very definition of an iterator is that the outside has control over the elements while the inside has control over what produces the elements.
From what I understand of what you are trying to do, I'd flip the concept around and instead of returning elements from an iterator to a surrounding environment, pass the environment to the iterator. That is, you create a struct with a constructor function that accepts a closure and implements the iterator trait. On each call to next, the passed-in closure is called with the next element and the return value of that closure or modifications thereof are returned as the current element. That way, next can handle the lifetime of whatever would otherwise be returned to the surrounding environment.

Returning a borrow from an owned resource

I am trying to write a function that maps an Arc<[T]> into an Iterable, for use with flat_map (that is, I want to call i.flat_map(my_iter) for some other i: Iterator<Item=Arc<[T]>>).
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T> {
let t: &'a [T] = &*n.clone();
t.into_iter()
}
The function above does not work because n.clone() produces an owned value of type Arc<[T]>, which I can dereference to [T] and then borrow to get &[T], but the lifetime of the borrow only lasts until the end of the function, while the 'a lifetime lasts until the client drops the returned iterator.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
Here's some sample code for the source iterator:
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
How do I implement the result of BaseIter(data).flat_map(my_iter) (which is of type Iterator<&T>) given that BaseIter is producing data, not just borrowing it? (The real thing is more complicated than this, it's not always the same result, but the ownership semantics are the same.)
You cannot do this. Remember, lifetimes in Rust are purely compile-time entities and are only used to validate that your code doesn't accidentally access dropped data. For example:
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T>
Here 'a does not "last until the client drops the returned iterator"; this reasoning is incorrect. From the point of view of slice::Iter its lifetime parameter means the lifetime of the slice it is pointing at; from the point of view of my_iter 'a is just a lifetime parameter which can be chosen arbitrarily by the caller. In other words, slice::Iter is always tied to some slice with some concrete lifetime, but the signature of my_iter states that it is able to return arbitrary lifetime. Do you see the contradiction?
As a side note, due to covariance of lifetimes you can return a slice of a static slice from such a function:
static DATA: &'static [u8] = &[1, 2, 3];
fn get_data<'a>() -> &'a [u8] {
DATA
}
The above definition compiles, but it only works because DATA is stored in static memory of your program and is always valid when your program is running; this is not so with Arc<[T]>.
Arc<[T]> implies shared ownership, that is, the data inside Arc<[T]> is jointly owned by all clones of the original Arc<[T]> value. Therefore, when the last clone of an Arc goes out of scope, the value it contains is dropped, and the respective memory is freed. Now, consider what would happen if my_iter() was allowed to compile:
let iter = {
let data: Arc<[i32]> = get_arc_slice();
my_iter(data.clone())
};
iter.map(|x| x+1).collect::<Vec<_>>();
Because in my_iter() 'a can be arbitrary and is not linked in any way to Arc<[T]> (and can not be, actually), nothing prevents this code from compilation - the user might as well choose 'static lifetime. However, here all clones of data will be dropped inside the block, and the array it contains inside will be freed. Using iter after the block is unsafe because it now provides access to the freed memory.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
So, as follows from the above, this is impossible. Only the owner of the data determines when this data should be destroyed, and borrowed references (whose existence is always implied by lifetime parameters) may only borrow the data for the time when it exists, but borrows cannot affect when and how the data is destroyed. In order for borrowed references to compile, they need to always borrow only the data which is valid through the whole time these references are active.
What you can do is to rethink your architecture. It is hard to say what exactly can be done without looking at the full code, but in the case of this particular example you can, for example, first collect the iterator into a vector and then iterate over the vector:
let items: Vec<_> = your_iter.collect();
items.iter().flat_map(my_iter)
Note that now my_iter() should indeed accept &Arc<[T]>, just as Francis Gagné has suggested; this way, the lifetimes of the output iterator will be tied to the lifetime of the input reference, and everything should work fine, because now it is guaranteed that Arcs are stored stably in the vector for their later perusal during the iteration.
There's no way to make this work by passing an Arc<[T]> by value. You need to start from a reference to an Arc<[T]> in order to construct a valid slice::Iter.
fn my_iter<'a, T>(n: &'a Arc<[T]>) -> slice::Iter<'a, T> {
n.into_iter()
}
Or, if we elide the lifetimes:
fn my_iter<T>(n: &Arc<[T]>) -> slice::Iter<T> {
n.into_iter()
}
You need to use another iterator as return type of the function my_iter. slice::Iter<'a, T> has an associated type Item = &'a T. You need an iterator with associated type Item = T. Something like vec::IntoIter<T>. You can implement such an iterator yourself:
use std::sync::Arc;
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
struct ArcIntoIter<T>(usize, Arc<[T]>);
impl<T:Clone> Iterator for ArcIntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
if self.0 < self.1.len(){
let i = self.0;
self.0+=1;
Some(self.1[i].clone())
}else{
None
}
}
}
fn my_iter<T>(n: Arc<[T]>) -> ArcIntoIter<T> {
ArcIntoIter(0, n)
}
fn main() {
let data = Arc::new(["A","B","C"]);
println!("{:?}", BaseIter(data).take(3).flat_map(my_iter).collect::<String>());
//output:"ABCABCABC"
}

Resources