How to avoid excessive cloning in Rust?

How to avoid excessive cloning in Rust? - rust

I'm trying to learn Rust, and like many before me set off to write a Fibonacci sequence iterator for practice. My first pass used u32s and worked fine, so I decided to try to write a generic version. This is my result:
use num::Integer;
use std::ops::Add;
pub struct Fibonacci<T: Integer + Add + Clone> {
nth: T,
n_plus_one_th: T,
}
impl<T: Integer + Add + Clone> Iterator for Fibonacci<T> {
type Item = T;
fn next(&mut self) -> Option<T> {
let temp = self.nth.clone();
self.nth = self.n_plus_one_th.clone();
self.n_plus_one_th = temp.clone() + self.n_plus_one_th.clone();
Some(temp)
}
}
impl<T: Integer + Add + Clone> Fibonacci<T> {
pub fn new() -> Fibonacci<T> {
Fibonacci {
nth: T::one(),
n_plus_one_th: T::one(),
}
}
}
I tested this with a u32 and a num::BigUint, and it works fine. I'm concerned about all the cloning in the next method though. In particular, I don't understand why I need to clone during the addition step.
I suspect there is a better way to write this using some of Rust's more advanced reference concepts, but so far I haven't figured it out.

The solution is to use a where clause like so:
extern crate num;
use num::One;
use std::ops::Add;
pub struct Fibonacci<T> {
nth: T,
n_plus_one_th: T,
}
impl<T> Fibonacci<T>
where T: One
{
pub fn new() -> Fibonacci<T> {
Fibonacci {
nth: T::one(),
n_plus_one_th: T::one(),
}
}
}
impl<T> Iterator for Fibonacci<T>
where for<'a> &'a T: Add<&'a T, Output = T>
{
type Item = T;
fn next(&mut self) -> Option<T> {
use std::mem::swap;
let mut temp = &self.nth + &self.n_plus_one_th;
swap(&mut self.nth, &mut temp);
swap(&mut self.n_plus_one_th, &mut self.nth);
Some(temp)
}
}
Specifically, the for<'a> &'a T: Add<&'a T, Output=T> clause reads as "for any lifetime 'a, &'a T must implement Add with a RHS of &'a T and Output=T. That is, you can add two &Ts to get a new T.
With that, the only remaining issue is shuffling the values around, which can be done using swap.
I also took the liberty of simplifying the constraints elsewhere (you only needed One, not Integer).

Related

Non-dyn iterable?

I am trying to avoid lifetimes because I still don't have a good understanding of the concept. I am reading this wonderful article and it clarifies many misunderstandings. Although I am not sure I can solve the problem.
I know how to implement iterable collection on dyn. Playground:
use std::collections::HashMap;
pub trait Enumerable {
fn elements<'a, 'b>(&'a self) -> Box<dyn Iterator<Item = (i32, &String)> + 'b>
where
'a: 'b;
}
#[derive(Debug)]
struct Container {
pub map: HashMap<i32, String>,
}
impl Enumerable for Container {
fn elements<'a, 'b>(&'a self) -> Box<dyn Iterator<Item = (i32, &String)> + 'b>
where
'a: 'b,
{
Box::new(self.map.iter().map(|el| (*el.0, el.1)))
}
}
My attempt to implement the same code without dyn. Playground:
use std::collections::HashMap;
pub trait Enumerable<'it, 'it2, It>
where
It: Iterator<Item = &'it2 (i32, &'it String)>,
'it: 'it2,
{
fn elements<'a, 'b>(&'a self) -> It
where
'a: 'b;
}
#[derive(Debug)]
struct Container {
pub map: HashMap<i32, String>,
}
impl<'it, 'it2> Enumerable<'it, 'it2, core::slice::Iter<'it, (i32, &String)>> for Container {
fn elements<'a, 'b>(&'a self) -> core::slice::Iter<'a, (i32, &'b String)>
where
'a: 'b,
{
self.map.iter().map(|el| (*el.0, el.1))
}
}
I was thinking about using impls but there is a restriction on using it in a trait. What is wrong with the code? What other useful articles can you recommend?

Apart of the fact that I don't think you need two separate lifetimes 'a and 'b, your first code example already looks quite promising.
Then, once you have only one lifetime, Rust can figure out lifetimes without any annotations:
use std::collections::HashMap;
pub trait Enumerable {
fn elements(&self) -> Box<dyn Iterator<Item = (i32, &String)> + '_>;
}
#[derive(Debug)]
struct Container {
pub map: HashMap<i32, String>,
}
impl Enumerable for Container {
fn elements(&self) -> Box<dyn Iterator<Item = (i32, &String)> + '_> {
Box::new(self.map.iter().map(|el| (*el.0, el.1)))
}
}
I know this is an XY-problem answer, but maybe it helps anyway. Your main reasoning behind not using dyn was to not deal with lifetimes, so I thought this might be relevant.

There is a solution to the original problem with GAT and TAIT which are not part of stable channgel for today.
Solution is
mod mod1 {
pub trait Enumerable {
type It<'it>: Iterator<Item = (i32, &'it String)>
where
Self: 'it;
fn elements(&self) -> Self::It<'_>;
}
}
//
impl mod1::Enumerable for Container {
type It<'it> = impl Iterator<Item = (i32, &'it String)>;
fn elements(&self) -> Self::It<'_> {
self.map.iter().map(|el| (*el.0, el.1))
}
}
Full solution
There are alternative solutions, but this one works even if the trait is not part of your crate.
Also, I should note, that if possible to avoid using lifetimes you can implement IntoIterator for your &Container:
impl< 'it > IntoIterator for &'it Container
{
type Item = ( &'it i32, &'it String );
type IntoIter = std::collections::hash_map::Iter< 'it, i32, String >;
fn into_iter( self ) -> Self::IntoIter
{
self.map.iter()
}
}
Full solution of a tweaked problem
Because the lifetime is dropped that works even on stable Rust.
Most probably you want to have your own InotIterator-like trait, especially if there is more than a single way how can you iterate your container, but if not you can simply implement standard IntoIterator for reference.

Trouble implementing custom IntoIterator trait

I'm new to rust, so forgive me if the question is naive.
I'm trying to build an OS in rust and I'm following this tutorial. The OS doesn't have memory management yet, so the goal is to build an object which is like a vector in that it can be pushed and popped etc, but it lives on the stack. We do this by initializing it with an array of fixed size. It looks like this:
#[derive(Debug)]
pub struct StackVec<'a, T: 'a> {
storage: &'a mut [T],
len: usize
}
impl<'a, T: 'a> StackVec<'a, T> {
pub fn new(storage: &'a mut [T]) -> StackVec<'a, T> {
StackVec {
storage: storage,
len: 0,
}
}
pub fn with_len(storage: &'a mut [T], len: usize) -> StackVec<'a, T> {
if len > storage.len(){
panic!();
}
StackVec{
storage: storage,
len: len
}
}
pub fn capacity(&self) -> usize {
self.storage.len()
}
pub fn into_slice(self) -> &'a mut [T] {
&mut self.storage[0..self.len]
}
// Other functions which aren't relevant for the question.
}
Popping and pushing increases and decreases the len variable and adds and removes entries from the appropriate place in the array.
Now, we also need to implement the IntoIterator trait. Given that the StackVec contains a reference to an array, I thought that I could just return an iterator from the underlying array:
impl <'a, T:'a> IntoIterator for StackVec<'a, T> {
type Item = T;
type IntoIter = core::array::IntoIter; // <- Throws "not found in `core::array"
fn into_iter(self) -> Self::IntoIter {
self.into_slice().into_iter()
}
}
But no matter how much I play around with it, it still doesn't want to compile. I can't find a way to express using types that into_itershould return the iterator for the array. What am I doing wrong?

Different problems here:
You cannot use array::IntoIterator because you do not have an array, you have a slice, which is quite different. It can be solved, for example, by using the proper core::slice::Iter as in the example.
You are trying to return T but in reality you only give access to &T, so return Item should be &T
Your into_slice method uses a &mut which is not necessary, you can reslice the storage for this implementation.
impl <'a, T:'a> IntoIterator for StackVec<'a, T> {
type Item = &'a T;
type IntoIter = core::slice::Iter<'a, T>;
fn into_iter(self) -> Self::IntoIter {
self.storage[0..self.len].into_iter()
}
}
Playground

Lifetimes for generalizing trait implementation from &T to AsRef<T>

I have some problems generalizing a trait working for &str to other string types (e.g. Rc<str>, Box<str>,String).
First of all my example function should work for:
assert_eq!(count("ababbc", 'a'), 2); // already working
assert_eq!(count(Rc::from("ababbc"), 'a'), 2); // todo
assert_eq!(count("ababbc".to_string(), 'a'), 2); // todo
This is the working code, which makes the first test run:
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A, I>
where
I: Iterator<Item = A>,
A: Atom,
{
fn atoms(&self) -> I;
}
impl<'a> Atoms<char, std::str::Chars<'a>> for &'a str {
fn atoms(&self) -> std::str::Chars<'a> {
self.chars()
}
}
pub fn count<I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A>,
T: Atoms<A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
To make the next tests run, I changed the signature of count and Atoms in following way:
pub trait Atoms<'a, A, I>
where
I: Iterator<Item = A> + 'a,
A: Atom,
{
fn atoms<'b>(&'b self) -> I
where
'b: 'a;
}
impl<'a, S> Atoms<'a, char, std::str::Chars<'a>> for S
where
S: AsRef<str> + 'a,
{
fn atoms<'b>(&'b self) -> std::str::Chars<'b>
where
'b: 'a,
{
self.as_ref().chars()
}
}
but now the function count does not compile any more:
pub fn count<'a, I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A> + 'a,
T: Atoms<'a, A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
Playground-Link
The compiler error is: the parameter type 'T' may not live long enough ... consider adding an explicit lifetime bound...: 'T: 'a'. This is not completely understandable for me, so I tried to apply the help T: Atoms<'a, A, I> + 'a. Now the error is: 'pattern' does not live long enough ... 'pattern' dropped here while still borrowed.
Since the latter error also occurs without implementations of Atoms and by just replacing the function body by pattern.atoms();return 1; I suspect that the type signature of Atoms is not suitable for my purpose.
Has anybody a hint what could be wrong with the type signature of Atoms or count?

trait Atoms<'a, A, I> where I: Iterator<Item = A> + 'a ...
By writing this, you're requiring the iterator to outlive the lifetime 'a. Since 'a is part of the trait's generic parameters, it must be a lifetime that extends before you start using (implicitly) <String as Atoms<'a, char, std::str::Chars>>::atoms. This directly contradicts the idea of returning a new iterator object, since that object did not exist before the call to atoms().
Unfortunately, I'm not sure how to rewrite this so that it will work, without generic associated types (a feature not yet stabilized), but I hope that at least this explanation of the problem helps.

I hope I clarify at least some parts of the problem. After doing some internet research on HRTBs and GATs I come to the conclusion that my problem is hardly solvable with stable rust.
The main problem is that one cannot
have a trait with different lifetime signature than its implementations
keep lifetimes generic in a trait for later instantiation in the implementation
limit the upper bound of a results lifetime if it is owned
I tried several approaches to but most evolve to fail:
at compiling the implementation (because the implementations lifetimes conflict with those of the trait)
at compiling the caller of the trait because a compiling implementation limits the lifetimes in a way, that no object can satisfy them.
At last I found two solutions:
Implement the trait for references
the function atoms(self) does now expect Self and not &Self
Atoms<A,I> is implemented for &'a str and &'a S where S:AsRef<str>
This gives us control of the lifetimes of the self objects ('a) and strips the lifetime completely from the trait.
The disadvantage of this approach is that we have to pass references to our count function even for smart references.
Playground-Link
use std::fmt::Display;
use std::fmt::Debug;
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A, I>
where
I: Iterator<Item = A>,
A: Atom,
{
fn atoms(self) -> I;
}
impl<'a> Atoms<char, std::str::Chars<'a>> for &'a str {
fn atoms(self) -> std::str::Chars<'a> {
self.chars()
}
}
impl<'a, S> Atoms<char, std::str::Chars<'a>> for &'a S
where
S: AsRef<str>,
{
fn atoms(self) -> std::str::Chars<'a> {
self.as_ref().chars()
}
}
pub fn count<I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A>,
T: Atoms<A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
#[cfg(test)]
mod tests {
use std::rc::Rc;
use super::*;
#[test]
fn test_example() {
assert_eq!(count("ababbc", 'a'), 2);
assert_eq!(count(&"ababbc".to_string(), 'a'), 2);
assert_eq!(count(&Rc::from("ababbc"), 'a'), 2);
}
}
Switch to Generic Associated Types (unstable)
reduce the generic type Atoms<A,I> to an type Atoms<A> with an generic associated type I<'a> (which is instantiable at implementations)
now the function count can refer to the lifetime of I like this fn atoms<'a>(&'a self) -> Self::I<'a>
and all implementations just have to define how the want to map the lifetime 'a to their own lifetime (for example to Chars<'a>)
In this case we have all lifetime constraints in the trait, the implementation can consider to map this lifetime or ignore it. The trait, the implementation and the call site are concise and do not require references or helper lifetimes.
The disadvantage of this solution is that it is unstable, I do not know whether this means that runtime failures would probably occur or that api could change (or both). You will have to activate #![feature(generic_associated_types)] to let it run.
Playground-Link
use std::{fmt::Display, str::Chars};
use std::{fmt::Debug, rc::Rc};
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A>
where
A: Atom,
{
type I<'a>: Iterator<Item = A>;
fn atoms<'a>(&'a self) -> Self::I<'a>;
}
impl Atoms<char> for str
{
type I<'a> = Chars<'a>;
fn atoms<'a>(&'a self) -> Chars<'a> {
self.chars()
}
}
impl <S> Atoms<char> for S
where
S: AsRef<str>,
{
type I<'a> = Chars<'a>;
fn atoms<'a>(&'a self) -> Chars<'a> {
self.as_ref().chars()
}
}
pub fn count<A, S>(pattern: S, item: A) -> usize
where
A: Atom,
S: Atoms<A>,
{
pattern.atoms().filter(|i| *i == item).count()
}
#[cfg(test)]
mod tests {
use std::rc::Rc;
use super::*;
#[test]
fn test_example() {
assert_eq!(count("ababbc", 'a'), 2);
assert_eq!(count("ababbc".to_string(), 'a'), 2);
assert_eq!(count(Rc::from("ababbc"), 'a'), 2);
}
}

How to derive Clone for structures with boxed closure?

I have the following structure:
struct MyStruct {
foo: Box<dyn Fn(usize) -> usize>
}
And I want to derive Clone for MyStruct. However, the compiler turns out error:
the trait bound `dyn std::ops::Fn(usize) -> usize: std::clone::Clone` is not satisfied
And for now (rustc 1.46.0), the dyn does not support addition of non-auto traits like Box<dyn Fn(usize) -> usize + Clone>.
From the rust-reference:
A closure is Clone or Copy if it does not capture any values by unique immutable or mutable reference, and if all values it captures by copy or move are Clone or Copy, respectively.
So I think this derivation makes sense in theory, but I don't know if I can do it for now.
I don't mind manually implementing Clone for MyStruct, but I don't know how to do it, either.
I don't want to do something like
#[derive(Clone)]
struct MyStruct<F: Fn(usize) -> usize> {
foo: F
}
because this struct is used as an associated type for an implementation of another struct for a trait, and both that struct and the trait has no generics, and I don't want to messed up with PhantomData.

Expanding on #mcarton comment above, but using the dyn-clone crate to overcome Clone not being object safe, the code is quite straightforward.
use dyn_clone::DynClone;
trait FnClone: DynClone {
fn call(&self, x: usize) -> usize;
}
impl<F> FnClone for F
where F: Fn(usize) -> usize + Clone
{
fn call(&self, x: usize) -> usize {
self(x)
}
}
struct MyStruct {
foo: Box<dyn FnClone>
}
impl Clone for MyStruct {
fn clone(&self) -> MyStruct {
MyStruct {
foo: dyn_clone::clone_box(&*self.foo),
}
}
}
fn main() {
let a = MyStruct{
foo: Box::new(|x| x + 1)
};
let b = a.clone();
println!("{} {}", a.foo.call(1), b.foo.call(2));
}
This crate also provides a macro clone_trait_object! to make the Clone derivable for the Box<dyn FnClone>, but you don't seem to want that.

How do I use a &HashSet<&T> as an IntoIterator<Item = &T>?

I have a function that takes a collection of &T (represented by IntoIterator) with the requirement that every element is unique.
fn foo<'a, 'b, T: std::fmt::Debug, I>(elements: &'b I)
where
&'b I: IntoIterator<Item = &'a T>,
T: 'a,
'b: 'a,
I would like to also write a wrapper function which can work even if the elements are not unique, by using a HashSet to remove the duplicate elements first.
I tried the following implementation:
use std::collections::HashSet;
fn wrap<'a, 'b, T: std::fmt::Debug + Eq + std::hash::Hash, J>(elements: &'b J)
where
&'b J: IntoIterator<Item = &'a T>,
T: 'a,
'b: 'a,
{
let hashset: HashSet<&T> = elements.into_iter().into_iter().collect();
foo(&hashset);
}
playground
However, the compiler doesn't seem happy with my assumption that HashSet<&T> implements IntoIterator<Item = &'a T>:
error[E0308]: mismatched types
--> src/lib.rs:10:9
|
10 | foo(&hashset);
| ^^^^^^^^ expected type parameter, found struct `std::collections::HashSet`
|
= note: expected type `&J`
found type `&std::collections::HashSet<&T>`
= help: type parameters must be constrained to match other types
= note: for more information, visit https://doc.rust-lang.org/book/ch10-02-traits.html#traits-as-parameters
I know I could use a HashSet<T> by cloning all the input elements, but I want to avoid unnecessary copying and memory use.

If you have a &HashSet<&T> and need an iterator of &T (not &&T) that you can process multiple times, then you can use Iterator::copied to convert the iterator's &&T to a &T:
use std::{collections::HashSet, fmt::Debug, hash::Hash, marker::PhantomData};
struct Collection<T> {
item: PhantomData<T>,
}
impl<T> Collection<T>
where
T: Debug,
{
fn foo<'a, I>(elements: I) -> Self
where
I: IntoIterator<Item = &'a T> + Clone,
T: 'a,
{
for element in elements.clone() {
println!("{:?}", element);
}
for element in elements {
println!("{:?}", element);
}
Self { item: PhantomData }
}
}
impl<T> Collection<T>
where
T: Debug + Eq + Hash,
{
fn wrap<'a, I>(elements: I) -> Self
where
I: IntoIterator<Item = &'a T>,
T: 'a,
{
let set: HashSet<_> = elements.into_iter().collect();
Self::foo(set.iter().copied())
}
}
#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
Collection::<Foo>::wrap(&v);
}
See also:
Using the same iterator multiple times in Rust
Does cloning an iterator copy the entire underlying vector?
Why does cloning my custom type result in &T instead of T?
Note that the rest of this answer made the assumption that a struct named Collection<T> was a collection of values of type T. OP has clarified that this is not true.
That's not your problem, as shown by your later examples. That can be boiled down to this:
struct Collection<T>(T);
impl<T> Collection<T> {
fn new(value: &T) -> Self {
Collection(value)
}
}
You are taking a reference to a type (&T) and trying to store it where a T is required; these are different types and will generate an error. You are using PhantomData for some reason and accepting references via the iterator, but the problem is the same.
In fact, PhantomData makes the problem harder to see as you can just make up values that don't work. For example, we never have any kind of string here but we "successfully" created the struct:
use std::marker::PhantomData;
struct Collection<T>(PhantomData<T>);
impl Collection<String> {
fn new<T>(value: &T) -> Self {
Collection(PhantomData)
}
}
Ultimately, your wrap function doesn't make sense, either:
impl<T: Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
This is equivalent to
impl<T: Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Collection<T>
where
I: IntoIterator<Item = T>,
Which says that, given an iterator of elements T, you will return a collection of those elements. However, you put them in a HashMap and iterate on a reference to it, which yields &T. Thus this function signature cannot be right.
It seems most likely that you want to accept an iterator of owned values instead:
use std::{collections::HashSet, fmt::Debug, hash::Hash};
struct Collection<T> {
item: T,
}
impl<T> Collection<T> {
fn foo<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
for<'a> &'a I: IntoIterator<Item = &'a T>,
T: Debug,
{
for element in &elements {
println!("{:?}", element);
}
for element in &elements {
println!("{:?}", element);
}
Self {
item: elements.into_iter().next().unwrap(),
}
}
}
impl<T> Collection<T>
where
T: Eq + Hash,
{
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
T: Debug,
{
let s: HashSet<_> = elements.into_iter().collect();
Self::foo(s)
}
}
#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
let c = Collection::wrap(v);
println!("{:?}", c.item)
}
Here we place a trait bound on the generic iterator type directly and a second higher-ranked trait bound on a reference to the iterator. This allows us to use a reference to the iterator as an iterator itself.
See also:
How does one generically duplicate a value in Rust?
Is there any way to return a reference to a variable created in a function?
How do I write the lifetimes for references in a type constraint when one of them is a local reference?

There were a number of orthogonal issues with my code that Shepmaster pointed out, but to solve the issue of using a HashSet<&T> as an IntoIterator<Item=&T>, I found that one way to solve it is with a wrapper struct:
struct Helper<T, D: Deref<Target = T>>(HashSet<D>);
struct HelperIter<'a, T, D: Deref<Target = T>>(std::collections::hash_set::Iter<'a, D>);
impl<'a, T, D: Deref<Target = T>> Iterator for HelperIter<'a, T, D>
where
T: 'a,
{
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
self.0.next().map(|x| x.deref())
}
}
impl<'a, T, D: Deref<Target = T>> IntoIterator for &'a Helper<T, D> {
type Item = &'a T;
type IntoIter = HelperIter<'a, T, D>;
fn into_iter(self) -> Self::IntoIter {
HelperIter((&self.0).into_iter())
}
}
Which is used as follows:
struct Collection<T> {
item: PhantomData<T>,
}
impl<T: Debug> Collection<T> {
fn foo<I>(elements: I) -> Self
where
I: IntoIterator + Copy,
I::Item: Deref<Target = T>,
{
for element in elements {
println!("{:?}", *element);
}
for element in elements {
println!("{:?}", *element);
}
return Self { item: PhantomData };
}
}
impl<T: Debug + Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator + Copy,
I::Item: Deref<Target = T> + Eq + Hash,
{
let helper = Helper(elements.into_iter().collect());
Self::foo(&helper);
return Self { item: PhantomData };
}
}
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
Collection::<Foo>::wrap(&v);
}
I'm guessing that some of this may be more complicated than it needs to be, but I'm not sure how.
full playground

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to avoid excessive cloning in Rust? - rust

Related

Non-dyn iterable?

Trouble implementing custom IntoIterator trait

Lifetimes for generalizing trait implementation from &T to AsRef<T>

How to derive Clone for structures with boxed closure?

How do I use a &HashSet<&T> as an IntoIterator<Item = &T>?

Categories

Resources