What is the difference between to_owned() and clone() in this context? - rust

pub fn set(&mut self, key: String, value: String) -> Result<()> {
let cmd = Command::Set {
key: key.clone(),
value: value.to_owned(),
};
serde_json::to_writer(&mut self.writer, &cmd)?;
self.writer.flush()?;
self.map.insert(key, value);
Ok(())
}
In this function I can either use .clone() or to_owned() to create that struct from my 2 function parameter String's key and value. What is the difference and which would be better for this situation?

The difference is explained in the docs of the ToOwned trait:
Some types make it possible to go from borrowed to owned, usually by
implementing the Clone trait.
But Clone works only for going from &T to T. The ToOwned trait
generalizes Clone to construct owned data from any borrow of a given
type.
In your particular case, however, given that you build a cmd for the whole purpose of passing a reference to it to serde_json::to_writer(&mut self.writer, &cmd)?;, you might want to modify the struct's fields to be just references to the types, something like this:
struct Cmd<'s> {
key: &'s str,
value: &'s str,
}
This will avoid you having to clone anything.

Related

Proper way in Rust to store a reference in a struct

What is the proper way to store a reference in a struct and operate on it given this example:
// Trait that cannot be changed
pub trait FooTrait {
pub fn open(&self, client: &SomeType);
pub fn close(&self);
}
pub struct Foo {
// HOW TO STORE IT HERE???
// client: &SomeType,
}
impl FooTrait for Foo {
pub fn open(&self, client: &SomeType) {
// HOW TO SAVE IT HERE?
// NOTE that &self cannot be changed into &mut self because the trait cannot be modified
// smth like self.client = client;
}
pub fn close(&self) {
// HOW TO DELETE IT HERE?
// NOTE that &self cannot be changed into &mut self because the trait cannot be modified
}
}
Is there a design pattern that could fit to my snippet?
This is horribly complicated on its surface because of lifetime issues. Rust is designed to guarantee memory safety, but this pattern creates an untenable situation where the caller of FooTrait::open() needs some way to tell Rust that the client borrow will outlive *self. If it can't do that, Rust will disallow the method call. Actually making this work with references is probably not feasible, as Foo needs a lifetime parameter, but the code that creates the Foo may not know the appropriate lifetime parameter.
You can make this pattern work by combining a few things, but only if you can modify the trait. If you can't change the definition of the trait, then what you are asking is impossible.
You need an Option so that close can clear the value.
You need interior mutability (a Cell) to allow mutating self.client even if self is a shared reference.
You need something other than a bare reference. An owned value or a shared ownership type like Rc or Arc, for example. These types sidestep the lifetime issue entirely. You can make the code generic over Borrow<T> to support them all at once.
use std::cell::Cell;
use std::borrow::Borrow;
pub trait FooTrait {
fn open(&self, client: impl Borrow<SomeType> + 'static);
fn close(&self);
}
pub struct SomeType;
pub struct Foo {
client: Cell<Option<Box<dyn Borrow<SomeType>>>>,
}
impl FooTrait for Foo {
fn open(&self, client: impl Borrow<SomeType> + 'static) {
self.client.set(Some(Box::new(client)));
}
fn close(&self) {
self.client.set(None);
}
}

Why doesn't clone() allow for this move?

I don't understand why the following doesn't work:
use std::collections::HashMap;
#[derive(Debug,Clone,PartialEq)]
struct Foo<'a> {
contents: HashMap<&'a str, Foo<'a>>,
}
fn bar<'a>(val: Foo<'a>) -> Foo<'a> {
*val.contents.get("bar").clone().unwrap()
}
error[E0507]: cannot move out of a shared reference
--> src/lib.rs:9:5
|
9 | *val.contents.get("bar").clone().unwrap()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ move occurs because value has type `Foo<'_>`, which does not implement the `Copy` trait
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8ab3c7355903fc34751d5bd5360bb71a
I'm performing a .clone(), which I thought should allow me to return the resulting value whose ownership should be disentangled from the input value, but that doesn't appear to be the case.
Another weird thing, which isn't a blocker on its own but may hint at the underlying problem, is that for some reason the .clone() returns a &Foo, which is surprising because in other cases I've mostly seen .clone() go from &T -> T. This is why the * is there; without it this doesn't pass type checking. I know Rust has some "magical" referencing/dereferencing rules, but I can't quite figure this one out.
HashMap.get returns an Option<&T>. Cloning this option gives you another Option<&T>, referencing the same object.
If you want to convert a Option<&T> to a new Option<T> where T supports Clone you can use .cloned(). There's also no need to dereference anymore as you have a T not an &T.
This means your code will look like this:
use std::collections::HashMap;
#[derive(Debug, Clone, PartialEq)]
struct Foo<'a> {
contents: HashMap<&'a str, Foo<'a>>,
}
fn bar<'a>(val: Foo<'a>) -> Foo<'a> {
val.contents.get("bar").cloned().unwrap()
}

Size of dyn MyTrait cannot be statically determined in method which takes self?

I'm creating a HashMap<u64, Box<dyn MyTrait>>. I can create the HashMap and insert a struct that implements MyTrait, but when I retrieve MyTrait and try to use it, the compiler complains at me:
error[E0161]: cannot move a value of type dyn MyTrait: the size of dyn MyTrait cannot be statically determined
I was under the impression that a trait consists of two pointers, one to the vtable and one to the object data. So the size of any trait, including MyTrait, should be 2 * pointer_size. Furthermore, the object data pointer points to a MyStruct, which is of known size. Clearly I'm wrong in my understanding but I can't figure out why.
Here's my code:
use std::collections::HashMap;
fn main() {
let mut hm: HashMap<u64, Box<dyn MyTrait>> = HashMap::new();
hm.insert(0, Box::new(MyStruct{num: 0}));
match hm.get(&(0 as u64)) {
Some(r) => {
r.my_fun();
}
None => { println!("not found");}
}
}
pub trait MyTrait {
fn my_fun(self);
}
struct MyStruct {
num: u64,
}
impl MyTrait for MyStruct {
fn my_fun(self) {
println!("num is {}", self.num);
return
}
}
When you declared the method in MyTrait as fn my_fun(self);, this creates a method that takes the self parameter by value, not by reference. In Rust, there is no implicit clone for most values (and for those that do, it only supports a straight-forward bit-per-bit copy).
In general, passing a parameter by value results in it being moved, and afterwards its old location is no longer valid and becomes invalid behavior to somehow access. (Although that this doesn't happen is enforced by the compiler.)
The error isn't saying that you can't have a Box<dyn MyTrait> in the value of a HashMap, but rather that you cannot move a dyn MyTrait out of the &Box<dyn MyTrait> reference you had gotten from the hm.get call. Even if you had a sized type there instead of an unsized one, this is still not possible because you cannot move a value out from behind a shared reference, unless the type is Copy. (In which case, it is, well, copied instead of moved at all.)
Most likely, you want to use fn my_fun(&self) instead, which takes the self parameter by reference instead of value.
If you want to mutate the value in the function, you should declare the function fn my_fun(&mut self) and replace the hm.get with hm.get_mut so you have a mutable reference to your MyStruct.
If you really need to take the self parameter by value instead of reference, you can write fn my_fun(self: Box<Self>) which declares that the self parameter is a boxed value, and then change the hm.get line to hm.remove. As the name implies, this leaves the value no longer accessible in the hash map.

Eliminate lifetime parameter from a trait whose implementation wraps a HashMap?

I'd like to wrap a few methods of HashMap such as insert and keys. This attempt compiles, and the tests pass:
use std::collections::HashMap;
use std::hash::Hash;
pub trait Map<'a, N: 'a> {
type ItemIterator: Iterator<Item=&'a N>;
fn items(&'a self) -> Self::ItemIterator;
fn insert(&mut self, item: N);
}
struct MyMap<N> {
map: HashMap<N, ()>
}
impl<N: Eq + Hash> MyMap<N> {
fn new() -> Self {
MyMap { map: HashMap::new() }
}
}
impl<'a, N: 'a + Eq + Hash> Map<'a, N> for MyMap<N> {
type ItemIterator = std::collections::hash_map::Keys<'a, N, ()>;
fn items(&'a self) -> Self::ItemIterator {
self.map.keys()
}
fn insert(&mut self, item: N) {
self.map.insert(item, ());
}
}
#[cfg(test)]
mod tests {
use super::*;
#[derive(Eq, Hash, PartialEq, Debug)]
struct MyItem;
#[test]
fn test() {
let mut map = MyMap::new();
let item = MyItem { };
map.insert(&item);
let foo = map.items().collect::<Vec<_>>();
for it_item in map.items() {
assert_eq!(it_item, &&item);
}
assert_eq!(foo, vec![&&item]);
}
}
I'd like to eliminate the need for the lifetime parameter in Map if possible, but so far haven't found a way. The problem seems to result from the definition of std::collections::hash_map::Keys, which requires a lifetime parameter.
Attempts to redefine the Map trait work until it becomes necessary to supply the lifetime parameter on Keys:
use std::collections::HashMap;
use std::hash::Hash;
pub trait Map<N> {
type ItemIterator: Iterator<Item=N>;
fn items(&self) -> Self::ItemIterator;
fn insert(&mut self, item: N);
}
struct MyMap<N> {
map: HashMap<N, ()>
}
impl<N: Eq + Hash> MyMap<N> {
fn new() -> Self {
MyMap { map: HashMap::new() }
}
}
// ERROR: "unconstrained lifetime parameter"
impl<'a, N> Map<N> for MyMap<N> {
type ItemIterator = std::collections::hash_map::Keys<'a, N, ()>;
}
The compiler issues an error about an unconstrained lifetime parameter that I haven't been able to fix without re-introducing the lifetime into the Map trait.
The main goal of this experiment was to see how I could also eliminate Box from previous attempts. As this question explains, that's another way to return an iterator. So I'm not interested in that approach at the moment.
How can I set up Map and an implementation without introducing a lifetime parameter or using Box?
Something to think about is that since hash_map::Keys has a generic lifetime parameter, it is probably necessary for some reason, so your trait to abstract over Keys will probably need it to.
In this case, in the definition of Map, you need some way to specify how long the ItemIterator's Item lives. (The Item is &'a N).
This was your definition:
type ItemIterator: Iterator<Item=&'a N>
You are trying to say that for any struct that implements Map, the struct's associated ItemIterator must be an iterator of references; however, this constraint alone is useless without any further information: we also need to know how long the reference lives for (hence why type ItemIterator: Iterator<Item=&N> throws an error: it is missing this information, and it cannot currently be elided AFAIK).
So, you choose 'a to name a generic lifetime that you guarantee each &'a N will be valid for. Now, in order to satisfy the borrow checker, prove that &'a N will be valid for 'a, and establish some useful promises about 'a, you specify that:
Any value for the reference &self given to items() must live at least as long as 'a. This ensures that for each of the returned items (&'a N), the &self reference must still be valid in order for the item reference to remain valid, in other words, the items must outlive self. This invariant allows you to reference &self in the return value of items(). You have specified this with fn items(&'a self). (Side note: my_map.items() is really shorthand for MyMap::items(&my_map)).
Each of the Ns themselves must also remain valid for as long as 'a. This is important if the objects contain any references that won't live forever (aka non-'static references); this ensures that all of the references that the item N contains live at least as long as 'a. You have specified this with the constraint N: 'a.
So, to recap, the definition of Map<'a, N> requires that an implementors' items() function must return an ItemIterator of references that are valid for 'a to items that are valid for 'a. Now, your implementation:
impl<'a, N: 'a + Eq + Hash> Map<'a, N> for MyMap<N> { ... }
As you can see, the 'a parameter is completely unconstrained, so you can use any 'a with the methods from Map on an instance of MyMap, as long as N fulfills its constraints of N: 'a + Eq + Hash. 'a should automatically become the longest lifetime for which both N and the map passed to items() are valid.
Anyway, what you're describing here is known as a streaming iterator, which has been a problem in years. For some relevant discussion, see the approved but currently unimplemented RFC 1598 (but prepare to be overwhelmed).
Finally, as some people have commented, it's possible that your Map trait might be a bad design from the start since it may be better expressed as a combination of the built-in IntoIterator<Item=&'a N> and a separate trait for insert(). This would mean that the default iterator used in for loops, etc. would be the items iterator, which is inconsistent with the built-in HashMap, but I am not totally clear on the purpose of your trait so I think your design likely makes sense.

How to express lifetime for Rust iterator for a container

I have a circular buffer like this:
struct CircularBuffer<T: Copy> {
seqno: usize,
data: Vec<T>,
}
And I want to create an external struct being an iterator. This struct would refer to the internal data vector of the CircularBuffer like this one:
struct CircularBufferIterator<'a, T: 'a + Copy> {
buffer: &'a CircularBuffer<T>,
position: usize,
limit: usize,
}
This is the best I could come up with that actually compiles. Can you please suggest a better way to express that the CircularBufferIterator depends on the CircularBuffer object?
What troubles me is T: 'a + Copy. I wonder if it is possible or it makes sense to say that not the T type, but CircularBuffer<T> is the one CircularBufferIterator depends on.
The part I don't see is why do I need to add the 'a lifetime to T. Cannot that be T: Copy, without a lifetime? In other words, I cannot see a case when T reference outlives the CircularBuffer. It is the CircularBuffer reference that outlives the CircularBufferIterator.
The CircularBuffer and the context comes from this blog post.
why do I need to add the 'a lifetime to T
You aren't adding a lifetime to T; you are saying that whatever T is chosen, it can only contain references that outlive 'a. If that wasn't the case, then we might have a reference to a type that has a reference that is now invalid. Using that invalid reference would lead to memory unsafety; a key thing that Rust seeks to avoid.
I originally thought you were asking how to remove the Copy bound, so here's all that I typed up.
One change would be to remove the Copy bound from CircularBuffer but leaving it on the implementation of the methods. Then you don't need it on the iterator at all:
struct CircularBuffer<T> {
seqno: usize,
data: Vec<T>,
}
struct CircularBufferIterator<'a, T: 'a> {
buffer: &'a CircularBuffer<T>,
position: usize,
limit: usize,
}
Another change would be to completely eschew the direct reference to the CircularBuffer altogether, and keep direct iterators into the Vec:
struct CircularBufferIterator<'a, T: 'a> {
first: std::slice::Iter<'a, T>,
second: Option<std::slice::Iter<'a, T>>,
}
However, looking at the Iterator implementation, I see it returns a T, not a &T, so you ultimately need a type that is Copy or Clone. You'll note that the standard library doesn't require this because it returns a reference to the item in the collection. If you do need a non-reference, that's what into_iter or Iterator::cloned is for.

Resources