Mutating one field while iterating over another immutable field - rust

Given the following program:
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&mut self, s: &str) {
self.output().push_str(s);
}
fn data(&self) -> &Data;
fn generate_items(&mut self) {
for item in self.data().items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
}
fn output(&mut self) -> &mut String;
}
struct MyGenerator<'a> {
data: &'a Data,
output: String,
}
impl<'a> MyGenerator<'a> {
fn generate(mut self) -> String {
self.generate_items();
self.output
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&self) -> &Data {
self.data
}
fn output(&mut self) -> &mut String {
&mut self.output
}
}
fn main() {
let data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &data,
output: String::new(),
};
let output = generator.generate();
println!("{}", output);
}
The following errors are produced trying to compile it:
error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
--> src/main.rs:15:26
|
13 | for item in self.data().items.iter() {
| ---- - immutable borrow ends here
| |
| immutable borrow occurs here
14 | match *item {
15 | "foo" => self.append("it was foo\n"),
| ^^^^ mutable borrow occurs here
error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
--> src/main.rs:16:22
|
13 | for item in self.data().items.iter() {
| ---- - immutable borrow ends here
| |
| immutable borrow occurs here
...
16 | _ => self.append("it was something else\n"),
| ^^^^ mutable borrow occurs here
What is the proper way to structure the code so that the mutable field output can be written to while iterating over the immutable field data? Assume the indirection through the Generator trait is being used to share similar logic with other structs, so accessing MyStruct's fields from the trait's default method implementations need to be done through accessor methods like this.

This is a common issue in Rust; the typical way of solving it is the replace dance. This involves making more of the data and methods use mutable references:
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&mut self, s: &str) {
self.output().push_str(s);
}
fn data(&mut self) -> &mut Data;
fn generate_items(&mut self) {
// Take the data. The borrow on self ends after this statement.
let data = std::mem::replace(self.data(), Data { items: vec![] });
// Iterate over the local version. Now append can borrow all it wants.
for item in data.items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
// Put the data back where it belongs.
std::mem::replace(self.data(), data);
}
fn output(&mut self) -> &mut String;
}
struct MyGenerator<'a> {
data: &'a mut Data,
output: String,
}
impl<'a> MyGenerator<'a> {
fn generate(mut self) -> String {
self.generate_items();
self.output
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&mut self) -> &mut Data {
self.data
}
fn output(&mut self) -> &mut String {
&mut self.output
}
}
fn main() {
let mut data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &mut data,
output: String::new(),
};
let output = generator.generate();
println!("{}", output);
}
The thing to realize is that the compiler is right to complain. Imagine if calling output() had the side effect of mutating the thing that is referenced by the return value of data() Then the iterator you're using in the loop could get invalidated. Your trait functions have the implicit contract that they don't do anything like that, but there is no way of checking this. So the only thing you can do is temporarily assume full control over the data, by taking it out.
Of course, this pattern breaks unwind safety; a panic in the loop will leave the data moved out.

Assume the indirection through the Generator trait is being used to share similar logic with other structs, so accessing MyStruct's fields from the trait's default method implementations need to be done through accessor methods like this.
Then it's impossible.
The compiler recognizes access to different fields when it sees such fields directly; it does not break abstraction boundaries to peek inside the functions called.
There have been discussions about adding attributes on the methods to specifically mention which field is accessed by which method:
the compiler would enforce that a method does not touch any field NOT mentioned in the attribute
the compiler could then use the knowledge that said method only operates on a subset of the fields
however... this is for non-virtual methods.
For a trait this gets significantly more complicated because a trait does not have fields, and each implementer may have a different set of fields!
So now what?
You will need to change your code:
you can split the trait in two, and require two objects (one to iterate, one to mutate)
you can "hide" the mutability of the append method, forcing users to use interior mutability
...

You can use RefCell:
RefCell uses Rust's lifetimes to implement 'dynamic borrowing', a
process whereby one can claim temporary, exclusive, mutable access to
the inner value. Borrows for RefCells are tracked 'at runtime',
unlike Rust's native reference types which are entirely tracked
statically, at compile time. Because RefCell borrows are dynamic it
is possible to attempt to borrow a value that is already mutably
borrowed; when this happens it results in thread panic.
use std::cell::{RefCell, RefMut};
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&self, s: &str) {
self.output().push_str(s);
}
fn data(&self) -> &Data;
fn generate_items(&self) {
for item in self.data().items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
}
fn output(&self) -> RefMut<String>;
}
struct MyGenerator<'a> {
data: &'a Data,
output: RefCell<String>,
}
impl<'a> MyGenerator<'a> {
fn generate(self) -> String {
self.generate_items();
self.output.into_inner()
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&self) -> &Data {
self.data
}
fn output(&self) -> RefMut<String> {
self.output.borrow_mut()
}
}
fn main() {
let data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &data,
output: RefCell::new(String::new()),
};
let output = generator.generate();
println!("{}", output);
}
Rust playground

Related

Passing both &mut self and &self to the same function

Stripped down to the bare essentials, my problematic code looks as follows:
pub struct Item;
impl Item {
/// Partial copy. Not the same as simple assignment.
pub fn copy_from(&mut self, _other: &Item) {
}
}
pub struct Container {
items: Vec<Item>,
}
impl Container {
pub fn copy_from(&mut self, self_idx: usize, other: &Container, other_idx: usize) {
self.items[self_idx].copy_from(&other.items[other_idx]);
}
}
fn main() {
let mut container = Container { items: vec![Item, Item] };
container.copy_from(0, &container, 1);
}
This is of course rejected by the borrow checker:
error[E0502]: cannot borrow `container` as mutable because it is also borrowed as immutable
--> src/main.rs:21:5
|
21 | container.copy_from(0, &container, 1);
| ^^^^^^^^^^---------^^^^----------^^^^
| | | |
| | | immutable borrow occurs here
| | immutable borrow later used by call
| mutable borrow occurs here
For more information about this error, try `rustc --explain E0502`.
I understand why that happens, but I don't have a good solution.
I've considered adding a dedicated copy_from_self function that callers need to use in cases where self == other:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
if to_idx != from_idx {
unsafe {
let from_item: *const Item = &self.items[from_idx];
self.items[to_idx].copy_from(&*from_item);
}
}
}
But this is un-ergonomic, bloats the API surface, and needs unsafe code inside.
Note that in reality, the internal items data structure is not a simple Vec, so any approach specific to Vec or slice will not work.
Is there an elegant, idiomatic solution to this problem?
If I understand the comments on the question correctly, a general solution seems to be impossible, so this answer is necessarily specific to my actual situation.
As mentioned, the actual data structure is not a Vec. If it were a Vec, we could use split_at_mut to at least implement copy_from_self safely.
But as it happens, my actual data structure is backed by a Vec, so I was able to add a helper function:
/// Returns a pair of mutable references to different items. Useful if you need to pass
/// a reference to one item to a function that takes `&mut self` on another item.
/// Panics if `a == b`.
fn get_mut_2(&mut self, a: usize, b: usize) -> (&mut T, &mut T) {
assert!(a != b);
if a < b {
let (first, second) = self.items.split_at_mut(b);
(&mut first[a], &mut second[0])
} else if a > b {
let (first, second) = self.items.split_at_mut(a);
(&mut second[0], &mut first[b])
} else {
panic!("cannot call get_mut_2 with the same index {} == {}", a, b);
}
}
Now we can implement copy_from_self without unsafe code:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
let (to, from) = self.items.get_mut_2(to_idx, from_idx);
to.unwrap().copy_from(from.unwrap());
}

Cannot move out of borrowed content from closure return value

I found this problem when working on a mid-size project. The following snippet is a minimal summary of the problem.
In the following code I try to map a list of enum variants into a Set of different enum variants. I use a closure so I can capture a mutable reference to my_list which is a list of source enum variants. The closure is then kept inside a MyType instance so it can be called later and the result used inside another method.
To keep the closure, I used a FnMut trait inside a Box. I also wrapped that inside an Option so I can set the closure after instance creation.
I based this a bit from the question asked here: structs with boxed vs. unboxed closures
use std::collections::HashSet;
enum Numbers {
One,
Two,
Three,
}
#[derive(Eq, PartialEq, Hash)]
enum Romans {
I,
II,
III,
}
struct MyType<'a> {
func: Option<Box<dyn FnMut() -> HashSet<Romans> + 'a>>,
}
impl<'a> MyType<'a> {
pub fn set_func<F>(&mut self, a_func: F)
where F: FnMut() -> HashSet<Romans> + 'a {
self.func = Some(Box::new(a_func));
}
pub fn run(&mut self) {
let result = (self.func.unwrap())();
if result.contains(&Romans::I) {
println!("Roman one!");
}
}
}
fn main() {
let my_list = vec![Numbers::One, Numbers::Three];
let mut my_type = MyType {
func: None,
};
my_type.set_func(|| -> HashSet<Romans> {
HashSet::from(my_list
.iter()
.map(|item| {
match item {
Numbers::One => Romans::I,
Numbers::Two => Romans::II,
Numbers::Three => Romans::III,
}
})
.collect()
)
});
my_type.run();
}
When I try to compile, I get the following error:
error[E0507]: cannot move out of borrowed content
--> src/main.rs:27:23
|
27 | let result = (self.func.unwrap())();
| ^^^^^^^^^ cannot move out of borrowed content
error: aborting due to previous error
I don't quite understand what is being moved out. Is it a hidden self? The resulting HashSet? or maybe the values inside the set?
What am I doing wrong?
The trouble you're having is that calling unwrap on an Option will consume it--it takes self as an argument. Inside run(), your MyType only has a &mut self reference to itself, so it cannot take ownership of its field.
The solution is to take mutable reference to the stored function instead:
pub fn run(&mut self) {
if let Some(func) = &mut self.func {
let result = func();
if result.contains(&Romans::I) {
println!("Roman one!");
}
}
}

How to call a member method while inside block with mutable reference to self [duplicate]

Given the following program:
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&mut self, s: &str) {
self.output().push_str(s);
}
fn data(&self) -> &Data;
fn generate_items(&mut self) {
for item in self.data().items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
}
fn output(&mut self) -> &mut String;
}
struct MyGenerator<'a> {
data: &'a Data,
output: String,
}
impl<'a> MyGenerator<'a> {
fn generate(mut self) -> String {
self.generate_items();
self.output
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&self) -> &Data {
self.data
}
fn output(&mut self) -> &mut String {
&mut self.output
}
}
fn main() {
let data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &data,
output: String::new(),
};
let output = generator.generate();
println!("{}", output);
}
The following errors are produced trying to compile it:
error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
--> src/main.rs:15:26
|
13 | for item in self.data().items.iter() {
| ---- - immutable borrow ends here
| |
| immutable borrow occurs here
14 | match *item {
15 | "foo" => self.append("it was foo\n"),
| ^^^^ mutable borrow occurs here
error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
--> src/main.rs:16:22
|
13 | for item in self.data().items.iter() {
| ---- - immutable borrow ends here
| |
| immutable borrow occurs here
...
16 | _ => self.append("it was something else\n"),
| ^^^^ mutable borrow occurs here
What is the proper way to structure the code so that the mutable field output can be written to while iterating over the immutable field data? Assume the indirection through the Generator trait is being used to share similar logic with other structs, so accessing MyStruct's fields from the trait's default method implementations need to be done through accessor methods like this.
This is a common issue in Rust; the typical way of solving it is the replace dance. This involves making more of the data and methods use mutable references:
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&mut self, s: &str) {
self.output().push_str(s);
}
fn data(&mut self) -> &mut Data;
fn generate_items(&mut self) {
// Take the data. The borrow on self ends after this statement.
let data = std::mem::replace(self.data(), Data { items: vec![] });
// Iterate over the local version. Now append can borrow all it wants.
for item in data.items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
// Put the data back where it belongs.
std::mem::replace(self.data(), data);
}
fn output(&mut self) -> &mut String;
}
struct MyGenerator<'a> {
data: &'a mut Data,
output: String,
}
impl<'a> MyGenerator<'a> {
fn generate(mut self) -> String {
self.generate_items();
self.output
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&mut self) -> &mut Data {
self.data
}
fn output(&mut self) -> &mut String {
&mut self.output
}
}
fn main() {
let mut data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &mut data,
output: String::new(),
};
let output = generator.generate();
println!("{}", output);
}
The thing to realize is that the compiler is right to complain. Imagine if calling output() had the side effect of mutating the thing that is referenced by the return value of data() Then the iterator you're using in the loop could get invalidated. Your trait functions have the implicit contract that they don't do anything like that, but there is no way of checking this. So the only thing you can do is temporarily assume full control over the data, by taking it out.
Of course, this pattern breaks unwind safety; a panic in the loop will leave the data moved out.
Assume the indirection through the Generator trait is being used to share similar logic with other structs, so accessing MyStruct's fields from the trait's default method implementations need to be done through accessor methods like this.
Then it's impossible.
The compiler recognizes access to different fields when it sees such fields directly; it does not break abstraction boundaries to peek inside the functions called.
There have been discussions about adding attributes on the methods to specifically mention which field is accessed by which method:
the compiler would enforce that a method does not touch any field NOT mentioned in the attribute
the compiler could then use the knowledge that said method only operates on a subset of the fields
however... this is for non-virtual methods.
For a trait this gets significantly more complicated because a trait does not have fields, and each implementer may have a different set of fields!
So now what?
You will need to change your code:
you can split the trait in two, and require two objects (one to iterate, one to mutate)
you can "hide" the mutability of the append method, forcing users to use interior mutability
...
You can use RefCell:
RefCell uses Rust's lifetimes to implement 'dynamic borrowing', a
process whereby one can claim temporary, exclusive, mutable access to
the inner value. Borrows for RefCells are tracked 'at runtime',
unlike Rust's native reference types which are entirely tracked
statically, at compile time. Because RefCell borrows are dynamic it
is possible to attempt to borrow a value that is already mutably
borrowed; when this happens it results in thread panic.
use std::cell::{RefCell, RefMut};
struct Data {
pub items: Vec<&'static str>,
}
trait Generator {
fn append(&self, s: &str) {
self.output().push_str(s);
}
fn data(&self) -> &Data;
fn generate_items(&self) {
for item in self.data().items.iter() {
match *item {
"foo" => self.append("it was foo\n"),
_ => self.append("it was something else\n"),
}
}
}
fn output(&self) -> RefMut<String>;
}
struct MyGenerator<'a> {
data: &'a Data,
output: RefCell<String>,
}
impl<'a> MyGenerator<'a> {
fn generate(self) -> String {
self.generate_items();
self.output.into_inner()
}
}
impl<'a> Generator for MyGenerator<'a> {
fn data(&self) -> &Data {
self.data
}
fn output(&self) -> RefMut<String> {
self.output.borrow_mut()
}
}
fn main() {
let data = Data {
items: vec!["foo", "bar", "baz"],
};
let generator = MyGenerator {
data: &data,
output: RefCell::new(String::new()),
};
let output = generator.generate();
println!("{}", output);
}
Rust playground

How do I efficiently build a vector and an index of that vector while processing a data stream?

I have a struct Foo:
struct Foo {
v: String,
// Other data not important for the question
}
I want to handle a data stream and save the result into Vec<Foo> and also create an index for this Vec<Foo> on the field Foo::v.
I want to use a HashMap<&str, usize> for the index, where the keys will be &Foo::v and the value is the position in the Vec<Foo>, but I'm open to other suggestions.
I want to do the data stream handling as fast as possible, which requires not doing obvious things twice.
For example, I want to:
allocate a String only once per one data stream reading
not search the index twice, once to check that the key does not exist, once for inserting new key.
not increase the run time by using Rc or RefCell.
The borrow checker does not allow this code:
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
//here is loop in real code, like:
//let mut s: String;
//while get_s(&mut s) {
let s = "aaa".to_string();
let idx: usize = match hash.entry(&s) { //a
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: s }); //b
ent.insert(l.len() - 1);
l.len() - 1
}
};
// do something with idx
}
There are multiple problems:
hash.entry borrows the key so s must have a "bigger" lifetime than hash
I want to move s at line (b), while I have a read-only reference at line (a)
So how should I implement this simple algorithm without an extra call to String::clone or calling HashMap::get after calling HashMap::insert?
In general, what you are trying to accomplish is unsafe and Rust is correctly preventing you from doing something you shouldn't. For a simple example why, consider a Vec<u8>. If the vector has one item and a capacity of one, adding another value to the vector will cause a re-allocation and copying of all the values in the vector, invalidating any references into the vector. This would cause all of your keys in your index to point to arbitrary memory addresses, thus leading to unsafe behavior. The compiler prevents that.
In this case, there's two extra pieces of information that the compiler is unaware of but the programmer isn't:
There's an extra indirection — String is heap-allocated, so moving the pointer to that heap allocation isn't really a problem.
The String will never be changed. If it were, then it might reallocate, invalidating the referred-to address. Using a Box<[str]> instead of a String would be a way to enforce this via the type system.
In cases like this, it is OK to use unsafe code, so long as you properly document why it's not unsafe.
use std::collections::HashMap;
#[derive(Debug)]
struct Player {
name: String,
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let mut players = Vec::new();
let mut index = HashMap::new();
for &name in &names {
let player = Player { name: name.into() };
let idx = players.len();
// I copied this code from Stack Overflow without reading the prose
// that describes why this unsafe block is actually safe
let stable_name: &str = unsafe { &*(player.name.as_str() as *const str) };
players.push(player);
index.insert(idx, stable_name);
}
for (k, v) in &index {
println!("{:?} -> {:?}", k, v);
}
for v in &players {
println!("{:?}", v);
}
}
However, my guess is that you don't want this code in your main method but want to return it from some function. That will be a problem, as you will quickly run into Why can't I store a value and a reference to that value in the same struct?.
Honestly, there's styles of code that don't fit well within Rust's limitations. If you run into these, you could:
decide that Rust isn't a good fit for you or your problem.
use unsafe code, preferably thoroughly tested and only exposing a safe API.
investigate alternate representations.
For example, I'd probably rewrite the code to have the index be the primary owner of the key:
use std::collections::BTreeMap;
#[derive(Debug)]
struct Player<'a> {
name: &'a str,
data: &'a PlayerData,
}
#[derive(Debug)]
struct PlayerData {
hit_points: u8,
}
#[derive(Debug)]
struct Players(BTreeMap<String, PlayerData>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| (name.into(), PlayerData { hit_points: 100 }))
.collect();
Players(players)
}
fn get<'a>(&'a self, name: &'a str) -> Option<Player<'a>> {
self.0.get(name).map(|data| Player { name, data })
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for (k, v) in &players.0 {
println!("{:?} -> {:?}", k, v);
}
println!("{:?}", players.get("eustice"));
}
Alternatively, as shown in What's the idiomatic way to make a lookup table which uses field of the item as the key?, you could wrap your type and store it in a set container instead:
use std::collections::BTreeSet;
#[derive(Debug, PartialEq, Eq)]
struct Player {
name: String,
hit_points: u8,
}
#[derive(Debug, Eq)]
struct PlayerByName(Player);
impl PlayerByName {
fn key(&self) -> &str {
&self.0.name
}
}
impl PartialOrd for PlayerByName {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}
impl Ord for PlayerByName {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
self.key().cmp(&other.key())
}
}
impl PartialEq for PlayerByName {
fn eq(&self, other: &Self) -> bool {
self.key() == other.key()
}
}
impl std::borrow::Borrow<str> for PlayerByName {
fn borrow(&self) -> &str {
self.key()
}
}
#[derive(Debug)]
struct Players(BTreeSet<PlayerByName>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| {
PlayerByName(Player {
name: name.into(),
hit_points: 100,
})
})
.collect();
Players(players)
}
fn get(&self, name: &str) -> Option<&Player> {
self.0.get(name).map(|pbn| &pbn.0)
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for player in &players.0 {
println!("{:?}", player.0);
}
println!("{:?}", players.get("eustice"));
}
not increase the run time by using Rc or RefCell
Guessing about performance characteristics without performing profiling is never a good idea. I honestly don't believe that there'd be a noticeable performance loss from incrementing an integer when a value is cloned or dropped. If the problem required both an index and a vector, then I would reach for some kind of shared ownership.
not increase the run time by using Rc or RefCell.
#Shepmaster already demonstrated accomplishing this using unsafe, once you have I would encourage you to check how much Rc actually would cost you. Here is a full version with Rc:
use std::{
collections::{hash_map::Entry, HashMap},
rc::Rc,
};
#[derive(Debug)]
struct Foo {
v: Rc<str>,
}
#[derive(Debug)]
struct Collection {
vec: Vec<Foo>,
index: HashMap<Rc<str>, usize>,
}
impl Foo {
fn new(s: &str) -> Foo {
Foo {
v: s.into(),
}
}
}
impl Collection {
fn new() -> Collection {
Collection {
vec: Vec::new(),
index: HashMap::new(),
}
}
fn insert(&mut self, foo: Foo) {
match self.index.entry(foo.v.clone()) {
Entry::Occupied(o) => panic!(
"Duplicate entry for: {}, {:?} inserted before {:?}",
foo.v,
o.get(),
foo
),
Entry::Vacant(v) => v.insert(self.vec.len()),
};
self.vec.push(foo)
}
}
fn main() {
let mut collection = Collection::new();
for foo in vec![Foo::new("Hello"), Foo::new("World"), Foo::new("Go!")] {
collection.insert(foo)
}
println!("{:?}", collection);
}
The error is:
error: `s` does not live long enough
--> <anon>:27:5
|
16 | let idx: usize = match hash.entry(&s) { //a
| - borrow occurs here
...
27 | }
| ^ `s` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
The note: at the end is where the answer is.
s must outlive hash because you are using &s as a key in the HashMap. This reference will become invalid when s is dropped. But, as the note says, hash will be dropped after s. A quick fix is to swap the order of their declarations:
let s = "aaa".to_string();
let mut hash = HashMap::<&str, usize>::new();
But now you have another problem:
error[E0505]: cannot move out of `s` because it is borrowed
--> <anon>:22:33
|
17 | let idx: usize = match hash.entry(&s) { //a
| - borrow of `s` occurs here
...
22 | l.push(Foo { v: s }); //b
| ^ move out of `s` occurs here
This one is more obvious. s is borrowed by the Entry, which will live to the end of the block. Cloning s will fix that:
l.push(Foo { v: s.clone() }); //b
I only want to allocate s only once, not cloning it
But the type of Foo.v is String, so it will own its own copy of the str anyway. Just that type means you have to copy the s.
You can replace it with a &str instead which will allow it to stay as a reference into s:
struct Foo<'a> {
v: &'a str,
}
pub fn main() {
// s now lives longer than l
let s = "aaa".to_string();
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
let idx: usize = match hash.entry(&s) {
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: &s });
ent.insert(l.len() - 1);
l.len() - 1
}
};
}
}
Note that, previously I had to move the declaration of s to before hash, so that it would outlive it. But now, l holds a reference to s, so it has to be declared even earlier, so that it outlives l.

Lifetime on trait returning iterator

I'm working with a trait requiring a function returning an iterator without consuming the object. The iterator itself returns copies of data values, not references. As the iterator implementation requires a reference to the object it is iterating over, I end up having to declare lots of lifetimes (more than I would have thought necessary, but could not get it to compile otherwise). I then run into trouble with borrow duration - a minimal "working" example is as follows:
pub trait MyTrait<'a> {
type IteratorType: Iterator<Item=u32>;
fn iter(&'a self) -> Self::IteratorType;
fn touch(&'a mut self, value: u32);
}
struct MyStruct {
data: Vec<u32>
}
struct MyIterator<'a> {
structref: &'a MyStruct,
next: usize,
}
impl<'a> Iterator for MyIterator<'a> {
type Item = u32;
fn next(&mut self) -> Option<u32> {
if self.next < self.structref.data.len() {
self.next += 1;
return Some(self.structref.data[self.next-1]);
} else {
return None;
}
}
}
impl<'a> MyTrait<'a> for MyStruct {
type IteratorType = MyIterator<'a>;
fn iter(&'a self) -> Self::IteratorType {
return MyIterator { structref: &self, next: 0 };
}
fn touch(&'a mut self, value: u32) {
}
}
fn touch_all<'a,T>(obj: &'a mut T) where T: MyTrait<'a> {
let data: Vec<u32> = obj.iter().collect();
for value in data {
obj.touch(value);
}
}
Compiling this gives me the error:
error[E0502]: cannot borrow `*obj` as mutable because it is also borrowed as immutable
|
39 | let data: Vec<u32> = obj.iter().collect();
| --- immutable borrow occurs here
40 | for value in data {
41 | obj.touch(value);
| ^^^ mutable borrow occurs here
42 | }
43 | }
| - immutable borrow ends here
By my limited understanding of lifetimes, I would have thought the immutable borrow only extends to the line where I make it - after all the iterator is consumed and I no longer hold any references to obj or data contained in it. Why does the lifetime of the borrow extend to the entire function, and how do I fix this?
Here is a sequence of steps on how I arrived here - running the code should provide the associated compiler errors.
no explicit lifetimes
IteratorType needs lifetime
Unconstrained lifetime parameter
To clarify: I'd like to be able to make calls like this:
fn main() {
let obj: MyStruct = MyStruct { data : vec![] };
touch_all(&mut obj);
}
rather than having to call
touch_all(&mut &obj);
which would be needed for the proposal by mcarton (1st and 2nd comment).

Resources