Get mutable reference from immutable array - rust

I don't know if I phrased the title correctly, but here's the issue:
let mut rows: Vec<Box<String>> = vec![];
let row1 = &mut **rows.get(0).unwrap();
I want to store mutable references to multiple strings which are stored in boxes in the vector. This should be perfectly safe since I'm not referencing anything in the vector, just getting a box from the vector, dereferencing it and changing the memory it points to. If the vector gets too large and needs to reallocate its data my strings stay intact. But rust's compiler won't let me do that, I get the error cannot borrow data in a '&' reference as mutable
How do I design around this?
I could make rows mutable and use get_mut, but then I wouldn't be able to, for example, have mutable references to two rows at the same time:
let mut rows: Vec<Box<String>> = vec![];
let row1 = &mut **rows.get_mut(0).unwrap();
let row2 = &mut **rows.get_mut(1).unwrap();
*row1 = String::from("aaa");
*row2 = String::from("bbb");
This gives cannot borrow 'rows' as mutable more than once at a time.
Another solution would be to get each row only when I need to use it and then get it again if I need to use it again, but I don't think that's a very performant idea since I'd have to loop through the array to find the row I want every time I need to change something in it (I wouldn't be getting the array element by index).
EDIT: I am trying to design around storing mutable references to the strings, since Rust's compiler won't let me do that. Either I'm missing something I can do to have multiple mutable references to the strings or I need find another way to accomplish that. I need to have mutable references to what the boxes contain, in my program they're not strings, they're structs which have mutable functions, but I used strings here for the sake of simplicity. Here's a bit of code to clarify it
struct Table
{
cols: Vec<Box<Col>>
}
//...
let mut table = Table::new();
let mut id_col = table.new_col("ID" .to_owned());
let mut status_col = table.new_col("Status" .to_owned());
let mut title_col = table.new_col("Title" .to_owned());
let mut deadline_col = table.new_col("Deadline" .to_owned());
let mut tags_col = table.new_col("Tags" .to_owned());
let mut repeat_col = table.new_col("Repeat" .to_owned());
(I used rows in my first example to make it easier to understand)
I will iterate over some data and append stuff to these columns, so I don't want to search for them by name in the vector in each iteration, I want to have "cached" references to them (which are all of these variables). The problem is that my compiler won't let me do that because I can't borrow table as mutable more than once. So I mean by "designing around" is changing my way of thinking and restructuring my code so I don't have this problem.

I want to store mutable references to multiple strings which are stored in boxes in the vector. This should be perfectly safe since I'm not referencing anything in the vector, just getting a box from the vector, dereferencing it and changing the memory it points to. If the vector gets too large and needs to reallocate its data my strings stay intact.
The compiler has literally no idea about this, and it's not a certainty either: in your scheme nothing prevents getting a second mutable handle on the same box or string and breaking that.
RefCell exists for this sort of situations ("interior mutability" is the concept your want to look for), and it implies a runtime performance hit as it needs to keep track of extant borrows (it's basically a single-threaded RWLock).
I could make rows mutable and use get_mut, but then I wouldn't be able to, for example, have mutable references to two rows at the same time
That's what split_mut (and friends) is for.
(I used rows in my first example to make it easier to understand) I will iterate over some data and append stuff to these columns, so I don't want to search for them by name in the vector in each iteration, I want to have "cached" references to them (which are all of these variables). The problem is that my compiler won't let me do that because I can't borrow table as mutable more than once. So I mean by "designing around" is changing my way of thinking and restructuring my code so I don't have this problem.
Yeah no, that's not just having mutable handles on separate parts of a collection, it's having mutable handles while modifying the parent. You probably need Rc (instead of Box) and a RefCell (for interior mutability).
Or do a second pass of split_first_mut or an unrolled iterator:
let mut t = table.cols.iter_mut();
let id_col = t.next().unwrap();
let status_col = t.next().unwrap();
let title_col = t.next().unwrap();
let deadline_col = t.next().unwrap();
let tags_col = t.next().unwrap();
let repeat_col = t.next().unwrap();
with the issue that you will not be able to modify table until all of these references are dead. And that it's pretty ugly.

Related

Incrementally build data structures which reference the same data

I have an EntityMap object which deals with spatially indexing anything that can have a bounding box. I've implemented it in such a way that it stores references to objects rather than owned values (this may not be the best way to do, but I think changing it to owned values would only shift my problem).
What I am currently attempting to do is add objects to a vector in such a way that they do not collide with anything that was previously added to the vector. The following is pseudo-rust of what I just described:
let mut final_moves = Vec::new();
let mut move_map = EntityMap::new();
for m in moves.into_iter() {
let close_moves = move_map.find_intersecting(m.bounding_box());
let new_move = m.modify_until_not_intersecting(close_moves);
final_moves.push(new_move);
move_map.insert(final_moves.last().unwrap());
}
final_moves is what is being returned from the function. The issue here is that I'm mutably borrowing final_moves, but I want to store references to its objects inside move_map. The conflict I can't figure out how to resolve is that I want to incrementally build final_moves and move_map at the same time, but this seems to require that they both be mutable, which then stops me from borrowing data from move_map.
How do I restructure my code to accomplish my goals?
You could change the types of move_map and final_moves to contain Rc<Move> instead of just Move.
let mut final_moves = Vec::new();
let mut move_map = EntityMap::new();
for m in moves.into_iter() {
let close_moves = move_map.find_intersecting(m.bounding_box());
let new_move = Rc::new(m.modify_until_not_intersecting(close_moves));
final_moves.push(Rc::clone(new_move));
move_map.insert(new_move);
}

Is there a way to add elements to a container while immutably borrowing earlier elements?

I'm building a GUI and I want to store all used textures in one place, but I have to add new textures while older textures are already immutably borrowed.
let (cat, mouse, dog) = (42, 360, 420); // example values
let mut container = vec![cat, mouse]; // new container
let foo = &container[0]; // now container is immutably borrowed
container.push(dog); // error: mutable borrow
Is there any kind of existing structure that allows something like this,
or can I implement something like this using raw pointers?
The absolute easiest thing is to introduce shared ownership:
use std::rc::Rc;
fn main() {
let (cat, mouse, dog) = (42, 360, 420);
let mut container = vec![Rc::new(cat), Rc::new(mouse)];
let foo = container[0].clone();
container.push(Rc::new(dog));
}
Now container and foo jointly own cat.
Is there any kind of existing structure that allows something like this,
Yes, but there are always tradeoffs. Above, we used Rc to share ownership, which involves a reference counter.
Another potential solution is to use an arena:
extern crate typed_arena;
use typed_arena::Arena;
fn main() {
let container = Arena::new();
let cat = container.alloc(42);
let mouse = container.alloc(360);
let dog = container.alloc(420);
}
This isn't indexable, you cannot take ownership of the value again, and you cannot remove a value.
Being able to remove things from the collection always makes invalidating references dangerous.
can I implement something like this using raw pointers
Almost certainly. Will you get it right is always the tricky point.
but I have to add new textures while older textures are already immutably borrowed
Many times, you don't have to do any such thing. For example, you can split up your logic into phases. You have two containers; one that people take references into, and another to collect new values. At the end of the phase, you combine the two collections into one. You have to ensure that no references are used after the end of the phase, of course.

Not live long enough with CSV and dataflow

fn main() {
timely::execute_from_args(std::env::args().skip(0), move |worker| {
let (mut input, probe) = worker.dataflow::<_, _, _>(|scope| {
let (input, data) = scope.new_collection();
let probe = data.inspect(|x| println!("observed data: {:?}", x)).probe();
(input, probe)
});
let mut rdr = csv::ReaderBuilder::new()
.has_headers(false)
.flexible(true)
.delimiter(b'\t')
.from_reader(io::stdin());
for result in rdr.deserialize() {
let record = result.expect("a CSV record");
let mut vec = Vec::new();
for i in 0..13 {
vec.push(&record[i]);
}
input.insert(vec);
}
});
}
The error is record can not live long enough. I try to read the CSV record and read it as a vector. Then insert records in to the data flow. I can run them separate. I can read the CSv as vector and use the data flow in other place.
The problem is that you are pushing to the Vec a borrowed value: &record[i]. The & means borrow, and as a consequence the original value record must outlive the borrower vec.
That might seem fine (both are in the for body, and thus both have the same lifetime, i.e., they both live inside the for body and therefore none outlive each other), but this doesn't happen because the line input.insert(vec) is moving vec. What this means is that vec now becomes owned by input and hence it lives as long as input (as far as I understand). Now, because input is outside the for body, the moved vec lives as long as input and therefore outlives the record[i]s.
There are a few solutions, but all of them try to remove the dependency between record and input:
If the record is an array of primitive values, or something that implements the Copy trait, you can simply omit the borrow and the value will be copied into the vector: vec.push(record[i]).
Clone the record value into the vector: vec.push(record[i].clone()). This forces the creation of a clone, which as above, the vec becomes the owner, avoiding the borrow.
If the elements in the record array don't implement Copy nor Clone, you have to move it. Because the value is in an array, you have to move the array fully (it can't have elements that haven't been removed). One solution is to transform it into an iterator that moves out the values one by one, and then push them into the vector:
for element in record.into_iter().take(13) {
vec.push(element)
}
Replace the record value with a different value. One final solution in order to move only parts of the array is to replace the element in the array with something else. This means that although you remove an element from the array, you replace it with something else, and the array continues to be valid.
for i in 0..13 {
vec.push(std::mem::replace(&record[i], Default::default()));
}
You can replace Default::default() with another value if you want to.
I hope this helps. I'm still a noob in Rust, so improvements and critique on the answer are accepted :)

How does one create a HashMap with a default value in Rust?

Being fairly new to Rust, I was wondering on how to create a HashMap with a default value for a key? For example, having a default value 0 for any key inserted in the HashMap.
In Rust, I know this creates an empty HashMap:
let mut mymap: HashMap<char, usize> = HashMap::new();
I am looking to maintain a counter for a set of keys, for which one way to go about it seems to be:
for ch in "AABCCDDD".chars() {
mymap.insert(ch, 0)
}
Is there a way to do it in a much better way in Rust, maybe something equivalent to what Ruby provides:
mymap = Hash.new(0)
mymap["b"] = 1
mymap["a"] # 0
Answering the problem you have...
I am looking to maintain a counter for a set of keys.
Then you want to look at How to lookup from and insert into a HashMap efficiently?. Hint: *map.entry(key).or_insert(0) += 1
Answering the question you asked...
How does one create a HashMap with a default value in Rust?
No, HashMaps do not have a place to store a default. Doing so would cause every user of that data structure to allocate space to store it, which would be a waste. You'd also have to handle the case where there is no appropriate default, or when a default cannot be easily created.
Instead, you can look up a value using HashMap::get and provide a default if it's missing using Option::unwrap_or:
use std::collections::HashMap;
fn main() {
let mut map: HashMap<char, usize> = HashMap::new();
map.insert('a', 42);
let a = map.get(&'a').cloned().unwrap_or(0);
let b = map.get(&'b').cloned().unwrap_or(0);
println!("{}, {}", a, b); // 42, 0
}
If unwrap_or doesn't work for your case, there are several similar functions that might:
Option::unwrap_or_else
Option::map_or
Option::map_or_else
Of course, you are welcome to wrap this in a function or a data structure to provide a nicer API.
ArtemGr brings up an interesting point:
in C++ there's a notion of a map inserting a default value when a key is accessed. That always seemed a bit leaky though: what if the type doesn't have a default? Rust is less demanding on the mapped types and more explicit about the presence (or absence) of a key.
Rust adds an additional wrinkle to this. Actually inserting a value would require that simply getting a value can also change the HashMap. This would invalidate any existing references to values in the HashMap, as a reallocation might be required. Thus you'd no longer be able to get references to two values at the same time! That would be very restrictive.
What about using entry to get an element from the HashMap, and then modify it.
From the docs:
fn entry(&mut self, key: K) -> Entry<K, V>
Gets the given key's corresponding entry in the map for in-place
manipulation.
example
use std::collections::HashMap;
let mut letters = HashMap::new();
for ch in "a short treatise on fungi".chars() {
let counter = letters.entry(ch).or_insert(0);
*counter += 1;
}
assert_eq!(letters[&'s'], 2);
assert_eq!(letters[&'t'], 3);
assert_eq!(letters[&'u'], 1);
assert_eq!(letters.get(&'y'), None);
.or_insert() and .or_insert_with()
Adding to the existing example for .entry().or_insert(), I wanted to mention that if the default value passed to .or_insert() is dynamically generated, it's better to use .or_insert_with().
Using .or_insert_with() as below, the default value is not generated if the key already exists. It only gets created when necessary.
for v in 0..s.len() {
components.entry(unions.get_root(v))
.or_insert_with(|| vec![]) // vec only created if needed.
.push(v);
}
In the snipped below, the default vector passed to .or_insert() is generated on every call. If the key exists, a vector is being created and then disposed of, which can be wasteful.
components.entry(unions.get_root(v))
.or_insert(vec![]) // vec always created.
.push(v);
So for fixed values that don't have much creation overhead, use .or_insert(), and for values that have appreciable creation overhead, use .or_insert_with().
A way to start a map with initial values is to construct the map from a vector of tuples. For instance, considering, the code below:
let map = vec![("field1".to_string(), value1), ("field2".to_string(), value2)].into_iter().collect::<HashMap<_, _>>();

Sort HashMap data by value

I want to sort HashMap data by value in Rust (e.g., when counting character frequency in a string).
The Python equivalent of what I’m trying to do is:
count = {}
for c in text:
count[c] = count.get('c', 0) + 1
sorted_data = sorted(count.items(), key=lambda item: -item[1])
print('Most frequent character in text:', sorted_data[0][0])
My corresponding Rust code looks like this:
// Count the frequency of each letter
let mut count: HashMap<char, u32> = HashMap::new();
for c in text.to_lowercase().chars() {
*count.entry(c).or_insert(0) += 1;
}
// Get a sorted (by field 0 ("count") in reversed order) list of the
// most frequently used characters:
let mut count_vec: Vec<(&char, &u32)> = count.iter().collect();
count_vec.sort_by(|a, b| b.1.cmp(a.1));
println!("Most frequent character in text: {}", count_vec[0].0);
Is this idiomatic Rust? Can I construct the count_vec in a way so that it would consume the HashMaps data and owns it (e.g., using map())? Would this be more idomatic?
Is this idiomatic Rust?
There's nothing particularly unidiomatic, except possibly for the unnecessary full type constraint on count_vec; you could just use
let mut count_vec: Vec<_> = count.iter().collect();
It's not difficult from context to work out what the full type of count_vec is. You could also omit the type constraint for count entirely, but then you'd have to play shenanigans with your integer literals to have the correct value type inferred. That is to say, an explicit annotation is eminently reasonable in this case.
The other borderline change you could make if you feel like it would be to use |a, b| a.1.cmp(b.1).reverse() for the sort closure. The Ordering::reverse method just reverses the result so that less-than becomes greater-than, and vice versa. This makes it slightly more obvious that you meant what you wrote, as opposed to accidentally transposing two letters.
Can I construct the count_vec in a way so that it would consume the HashMaps data and owns it?
Not in any meaningful way. Just because HashMap is using memory doesn't mean that memory is in any way compatible with Vec. You could use count.into_iter() to consume the HashMap and move the elements out (as opposed to iterating over pointers), but since both char and u32 are trivially copyable, this doesn't really gain you anything.
This could be another way to address the matter without the need of an intermediary vector.
// Count the frequency of each letter
let mut count: HashMap<char, u32> = HashMap::new();
for c in text.to_lowercase().chars() {
*count.entry(c).or_insert(0) += 1;
}
let top_char = count.iter().max_by(|a, b| a.1.cmp(&b.1)).unwrap();
println!("Most frequent character in text: {}", top_char.0);
use BTreeMap for sorted data
BTreeMap sorts its elements by key by default, therefore exchanging the place of your key and value and putting them into a BTreeMap
let count_b: BTreeMap<&u32,&char> = count.iter().map(|(k,v)| (v,k)).collect();
should give you a sorted map according to character frequency.
Some character of the same frequency shall be lost though. But if you only want the most frequent character, it does not matter.
You can get the result using
println!("Most frequent character in text: {}", count_b.last_key_value().unwrap().1);

Resources