Rust lifetime in closure environment - rust

I want to implement a graph structure in Rust. For this goal, I wrote simple abstractions:
pub struct Graph<'a> {
pub nodes: Vec<Node>,
pub edges: Vec<Edge<'a>>,
}
#[derive(Debug)]
pub struct Node {
pub id: String,
pub label: String,
}
pub struct Edge<'a> {
pub source: &'a Node,
pub target: &'a Node,
}
Graph contains vectors of Nodes and Edges. Every Edge has a ref to a Node in the same Graph.
I don't know it's a possible write something like this.
I tried to write a static method that builds a new Graph instance from a JSON representation:
impl<'a> Graph<'a> {
pub fn from_json(json: &String) -> Graph {
if let json::JsonValue::Object(deserialized) = json::parse(json.as_ref()).unwrap() {
let nodes: Vec<Node> = deserialized
.get("nodes")
.unwrap()
.members()
.map(|v| {
if let json::JsonValue::Object(ref val) = *v {
return Node {
id: val.get("id").unwrap().to_string(),
label: val.get("label").unwrap().to_string(),
};
}
panic!("Invalid structure of json graph body.")
})
.collect::<Vec<Node>>();
let edges: Vec<Edge> = deserialized
.get("edges")
.unwrap()
.members()
.map(|v| {
if let json::JsonValue::Object(ref val) = *v {
let source = (*nodes)
.iter()
.find(|&v| v.id == val.get("source").unwrap().to_string())
.unwrap();
let target = (*nodes)
.iter()
.find(|&v| v.id == val.get("target").unwrap().to_string())
.unwrap();
return Edge { source, target };
}
panic!("Invalid structure of json graph body.")
})
.collect::<Vec<Edge>>();
return Graph { nodes, edges };
}
panic!("Incorrect struct of json contains!");
}
}
When I compile, I get this error:
error[E0373]: closure may outlive the current function, but it borrows `nodes`, which is owned by the current function
--> src/graph.rs:30:22
|
30 | .map(|v| {
| ^^^ may outlive borrowed value `nodes`
31 | if let json::JsonValue::Object(ref val) = *v {
32 | let source = (*nodes).iter().find(|&v| v.id == val.get("source").unwrap().to_string()).unwrap();
| ----- `nodes` is borrowed here
|
help: to force the closure to take ownership of `nodes` (and any other referenced variables), use the `move` keyword
|
30 | .map(move |v| {
| ^^^^^^^^
error: aborting due to previous error
A possible solution to this problem is to add move before the closure parameters, but I need the nodes vector to build the Graph instance.
What am I doing wrong?

After some research, I found this article's: Rust doc. Smart pointers, Users Rust Lang, and I understood my mistakes.
The first one: I remove lifetime parameters from structs definitions.
use std::rc::Rc;
#[derive(Debug)]
pub struct Graph {
pub nodes: Vec<Rc<Node>>,
pub edges: Vec<Edge>
}
#[derive(Debug)]
pub struct Node {
pub id: String,
pub label: String
}
#[derive(Debug)]
pub struct Edge {
pub source: Rc<Node>,
pub target: Rc<Node>
}
Second thing: I rewrote the code of from_json function for using Rc<T> instead of raw references.
impl Graph {
pub fn from_json(json: & String) -> Graph {
if let json::JsonValue::Object(deserialized) = json::parse(json.as_ref()).unwrap() {
let nodes : Vec<Rc<Node>> = deserialized.get("nodes").unwrap().members()
.map(|v| {
if let json::JsonValue::Object(ref val) = *v {
return Rc::new(Node {
id: val.get("id").unwrap().to_string(),
label: val.get("label").unwrap().to_string()
});
}
panic!("Invalid structure of json graph body.")
}).collect::<Vec<Rc<Node>>>();
let edges : Vec<Edge> = deserialized.get("edges").unwrap().members()
.map(|v| {
if let json::JsonValue::Object(ref val) = *v {
let source = nodes.iter().find(|&v| v.id == val.get("source").unwrap().to_string()).unwrap();
let target = nodes.iter().find(|&v| v.id == val.get("target").unwrap().to_string()).unwrap();
return Edge {
source: Rc::clone(&source),
target: Rc::clone(&target)
};
}
panic!("Invalid structure of json graph body.")
}).collect::<Vec<Edge>>();
return Graph {
nodes,
edges
}
}
panic!("Incorrect struct of json contains!");
}
}
Now it works. Thanks for sharing useful links. I found a lot of helpful information about building graph structs in Rust such as: Graph structure in Rust

Related

How can I update a mutable reference in a loop?

I'm trying to implement a Trie/Prefix Tree in Rust and I'm having trouble with the borrow checker. Here is my implementation so far and I'm getting an error when I call children.insert.
cannot borrow *children as mutable because it is also borrowed as immutable
use std::collections::HashMap;
#[derive(Clone, Debug)]
struct PrefixTree {
value: String,
children: HashMap<char, PrefixTree>
}
fn insert(mut tree: &mut PrefixTree, key: &str, value: String) {
let mut children = &mut tree.children;
for c in key.chars() {
if !children.contains_key(&c) {
children.insert(c, PrefixTree {
value: String::from(&value),
children: HashMap::new()
});
}
let subtree = children.get(&c);
match subtree {
Some(s) => {
children = &mut s.children;
},
_ => {}
}
}
tree.value = value;
}
fn main() {
let mut trie = PrefixTree {
value: String::new(),
children: HashMap::new()
};
let words = vec!["Abc", "Abca"];
for word in words.iter() {
insert(&mut trie, word, String::from("TEST"));
}
println!("{:#?}", trie);
}
I think this problem is related to Retrieve a mutable reference to a tree value but in my case I need to update the mutable reference and continue looping. I understand why I'm getting the error since I'm borrowing a mutable reference twice, but I'm stumped about how to rewrite this so I'm not doing it that way.
When you're doing multiple things with a single key (like find or insert and get) and run into borrow trouble, try using the Entry API (via .entry()):
fn insert(mut tree: &mut PrefixTree, key: &str, value: String) {
let mut children = &mut tree.children;
for c in key.chars() {
let tree = children.entry(c).or_insert_with(|| PrefixTree {
value: String::from(&value),
children: HashMap::new(),
});
children = &mut tree.children;
}
tree.value = value;
}

Implementing 2D vector in Rust

// GMatrix is the structure that will implement the matrix
pub struct GMatrix {
pub vec_list: Vec<String>,
pub matrix: Vec<Vec<i32>>,
}
impl GMatrix {
pub fn new() -> GMatrix {
let v: Vec<Vec<i32>> = Vec::new();
GMatrix {
vec_list: vec![],
matrix: v,
}
}
// insert_vertex inserts into the vertex
pub fn insert_vertex(&mut self, vertex_name: &str) -> Result<(), String> {
if self.vec_list.iter().any(|i| i == vertex_name) {
return Err(format!("Vector already present"));
}
self.vec_list.push(vertex_name.to_string());
let mut v: Vec<i32> = Vec::new();
self.matrix.append(&v);
self.update_vector();
Ok(())
}
/// update_vector adds another row when another vector is added
/// will be called inside insert_vector function, so no need to
/// be public
fn update_vector(&mut self) {
for i in 0..self.vec_list.len() - 1 {
if self.matrix[i].len() < self.vec_list.len() {
self.matrix[i].push(-1);
}
}
}
}
I am trying I guess the error is in line 23 where I try to append another vector. The compiler throws the error.
|
23 | self.matrix.append(&v);
| ^^ types differ in mutability
|
= note: expected mutable reference `&mut Vec<Vec<i32>>`
found reference `&Vec<i32>`
error: aborting due to previous error
I thought I created a vector in insert_vertex that is mutable, and yet I try to append it to another vector I get an error.

Rust can't assign Option internal mutable reference

Hopefully the title is accurate.
I would like to set a field on the Node struct that is inside of a vector. The node is a mutable reference, so I'm not sure why I can't assign to it. I'm guessing I am not properly unwrapping the Option?
Code example:
#[derive(Debug)]
enum ContentType {
Big,
Small,
}
#[derive(Debug)]
struct Node {
content_type: Option<ContentType>
}
#[derive(Debug)]
struct List {
nodes: Vec<Node>,
}
impl List {
fn get_node(&self, index: usize) -> Option<&Node> {
return self.nodes.get(index);
}
}
fn main() {
let list = List {
nodes: vec![Node {content_type: None}]
};
let node = &mut list.get_node(0);
println!("{:?}", list);
if let Some(x) = node {
x.content_type = Some(ContentType::Big)
}
}
Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=18bdaf8b903d57dfbf49ebfb3252cf34
Receiving this error:
cannot assign to `x.content_type` which is behind a `&` reference
The error specifically refers to the return type of get_node, which is Option<&Node>. When you take the content out of the option here:
if let Some(x) = node {
x.content_type = Some(ContentType::Big)
}
x becomes &Node, which is not a mutable reference.
You need to change get_node to return a mutable reference.
impl List {
// Change to &mut self to borrow mutable items from self, and change the return type.
fn get_node(&mut self, index: usize) -> Option<&mut Node> {
return self.nodes.get_mut(index);
}
}
fn main() {
let mut list = List {
nodes: vec![Node {content_type: None}]
};
// Move this print statement above get_node(),
// as you can't get a non mutable reference while you are still holding onto a mutable reference
println!("{:?}", list);
let node = list.get_node(0);
if let Some(x) = node {
x.content_type = Some(ContentType::Big)
}
}
Playground
This means you can't get two mutable references to two different nodes at the same time however. See this question for a potential solution: How to get mutable references to two array elements at the same time?
It means what it says: get_node returns a Option<&Node> -- that's a non-mutable reference to a node. You can't mutate it.
Perhaps you meant to do
impl List {
fn get_node(&mut self, index: usize) -> Option<&mut Node> {
return self.nodes.get_mut(index);
}
}
Then you can do
let mut list = List {nodes: vec![Node {content_type: None}]};
let node = list.get_node(0);
if let Some(x) = node {
x.content_type = Some(ContentType::Big)
}

How to implement a macro that defines a new public type and returns an instance of that type?

I want to implement a struct using macro_rules! because the generics require a lot of boilerplate and trait hunting.
The struct in question has a hash table inside but the key and the value types are to be provided by the user. The code is as follows:
macro_rules! new_ytz {
($T: ty) => {
// define the struct
pub struct Ytz {
table: hashbrown::hash_map::HashMap<$T, $T>,
}
impl Ytz {
pub fn new() -> Self {
Ytz {
table: hashbrown::hash_map::HashMap::<$T, $T>::new(),
}
}
pub fn add(&mut self, item: &$T) {
if self.table.contains_key(item) {
*self.table.get_mut(item).unwrap() += *item;
} else {
self.table.insert(*item, *item);
}
}
pub fn largest(&self) -> $T {
let mut result = 0;
for v in self.table.values() {
if result < *v {
result = *v;
}
}
result
}
}
// construct an instance of the struct and return it
Ytz::new()
};
}
// driver
fn main() {
let mut y = new_ytz!(u64); // should construct the object and return Ytz::new()
y.add(&71);
y.add(&25);
y.add(&25);
y.add(&25);
y.add(&34);
println!("{}", y.largest());
}
This won't compile since it tries to paste the struct within the main function:
error: expected expression, found keyword `pub`
--> src/main.rs:4:9
|
4 | pub struct Ytz {
| ^^^ expected expression
...
40 | let mut y = new_ytz!(u64); // should construct the object and return Ytz::new()
| ------------- in this macro invocation
How can I work around it? How can I paste the struct outside the main function publicly, along with the impl block?
generics require a lot of boilerplate
use std::collections::HashMap;
use core::hash::Hash;
use std::ops::AddAssign;
struct YtzU64<T: Eq + Ord + Hash + Copy + AddAssign> {
table: HashMap<T, T>
}
impl<T: Eq + Ord + Hash + Copy + AddAssign> YtzU64<T> {
pub fn new() -> Self {
Self {
table: HashMap::new()
}
}
pub fn add(&mut self, item: &T) {
if let Some(item) = self.table.get_mut(item) {
*item += *item;
} else {
self.table.insert(*item, *item);
}
}
pub fn largest(&self) -> Option<T> {
let mut values = self.table.values();
let mut largest:Option<T> = values.next().map(|t| *t);
for v in values {
if largest < Some(*v) {
largest = Some(*v);
}
}
largest
}
}
fn main() {
let mut y = YtzU64::new();
y.add(&71);
y.add(&25);
y.add(&25);
y.add(&25);
y.add(&34);
println!("{}", y.largest().unwrap());
}
My translation of your macro requires less boilerplate than your macro. It has two fewer indents, 4 fewer lines (macro_rules!, pattern matching at the top, two close braces at the end). Note that I changed the api slightly, as largest now returns an Option, to match std::iter::Iterator::max(). Also note that your api design is limited to T:Copy. You would have to redesign it a little if you want to support T: ?Copy + Clone or T: ?Copy + ?Clone.
trait hunting
The compiler is your friend. Watch what happens when I remove one of the trait bounds
error[E0277]: the trait bound `T: std::hash::Hash` is not satisfied
...
Using a macro is an interesting exercise, but re-implementing generics using macros is not useful.

How do I efficiently build a vector and an index of that vector while processing a data stream?

I have a struct Foo:
struct Foo {
v: String,
// Other data not important for the question
}
I want to handle a data stream and save the result into Vec<Foo> and also create an index for this Vec<Foo> on the field Foo::v.
I want to use a HashMap<&str, usize> for the index, where the keys will be &Foo::v and the value is the position in the Vec<Foo>, but I'm open to other suggestions.
I want to do the data stream handling as fast as possible, which requires not doing obvious things twice.
For example, I want to:
allocate a String only once per one data stream reading
not search the index twice, once to check that the key does not exist, once for inserting new key.
not increase the run time by using Rc or RefCell.
The borrow checker does not allow this code:
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
//here is loop in real code, like:
//let mut s: String;
//while get_s(&mut s) {
let s = "aaa".to_string();
let idx: usize = match hash.entry(&s) { //a
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: s }); //b
ent.insert(l.len() - 1);
l.len() - 1
}
};
// do something with idx
}
There are multiple problems:
hash.entry borrows the key so s must have a "bigger" lifetime than hash
I want to move s at line (b), while I have a read-only reference at line (a)
So how should I implement this simple algorithm without an extra call to String::clone or calling HashMap::get after calling HashMap::insert?
In general, what you are trying to accomplish is unsafe and Rust is correctly preventing you from doing something you shouldn't. For a simple example why, consider a Vec<u8>. If the vector has one item and a capacity of one, adding another value to the vector will cause a re-allocation and copying of all the values in the vector, invalidating any references into the vector. This would cause all of your keys in your index to point to arbitrary memory addresses, thus leading to unsafe behavior. The compiler prevents that.
In this case, there's two extra pieces of information that the compiler is unaware of but the programmer isn't:
There's an extra indirection — String is heap-allocated, so moving the pointer to that heap allocation isn't really a problem.
The String will never be changed. If it were, then it might reallocate, invalidating the referred-to address. Using a Box<[str]> instead of a String would be a way to enforce this via the type system.
In cases like this, it is OK to use unsafe code, so long as you properly document why it's not unsafe.
use std::collections::HashMap;
#[derive(Debug)]
struct Player {
name: String,
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let mut players = Vec::new();
let mut index = HashMap::new();
for &name in &names {
let player = Player { name: name.into() };
let idx = players.len();
// I copied this code from Stack Overflow without reading the prose
// that describes why this unsafe block is actually safe
let stable_name: &str = unsafe { &*(player.name.as_str() as *const str) };
players.push(player);
index.insert(idx, stable_name);
}
for (k, v) in &index {
println!("{:?} -> {:?}", k, v);
}
for v in &players {
println!("{:?}", v);
}
}
However, my guess is that you don't want this code in your main method but want to return it from some function. That will be a problem, as you will quickly run into Why can't I store a value and a reference to that value in the same struct?.
Honestly, there's styles of code that don't fit well within Rust's limitations. If you run into these, you could:
decide that Rust isn't a good fit for you or your problem.
use unsafe code, preferably thoroughly tested and only exposing a safe API.
investigate alternate representations.
For example, I'd probably rewrite the code to have the index be the primary owner of the key:
use std::collections::BTreeMap;
#[derive(Debug)]
struct Player<'a> {
name: &'a str,
data: &'a PlayerData,
}
#[derive(Debug)]
struct PlayerData {
hit_points: u8,
}
#[derive(Debug)]
struct Players(BTreeMap<String, PlayerData>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| (name.into(), PlayerData { hit_points: 100 }))
.collect();
Players(players)
}
fn get<'a>(&'a self, name: &'a str) -> Option<Player<'a>> {
self.0.get(name).map(|data| Player { name, data })
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for (k, v) in &players.0 {
println!("{:?} -> {:?}", k, v);
}
println!("{:?}", players.get("eustice"));
}
Alternatively, as shown in What's the idiomatic way to make a lookup table which uses field of the item as the key?, you could wrap your type and store it in a set container instead:
use std::collections::BTreeSet;
#[derive(Debug, PartialEq, Eq)]
struct Player {
name: String,
hit_points: u8,
}
#[derive(Debug, Eq)]
struct PlayerByName(Player);
impl PlayerByName {
fn key(&self) -> &str {
&self.0.name
}
}
impl PartialOrd for PlayerByName {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}
impl Ord for PlayerByName {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
self.key().cmp(&other.key())
}
}
impl PartialEq for PlayerByName {
fn eq(&self, other: &Self) -> bool {
self.key() == other.key()
}
}
impl std::borrow::Borrow<str> for PlayerByName {
fn borrow(&self) -> &str {
self.key()
}
}
#[derive(Debug)]
struct Players(BTreeSet<PlayerByName>);
impl Players {
fn new<I>(iter: I) -> Self
where
I: IntoIterator,
I::Item: Into<String>,
{
let players = iter
.into_iter()
.map(|name| {
PlayerByName(Player {
name: name.into(),
hit_points: 100,
})
})
.collect();
Players(players)
}
fn get(&self, name: &str) -> Option<&Player> {
self.0.get(name).map(|pbn| &pbn.0)
}
}
fn main() {
let names = ["alice", "bob", "clarice", "danny", "eustice", "frank"];
let players = Players::new(names.iter().copied());
for player in &players.0 {
println!("{:?}", player.0);
}
println!("{:?}", players.get("eustice"));
}
not increase the run time by using Rc or RefCell
Guessing about performance characteristics without performing profiling is never a good idea. I honestly don't believe that there'd be a noticeable performance loss from incrementing an integer when a value is cloned or dropped. If the problem required both an index and a vector, then I would reach for some kind of shared ownership.
not increase the run time by using Rc or RefCell.
#Shepmaster already demonstrated accomplishing this using unsafe, once you have I would encourage you to check how much Rc actually would cost you. Here is a full version with Rc:
use std::{
collections::{hash_map::Entry, HashMap},
rc::Rc,
};
#[derive(Debug)]
struct Foo {
v: Rc<str>,
}
#[derive(Debug)]
struct Collection {
vec: Vec<Foo>,
index: HashMap<Rc<str>, usize>,
}
impl Foo {
fn new(s: &str) -> Foo {
Foo {
v: s.into(),
}
}
}
impl Collection {
fn new() -> Collection {
Collection {
vec: Vec::new(),
index: HashMap::new(),
}
}
fn insert(&mut self, foo: Foo) {
match self.index.entry(foo.v.clone()) {
Entry::Occupied(o) => panic!(
"Duplicate entry for: {}, {:?} inserted before {:?}",
foo.v,
o.get(),
foo
),
Entry::Vacant(v) => v.insert(self.vec.len()),
};
self.vec.push(foo)
}
}
fn main() {
let mut collection = Collection::new();
for foo in vec![Foo::new("Hello"), Foo::new("World"), Foo::new("Go!")] {
collection.insert(foo)
}
println!("{:?}", collection);
}
The error is:
error: `s` does not live long enough
--> <anon>:27:5
|
16 | let idx: usize = match hash.entry(&s) { //a
| - borrow occurs here
...
27 | }
| ^ `s` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
The note: at the end is where the answer is.
s must outlive hash because you are using &s as a key in the HashMap. This reference will become invalid when s is dropped. But, as the note says, hash will be dropped after s. A quick fix is to swap the order of their declarations:
let s = "aaa".to_string();
let mut hash = HashMap::<&str, usize>::new();
But now you have another problem:
error[E0505]: cannot move out of `s` because it is borrowed
--> <anon>:22:33
|
17 | let idx: usize = match hash.entry(&s) { //a
| - borrow of `s` occurs here
...
22 | l.push(Foo { v: s }); //b
| ^ move out of `s` occurs here
This one is more obvious. s is borrowed by the Entry, which will live to the end of the block. Cloning s will fix that:
l.push(Foo { v: s.clone() }); //b
I only want to allocate s only once, not cloning it
But the type of Foo.v is String, so it will own its own copy of the str anyway. Just that type means you have to copy the s.
You can replace it with a &str instead which will allow it to stay as a reference into s:
struct Foo<'a> {
v: &'a str,
}
pub fn main() {
// s now lives longer than l
let s = "aaa".to_string();
let mut l = Vec::<Foo>::new();
{
let mut hash = HashMap::<&str, usize>::new();
let idx: usize = match hash.entry(&s) {
Occupied(ent) => {
*ent.get()
}
Vacant(ent) => {
l.push(Foo { v: &s });
ent.insert(l.len() - 1);
l.len() - 1
}
};
}
}
Note that, previously I had to move the declaration of s to before hash, so that it would outlive it. But now, l holds a reference to s, so it has to be declared even earlier, so that it outlives l.

Resources