Rust closure return value - rust

I'm a Rust beginner and am working on a small personal project. I'm working on a function that takes a Vec of records and turns them into strings to be used as rows in a CSV file.
My currently incomplete function looks like
pub fn write_csv_records<T: CSVWritable>( file_path: &str, seperator: char, records: Vec<Box<dyn CSVWritable>>, columns: Vec<String> ) -> ()
{
let body = records.iter()
.map(|r| r.to_csv_row() )
.map(|row_values| {
columns.iter()
// this is my problem the closure in unwrap_or_else causes in issue!
.map( |column| row_values.get( column ).unwrap_or_else(|| &String::from("") ).clone() )
.collect::<Vec<String>>()
.join( &seperator.to_string() )
})
.collect::<Vec<String>>()
.join( "\n" );
}
I have had a persistent error that reads "cannot return a reference to data owned by the current function".
I'm at a loss for how to proceed. Is this related to lifetimes? Am I missing how unwrap_or_else works? What is the right way to provide a default value when something is absent in the row_values HashMap?

In your unwrap_or_else() call, you're returning the address of a function that's owned by the closure. This isn't allowed in Rust because that address is no longer valid after the closure finishes and that variable goes out of scope. Since you're just immediately cloning, you should instead return the actual String from the closure, not return the address of the String.
Also, if your CSVWritable trait returns something like a HashMap where the get() method returns a Option<&T> rather than Option<T>, then you need to do some extra work to get the underlying value (because you're collecting into Vector<String> and not Vector<&String>). So you'll want something like this:
pub trait CSVWritable {
fn to_csv_row(&self) -> std::collections::HashMap<String, String>;
}
pub fn write_csv_records(file_path: &str, seperator: char, records: Vec<Box<dyn CSVWritable>>, columns: Vec<String>) {
let body = records
.iter()
.map(|r| r.to_csv_row())
.map(|row_values| {
columns
.iter()
.map(|column| match row_values.get(column) {
Some(s) => s.clone(),
None => String::from(""),
})
.collect::<Vec<String>>()
.join(&seperator.to_string())
})
.collect::<Vec<String>>()
.join("\n");
}

Related

RefCell cannot return string slice from String; temporary value problem

im trying to return a &str slice from a String which is behind a an Rc<RefCell<>>
use std::rc::Rc;
use std::cell::{
RefCell,
Ref
};
struct Data {
data: Rc<RefCell<String>>
}
impl Data {
fn set(
&mut self,
data: String
) {
*self.data.borrow_mut() = data;
}
// TODO: fix here
fn get(&self) -> &str {
self.data.borrow().as_str()
}
}
fn main() {
let mut data = Data {
data: Rc::new(RefCell::new(String::new()))
};
data.set(String::from("how can i solve this?"));
let d = data.get();
println!("{}", d);
}
here's the compilation error
error[E0515]: cannot return reference to temporary value
--> questions/refcell-data-inspection/src/main.rs:21:9
|
21 | self.data.borrow().as_str()
| ------------------^^^^^^^^^
| |
| returns a reference to data owned by the current function
| temporary value created here
well, i understand its a temporary value crated because borrow returns a guard, which will go out of scope.
how can i fix this in order to return a &str slice from that String?
Well, as you said yourself "borrow guard" will be out of scope at the one of a function, so you cannot do that. Think of it that way. If you managed to return a reference into String, how would RefCell know that this reference exists and that it cannot give for example a mutable reference.
Only thing that you can do is to return that guard. This would mean, that you must change function's signature, but it will be as usable as a &str would be. So try this:
fn get(&self) -> Ref<'_, String> {
self.data.borrow()
}
EDIT. You might want to also look at that question: How to borrow the T from a RefCell<T> as a reference?

Hashmap value not updating on consecutive inserts [Rust]

I have a struct that contains various Routers. Mostly hashmaps. But for this specific hashmap, the values are not updating after insertion. There is no delete function. Just an insert function(shown below).
This is the main struct
pub struct Router {
....
web_socket_routes: Arc<RwLock<HashMap<String, HashMap<String, (PyFunction, u8)>>>>,
}
This is a getter
#[inline]
pub fn get_web_socket_map(
&self,
) -> &Arc<RwLock<HashMap<String, HashMap<String, (PyFunction, u8)>>>> {
&self.web_socket_routes
}
This is the insert method
pub fn add_websocket_route(
&mut self,
route: &str,
connect_route: (Py<PyAny>, bool, u8),
close_route: (Py<PyAny>, bool, u8),
message_route: (Py<PyAny>, bool, u8),
) {
let table = self.get_web_socket_map();
let (connect_route_function, connect_route_is_async, connect_route_params) = connect_route;
let (close_route_function, close_route_is_async, close_route_params) = close_route;
let (message_route_function, message_route_is_async, message_route_params) = message_route;
let insert_in_router =
|handler: Py<PyAny>, is_async: bool, number_of_params: u8, socket_type: &str| {
let function = if is_async {
PyFunction::CoRoutine(handler)
} else {
PyFunction::SyncFunction(handler)
};
let mut route_map = HashMap::new();
route_map.insert(socket_type.to_string(), (function, number_of_params));
println!("socket type is {:?} {:?}", table, route);
table.write().unwrap().insert(route.to_string(), route_map);
};
insert_in_router(
connect_route_function,
connect_route_is_async,
connect_route_params,
"connect",
);
insert_in_router(
close_route_function,
close_route_is_async,
close_route_params,
"close",
);
insert_in_router(
message_route_function,
message_route_is_async,
message_route_params,
"message",
);
}
After all the 3 insert_in_router calls, web_socket_routes only contains the insertion of the last insert_in_router call?
I have tried changing the Arc<RwLock< for a generic DashMap but I am still facing the same issues.
Why is this happening?
Your closure unconditionally creates a new inner hashmap each time, which it uses as value in the outer hashmap. However, it inserts it into the outer hashmap under the same key (route.to_string()) all three times, which results in each insert overwriting the previous one(s).
You need to implement a logic that will create a new inner hashmap only if one is missing under that key, otherwise look up the existing one. Then it should insert the value into the inner hashmap, either the freshly created one, or the one looked up. In Rust this is conveniently done using the entry API:
table
.write()
.unwrap()
.entry(route.to_string())
.or_default()
.insert(socket_type.to_string(), (function, number_of_params));

can I create a custom iterator the iterates over one sequence then another (chain doesnt work)

I have a struct Folder. I have a method called contents. I want that method to return an object that supports IntoIterator so that the caller can just go
for x in folder.contents(){
...
}
The Item type is (since this is what the hashmap iterator returns - see a little lower)
(&OsString, &FileOrFolder)
where FileOrFolder is an enum
enum FileOrFolder{
File(File),
Folder(Folder)
}
The iterator itself needs to first enumerate a HashMap<OSString, FileOrFolder> owned by the folder and then second, enumerate a Vec<File>. The Vec of files is created on the fly by the contents fn or by the IntoIterator call, whatever works. I tried simply using chain but quickly realized that wasn't going to work. So my rough sketch of what I am trying to do is this:
// the iterator
pub struct FFIter {
files: Vec<FileOrFolder>,
files_iter:Box<dyn Iterator<Item=FileOrFolder>>,
dirs: Box<dyn Iterator<Item = (&OsString, &FileOrFolder)>>,
dirs_done:bool
}
// the thing returned by the contents fn
struct FolderContents{
folder:&Folder
}
// make it iterable
impl IntoIterator for FolderContents {
type Item =(&OsString, &FileOrFolder);
type IntoIter = FFIter;
fn into_iter(self) -> Self::IntoIter {
let files = self.folder.make_the_files()
FFIter {
files: files, // to keep files 'alive'
files_iter: files.iter(),
dirs: Box::new(self.hashmap.iter()),
dirs_done:false
}
}
}
impl Iterator for FFIter {
type Item = (&OsString, &FileOrFolder);
fn next(&mut self) -> Option<(&OsString, &FileOrFolder)> {
None // return empty, lets just get the skeleton built
}
}
impl Folder{
pub fn contents(&self) -> FolderContents{
FolderContents{folder:&self}
}
}
I know this is full of errors, but I need to know if this is doable at all. As you can see I am not even trying to write the code that returns anything. I am just trying to get the basic outline to compile.
I started arm wrestling with the lifetime system and got to the point where I had this
error[E0658]: generic associated types are unstable
--> src\state\files\file_or_folder.rs:46:5
|
46 | type Item<'a> =(&'a OsString, &'a FileOrFolder);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: see issue #44265 <https://github.com/rust-lang/rust/issues/44265> for more information
Which kinda sucked as that is what the compiler said I should do.
I am happy to keep ploughing away at this following the suggestions from the compiler / reading / ... But in the past I have posted a question along these lines and been told - 'of course it can't be done'. So should I be able to make this work?
The Folder type is not Copy and expensive to clone. The File type is simple (string and i64), Copy and Clone
I know I could simply make the caller call two different iterations and merge them, but I am trying to write a transparent replacement module to drop into a large existing codebase.
If somebody says that chain() should work that's great, I will have another go at that.
EDIT Jmp said chain should work,
heres what I tried
pub fn contents(&self) -> Box<dyn Iterator<Item = (&OsString, &FileOrFolder)> + '_> {
let mut files = vec![];
if self.load_done {
for entry in WalkDir::new(&self.full_path)
.max_depth(1)
.skip_hidden(false)
.follow_links(false)
.into_iter()
{
let ent = entry.unwrap();
if ent.file_type().is_file() {
if let Some(name) = ent.path().file_name() {
files.push((
name.to_os_string(),
FileOrFolder::File(File {
name: name.to_os_string(),
size: ent.metadata().unwrap().len() as u128,
}),
));
}
}
}
};
Box::new(
self.contents
.iter()
.map(|(k, v)| (k, v))
.chain(files.iter().map(|x| (&x.0, &x.1))),
)
}
but the compiler complains, correctly, that 'files' get destroyed at the end of the call. What I need is for the vec to be held by the iterator and then dropped at the end of the iteration. Folder itself cannot hold the files - the whole point here is to populate files on the fly, its too expensive, memory wise to hold them.
You claim that files is populated on the fly, but that's precisely what your code is not doing: your code precomputes files before attempting to return it. The solution is to really compute files on the fly, something like this:
pub fn contents(&self) -> Box<dyn Iterator<Item = (&OsString, &FileOrFolder)> + '_> {
let files = WalkDir::new(&self.full_path)
.max_depth(1)
.skip_hidden(false)
.follow_links(false)
.into_iter()
.filter_map (|entry| {
let ent = entry.unwrap;
if ent.file_type().is_file() {
if let Some(name) = ent.path().file_name() {
Some((
name.to_os_string(),
FileOrFolder::File(File {
name: name.to_os_string(),
size: ent.metadata().unwrap().len() as u128,
}),
))
} else None
} else None
});
self.contents
.iter()
.chain (files)
}
Since you haven't given us an MRE, I haven't tested the above, but I think it will fail because self.contents.iter() returns references, whereas files returns owned values. Fixing this requires changing the prototype of the function to return some form of owned values since files cannot be made to return references. I see two ways to do this:
Easiest is to make FileOrFolder clonable and get rid of the references in the prototype:
pub fn contents(&self) -> Box<dyn Iterator<Item = (OsString, FileOrFolder)> + '_> {
let files = ...;
self.contents
.iter()
.cloned()
.chain (files)
Or you can make a wrapper type similar to Cow than can hold either a reference or an owned value:
enum OwnedOrRef<'a, T> {
Owned (T),
Ref (&'a T),
}
pub fn contents(&self) -> Box<dyn Iterator<Item = (OwnedOrRef::<OsString>, OwnedOrRef::<FileOrFolder>)> + '_> {
let files = ...;
self.contents
.iter()
.map (|(k, v)| (OwnedOrRef::Ref (k), OwnedOrRef::Ref (v))
.chain (files
.map (|(k, v)| (OwnedOrRef::Owned (k),
OwnedOrRef::Owned (v)))
}
You can even use Cow if FileOrFolder can implement ToOwned.

Return exact value in Rust HashMap

I can't find a suitable way to return the exact value of key in a HashMap in Rust . All the existing get methods return in a different format rather than the exact format.
You probably want the HashMap::remove method - it deletes the key from the map and returns the original value rather than a reference:
use std::collections::HashMap;
struct Thing {
content: String,
}
fn main() {
let mut hm: HashMap<u32, Thing> = HashMap::new();
hm.insert(
123,
Thing {
content: "abc".into(),
},
);
hm.insert(
432,
Thing {
content: "def".into(),
},
);
// Remove object from map, and take ownership of it
let value = hm.remove(&432);
if let Some(v) = value {
println!("Took ownership of Thing with content {:?}", v.content);
};
}
The get methods must return a reference to the object because the original object can only exist in one place (it is owned by the HashMap). The remove method can return the original object (i.e "take ownership") only because it removes it from its original owner.
Another solution, depending on the specific situation, may be to take the reference, call .clone() on it to make a new copy of the object (in this case it wouldn't work because Clone isn't implemented for our Thing example object - but it would work if the value way, say, a String)
Finally it may be worth noting you can still use the reference to the object in many circumstances - e.g the previous example could be done by getting a reference:
use std::collections::HashMap;
struct Thing {
content: String,
}
fn main() {
let mut hm: HashMap<u32, Thing> = HashMap::new();
hm.insert(
123,
Thing {
content: "abc".into(),
},
);
hm.insert(
432,
Thing {
content: "def".into(),
},
);
let value = hm.get(&432); // Get reference to the Thing containing "def" instead of removing it from the map and taking ownership
// Print the `content` as in previous example.
if let Some(v) = value {
println!("Showing content of referenced Thing: {:?}", v.content);
}
}
There are two basic methods of obtaining the value for the given key: get() and get_mut(). Use the first one if you just want to read the value, and the second one if you need to modify the value:
fn get(&self, k: &Q) -> Option<&V>
fn get_mut(&mut self, k: &Q) -> Option<&mut V>
As you can see from their signatures, both of these methods return Option rather than a direct value. The reason is that there may be no value associated to the given key:
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert(1, "a");
assert_eq!(map.get(&1), Some(&"a")); // key exists
assert_eq!(map.get(&2), None); // key does not exist
If you are sure that the map contains the given key, you can use unwrap() to get the value out of the option:
assert_eq!(map.get(&1).unwrap(), &"a");
However, in general, it is better (and safer) to consider also the case when the key might not exist. For example, you may use pattern matching:
if let Some(value) = map.get(&1) {
assert_eq!(value, &"a");
} else {
// There is no value associated to the given key.
}

Borrowed value doesn't live long enough, trying to expose iterators instead of concrete Vec representations of the data

I have a struct representing a grid of data, and accessors for the rows and columns. I'm trying to add accessors for the rows and columns which return iterators instead of Vec.
use std::slice::Iter;
#[derive(Debug)]
pub struct Grid<Item : Copy> {
raw : Vec<Vec<Item>>
}
impl <Item : Copy> Grid <Item>
{
pub fn new( data: Vec<Vec<Item>> ) -> Grid<Item> {
Grid{ raw : data }
}
pub fn width( &self ) -> usize {
self.rows()[0].len()
}
pub fn height( &self ) -> usize {
self.rows().len()
}
pub fn rows( &self ) -> Vec<Vec<Item>> {
self.raw.to_owned()
}
pub fn cols( &self ) -> Vec<Vec<Item>> {
let mut cols = Vec::new();
for i in 0..self.height() {
let col = self.rows().iter()
.map( |row| row[i] )
.collect::<Vec<Item>>();
cols.push(col);
}
cols
}
pub fn rows_iter( &self ) -> Iter<Vec<Item>> {
// LIFETIME ERROR HERE
self.rows().iter()
}
pub fn cols_iter( &self ) -> Iter<Vec<Item>> {
// LIFETIME ERROR HERE
self.cols().iter()
}
}
Both functions rows_iter and cols_iter have the same problem: error: borrowed value does not live long enough. I've tried a lot of things, but pared it back to the simplest thing to post here.
You can use the method into_iter which returns std::vec::IntoIter. The function iter usually only borrows the data source iterated over. into_iter has ownership of the data source. Thus the vector will live as long as the actual data.
pub fn cols_iter( &self ) -> std::vec::IntoIter<Vec<Item>> {
self.cols().intoiter()
}
However, I think that the design of your Grid type could be improved a lot. Always cloning a vector is not a good thing (to name one issue).
Iterators only contain borrowed references to the original data structure; they don't take ownership of it. Therefore, a vector must live longer than an iterator on that vector.
rows and cols allocate and return a new Vec. rows_iter and cols_iter are trying to return an iterator on a temporary Vec. This Vec will be deallocated before rows_iter or cols_iter return. That means that an iterator on that Vec must be deallocated before the function returns. However, you're trying to return the iterator from the function, which would make the iterator live longer than the end of the function.
There is simply no way to make rows_iter and cols_iter compile as is. I believe these methods are simply unnecessary, since you already provide the public rows and cols methods.

Resources