rust closures definition insdie a for loop

rust closures definition insdie a for loop - rust

I faced the same problem as mentioned in this question. In short his problem is borrowing an object as mutable, due to its usage inside a closure, and borrowing it as immutable due to usage inside of a function (or macro in this case).
fn main() {
let mut count = 0;
let mut inc = || {
count += 2;
};
for _index in 1..5 {
inc();
println!("{}", count);
}
}
One solution to this problem is defining the closure inside the for loop instead of outside the for loop, or avoid capture of a variable by passing the mutable reference using the parameters of the closure:
1.
fn main() {
let mut count = 0;
for _index in 1..5 {
let mut inc = || {
count += 2;
};
inc();
println!("{}", count);
}
}
fn main() {
let mut count = 0;
let inc = | count: &mut i32| {
*count += 2;
};
for _index in 1..5 {
inc(&mut count);
println!("{}", count);
}
}
So I have the following questions on my mind:
Which one of these follows the best practice solutions?
Is there a 3rd way of doing things the right way?
According to my un understanding, closures are just anonymous functions, so defining them multiple times is as efficient as defining them a single time. But I am not able to find a definite answer to this question on the official rust references. Help!

Regarding which one is the right solutions, I would say it depends on the use case. They are so similar it shouldn't matter in most cases unless there is something else to sway the decision. I don't know of any third solution.
However, closures are not just anonymous functions but also anonymous structs: A closures is an anonymous struct that calls an anonymous function. The members of the struct are the references to borrows values. This is important because structs need to be initialized and potentially moved around, unlike functions. This means the more values your closure borrows, the more expensive it is to initialize and pass as an argument to functions (by value). Likewise, if you initialize your closure inside a loop, the initialization might happen every iteration (if it is not optimized out of the loop), making it less performant than initializing it outside the loop.
We can try and desugar the first example into the following code:
struct IncClusureStruct<'a> {
count: &'a mut i32,
}
fn inc_closure_fn<'a>(borrows: &mut IncClusureStruct<'a>) {
*borrows.count += 2
}
fn main() {
let mut count = 0;
for _index in 1..5 {
let mut inc_struct = IncClusureStruct { count: &mut count };
inc_closure_fn(&mut inc_struct);
println!("{}", count);
}
}
Note: The compiler doesn't necessarily do exactly like this, but it is a useful approximation.
Here you can see the closure struct IncClusureStructand its function inc_closure_fn, which together are used to provide the functionality of inc. You can see we initialize the struct in the loop and then call it immediately. If we were to desugar the second example, IncClusureStruct would have no members, but inc_closure_fn would take an additional argument that references the counter. The counter reference would then go to the function call instead of the struct initializer.
These two examples end up being the same efficiency-wise because the number of actual values passed to the function is the same in both cases: 1 reference. Initialize a struct with one member is the same as simply initializing the member itself, the wrapping struct is gone by the time you reach machine code. I tried this on Godbolt and as far as I can tell, the resulting assembly is the same.
However, optimizations don't catch all situations. So, if performance is important, benchmarking is the way to go.

Related

Accessing a method while iterating over mutable borrow in Rust

I have this piece of code below where I iterate over a Vec (self.regions), and under the right conditions, I mutate the current element of the Vec
However, when processing a region, I must also call another method on self that borrows it, causing me to have both a mutable and shared reference of self at the same time, which is obviously impossible (and will not compile)
How can I change my code in order to make it work ?
I've thought of moving line 3 outside the loop and using a iter() instead of an iter_mut(), but then I don't see how to "keep the two iterators in sync" in order to make sure that both pieces of code refer to the same region
fn growth(&'a mut self) {
for region in self.regions.iter_mut() {
let candidates = self.neighbors(region); // Has to involve taking a &self
// Do stuff with `candidates` that involves mutating `region`
}
}
Edit
My post was lacking context, so here is the full function. self.regions is of type Vec<Vec<&'a Cell>>, and the signature of neighbors is fn neighbors(&self, region: &Vec<&Cell>) -> HashSet<&Cell>.
The objective of my program is to tesselate a grid into random polyominoes, by randomly selecting some sources and (the part that I posted) "growing them" into regions by iterating over each region and adding a random neighbor of the region to the region itself.
The neighbors function depends of the previous iterations' outcome, so it cannot be precomputed
fn growth(&'a mut self) -> bool {
let mut unchoosen_cells = 20; // More complex in reality
let mut steps: usize = 0;
while unchoosen_cells != 0 {
if steps > GROWTH_STEP_LIMIT { return false; }
for region in self.regions.iter_mut() {
if rand::random() { continue; }
let candidates = self.neighbors(region);
if let Some(cell) = candidates.iter().choose(&mut thread_rng()) {
// cell.in_region = true;
region.push(cell);
unchoosen_cells -= 1;
}
steps += 1;
}
}
true
}

After reading all the comments, I understand that my issue is the result of a poor design, and I will refactor my code in order to use indices instead of references
Thanks to all the people who took the time to help me, and I'm marking this as solved

Lifetimes in lambda-based iterators

My questions seems to be closely related to Rust error "cannot infer an appropriate lifetime for borrow expression" when attempting to mutate state inside a closure returning an Iterator, but I think it's not the same. So, this
use std::iter;
fn example(text: String) -> impl Iterator<Item = Option<String>> {
let mut i = 0;
let mut chunk = None;
iter::from_fn(move || {
if i <= text.len() {
let p_chunk = chunk;
chunk = Some(&text[..i]);
i += 1;
Some(p_chunk.map(|s| String::from(s)))
} else {
None
}
})
}
fn main() {}
does not compile. The compiler says it cannot determine the appropriate lifetime for &text[..i]. This is the smallest example I could come up with. The idea being, there is an internal state, which is a slice of text, and the iterator returns new Strings allocated from that internal state. I'm new to Rust, so maybe it's all obvious, but how would I annotate lifetimes so that this compiles?
Note that this example is different from the linked example, because there point was passed as a reference, while here text is moved. Also, the answer there is one and half years old by now, so maybe there is an easier way.
EDIT: Added p_chunk to emphasize that chunk needs to be persistent across calls to next and so cannot be local to the closure but should be captured by it.

Your code is an example of attempting to create a self-referential struct, where the struct is implicitly created by the closure. Since both text and chunk are moved into the closure, you can think of both as members of a struct. As chunk refers to the contents in text, the result is a self-referential struct, which is not supported by the current borrow checker.
While self-referential structs are unsafe in general due to moves, in this case it would be safe because text is heap-allocated and is not subsequently mutated, nor does it escape the closure. Therefore it is impossible for the contents of text to move, and a sufficiently smart borrow checker could prove that what you're trying to do is safe and allow the closure to compile.
The answer to the [linked question] says that referencing through an Option is possible but the structure cannot be moved afterwards. In my case, the self-reference is created after text and chunk were moved in place, and they are never moved again, so in principle it should work.
Agreed - it should work in principle, but it is well known that the current borrow checker doesn't support it. The support would require multiple new features: the borrow checker should special-case heap-allocated types like Box or String whose moves don't affect references into their content, and in this case also prove that you don't resize or mem::replace() the closed-over String.
In this case the best workaround is the "obvious" one: instead of persisting the chunk slice, persist a pair of usize indices (or a Range) and create the slice when you need it.

If you move the chunk Option into the closure, your code compiles. I can't quite answer why declaring chunk outside the closure results in a lifetime error for the borrow of text inside the closure, but the chunk Option looks superfluous anyways and the following code should be equivalent:
fn example(text: String) -> impl Iterator<Item = Option<String>> {
let mut i = 0;
iter::from_fn(move || {
if i <= text.len() {
let chunk = text[..i].to_string();
i += 1;
Some(Some(chunk))
} else {
None
}
})
}
Additionally, it seems unlikely that you really want an Iterator<Item = Option<String>> here instead of an Iterator<Item<String>>, since the iterator never yields Some(None) anyways.
fn example(text: String) -> impl Iterator<Item = String> {
let mut i = 0;
iter::from_fn(move || {
if i <= text.len() {
let chunk = text[..i].to_string();
i += 1;
Some(chunk)
} else {
None
}
})
}
Note, you can also go about this iterator without allocating a String for each chunk, if you take a &str as an argument and tie the lifetime of the output to the input argument:
fn example<'a>(text: &'a str) -> impl Iterator<Item = &'a str> + 'a {
let mut i = 0;
iter::from_fn(move || {
if i <= text.len() {
let chunk = &text[..i];
i += 1;
Some(chunk)
} else {
None
}
})
}

Confusing automatic dereferencing of Arc

This is an example taken from the Mutex documentation:
use std::sync::{Arc, Mutex};
use std::sync::mpsc::channel;
use std::thread;
const N: usize = 10;
fn main() {
let data = Arc::new(Mutex::new(0));
let (tx,rx) = channel();
for _ in 0..N{
let (data, tx) = (data.clone(), tx.clone());
thread::spawn(move || {
// snippet
});
}
rx.recv().unwrap();
}
My question is where the snippet comment is. It is given as
let mut data = data.lock().unwrap();
*data += 1;
if *data == N {
tx.send(()).unwrap();
}
The type of data is Arc<Mutex<usize>>, so when calling data.lock(), I assumed that the Arc is being automatically dereferenced and an usize is assigned to data. Why do we need a *in front of data again to dereference it?
The following code which first dereferences the Arc and then proceeds with just an usize also works in place of the snippet.
let mut data = *data.lock().unwrap();
data += 1;
if data == N {
tx.send(()).unwrap();
}

Follow the docs. Starting with Arc<T>:
Does Arc::lock exist? No. Check Deref.
Deref::Target is T. Check Mutex<T>.
Does Mutex::lock exist? Yes. It returns LockResult<MutexGuard<T>>.
Where does unwrap come from? LockResult<T> is a synonym for Result<T, PoisonError<T>>. So it's Result::unwrap, which results in a MutexGuard<T>.
Therefore, data is of type MutexGuard<usize>.
So this is wrong:
so when calling data.lock(), I assumed that the Arc is being automatically dereferenced and an usize is assigned to data.
Thus the question is not why you can't assign directly, but how you're able to assign an usize value at all. Again, follow the docs:
data is a MutexGuard<usize>, so check MutexGuard<T>.
*data is a pointer dereference in a context that requires mutation. Look for an implementation of DerefMut.
It says that for MutexGuard<T>, it implements DerefMut::deref_mut(&mut self) -> &mut T.
Thus, the result of *data is &mut usize.
Then we have your modified example. At this point, it should be clear that this is not at all doing the same thing: it's mutating a local variable that happens to contain the same value as the mutex. But because it's a local variable, changing it has absolutely no bearing on the contents of the mutex.
Thus, the short version is: the result of locking a mutex is a "smart pointer" wrapping the actual value, not the value itself. Thus you have to dereference it to access the value.

Explain the behavior of *Rc::make_mut and why it differs compared to Mutex

I needed to pass a resource between several functions which use a closure as an argument. And within these the data was handled, but it looked for that the changes that were realized to a variable will be reflected in the rest.
The first thing I thought was to use Rc. I had previously used Arc to handle the data between different threads, but since these functions aren't running in different threads I chose Rc instead.
The most simplified code that I have, to show my doubts:
The use of RefCell was because maybe I had to see that this syntax will not work as I expected:
*Rc::make_mut(&mut rc_pref_temp)...
use std::sync::Arc;
use std::rc::Rc;
use std::sync::Mutex;
use std::cell::RefCell;
use std::cell::Cell;
fn main() {
test2();
println!("---");
test();
}
#[derive(Debug, Clone)]
struct Prefe {
name_test: RefCell<u64>,
}
impl Prefe {
fn new() -> Prefe {
Prefe {
name_test: RefCell::new(3 as u64),
}
}
}
fn test2(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(Mutex::new(prefe));
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
let mut rc_pref_temp = rc_pref.clone();
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
}
fn test(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(prefe);
println!("rc_pref: {:?}", rc_pref.name_test);
let mut rc_pref_temp = rc_pref.clone();
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
println!("rc_pref: {:?}", rc_pref.name_test);
}
The code is simplified, the scenario where it is used is totally different. I note this to avoid comments like "you can lend a value to the function", because what interests me is to know why the cases exposed work in this way.
stdout:
rc_pref Mutex : RefCell { value: 3 }
rc_pref_clone Mutex : RefCell { value: 4 }
rc_pref_clone Mutex : RefCell { value: 5 }
rc_pref Mutex : RefCell { value: 5 }
---
rc_pref : RefCell { value: 3 }
rc_pref_clone : RefCell { value: 4 }
rc_pref_clone : RefCell { value: 5 }
rc_pref : RefCell { value: 3 }
About test()
I'm new to Rust so I don't know if this crazy syntax is the right way.
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
When running test() you can see that the previous syntax works, because it increases the value, but this increase does not affect the clones. I expected that with the use of *Rc::make_mut(& mut rc_pref_temp)... that the clones of a shared reference would reflect the same values.
If Rc has references to the same object, why do the changes to an object not apply to the rest of the clones? Why does this work this way? Am I doing something wrong?
Note: I use RefCell because in some tests I thought that maybe I had something to do.
About test2()
I've got it working as expected using Mutex with Rc, but I do not know if this is the correct way. I have some ideas of how Mutex and Arc works, but after using this syntax:
*Rc::make_mut(&mut rc_pref_temp)...
With the use of Mutex in test2(), I wonder if Mutex is not only responsible for changing the data in but also the one in charge of reflecting the changes in all the cloned references.
Do the shared references actually point to the same object? I want to think they do, but with the above code where the changes are not reflected without the use of Mutex, I have some doubts.

You need to read and understand the documentation for functions you use before you use them. Rc::make_mut says, emphasis mine:
Makes a mutable reference into the given Rc.
If there are other Rc or Weak pointers to the same value, then
make_mut will invoke clone on the inner value to ensure unique
ownership. This is also referred to as clone-on-write.
See also get_mut, which will fail rather than cloning.
You have multiple Rc pointers because you called rc_pref.clone(). Thus, when you call make_mut, the inner value will be cloned and the Rc pointers will now be disassociated from each other:
use std::rc::Rc;
fn main() {
let counter = Rc::new(100);
let mut counter_clone = counter.clone();
println!("{}", Rc::strong_count(&counter)); // 2
println!("{}", Rc::strong_count(&counter_clone)); // 2
*Rc::make_mut(&mut counter_clone) += 50;
println!("{}", Rc::strong_count(&counter)); // 1
println!("{}", Rc::strong_count(&counter_clone)); // 1
println!("{}", counter); // 100
println!("{}", counter_clone); // 150
}
The version with the Mutex works because it's completely different. You aren't calling a function which clones the inner value anymore. Of course, it doesn't make sense to use a Mutex when you don't have threads. The single-threaded equivalent of a Mutex is... RefCell!
I honestly don't know how you found Rc::make_mut; I've never even heard of it before. The module documentation for cell doesn't mention it, nor does the module documentation for rc.
I'd highly encourage you to take a step back and re-read through the documentation. The second edition of The Rust Programming Language has a chapter on smart pointers, including Rc and RefCell. Read the module-level documentation for rc and cell as well.
Here's what your code should look like. Note the usage of borrow_mut.
fn main() {
let prefe = Rc::new(Prefe::new());
println!("prefe: {:?}", prefe.name_test); // 3
let prefe_clone = prefe.clone();
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 4
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 5
println!("prefe: {:?}", prefe.name_test); // 5
}

How to multi-thread function calls on read-only data without cloning it? [duplicate]

This question already has an answer here:
Lifetime of variables passed to a new thread
(1 answer)
Closed 6 years ago.
Take this simple example where we're using an immutable list of vectors to calculate new values.
Given this working, single threaded example:
use std::collections::LinkedList;
fn calculate_vec(v: &Vec<i32>) -> i32 {
let mut result: i32 = 0;
for i in v {
result += *i;
}
return result;
}
fn calculate_from_list(list: &LinkedList<Vec<i32>>) -> LinkedList<i32> {
let mut result: LinkedList<i32> = LinkedList::new();
for v in list {
result.push_back(calculate_vec(v));
}
return result;
}
fn main() {
let mut list: LinkedList<Vec<i32>> = LinkedList::new();
// some arbitrary values
list.push_back(vec![0, -2, 3]);
list.push_back(vec![3, -4, 3]);
list.push_back(vec![7, -10, 6]);
let result = calculate_from_list(&list);
println!("Here's the result!");
for i in result {
println!("{}", i);
}
}
Assuming calculate_vec is a processor intensive function, we may want to use multiple threads to run this, the following example works but requires (what I think is) an unnecessary vector clone.
use std::collections::LinkedList;
fn calculate_vec(v: &Vec<i32>) -> i32 {
let mut result: i32 = 0;
for i in v {
result += *i;
}
return result;
}
fn calculate_from_list(list: &LinkedList<Vec<i32>>) -> LinkedList<i32> {
use std::thread;
let mut result: LinkedList<i32> = LinkedList::new();
let mut join_handles = LinkedList::new();
for v in list {
let v_clone = v.clone(); // <-- how to avoid this clone?
join_handles.push_back(thread::spawn(move || calculate_vec(&v_clone)));
}
for j in join_handles {
result.push_back(j.join().unwrap());
}
return result;
}
fn main() {
let mut list: LinkedList<Vec<i32>> = LinkedList::new();
// some arbitrary values
list.push_back(vec![0, -2, 3]);
list.push_back(vec![3, -4, 3]);
list.push_back(vec![7, -10, 6]);
let result = calculate_from_list(&list);
println!("Here's the result!");
for i in result {
println!("{}", i);
}
}
This example works, but it only when the vector is cloned,
however logically, I don't think this should be needed since the vector is immutable.
There is no reason each call to calculate_vec should need to allocate a new vector.
How could this simple example be multi-threaded without needing to clone the data before its passed to the closure?
Update, heres a working example that uses Arc based on #ker's suggestion, although it does need to take ownership.
Note 1) I'm aware there are 3rd party libraries to handle threading, but would be interested to know if this is possible using Rust's standard library.
Note 2) There are quite a few similar questions on threading but examples often involves threads writing to data, which isn't the case here.

There are multiple ways to solve your problem.
move the Vector into an Arc<LinkedList<Vec<i32>>> and clone that. After the calculation, you can use try_unwrap to get your LinkedList<Vec<i32>> back. This works with just the Rust standard library.Heres a working example that uses Arc, though LinkedList was replaced by Vec to allow indexing.Also note that the function needs to own the argument being passed to it in this case.
Use the crossbeam crate to create threads that can reference their scope, freeing you from the need to do all that join_handles code by hand. This will have a minimal impact on your code, since it work exactly like you want.
crossbeam::scope(|scope| {
for v in list {
scope.spawn(|| calculate_vec(&v))
}
});
Use the scoped_threadpool crate. It works just like crossbeam but doesn't create one thread per task, instead it spreads out the tasks over a limited number of threads. (thanks #delnan)
use the rayon crate for direct data parallelism
use rayon::prelude::*;
list.par_iter().map(|v| calculate_vec(&v)).collect()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string