Mutex locking inside loop declaration

Mutex locking inside loop declaration - rust

In my understanding gaining a lock on a mutex and then immediately calling a function on the guarded structure without declaring a separate variable for the MutexGuard releases this guard once the function call is done.
My question is whether this is also the case when getting a lock within a loop declaration like so:
for ele in mtx.lock().await.clone() {
// do something requiring lock on mtx
}
The expectation here would be that once the clone call completes, the lock on mtx is released and can be reacquired inside the loop. Is this the case? If not, why is this not the case?

No, this is not the case. Temporaries created in the iterator expression will live until the end of the for loop, and the mutex guard will only be dropped after the loop.
Temporaries are generally dropped at the end of the statement. You can see the full rules in the documentation on temporary scopes:
Apart from lifetime extension, the temporary scope of an expression is the smallest scope that contains the expression and is one of the following:
The entire function body.
A statement.
The body of a if, while or loop expression.
The else block of an if expression.
The condition expression of an if or while expression, or a match guard.
The expression for a match arm.
The second operand of a lazy boolean expression.
These rules cover a lot of subtle corner cases, so it's not easy to give a short summary why the language is designed this way. For for loops in particular, it would be very annoying if temporaries created inside the iterator expression would get immediately dropped, since then things like this would cause a borrow checker error:
for x in my_vec.iter().filter(|&&y| y != 0) {
...
}
If the compiler would only keep the result of the iterator expression, the iterator returned by my_vec.iter() would be immediately dropped, which of course isn't desired.

Related

How can I share the data without locking whole part of it?

Consider the following scenarios.
let mut map = HashMap::new();
map.insert(2,5);
thread::scope(|s|{
s.spawn(|_|{
map.insert(1,5);
});
s.spawn(|_|{
let d = map.get(&2).unwrap();
});
}).unwrap();
This code cannot be compiled because we borrow the variable map mutably in h1 and borrow again in h2. The classical solution is wrapping map by Arc<Mutex<...>>. But in the above code, we don't need to lock whole hashmap. Because, although two threads concurrently access to same hashmap, they access completely different region of it.
So I want to share map through thread without using lock, but how can I acquire it? I'm also open to use unsafe rust...

in the above code, we don't need to lock whole hashmap
Actually, we do.
Every insert into the HashMap may possibly trigger its reallocation, if the map is at that point on its capacity. Now, imagine the following sequence of events:
Second thread calls get and retrieves reference to the value (at runtime it'll be just an address).
First thread calls insert.
Map gets reallocated, the old chunk of memory is now invalid.
Second thread dereferences the previously-retrieved reference - boom, we get UB!
So, if you need to insert something in the map concurrently, you have to synchronize that somehow.
For the standard HashMap, the only way to do this is to lock the whole map, since the reallocation invalidates every element. If you used something like DashMap, which synchronizes access internally and therefore allows inserting through shared reference, this would require no locking from your side - but can be more cumbersome in other parts of API (e.g. you can't return a reference to the value inside the map - get method returns RAII wrapper, which is used for synchronization), and you can run into unexpected deadlocks.

What happens to the memory that was 'moved into'?

Does rust drop the memory after move is used to reassign a value?
What happens to the string "aaa" in this example?
let mut s = String::from("aaa");
s = String::from("bbb");
I'm guessing that the "aaa" String is dropped - it makes sense, as it is not used anymore. However, I cannot find anything confirming this in the docs. (for example, the Book only provides an explanation of what happens when we assign a new value using move).
I'm trying to wrap my head around the rules Rust uses to ensure memory safety, but I cannot find an explicit rule of what happens in such a situation.

Yes, assignment to a variable will drop the value that is being replaced.
Not dropping the replaced value would be a disaster - since Drop is frequently used for deallocation and other cleanup, if the value wasn't dropped, you'd end up with all sorts of leaks.

Move semantics are implicit here. The data in s is initialized by moving from the String produced by String::from("bbb"). The original data stored in s is dropped by side-effect (the process of replacing it leaves it with no owner, so it's dropped as part of the operation).
Per the destructor documentation (emphasis added):
When an initialized variable or temporary goes out of scope, its destructor is run, or it is dropped. Assignment also runs the destructor of its left-hand operand, if it's initialized. If a variable has been partially initialized, only its initialized fields are dropped.

How can std::thread::JoinHandle::join catch panics?

The code below compiles only if dyn Fn is UnwindSafe + RefUnwindSafe, because panic::catch_unwind requires it to be able to catch the panic.
use std::panic;
use std::panic::{UnwindSafe, RefUnwindSafe};
fn launch_closure(f: Box<dyn Fn() + UnwindSafe + RefUnwindSafe>) {
let result = panic::catch_unwind(|| {
f();
});
}
However, std::thread::JoinHandle::join function is able to catch the panic even if the thread closure is not UnwindSafe + RefUnwindSafe:
If the child thread panics, Err is returned with the parameter given
to panic!.
How?
I'd like to be able to know if my closure panicked, but UnwindSafe + RefUnwindSafe is too restrictive, I cannot use CondVar for example.
Playground

thread::spawn wraps the closure in an AssertUnwindSafe to tell the compiler that it knows that the given closure is unwind-safe:
let try_result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
crate::sys_common::backtrace::__rust_begin_short_backtrace(f)
}));
So, what is unwind safety and how can thread::spawn make that assertion?
From the documentation for UnwindSafe:
In Rust a function can “return” early if it either panics or calls a function which transitively panics. This sort of control flow is not always anticipated, and has the possibility of causing subtle bugs through a combination of two critical components:
A data structure is in a temporarily invalid state when the thread panics.
This broken invariant is then later observed.
A type is not unwind safe if both of these can be true.
Types like Mutex and RwLock are unwind safe because they use poisoning to protect you from broken invariants. If a panic occurs in another thread that has a lock on a Mutex then it becomes poisoned and you must explicitly call PoisonError::into_inner to access the possibly-inconsistent data. If you cause a bug by making assumptions about a poisoned mutex, then that's your own responsibility and the Rust type system can't help you there.
Mutable references and non-sharable types with interior mutability like RefCell are not unwind safe because they don't offer such protection. However, they are also not Sync, so you can't get into a situation where you use them after another thread has panicked while holding a reference.
The final piece of the puzzle is that thread::spawn creates a new thread stack. That means it can guarantee that the closure is called first in the stack so nothing in the same thread as the closure can access its environment after a panic is caught.
While thread::spawn can't guarantee that the closure is unwind safe in the general case, it knows that:
The closure is Send (by its own bounds), so it cannot contain references to non-Sync types.
The non-unwind safe types in std (mutable references and cell types) are also not Sync, which means nothing can access a non-unwind safe type from outside the thread
Nothing in the same thread as the call to the closure can access its environment after the panic is caught.
So it is safe for the closure to be unwound because there is no possibility of broken invariants being unintentionally observed after a panic.
It is of course possible for a closure to make use of a user-defined type that is not unwind safe but is Sync, in which case this assumption would turn out to be incorrect. However, this would require unsafe code either from a third party crate or by the same author as the closure itself. It is always the responsibility of the author of unsafe code to ensure memory safety. It is not sound for a type to be Sync if a panic in another thread could cause UB. It is up to the author to decide if logical bugs are acceptable, but memory unsafety never is.
So... can you use the same trick in your code? Unfortunately, you probably cannot. Since you don't have control over the caller of launch_closure, you can't make the guarantee that a panic won't cause invalid state to be observed by callers in the same thread.

How can Rust be told that a thread does not live longer than its caller? [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 5 years ago.
I have the following code:
fn main() {
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let t1 = std::thread::spawn(|| {
println!("{}", message);
});
t1.join();
}
rustc gives me the compilation error:
closure may outlive the current function, but it borrows message, which is owned by the current function
This is wrong since:
The function it's referring to here is (I believe) main. The threads will be killed or enter in UB once main is finished executing.
The function it's referring to clearly invokes .join() on said thread.
Is the previous code unsafe in any way? If so, why? If not, how can I get the compiler to understand that?
Edit: Yes I am aware I can just move the message in this case, my question is specifically asking how can I pass a reference to it (preferably without having to heap allocate it, similarly to how this code would do it:
std::thread([&message]() -> void {/* etc */});
(Just to clarify, what I'm actually trying to do is access a thread safe data structure from two threads... other solutions to the problem that don't involve making the copy work would also help).
Edit2: The question this has been marked as a duplicate of is 5 pages long and as such I'd consider it and invalid question in it's own right.

Is the previous code 'unsafe' in any way ? If so, why ?
The goal of Rust's type-checking and borrow-checking system is to disallow unsafe programs, but that does not mean that all programs that fail to compile are unsafe. In this specific case, your code is not unsafe, but it does not satisfy the type constraints of the functions you are using.
The function it's referring to clearly invokes .join() on said thread.
But there is nothing from a type-checker standpoint that requires the call the .join. A type-checking system (on its own) can't enforce that a function has or has not been called on a given object. You could just as easily imagine an example like
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let mut handles = vec![];
for i in 0..3 {
let t1 = std::thread::spawn(|| {
println!("{} {}", message, i);
});
handles.push(t1);
}
for t1 in handles {
t1.join();
}
where a human can tell that each thread is joined before main exits. But a typechecker has no way to know that.
The function it's referring to here is (I believe) main. So presumably those threads will be killed when main exists anyway (and them running after main exists is ub).
From the standpoint of the checkers, main is just another function. There is no special knowledge that this specific function can have extra behavior. If this were any other function, the thread would not be auto-killed. Expanding on that, even for main there is no guarantee that the child threads will be killed instantly. If it takes 5ms for the child threads to be killed, that is still 5ms where the child threads could be accessing the content of a variable that has gone out of scope.
To gain the behavior that you are looking for with this specific snippet (as-is), the lifetime of the closure would have to be tied to the lifetime of the t1 object, such that the closure was guaranteed to never be used after the handles have been cleaned up. While that is certainly an option, it is significantly less flexible in the general case. Because it would be enforced at the type level, there would be no way to opt out of this behavior.
You could consider using crossbeam, specifically crossbeam::scope's .spawn, which enforces this lifetime requirement where the standard library does not, meaning a thread must stop execution before the scope is finished.
In your specific case, your code works fine as long as you transfer ownership of message to the child thread instead of borrowing it from the main function, because there is no risk of unsafe code with or without your call to .join. Your code works fine if you change
let t1 = std::thread::spawn(|| {
to
let t1 = std::thread::spawn(move || {

Is Callback become closures?

function fn(args){
var a= 'something';
doSomething('dummy',function(){
});
}
fn();
In this code, is anonymous callback become closure or just exit after execution? If it become become a closure, how can i get back memory because it always has access to fn's activation object.

It will only be a closure if the lambda uses the enclosing functions'(fn) local variables or parameter, e.g. a or args.
Re: Memory recovery - Don't worry about it* - the GC will know when references are no longer reachable and collect them (whether they are used in closures or not). See also here.
* Don't worry too much

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string