function fn(args){
var a= 'something';
doSomething('dummy',function(){
});
}
fn();
In this code, is anonymous callback become closure or just exit after execution? If it become become a closure, how can i get back memory because it always has access to fn's activation object.
It will only be a closure if the lambda uses the enclosing functions'(fn) local variables or parameter, e.g. a or args.
Re: Memory recovery - Don't worry about it* - the GC will know when references are no longer reachable and collect them (whether they are used in closures or not). See also here.
* Don't worry too much
Related
Consider the following scenarios.
let mut map = HashMap::new();
map.insert(2,5);
thread::scope(|s|{
s.spawn(|_|{
map.insert(1,5);
});
s.spawn(|_|{
let d = map.get(&2).unwrap();
});
}).unwrap();
This code cannot be compiled because we borrow the variable map mutably in h1 and borrow again in h2. The classical solution is wrapping map by Arc<Mutex<...>>. But in the above code, we don't need to lock whole hashmap. Because, although two threads concurrently access to same hashmap, they access completely different region of it.
So I want to share map through thread without using lock, but how can I acquire it? I'm also open to use unsafe rust...
in the above code, we don't need to lock whole hashmap
Actually, we do.
Every insert into the HashMap may possibly trigger its reallocation, if the map is at that point on its capacity. Now, imagine the following sequence of events:
Second thread calls get and retrieves reference to the value (at runtime it'll be just an address).
First thread calls insert.
Map gets reallocated, the old chunk of memory is now invalid.
Second thread dereferences the previously-retrieved reference - boom, we get UB!
So, if you need to insert something in the map concurrently, you have to synchronize that somehow.
For the standard HashMap, the only way to do this is to lock the whole map, since the reallocation invalidates every element. If you used something like DashMap, which synchronizes access internally and therefore allows inserting through shared reference, this would require no locking from your side - but can be more cumbersome in other parts of API (e.g. you can't return a reference to the value inside the map - get method returns RAII wrapper, which is used for synchronization), and you can run into unexpected deadlocks.
In my understanding gaining a lock on a mutex and then immediately calling a function on the guarded structure without declaring a separate variable for the MutexGuard releases this guard once the function call is done.
My question is whether this is also the case when getting a lock within a loop declaration like so:
for ele in mtx.lock().await.clone() {
// do something requiring lock on mtx
}
The expectation here would be that once the clone call completes, the lock on mtx is released and can be reacquired inside the loop. Is this the case? If not, why is this not the case?
No, this is not the case. Temporaries created in the iterator expression will live until the end of the for loop, and the mutex guard will only be dropped after the loop.
Temporaries are generally dropped at the end of the statement. You can see the full rules in the documentation on temporary scopes:
Apart from lifetime extension, the temporary scope of an expression is the smallest scope that contains the expression and is one of the following:
The entire function body.
A statement.
The body of a if, while or loop expression.
The else block of an if expression.
The condition expression of an if or while expression, or a match guard.
The expression for a match arm.
The second operand of a lazy boolean expression.
These rules cover a lot of subtle corner cases, so it's not easy to give a short summary why the language is designed this way. For for loops in particular, it would be very annoying if temporaries created inside the iterator expression would get immediately dropped, since then things like this would cause a borrow checker error:
for x in my_vec.iter().filter(|&&y| y != 0) {
...
}
If the compiler would only keep the result of the iterator expression, the iterator returned by my_vec.iter() would be immediately dropped, which of course isn't desired.
The code below compiles only if dyn Fn is UnwindSafe + RefUnwindSafe, because panic::catch_unwind requires it to be able to catch the panic.
use std::panic;
use std::panic::{UnwindSafe, RefUnwindSafe};
fn launch_closure(f: Box<dyn Fn() + UnwindSafe + RefUnwindSafe>) {
let result = panic::catch_unwind(|| {
f();
});
}
However, std::thread::JoinHandle::join function is able to catch the panic even if the thread closure is not UnwindSafe + RefUnwindSafe:
If the child thread panics, Err is returned with the parameter given
to panic!.
How?
I'd like to be able to know if my closure panicked, but UnwindSafe + RefUnwindSafe is too restrictive, I cannot use CondVar for example.
Playground
thread::spawn wraps the closure in an AssertUnwindSafe to tell the compiler that it knows that the given closure is unwind-safe:
let try_result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
crate::sys_common::backtrace::__rust_begin_short_backtrace(f)
}));
So, what is unwind safety and how can thread::spawn make that assertion?
From the documentation for UnwindSafe:
In Rust a function can “return” early if it either panics or calls a function which transitively panics. This sort of control flow is not always anticipated, and has the possibility of causing subtle bugs through a combination of two critical components:
A data structure is in a temporarily invalid state when the thread panics.
This broken invariant is then later observed.
A type is not unwind safe if both of these can be true.
Types like Mutex and RwLock are unwind safe because they use poisoning to protect you from broken invariants. If a panic occurs in another thread that has a lock on a Mutex then it becomes poisoned and you must explicitly call PoisonError::into_inner to access the possibly-inconsistent data. If you cause a bug by making assumptions about a poisoned mutex, then that's your own responsibility and the Rust type system can't help you there.
Mutable references and non-sharable types with interior mutability like RefCell are not unwind safe because they don't offer such protection. However, they are also not Sync, so you can't get into a situation where you use them after another thread has panicked while holding a reference.
The final piece of the puzzle is that thread::spawn creates a new thread stack. That means it can guarantee that the closure is called first in the stack so nothing in the same thread as the closure can access its environment after a panic is caught.
While thread::spawn can't guarantee that the closure is unwind safe in the general case, it knows that:
The closure is Send (by its own bounds), so it cannot contain references to non-Sync types.
The non-unwind safe types in std (mutable references and cell types) are also not Sync, which means nothing can access a non-unwind safe type from outside the thread
Nothing in the same thread as the call to the closure can access its environment after the panic is caught.
So it is safe for the closure to be unwound because there is no possibility of broken invariants being unintentionally observed after a panic.
It is of course possible for a closure to make use of a user-defined type that is not unwind safe but is Sync, in which case this assumption would turn out to be incorrect. However, this would require unsafe code either from a third party crate or by the same author as the closure itself. It is always the responsibility of the author of unsafe code to ensure memory safety. It is not sound for a type to be Sync if a panic in another thread could cause UB. It is up to the author to decide if logical bugs are acceptable, but memory unsafety never is.
So... can you use the same trick in your code? Unfortunately, you probably cannot. Since you don't have control over the caller of launch_closure, you can't make the guarantee that a panic won't cause invalid state to be observed by callers in the same thread.
scenario 1
function a(callback){
console.log("not calling callback");
}
a(function(callback_res){
console.log("callback_res", callback_res);
});
scenario 2
function a(callback){
console.log("calling callback");
callback(true);
}
a(function(callback_res){
console.log("callback_res", callback_res);
});
will function a be waiting for callback and will not terminate in scenario 1? However program gets terminated in both scenario.
The problem is not safety but intention. If a function accepts a callback, it's expected that it will be called at some point. If it ignores the argument it accepts, the signature is misleading.
This is a bad practice because function signature gives false impression about how a function works.
It also may cause parameter is unused warning in linters.
will function a be waiting for callback and will not terminate in scenario 1?
The function doesn't contain asynchronous code and won't wait for anything. The fact that callbacks are commonly used in asynchronous control flow doesn't mean that they are asynchronous per se.
will function a be waiting for callback and will not terminate in scenario 1?
No. There is nothing in the code you show that waits for a callback to be called.
Passing a callback to a function is just like passing an integer to a function. The function is free to use it or not and it doesn't mean anything more than that to the interpreter. the JS interpreter has no special logic to "wait for a passed callback to get called". That has no effect one way or the other on when the program terminates. It's just a function argument that the called function can decide whether to use or ignore.
As another example, it used to be common to pass two callbacks to a function, one was called upon success and one was called upon error:
function someFunc(successFn, errorFn) {
// do some operation and then call either successFn or errorFn
}
In this case, it was pretty clear that one of these was going to get called and the other was not. There's no need (from the JS interpreter's point of view) to call a passed callback. That's purely the prerogative of the logic of your code.
Now, it would not be a good practice to design a function that shows a callback in the calling signature and then never, ever call that callback. That's just plain wasteful and a misleading design. There are many cases of callbacks that are sometimes called and sometimes not depending upon circumstances. Array.prototype.forEach is one such example. If you call array.forEach(fn) on an empty array, the callback is never called. But, of course, if you call it on a non-empty array, it is called.
If your function carries out asynchronous operations and the point of the callback is to communicate when the asynchronous operation is done and whether it concluded with an error or a value, then it would generally be bad form to have code paths that would never call the callback because it would be natural for a caller to assume the callback is doing to get called eventually. I can imagine there might be some exceptions to this, but they better be documented really well with the doc/comments for the function.
For asynchronous operations, your question reminds me somewhat of this: Do never resolved promises cause memory leak? which might be useful to read.
I'm wrapping a Rust object to be used from Lua. I need the object to be destroyed when neither Rust code nor Lua still has a reference to it, so the obvious (to me) solution is to use Rc<T>, stored in Lua-managed memory.
The Lua API (I'm using rust-lua53 for now) lets you allocate a chunk of memory and attach methods and a finalizer to it, so I want to store an Rc<T> into that chunk of memory.
My current attempt looks like. First, creating an object:
/* Allocate a block of uninitialized memory to use */
let p = state.new_userdata(mem::size_of::<Rc<T>>() as size_t) as *mut Rc<T>;
/* Make a ref-counted pointer to a Rust object */
let rc = Rc::<T>::new(...);
/* Store the Rc */
unsafe { ptr::write(p, rc) };
And in the finaliser:
let p: *mut Rc<T> = ...; /* Get a pointer to the item to finalize */
unsafe { ptr::drop_in_place(p) }; /* Release the object */
Now this seems to work (as briefly tested by adding a println!() to the drop method). But is it correct and safe (as long as I make sure it's not accessed after finalization)? I don't feel confident enough in unsafe Rust to be sure that it's ok to ptr::write an Rc<T>.
I'm also wondering about, rather than storing an Rc<T> directly, storing an Option<Rc<T>>; then instead of drop_in_place() I would ptr::swap() it with None. This would make it easy to handle any use after finalization.
Now this seems to work (as briefly tested by adding a println!() to the drop method). But is it correct and safe (as long as I make sure it's not accessed after finalisation)? I don't feel confident enough in unsafe Rust to be sure that it's ok to ptr::write an Rc<T>.
Yes, you may ptr::write any Rust type to any memory location. This "leaks" the Rc<T> object, but writes a bit-equivalent to the target location.
When using it, you need to guarantee that no one modified it outside of Rust code and that you are still in the same thread as the one where it was created. If you want to be able to move across threads, you need to use Arc.
Rust's thread safety cannot protect you here, because you are using raw pointers.
I'm also wondering about, rather than storing an Rc<T> directly, storing an Option<Rc<T>>; then instead of drop_in_place() I would ptr::swap() it with None. This would make it easy to handle any use after finalisation.
The pendant to ptr::write is ptr::read. So if you can guarantee that no one ever tries to ptr::read or drop_in_place() the object, then you can just call ptr::read (which returns the object) and use that object as you would use any other Rc<T> object. You don't need to care about dropping or anything, because now it's back in Rust's control.
You should also be using new_userdata_typed instead of new_userdata, since that takes the memory handling off your hands. There are other convenience wrapper functions ending with the postfix _typed for most userdata needs.
Your code will work; of course, note that the drop_in_place(p) will just decrease the counter of the Rc and only drop the contained T if and only if it was the last reference, which is the correct action.