Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 9 months ago.
Improve this question
enter image description here
I want lock multiple page before executor action ,how do?
First, from the code it looks like it's possible to hit the same entry multiple times. This means you'll immediately deadlock, as the second access will try to lock a mutex which is already locked. You should probably use or_insert_with in order to only acquire the lock if there was no entry for that key.
Second, well MutexGuard (whether that of tokio or the stdlib) borrows the mutex. Since the Mutex is only borrowed for the span of the iteration, the MutexGuard can't outlive that. The way to fix that is to expand the Mutex's lifetime e.g. by adding it to a collection that's outside the iteration then borrowing that. At which point you probably need two iterations, because if you push a lock into the "owner" collection then immediately borrow that, the collection will be borrowed on the next iteration and you won't be able to insert a new item:
let mut locks = vec![];
let mut page_map = HashMap::new();
for key in keys {
let page = store.get(key).unwrap();
locks.push((key, page.clone()));
}
for (key, lock) in &locks {
page_map.entry(key).or_insert_with(|| lock.lock());
}
Although then you're hitting the third issue (and second likelyhood of deadlock): if this can be executed twice concurrently it's possible for two calls to have key_feature_map in different order, meaning one call might lock page A then try to acquire page B, while a second call will have locked page B and now try to acquire page A.
Both calls are waiting on the other's lock, so they'll be deadlocked.
Related
Consider the following scenarios.
let mut map = HashMap::new();
map.insert(2,5);
thread::scope(|s|{
s.spawn(|_|{
map.insert(1,5);
});
s.spawn(|_|{
let d = map.get(&2).unwrap();
});
}).unwrap();
This code cannot be compiled because we borrow the variable map mutably in h1 and borrow again in h2. The classical solution is wrapping map by Arc<Mutex<...>>. But in the above code, we don't need to lock whole hashmap. Because, although two threads concurrently access to same hashmap, they access completely different region of it.
So I want to share map through thread without using lock, but how can I acquire it? I'm also open to use unsafe rust...
in the above code, we don't need to lock whole hashmap
Actually, we do.
Every insert into the HashMap may possibly trigger its reallocation, if the map is at that point on its capacity. Now, imagine the following sequence of events:
Second thread calls get and retrieves reference to the value (at runtime it'll be just an address).
First thread calls insert.
Map gets reallocated, the old chunk of memory is now invalid.
Second thread dereferences the previously-retrieved reference - boom, we get UB!
So, if you need to insert something in the map concurrently, you have to synchronize that somehow.
For the standard HashMap, the only way to do this is to lock the whole map, since the reallocation invalidates every element. If you used something like DashMap, which synchronizes access internally and therefore allows inserting through shared reference, this would require no locking from your side - but can be more cumbersome in other parts of API (e.g. you can't return a reference to the value inside the map - get method returns RAII wrapper, which is used for synchronization), and you can run into unexpected deadlocks.
Example seen below. It seems like this might by definition be ub, but it remains unclear to me.
fn main(){
let mut threads = Vec::new();
for _ in 0..100 {
let thread = thread::spawn(move || {
fs::read_to_string("./config/init.json")
.unwrap()
.trim()
.to_string()
});
threads.push(thread);
}
for handler in threads {
handler.join().unwrap();
}
}
On most operating systems only individual read operations are guaranteed to be atomic. read_to_string may perform multiple distinct reads, which means that it's not guaranteed to be atomic between multiple threads/processes. If another process is modifying this file concurrently, read_to_string could return a mixture of data from before and after the modification. In other words, each read_to_string operation is not guaranteed to return an identical result, and some may even fail while others succeed if another process deletes the file while the program is running.
However, none of this behavior is classified as "undefined." Absent hardware problems, you are guaranteed to get back a std::io::Result<String> in a valid state, which is something you can reason about. Once UB is invoked, you can no longer reason about the state of the program.
By way of analogy, consider a choose your own adventure book. At the end of each segment you'll have some instructions like "If you choose to go into the cave, go to page 53. If you choose to take the path by the river, go to page 20." Then you turn to the appropriate page and keep reading. This is a bit like Result -- if you have an Ok you do one thing, but if you have an Err you do another thing.
Once undefined behavior is invoked, this kind of choice no longer makes sense because the program is in a state where the rules of the language no longer apply. The program could do anything, including deleting random files from your hard drive. In the book analogy, the book caught fire. Trying to follow the rules of the book no longer makes any sense, and you hope the book doesn't burn your house down with it.
In Rust you're not supposed to be able to invoke UB without using the unsafe keyword, so if you don't see that keyword anywhere then UB isn't on the table.
I'llI was thinking about the rust async infrastructure and at the heart of the API lies the Future trait which is:
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
According to the docs
The core method of future, poll, attempts to resolve the future into a final value. This method does not block if the value is not ready. Instead, the current task is scheduled to be woken up when it’s possible to make further progress by polling again. The context passed to the poll method can provide a Waker, which is a handle for waking up the current task.
This implies that the Context value passed to .poll() (particularly the waker) needs some way to refer to the pinned future &mut self in order to wake it up. However &mut implies that the reference is not aliased. Am I misunderstanding something or is this a "special case" where aliasing &mut is allowed? If the latter, are there other such special cases besides UnsafeCell?
needs some way to refer to the pinned future &mut self in order to wake it up.
No, it doesn't. The key thing to understand is that a “task” is not the future: it is the future and what the executor knows about it. What exactly the waker mutates is up to the executor, but it must be something that isn't the Future. I say “must” not just because of Rust mutability rules, but because Futures don't contain any state that says whether they have been woken. So, there isn't anything there to usefully mutate; 100% of the bytes of the future's memory are dedicated to the specific Future implementation and none of the executor's business.
Well on the very next page if the book you will notice that task contains a boxed future and the waker is created from a reference to task. So there is a reference to future held from task albeit indirect.
OK, let's look at those data structures. Condensed:
struct Executor {
ready_queue: Receiver<Arc<Task>>,
}
struct Task {
future: Mutex<Option<BoxFuture<'static, ()>>>,
task_sender: SyncSender<Arc<Task>>,
}
The reference to the task is an Arc<Task>, and the future itself is inside a Mutex (interior mutability) in the task. Therefore,
It is not possible to get an &mut Task from the Arc<Task>, because Arc doesn't allow that.
The future is in a Mutex which does run-time checking that there is at most one mutable reference to it.
The only things you can do with an Arc<Task> are
clone it and send it
get & access to the future in a Mutex (which allows requesting run-time-checked mutation access to the Future)
get & access to the task_sender (which allows sending things to ready_queue).
So, in this case, when the waker is called, it sort-of doesn't even mutate anything specific to the Task at all: it makes a clone of the Arc<Task> (which increments an atomic reference count stored next to the Task) and puts it on the ready_queue (which mutates storage shared between the Sender and Receiver).
Another executor might indeed have task-specific state in the Task that is mutated, such as a flag marking that the task is already woken and doesn't need to be woken again. That flag might be stored in an AtomicBoolean field in the task. But still, it does not alias with any &mut of the Future because it's not part of the Future, but the task.
All that said, there actually is something special about Futures and noalias — but it's not about executors, it's about Pin. Pinning explicitly allows the pinned type to contain “self-referential” pointers into itself, so Rust does not declare noalias for Pin<&mut T>. However, exactly what the language rules around this are is still not quite rigorously specified; the current situation is just considered a kludge so that async functions can be correctly compiled, I think.
There is no such special case. Judging from your comment about the Rust executor, you are misunderstanding how interior mutability works.
The example in the Rust book uses an Arc wrapped Task structure, with the future contained in a Mutex. When the task is run, it locks the mutex and obtains the singular &mut reference that's allowed to exist.
Now look at how the example implements wake_by_ref, and notice how it never touches the future at all. Indeed, that function would not be able to lock the future at all, as the upper level already has the lock. It would not be able to safely get a &mut reference, and so it prevents you from doing so, therefore, no issue.
The restriction for UnsafeCell and its wrappers is that only one &mut reference may exist for an object at any point in time. However, multiple & immutable references may exist to the UnsafeCell or structures containing it just fine - that is the point of interior mutability.
This question already has answers here:
What are idiomatic ways to send data between threads?
(2 answers)
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 3 years ago.
I'm currently moving a project from Python to Rust, and I ran into a small problem with the concept of ownership in Rust.
My main thread create a empty vector, then, I start another thread and that thread will run forever (loop) and add data to the vector every second. Until here, no problem.
The problem comes when from my main thread, I want to read the vector (print it or use the data inside it). Since the ownership has been moved, I can't use the vector in my main thread.
let mut prices: Vec<f64> = Vec::new();
thread::spawn(move || {
start_filling_prices(&mut prices);
});
println!("{:?}", prices);
^^^^^^^ value borrowed here after move
I've tried using a Mutex, Arc, Box, but the same problem occurs every time (or maybe I used the wrong one).
So, does someone know how to use the same variable in different threads at the same time?
This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 5 years ago.
I have the following code:
fn main() {
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let t1 = std::thread::spawn(|| {
println!("{}", message);
});
t1.join();
}
rustc gives me the compilation error:
closure may outlive the current function, but it borrows message, which is owned by the current function
This is wrong since:
The function it's referring to here is (I believe) main. The threads will be killed or enter in UB once main is finished executing.
The function it's referring to clearly invokes .join() on said thread.
Is the previous code unsafe in any way? If so, why? If not, how can I get the compiler to understand that?
Edit: Yes I am aware I can just move the message in this case, my question is specifically asking how can I pass a reference to it (preferably without having to heap allocate it, similarly to how this code would do it:
std::thread([&message]() -> void {/* etc */});
(Just to clarify, what I'm actually trying to do is access a thread safe data structure from two threads... other solutions to the problem that don't involve making the copy work would also help).
Edit2: The question this has been marked as a duplicate of is 5 pages long and as such I'd consider it and invalid question in it's own right.
Is the previous code 'unsafe' in any way ? If so, why ?
The goal of Rust's type-checking and borrow-checking system is to disallow unsafe programs, but that does not mean that all programs that fail to compile are unsafe. In this specific case, your code is not unsafe, but it does not satisfy the type constraints of the functions you are using.
The function it's referring to clearly invokes .join() on said thread.
But there is nothing from a type-checker standpoint that requires the call the .join. A type-checking system (on its own) can't enforce that a function has or has not been called on a given object. You could just as easily imagine an example like
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let mut handles = vec![];
for i in 0..3 {
let t1 = std::thread::spawn(|| {
println!("{} {}", message, i);
});
handles.push(t1);
}
for t1 in handles {
t1.join();
}
where a human can tell that each thread is joined before main exits. But a typechecker has no way to know that.
The function it's referring to here is (I believe) main. So presumably those threads will be killed when main exists anyway (and them running after main exists is ub).
From the standpoint of the checkers, main is just another function. There is no special knowledge that this specific function can have extra behavior. If this were any other function, the thread would not be auto-killed. Expanding on that, even for main there is no guarantee that the child threads will be killed instantly. If it takes 5ms for the child threads to be killed, that is still 5ms where the child threads could be accessing the content of a variable that has gone out of scope.
To gain the behavior that you are looking for with this specific snippet (as-is), the lifetime of the closure would have to be tied to the lifetime of the t1 object, such that the closure was guaranteed to never be used after the handles have been cleaned up. While that is certainly an option, it is significantly less flexible in the general case. Because it would be enforced at the type level, there would be no way to opt out of this behavior.
You could consider using crossbeam, specifically crossbeam::scope's .spawn, which enforces this lifetime requirement where the standard library does not, meaning a thread must stop execution before the scope is finished.
In your specific case, your code works fine as long as you transfer ownership of message to the child thread instead of borrowing it from the main function, because there is no risk of unsafe code with or without your call to .join. Your code works fine if you change
let t1 = std::thread::spawn(|| {
to
let t1 = std::thread::spawn(move || {