This is the structure of ucontext_t:
typedef struct ucontext_t
{
unsigned long int __ctx(uc_flags);
struct ucontext_t *uc_link;
stack_t uc_stack;
mcontext_t uc_mcontext;
sigset_t uc_sigmask;
struct _libc_fpstate __fpregs_mem;
__extension__ unsigned long long int __ssp[4];
} ucontext_t;
I know uc_sigmask specifies the signals to be blocked after a setcontext. But how is it implemented? Is it atomic to set sigmask and set up registers and jmp to to the target rip?
The documentation says it is deprecated. What's the system semaphore? And what's the best replacement for this struct now?
Deprecated since 1.7.0: easily confused with system semaphore and not used enough to pull its weight
System semaphore refers to whatever semaphore the operating system provides. On POSIX (Linux, MacOS) these are the methods you get from #include <semaphore.h> (man page). std::sync::Semaphore was implemented in rust and was separate from the OS's semaphore, although it did use some OS level synchronization primitives (std::sync::Condvar which is based on pthread_cond_t on linux).
std::sync::Semaphore was never stabilized. The source code for Semaphore contains an unstable attribute
#![unstable(feature = "semaphore",
reason = "the interaction between semaphores and the acquisition/release \
of resources is currently unclear",
issue = "27798")]
The issue number in the header specifies the discussion about this feature.
The best replacement within std is either a std::sync::CondVar or a busy loop paired with a std::sync::Mutex. Pick a CondVar over a busy loop if you think you might be waiting more than a few thousand clock cycles.
The documentation for Condvar has a good example of how to use it as a (binary) semaphore
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);
// Inside of our lock, spawn a new thread, and then wait for it to start.
thread::spawn(move|| {
let (lock, cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// Wait for the thread to start up.
let (lock, cvar) = &*pair;
let mut started = lock.lock().unwrap();
while !*started {
started = cvar.wait(started).unwrap();
}
This example could be adapted to work as a counting semaphore by changing Mutex::new(false) to Mutex::new(0) and a few corresponding changes.
Does into_inner() return all the relaxed writes in this example program? If so, which concept guarantees this?
extern crate crossbeam;
use std::sync::atomic::{AtomicUsize, Ordering};
fn main() {
let thread_count = 10;
let increments_per_thread = 100000;
let i = AtomicUsize::new(0);
crossbeam::scope(|scope| {
for _ in 0..thread_count {
scope.spawn(|| {
for _ in 0..increments_per_thread {
i.fetch_add(1, Ordering::Relaxed);
}
});
}
});
println!(
"Result of {}*{} increments: {}",
thread_count,
increments_per_thread,
i.into_inner()
);
}
(https://play.rust-lang.org/?gist=96f49f8eb31a6788b970cf20ec94f800&version=stable)
I understand that crossbeam guarantees that all threads are finished and since the ownership goes back to the main thread, I also understand that there will be no outstanding borrows, but the way I see it, there could still be outstanding pending writes, if not on the CPUs, then in the caches.
Which concept guarantees that all writes are finished and all caches are synced back to the main thread when into_inner() is called? Is it possible to lose writes?
Does into_inner() return all the relaxed writes in this example program? If so, which concept guarantees this?
It's not into_inner that guarantees it, it's join.
What into_inner guarantees is that either some synchronization has been performed since the final concurrent write (join of thread, last Arc having been dropped and unwrapped with try_unwrap, etc.), or the atomic was never sent to another thread in the first place. Either case is sufficient to make the read data-race-free.
Crossbeam documentation is explicit about using join at the end of a scope:
This [the thread being guaranteed to terminate] is ensured by having the parent thread join on the child thread before the scope exits.
Regarding losing writes:
Which concept guarantees that all writes are finished and all caches are synced back to the main thread when into_inner() is called? Is it possible to lose writes?
As stated in various places in the documentation, Rust inherits the C++ memory model for atomics. In C++11 and later, the completion of a thread synchronizes with the corresponding successful return from join. This means that by the time join completes, all actions performed by the joined thread must be visible to the thread that called join, so it is not possible to lose writes in this scenario.
In terms of atomics, you can think of a join as an acquire read of an atomic that the thread performed a release store on just before it finished executing.
I will include this answer as a potential complement to the other two.
The kind of inconsistency that was mentioned, namely whether some writes could be missing before the final reading of the counter, is not possible here. It would have been undefined behaviour if writes to a value could be postponed until after its consumption with into_inner. However, there are no unexpected race conditions in this program, even without the counter being consumed with into_inner, and even without the help of crossbeam scopes.
Let us write a new version of the program without crossbeam scopes and where the counter is not consumed (Playground):
let thread_count = 10;
let increments_per_thread = 100000;
let i = Arc::new(AtomicUsize::new(0));
let threads: Vec<_> = (0..thread_count)
.map(|_| {
let i = i.clone();
thread::spawn(move || for _ in 0..increments_per_thread {
i.fetch_add(1, Ordering::Relaxed);
})
})
.collect();
for t in threads {
t.join().unwrap();
}
println!(
"Result of {}*{} increments: {}",
thread_count,
increments_per_thread,
i.load(Ordering::Relaxed)
);
This version still works pretty well! Why? Because a synchronizes-with relation is established between the ending thread and its corresponding join. And so, as well explained in a separate answer, all actions performed by the joined thread must be visible to the caller thread.
One could probably also wonder whether even the relaxed memory ordering constraint is sufficient to guarantee that the full program behaves as expected. This part is addressed by the Rust Nomicon, emphasis mine:
Relaxed accesses are the absolute weakest. They can be freely re-ordered and provide no happens-before relationship. Still, relaxed operations are still atomic. That is, they don't count as data accesses and any read-modify-write operations done to them occur atomically. Relaxed operations are appropriate for things that you definitely want to happen, but don't particularly otherwise care about. For instance, incrementing a counter can be safely done by multiple threads using a relaxed fetch_add if you're not using the counter to synchronize any other accesses.
The mentioned use case is exactly what we are doing here. Each thread is not required to observe the incremented counter in order to make decisions, and yet all operations are atomic. In the end, the thread joins synchronize with the main thread, thus implying a happens-before relation, and guaranteeing that the operations are made visible there. As Rust adopts the same memory model as C++11's (this is implemented by LLVM internally), we can see regarding the C++ std::thread::join function that "The completion of the thread identified by *this synchronizes with the corresponding successful return". In fact, the very same example in C++ is available in cppreference.com as part of the explanation on the relaxed memory order constraint:
#include <vector>
#include <iostream>
#include <thread>
#include <atomic>
std::atomic<int> cnt = {0};
void f()
{
for (int n = 0; n < 1000; ++n) {
cnt.fetch_add(1, std::memory_order_relaxed);
}
}
int main()
{
std::vector<std::thread> v;
for (int n = 0; n < 10; ++n) {
v.emplace_back(f);
}
for (auto& t : v) {
t.join();
}
std::cout << "Final counter value is " << cnt << '\n';
}
The fact that you can call into_inner (which consumes the AtomicUsize) means that there are no more borrows on that backing storage.
Each fetch_add is an atomic with the Relaxed ordering, so once the threads are complete there shouldn't be any thing that changes it (if so, then there's a bug in crossbeam).
See the description on into_inner for more info
Consider we defined a structure T
struct T {
int a, b;
};
if the address of b is 0x8b3000c and sizeof(int) is 4. what value will container_of() return when invoked
container_of is a macro in linux kernel code, which calculates address of container.
For ewxample, in your case
struct T {
int a, b;
};
Applying container_of on address of b will yield address of struct T
struct T *pT = container_of(ptr_b, struct T, b);
where ptr_b will hold the address of b, &b
Normally, we won't care the physical value we got, like 0x8b3000c, as we work with identifiers.
As you are interested in physical, as both members are int with size 4, ignoring padding, pT will have (Ox8b3000c -4) = Ox8b30008
BUT BUT, Please never make such assumption while coding, struct may be padded. It is always good to use sizeof
I decided to use structs in this program to keep it organized so I now have a chain of structs. My question is if I must malloc a struct that is within another struct. For example:
typedef struct OnlineS {
struct BBSIS *bbsi;
struct BBVIS *bbvi;
struct VBVIS *vbvi;
} *OnlineP;
typedef struct BBSIS{
struct FirstFitS *ff;
struct BestFitS *bf;
struct NextFitS *nf;
int itemNum;
int binNum;
int binMin;
int binMax;
int *items;
}*BBSIP;
And so on, so would my declaration and mallocs look like?
OnlineP on = malloc(sizeof (struct OnlineS));
on->bbsi = malloc(sizeof (struct BBSIS));
on->bbsi->bf = malloc(sizeof (struct BestFitS));
on->bbsi->nf = malloc(sizeof (struct NextFitS));
on->bbsi->ff = malloc(sizeof (struct FirstFitS));
on->bbvi = malloc(sizeof (struct BBVIS));
on->bbvi->bf = malloc(sizeof (struct BestFitS));
//ETC
If you use pointers to structs within a struct you must manage memory for that as well. (malloc/free)
If tou use structs within a struct you do not manage memory for the internal structures. Since they are part of the outer struct there is no need to.
You use pointer to structs in your outer struc so you must use malloc and free.
First allocate memory for your outer struct, then set all pointers to the inner structs to null or allocate memory for it.
There is no struct in your struct.
There is a pointer in your struct, and the memory for the pointer was allocated.
Consider the following construct:
typedef struct node {
struct node* next;
}
(Which is very common - a linked list)
How many nodes should it allocate?