linux shared memory mutex, struct with pointer to shared mutex - linux

Im working on porting some existing windows code over to linux, and I've come across something I'm not entirely sure how to handle.
The code is originally RTX windows, and must be deterministic. The first thing I've come across is a structure that contains a semaphore and mutex objects, and sets up pointers to the mutex and semaphore to be passed around/used by other callers.
volatile struct mystruct{
volatile pthread_mutex_t *qmutexid
volatile sem_t *qsemid
volatile int processID
volatile int msize
volatile char msgarray[]
}
this struct is cast over a large piece of memory that has data coming in and out of it via a linked list queue, but the semaphore and mutexes are a necessity to enure integrity.
What i want to know is if they following assignment for the pointer is valid.
myfunctioninit (*qname, msg_size, depth)
{
struct muStruct struct1
pthread_mutex_t mutexQueAccess
status = pthread_mutex_init(&mutexQueAccess, null)
struct1->qmutexid = mutexAccess
}
The other part of this, is that mutex's in windows are assigned/accessed by name. Other processes need access to this mutex, how do i go about doing it so the mutex can be shared across multiple processes/thread?

Related

How a thread which is blocked while trying to mutex_lock get to know that the lock is released by another thread?

In Linux, I have a scenario where two threads execute a critical section, one acquires the lock (thread A) and the other(thread B) will wait for the lock. Later threadA releases the mutex lock. I am trying to understand how threadB will be moved to the running state and acquire the lock? How threadB(or operating system) knows that the lock is released by threadA?
I have a theory, please correct if I am wrong. threadB enters TASK_INTERRUPTABLE (blocked at the mutex and so waiting) state and it receives signal when threadA unlocks the mutex so it comes back to the running queue(TASK_RUNNING).
The Linux mutex struct keeps track of the current owner of the mutex (if any):
struct mutex {
atomic_long_t owner;
// ...
There's also a struct to keep track of what other tasks are waiting on a mutex:
/*
* This is the control structure for tasks blocked on mutex,
* which resides on the blocked task's kernel stack:
*/
struct mutex_waiter {
struct list_head list;
struct task_struct *task;
struct ww_acquire_ctx *ww_ctx;
#ifdef CONFIG_DEBUG_MUTEXES
void *magic;
#endif
};
Simplifying quite a bit, when you unlock a mutex, the kernel looks at what other tasks are waiting on that mutex. It picks one of them to become the owner, sets the mutex's owner field to refer to the selected task, and removes that task from the list of tasks waiting for the mutex. At that point, there's at least a good chance that task has become un-blocked, in which case it'll be ready to run once it's unblocked. At that point, it's up to the scheduler to decide when to run it.
Optimization
Since mutexes are used a lot, and they get locked and unlocked quite a bit, they use some optimization to help speed. For example, consider the following:
/*
* #owner: contains: 'struct task_struct *' to the current lock owner,
* NULL means not owned. Since task_struct pointers are aligned at
* at least L1_CACHE_BYTES, we have low bits to store extra state.
*
* Bit0 indicates a non-empty waiter list; unlock must issue a wakeup.
* Bit1 indicates unlock needs to hand the lock to the top-waiter
* Bit2 indicates handoff has been done and we're waiting for pickup.
*/
#define MUTEX_FLAG_WAITERS 0x01
#define MUTEX_FLAG_HANDOFF 0x02
#define MUTEX_FLAG_PICKUP 0x04
#define MUTEX_FLAGS 0x07
So, when you ask the kernel to unlock a mutex, it can "glance" at one bit in the owner pointer to figure out whether this is a "simple" case (nobody's waiting on the mutex, so just mark it as unlocked, and off we go), or a more complex one (at least one task is waiting on the mutex, so a task needs to be selected to be unblocked, and be marked as the new owner of the mutex.
References
https://github.com/torvalds/linux/blob/master/include/linux/mutex.h
https://github.com/torvalds/linux/blob/master/kernel/locking/mutex.c
Disclaimer
The code extracts above are (I believe) current as I write this answer. But as noted above, mutexes get used a lot. If you look at the code for a mutex 5 or 10 years from now, chances are you'll find that somebody has done some work on optimizing the code, so it may not precisely match what I've quoted above. Most of the concepts are likely to remain similar, but changes in details (especially the optimizations) are to be expected.

Do I need to protect fd in multi-threaded read call?

Does read system call implies a synchronization on the descriptor inside the kernel? I've seen some code use just a read call to synchronize and coordinate between multiple consumer threads, like this:
rfd,wfd = pipe() or socketpair(); // setup fd
// a single writer:
write(wfd, ...);
// multiple read threads, each blocks on read:
read(rfd, work); // my questions is here
// do the work assigned by writer
while I used to think that a explicit lock like pthread_mutex has to be used, like this:
pthread_mutex_t lock;
// work threads:
pthread_mutex_lock(&lock);
read(rfd, work);
pthread_mutex_unlock(&lock);
// do work
So my question is whether an explicit lock necessary in this situation? Does read call guarantee proper thread safety in this case?
It is safe for multiple threads to call read() on the same file descriptor at once, but the read() call does not synchronize memory (it is not listed in the functions specified by POSIX to do so).
This means that it is only safe to rely on just the read() if you transfer all information required over the file descriptor itself. If you want to use shared memory, you also need a memory synchronisation function - you don't have to hold the lock over the read() though, it would be safe to have a pattern like:
/* writer */
pthread_mutex_lock(&lock);
/* ...write to shared memory... */
pthread_mutex_unlock(&lock);
write(wfd, ...);
/* readers */
read(rfd, ...);
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock);
/* ... read from shared memory ... */
However, this would be quite odd, because if you are using shared memory and mutexes, you might as well use part of the shared memory and a condition variable to implement the signalling from writer to reader, instead of involving file descriptors.

What does raw_spinlock mean?

I was studying the raw_spinlock struct, which is in /usr/src/linux/include/linux/spinlock_types.h:
typedef struct raw_spinlock {
arch_spinlock_t raw_lock;
#ifdef CONFIG_GENERIC_LOCKBREAK
unsigned int break_lock;
#endif
#ifdef CONFIG_DEBUG_SPINLOCK
unsigned int magic, owner_cpu;
void *owner;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
#endif
} raw_spinlock_t;
I think raw_lock is for a lock which is dependent on an architecture and dep_map is a kind of data structure to avoid deadlocks, but what do break_lock, magic, owner_cpu, and *owner mean?
spinlock
spinlock is public API for spinlocks in kernel code.
See Documentation/locking/spinlocks.txt.
raw_spinlock
raw_spinlock is actual implementation of normal spinlocks. On not-RT kernels, spinlock is just a wrapper for raw_spinlock. On RT kernels, spinlock doesn't always use raw_spinlock.
See this article on LWN.
arch_spinlock
arch_spinlock is platform-specific part of spinlock implementation. raw_spinlock is generally platform-independent and delegates low-level operations to arch_spinlock.
lockdep_map
lockdep_map is a dependency map for locking correctness validator.
See Documentation/locking/lockdep-design.txt.
break_lock
On SMP kernels, when spin_lock() on one CPU starts looping while the lock is held on another CPU, it sets this flag to 1. Another CPU that holds the lock can periodically check this flag using spin_is_contended() and then call spin_unlock().
This allows to archive two goals at the same time:
avoid frequent locking/unlocking;
avoid holding lock for a long time, preventing others to acquire the lock.
See also this article.
magic, owner, owner_cpu
These fields are enabled when CONFIG_SPINLOCK_DEBUG is set and help to detect common bugs:
magic is set to some randomly choosen constant when spinlock is created (SPINLOCK_MAGIC which is 0xdead4ead)
owner is set to current process in spin_lock();
owner_cpu is set to current CPU id in spin_lock().
spin_unlock() checks that it is called when current process and CPU are the same as they were when spin_lock() was called.
spin_lock() checks that magic is equal to SPINLOCK_MAGIC to ensure that caller passed a pointer to correctly initialized spinlock and (hopefully) no memory corruption occurred.
See kernel/locking/spinlock_debug.c.

Linux semaphores: spinlock or signals?

How does the current implementation of semaphores work? Does it use spinlocks or signals?
How does the scheduler know which one to invoke if signals are used?
Also how does it work in user space? Kernel locking recommends spinlocks but user space does not. So are the implementations different in user space and kernel space for semaphores?
Use the power of Open Source - just look at source code.
The kernel-space semaphore is defined as
struct semaphore {
raw_spinlock_t lock;
unsigned int count;
struct list_head wait_list;
};
lock is used to protect count and wait_list.
All tasks waiting on a semaphore reside in wait_list. When the semaphore is upped, one tasks is woken up.
User-space semaphores should rely on semaphore-related system calls, Kernel provides. The definition of user-space semaphores is:
/* One semaphore structure for each semaphore in the system. */
struct sem {
int semval; /* current value */
int sempid; /* pid of last operation */
spinlock_t lock; /* spinlock for fine-grained semtimedop */
struct list_head sem_pending; /* pending single-sop operations */
};
The kernel uses definition of the user-space semaphore similar to the kernel-space one. sem_pending is a list of waiting process plus some additional info.
I should highlight again that neither kernel-space semaphore, nor user-space one uses spinlock to wait on lock. Spinlock is included in both structures only to protect structure members from the concurrent access. After the structure is modified, spinlock is released and the task rests in list until woken.
Furthermore, spinlocks are unsuitable to wait on some event from another thread. Before acquiring a spinlock, kernel disables preemption. So, in this case, on uniprocessor machines, spinlock will never be released.
I should also notice that user-space semaphores, while serving on behalf of user-space, are executing in kernel-space.
P.S. Source code for the kernel-space semaphore resides in include/linux/semaphore.h and kernel/semaphore.c, for user-space one in ipc/sem.c

What is the Re-entrant lock and concept in general?

I always get confused. Would someone explain what Reentrant means in different contexts? And why would you want to use reentrant vs. non-reentrant?
Say pthread (posix) locking primitives, are they re-entrant or not? What pitfalls should be avoided when using them?
Is mutex re-entrant?
Re-entrant locking
A reentrant lock is one where a process can claim the lock multiple times without blocking on itself. It's useful in situations where it's not easy to keep track of whether you've already grabbed a lock. If a lock is non re-entrant you could grab the lock, then block when you go to grab it again, effectively deadlocking your own process.
Reentrancy in general is a property of code where it has no central mutable state that could be corrupted if the code was called while it is executing. Such a call could be made by another thread, or it could be made recursively by an execution path originating from within the code itself.
If the code relies on shared state that could be updated in the middle of its execution it is not re-entrant, at least not if that update could break it.
A use case for re-entrant locking
A (somewhat generic and contrived) example of an application for a re-entrant lock might be:
You have some computation involving an algorithm that traverses a graph (perhaps with cycles in it). A traversal may visit the same node more than once due to the cycles or due to multiple paths to the same node.
The data structure is subject to concurrent access and could be updated for some reason, perhaps by another thread. You need to be able to lock individual nodes to deal with potential data corruption due to race conditions. For some reason (perhaps performance) you don't want to globally lock the whole data structure.
Your computation can't retain complete information on what nodes you've visited, or you're using a data structure that doesn't allow 'have I been here before' questions to be answered quickly. An example of this situation would be a simple implementation of Dijkstra's algorithm with a priority queue implemented as a binary heap or a breadth-first search using a simple linked list as a queue. In these cases, scanning the queue for existing insertions is O(N) and you may not want to do it on every iteration.
In this situation, keeping track of what locks you've already acquired is expensive. Assuming you want to do the locking at the node level a re-entrant locking mechanism alleviates the need to tell whether you've visited a node before. You can just blindly lock the node, perhaps unlocking it after you pop it off the queue.
Re-entrant mutexes
A simple mutex is not re-entrant as only one thread can be in the critical section at a given time. If you grab the mutex and then try to grab it again a simple mutex doesn't have enough information to tell who was holding it previously. To do this recursively you need a mechanism where each thread had a token so you could tell who had grabbed the mutex. This makes the mutex mechanism somewhat more expensive so you may not want to do it in all situations.
IIRC the POSIX threads API does offer the option of re-entrant and non re-entrant mutexes.
A re-entrant lock lets you write a method M that puts a lock on resource A and then call M recursively or from code that already holds a lock on A.
With a non re-entrant lock, you would need 2 versions of M, one that locks and one that doesn't, and additional logic to call the right one.
Reentrant lock is very well described in this tutorial.
The example in the tutorial is far less contrived than in the answer about traversing a graph. A reentrant lock is useful in very simple cases.
The what and why of recursive mutex should not be such a complicated thing described in the accepted answer.
I would like to write down my understanding after some digging around the net.
First, you should realize that when talking about mutex, multi thread concepts is definitely involved too. (mutex is used for synchronization. I don't need mutex if I only have 1 thread in my program)
Secondly, you should know the difference bewteen a normal mutex and a recursive mutex.
Quoted from APUE:
(A recursive mutex is a) A mutex type that allows the same thread to lock
it multiple times without first unlocking it.
The key difference is that within the same thread, relock a recursive lock does not lead to deadlock, neither block the thread.
Does this mean that recusive lock never causes deadlock?
No, it can still cause deadlock as normal mutex if you have locked it in one thread without unlocking it, and try to lock it in other threads.
Let's see some code as proof.
normal mutex with deadlock
#include <pthread.h>
#include <stdio.h>
pthread_mutex_t lock;
void * func1(void *arg){
printf("thread1\n");
pthread_mutex_lock(&lock);
printf("thread1 hey hey\n");
}
void * func2(void *arg){
printf("thread2\n");
pthread_mutex_lock(&lock);
printf("thread2 hey hey\n");
}
int main(){
pthread_mutexattr_t lock_attr;
int error;
// error = pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_RECURSIVE);
error = pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_DEFAULT);
if(error){
perror(NULL);
}
pthread_mutex_init(&lock, &lock_attr);
pthread_t t1, t2;
pthread_create(&t1, NULL, func1, NULL);
pthread_create(&t2, NULL, func2, NULL);
pthread_join(t2, NULL);
}
output:
thread1
thread1 hey hey
thread2
common deadlock example, no problem.
recursive mutex with deadlock
Just uncomment this line
error = pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_RECURSIVE);
and comment out the other one.
output:
thread1
thread1 hey hey
thread2
Yes, recursive mutex can also cause deadlock.
normal mutex, relock in the same thread
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
pthread_mutex_t lock;
void func3(){
printf("func3\n");
pthread_mutex_lock(&lock);
printf("func3 hey hey\n");
}
void * func1(void *arg){
printf("thread1\n");
pthread_mutex_lock(&lock);
func3();
printf("thread1 hey hey\n");
}
void * func2(void *arg){
printf("thread2\n");
pthread_mutex_lock(&lock);
printf("thread2 hey hey\n");
}
int main(){
pthread_mutexattr_t lock_attr;
int error;
// error = pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_RECURSIVE);
error = pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_DEFAULT);
if(error){
perror(NULL);
}
pthread_mutex_init(&lock, &lock_attr);
pthread_t t1, t2;
pthread_create(&t1, NULL, func1, NULL);
sleep(2);
pthread_create(&t2, NULL, func2, NULL);
pthread_join(t2, NULL);
}
output:
thread1
func3
thread2
Deadlock in thread t1, in func3.
(I use sleep(2) to make it easier to see that the deadlock is firstly caused by relocking in func3)
recursive mutex, relock in the same thread
Again, uncomment the recursive mutex line and comment out the other line.
output:
thread1
func3
func3 hey hey
thread1 hey hey
thread2
Deadlock in thread t2, in func2. See? func3 finishes and exits, relocking does not block the thread or lead to deadlock.
So, last question, why do we need it ?
For recursive function (called in multi-threaded programs and you want to protect some resource/data).
E.g. You have a multi thread program, and call a recursive function in thread A. You have some data that you want to protect in that recursive function, so you use the mutex mechanism. The execution of that function is sequential in thread A, so you would definitely relock the mutex in recursion. Use normal mutex causes deadlocks. And resursive mutex is invented to solve this.
See an example from the accepted answer
When to use recursive mutex?.
The Wikipedia explains the recursive mutex very well. Definitely worth for a read. Wikipedia: Reentrant_mutex

Resources