When to call sem_unlink()? - linux

I'm a little confused by the Linux API sem_unlink(), mainly when or why to call it. I've used semaphores in Windows for many years. In Windows once you close the last handle of a named semaphore the system removes the underlying kernel object. But it appears in Linux you, the developer, needs to remove the kernel object by calling sem_unlink(). If you don't the kernel object persists in the /dev/shm folder.
The problem I'm running into, if process A calls sem_unlink() while process B has the semaphore locked, it immediately destroys the semaphore and now process B is no longer "protected" by the semaphore when/if process C comes along. What's more, the man page is confusing at best:
"The semaphore name is removed immediately. The semaphore is destroyed once all other processes that have the semaphore open close it."
How can it destroy the object immediately if it has to wait for other processes to close the semaphore?
Clearly I don't understand the proper use of semaphore objects on Linux. Thanks for any help. Below is some sample code I'm using to test this.
int main(void)
{
sem_t *pSemaphore = sem_open("/MyName", O_CREAT, S_IRUSR | S_IWUSR, 1);
if(pSemaphore != SEM_FAILED)
{
if(sem_wait(pSemaphore) == 0)
{
// Perform "protected" operations here
sem_post(pSemaphore);
}
sem_close(pSemaphore);
sem_unlink("/MyName");
}
return 0;
}

Response to your questions:
In comparison to the semaphore behavior for windows you
describe, POSIX semaphores are Kernel persistent. Meaning that the
semaphore retains it's value even if no process has the semaphore
opened. (the semaphore's reference count would be 0)
If process A calls sem_unlink() while process B has the semaphore
locked. This means the semaphore's reference count is not 0 and will
not be destructed.
Basic operation of sem_close vs sem_unlink, I think will help overall understanding:
sem_close: close's a semaphore, this also done when a process exits. the semaphore still remains in the system.
sem_unlink: will be removed from the system only when the reference count reaches 0 (that is after all processes that have it open, call sem_close or are exited).
References:
Book - Unix Networking Programming-Interprocess Communication by W.Richard Stevens, vol 2, ch10

The sem_unlink() function removes the semaphore identified by name and marks
the semaphore to be destroyed once all processes cease using it (this may mean
immediately, if all processes that had the semaphore open have already closed it).

Related

Close call does not release underlying resources for the device

A bit of context: Linux 3.10.40, Multi-threads application, main thread waiting for user input (keyboard), other threads waiting (epoll_wait()) for events. No specific priority for either application or child threads, no bounding to a specific core.
I have a problem when I try to close the device /dev/ttyGS from my application in user space. Close return 0 and file descriptor is indeed removed from the process fd list but the underlying tty port is not released (that because the gs_close() callback is not called).
It "only" happens when I test the following scenario: unloading my driver whereas the /dev/ttyGS is still opened.
However, if I close /dev/ttyGS during the "normal" application exit path, i.e. do the tear down sequence (including the close(fd) call) and exit the application, then unload the driver (in the shell) I am not facing this issue.
From my (main thread) application:
// during application initialization
fd = open("/dev/ttyGS0", O_NONBLOCK | O_NOCTTY)
fd1 = epoll_create(....);
epoll_ctl(fd1, EPOLL_CTL_ADD, fd, &evt);
fd2 = epoll_create(....)
....
// then during application life
system("rmmod mydriver");
mydriver_exit
// some code ....
eventfd_signal
// some code ....
wait_event_interruptible
// Then from my event thread of my application
exit epoll_wait(fd2)
// some code ....
epoll_ctl(fd1, EPOLL_CTL_DEL, fd, NULL);
close(fd)
// .... some code within the kernel fs subsystem
fput(filp);
if (atomic_long_dec_and_test(&file->f_count)) {
// some code ....
if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
if (!task_work_add(task, &file->f_u.fu_rcuhead, true))
return;
// some code ....
schedule_work(&delayed_fput_work);
spin_unlock_irqrestore(&delayed_fput_lock, flags);
}
// return from syscall
// some code ....
write(some_sysfs_attribute)
// some code ....
wake_up_interruptible
// return from syscall
// some code ....
go_back_to_epoll_wait(fd2)
// etc...
Is that correct to call close from a child thread whereas the open was performed in another (the main) thread of my application? I guess so...
The problem I have here is that file->f_count is greater than 1, so the if branch is not taken and therefore the work, which eventually will triggered tty_release() and thus gs_write callback, is not scheduled.
I grepped the f_count increment location in the fs subsystem and and from the result I get, apart from open, there were in the locking subpart (i.e. /fs/lockd).
So I was wondering whether it could be some lock (involved by the close() call) that has a grasp on the file (increasing the reference count) during the close which could prevent the work from being scheduled (and thus the callback).
From what I know file descriptors are shared between all the thread of a process, and looking in /proc/<my_app_pid>/fd and /proc/<my_app_child_pid>/fd I indeed see the same fds.
Still if I am not mistaken I think the fd table is shared between all the threads (within the same process), which I guess might/should? involve some kind of lock which might explain the problem.
The thing is that I don't really know fs subsystem (neither architecture nor source code). I try to read the source but although some parts of it are understandable, others are less (or rather more tricky especially without a good overview). I am struggling a bit to identity what could have grasp on the reference count.
Any idea of what the problem could be?

mutex destroyed while busy

There is a singleton object of EventHandler class to receive events from the mainthread. It registers the input to a vector and creates a thread that runs a lambda function that waits for some time before deleting the input from the vector to prevent repeated execution of the event for this input for some time.
But I'm getting mutex destroyed while busy error. I'm not sure where it happened and how it happened. I am not even sure what it meant either because it shouldn't be de-constructed ever as a singleton object. Some help would be appreciated.
class EventHandler{
public:
std::mutex simpleLock;
std::vector<UInt32> stuff;
void RegisterBlock(UInt32 input){
stuff.push_back(input);
std::thread removalCallBack([&](UInt32 input){
std::this_thread::sleep_for(std::chrono::milliseconds(200));
simpleLock.lock();
auto it = Find(stuff, input);
if (it != stuff.end())
stuff.erase(it);
simpleLock.simpleLock.unlock();
}, input)
removalCallBack.detach();
}
virtual EventResult ReceiveEvent(UInt32 input){
simpleLock.lock();
if (Find(stuff, input) != stuff.end()){
RegisterBlock(input));
//dostuff
}
simpleLock.simpleLock.unlock();
}
};
What is happening is that a thread is created
std::thread removalCallBack([&](UInt32 input){
std::this_thread::sleep_for(std::chrono::milliseconds(200));
simpleLock.lock();
...
removalCallBack.detach();
And then since removalCallBack is a local variable to the function RegisterBlock, when the function exits, the destructor for removalCallBack gets called which invokes std::terminate()
Documentation for thread destructor
~thread(); (since C++11)
Destroys the thread object. If *this still has an associated running thread (i.e. joinable() == true), std::terminate() is called.
but depending on timing, simpleLock is still owned by the thread (is busy) when the thread exits which according to the spec leads to undefined behavior, in your case the destroyed while busy error.
To avoid this error, you should either allow the thread to exist after the function exits (e.g. not make it a local variable) or block until the thread exits before the function exits using thread::join
Dealing with cleaning up after threads can be tricky especially if they are essentially used as different programs occupying the same address space, and in those cases many times a manager thread just like you thought of is created whose only job is to reclaim thread related resources. Your situation is a little easier because of the simplicity of the work done in the thread created by removalCallBack, but there still is cleanup to do.
If the thread object is going to be created by new, then although system resources used by the system thread the C++ thread object represents will get cleaned up, but the memory the object uses will remain allocated until delete is called.
Also, consider if the program exits while there are threads running, then the threads will be terminated, but if there is a mutex locked when that happens, once again there will be undefined behavior.
What is usually done to guarantee that a thread is no longer running is to join with it, but though this doesn't say, the pthread_join man page states
Once a thread has been detached, it can't be joined with pthread_join(3) or be made joinable again.

How does Wait/Signal (semaphore) implementation pseudo-code "work"?

Wait(semaphore sem) {
DISABLE_INTS
sem.val--
if (sem.val < 0){
add thread to sem.L
block(thread)
}
ENABLE_INTS
Signal(semaphore sem){
DISABLE_INTS
sem.val++
if (sem.val <= 0) {
th = remove next
thread from sem.L
wakeup(th)
}
ENABLE_INTS
If block(thread) stops a thread from executing, how, where, and when does it return?
Which thread enables interrupts following the Wait()?
the thread that called block() shouldn’t return until another thread has called wakeup(thread)!
but how does that other thread get to run?
where exactly does the thread switch occur?
block(thread) works that way:
Enables interrupts
Uses some kind of waiting mechanism (provided by the operating system or the busy waiting in the simplest case) to wait until the wakeup(thread) on this thread is called. This means that in this point thread yields its time to the scheduler.
Disables interrupts and returns.
Yes, UP and DOWN are mostly useful when called from different threads, but it is not impossible that you call these with one thread - if you start semaphore with a value > 0, then the same thread can entry the critical section and execute both DOWN (before) and UP (after). Value which initializes the semaphore tells how many threads can enter the critical section at once, which might be 1 (mutex) or any other positive number.
How are the threads created? That is not shown on the lecture slide, because that is only a principle how semaphore works using a pseudocode. But it is a completely different story how you use those semaphores in your application.

flock locking order?

im using a simple test script from
http://www.tuxradar.com/practicalphp/8/11/0
like this
<?php
$fp = fopen("foo.txt", "w");
if (flock($fp, LOCK_EX)) {
print "Got lock!\n";
sleep(10);
flock($fp, LOCK_UN);
}
i opened 5 shell's and executed the script one after the other
the scripts block until the lock is free'ed and then continues after released
im not really interessted in php stuff, but my question is:
anyone knows the order in which flock() is acquired?
e.g.
t0: process 1 lock's
t1: process 2 try_lock < blocking
t2: process 3 try_lock < blocking
t3: process 1 releases lock
t4: ?? which process get's the lock?
is there a simple deterministic order, like a queue or does the kernel 'just' pick one by "more advanced rules"?
If there are multiple processes waiting for an exclusive lock, it's not specified which one succeeds in acquiring it first. Don't rely on any particular ordering.
Having said that, the current kernel code wakes them in the order they blocked. This comment is in fs/locks.c:
/* Insert waiter into blocker's block list.
* We use a circular list so that processes can be easily woken up in
* the order they blocked. The documentation doesn't require this but
* it seems like the reasonable thing to do.
*/
If you want to have a set of processes run in order, don't use flock(). Use SysV semaphores (semget() / semop()).
Create a semaphore set that contains one semaphore for each process after the first, and initialise them all to -1. For every process after the first, do a semop() on that process's semaphore with a sem_op value of zero - this will block it. After the first process is complete, it should do a semop() on the second process's semaphore with a sem_op value of 1 - this will wake the second process. After the second process is complete, it should do a semop() on the third process's semaphore with a sem_op value of 1, and so on.

pthread condition variables on Linux, odd behaviour

I'm synchronizing reader and writer processes on Linux.
I have 0 or more process (the readers) that need to sleep until they are woken up, read a resource, go back to sleep and so on. Please note I don't know how many reader processes are up at any moment.
I have one process (the writer) that writes on a resource, wakes up the readers and does its business until another resource is ready (in detail, I developed a no starve reader-writers solution, but that's not important).
To implement the sleep / wake up mechanism I use a Posix condition value, pthread_cond_t. The clients call a pthread_cond_wait() on the variable to sleep, while the server does a pthread_cond_broadcast() to wake them all up. As the manual says, I surround these two calls with a lock/unlock of the associated pthread mutex.
The condition variable and the mutex are initialized in the server and shared between processes through a shared memory area (because I'm not working with threads, but with separate processes) an I'm sure my kernel / syscall support it (because I checked _POSIX_THREAD_PROCESS_SHARED).
What happens is that the first client process sleeps and wakes up perfectly. When I start the second process, it blocks on its pthread_cond_wait() and never wakes up, even if I'm sure (by the logs) that pthread_cond_broadcast() is called.
If I kill the first process, and launch another one, it works perfectly. In other words, the condition variable pthread_cond_broadcast() seems to wake up only one process a time. If more than one process wait on the very same shared condition variable, only the first one manages to wake up correctly, while the others just seem to ignore the broadcast.
Why this behaviour? If I send a pthread_cond_broadcast(), every waiting process should wake up, not just one (and, however, not always the same one).
Have you set the PTHREAD_PROCESS_SHARED attribute on both your condvar and mutex?
For Linux consult the following man pages:
pthread_mutexattr_init (with sample)
pthread_mutexattr_setpshared
pthread_condattr_init
pthread_condattr_setpshared
Methods, types, constants etc. are normally defined in /usr/include/pthread.h, /usr/include/nptl/pthread.h.
Do you test for some condition before calling pthread_cond_wait() ? I am asking because, it's a very common mistake : Your process must not call wait() unless you know some other process will call signal() (or broadcast()) later.
concidering this code (from pthread_cond_wait man page) :
pthread_mutex_lock(&mut);
while (x <= y) {
pthread_cond_wait(&cond, &mut);
}
/* operate on x and y */
pthread_mutex_unlock(&mut);
If your omit the while test, and just signal from another process whenever your (x <= y) condition is true, it won't work since the signal only wakes up the processes that are already waiting. If signal() was called before the other process calls wait() the signal will be lost and the waiting process will be waiting forever.
EDIT : About the while loop.
When you are signaling one process from another process it is set on the ''ready list'' but not necessarily scheduled and your condition (x <= y) may be change again since no one holds the lock. That's why you need to check for your condition each time you are about to wait. It should always be wakeup -> check if the condition is still true -> do work.
hope it's clear.
The documentation says that it should work... are you sure it's the same conditional value that the rest of the threads are looking at?
This is the example code from opengroup.org:
pthread_cond_wait(mutex, cond):
value = cond->value; /* 1 */
pthread_mutex_unlock(mutex); /* 2 */
pthread_mutex_lock(cond->mutex); /* 10 */
if (value == cond->value) { /* 11 */
me->next_cond = cond->waiter;
cond->waiter = me;
pthread_mutex_unlock(cond->mutex);
unable_to_run(me);
} else
pthread_mutex_unlock(cond->mutex); /* 12 */
pthread_mutex_lock(mutex); /* 13 */
pthread_cond_signal(cond):
pthread_mutex_lock(cond->mutex); /* 3 */
cond->value++; /* 4 */
if (cond->waiter) { /* 5 */
sleeper = cond->waiter; /* 6 */
cond->waiter = sleeper->next_cond; /* 7 */
able_to_run(sleeper); /* 8 */
}
pthread_mutex_unlock(cond->mutex); /* 9 */
what the last poster said is correct. the KEY to the whole cond-variable situation working correctly is that the cond-var is NOT signalled prior to it being waited on. its strictly a signal that is to be used when others (single or multiple) are waiting. when no one is waiting, its effectively a NOP. which, btw, is NOT how i believe it SHOULD work, but how it DOES work.
larry

Resources