Critical region codes and Semaphores - semaphore

The semaphores ,which are data structure created by the operating system, are used for providing a synchronization and creating mutual exclusion between the processes. wait() and signal() are methods which are invoked by the operating system in order to manage the semaphores and these methods cannot be interrupted by interrupt service routine signals.
What I am wondering is whether critical region codes between wait() and signal() methods can be interrupted or not ?

Yes they can be interrupted simply because the definition itself proposes no such restriction.
In concurrent programming, concurrent accesses to shared resources can lead to unexpected or erroneous behavior, so parts of the program where the shared resource is accessed are protected. This protected section is the critical section or critical region. It cannot be executed by more than one process at a time.
So critical-section demands mutual exclusion but it does not say anything about atomicity.
So Yes, critical region codes between wait() and signal() methods can be interrupted but a good synchronization construct would be that once a process/thread enters critical section, even if that process later is interrupted, no other process would be able to enter critical section.

Related

What is the difference between an atomic operation and critical section?, which of the two prevents context switching?

A programming language or the processor already has "default" atomic operations and we can use them as far as I understand.
https://en.wikipedia.org/wiki/Linearizability
What is the difference between an atomic operation and critical section?
Atomic operations are instructions that guarantee atomic accesses/updates of shared (small) variables. This generally include operations like incrementation, decrementation, addition, subtraction, compare and swap (aka. CAS), exchange, logical operations (and, or, xor) as well as basic loads/stores. If you want to perform a non trivial operation that is not supported by the target platform (or one involving large variables), then you cannot use one atomic operation. This means either multiple of them is required or another mechanism should be used instead (eg. critical section, transactional memory). Note that using multiple atomic operations often makes things significantly more complex (see ABA problem). On mainstream CPUs, atomic operations are generally implemented by locking cache lines of shared caches (eg. L3) so that only one thread can access to it at a time.
Critical sections are meant to protect one or multiple instructions from being executed by multiple threads at the same time. They are generally protected using a system mutex. The thread entering the critical section lock the associated mutex and unlock it when leaving the section. System mutexes cause the thread entering a critical section to wait if the associated mutex is already locked. This is generally done using a context switch (the thread is descheduled and rescheduled later).
Critical section can be efficient when the lock is very rarely already taken by another thread. Context switches can significantly impact the performance. Atomic operation are not great either when many thread perform atomic operations on it. Contention effects can make atomic accesses significantly slower (eg. spin locks). This is especially true for atomic CAS operations. Some platform can execute atomic operation very quickly (eg. GPUs) since they have dedicated units to execute atomic operation efficiently.
which of the two prevents context switching?
None of the two prevent context switching. Modern operating systems can perform a context switching at any time. That being said, critical section generally cause context switches: a thread trying to enter into a critical section already locked by another thread will typically enter in sleeping mode and be awaken by the OS scheduler when the other thread will unlock the section. Atomic operations do not impact the scheduling of the system (at least not on mainstream platforms).
Note that the above text is also true for processes.
Speaking only to the nomenclature question:
"Atomic" means "cannot be broken down into smaller parts." In programming, an operation performed by one thread is "atomic" (as seen from other threads) if there is no possible way for the other threads to see the operation in a half-way done state. From the point of view of other threads, it's as if the entire operation happened in a single instant. It either has already happened, or it hasn't happened yet. There is no in between.
As Jérôme Richard points out, modern computer hardware provides atomic operations on simple variables. We can use those to make more complex operations seem "atomic" from the point of view of other threads either by using the hardware atomics in tricky non-blocking algorithms, or by using the hardware atomics in the implementation of mutex locks.
"Critical section" comes from a time before multi-threading. In operating system kernel code, and in "bare metal" application code, there has always been a limited form of concurrency between the main body of code and the interrupt handlers. "Critical section," back in the day, referred to a routine in the main body of code that was protected from interference by the interrupt handlers by executing it with interrupts disabled.
Systems programmers today still use "critical section" with the original meaning, but now we also sometimes say it to talk about a routine that is executed by a thread while the thread has a mutex locked.
IMO, "critical section" encourages a somewhat less useful way of thinking about mutex locks though because it's never the code that needs protection from interference. It's always about protecting the integrity of shared data. Sometimes a programmer who worries about defining The critical section can lose sight of the fact that there may be multiple routines in the program that all access the same shared data.
IMO, this is one place where an object-oriented style of programming shines, because it's easier to keep track of what needs to be protected if it is encapsulated in private members of some object and, can only be accessed through the object's thread-safe, public methods.

LIghtweight mutex

The citation comes from http://preshing.com/20111124/always-use-a-lightweight-mutex/
The Windows Critical Section is what we call a lightweight mutex. It’s
optimized for the case when there are no other threads competing for
the lock. To demonstrate using a simple example, here’s a single
thread which locks and unlocks a Windows Mutex exactly one million
times
Does it mean that lightweight mutex is just a smart heavy (kernel) mutex?
By "smart" I mean that only when mutex is free it doesn't make a syscall?
In summary, yes: on Windows, critical sections and mutexes are similar, but critical sections are lighter weight because they avoid a system call when there is no contention.
Windows has two different mutual-exclusion primitives: critical sections and mutexes. They serve similar functions, but critical sections are significantly faster than mutexes.
Mutexes always result in a system call down to the kernel, which requires a processor ring-mode switch and entails a significant amount of overhead. (The user-mode thread raises an exception, which is then caught by the kernel thread running in ring 0; the user-mode thread remains halted until execution returns back out of kernel mode.) Although they are slower, mutexes are much more powerful and flexible. They can be shared across processes, a waiting thread can specify a time-out period, and a waiting thread can also determine whether the thread that owned the mutex terminated or if the mutex was deleted.
Critical sections are much lighter-weight objects, and therefore much faster than mutexes. In the most common case of uncontended acquires, critical sections are incredibly fast because they just atomically increment a value in user-mode and return immediately. (Internally, the InterlockedCompareExchange API is used to "acquire" the critical section.)
Critical sections only switch to kernel mode when there is contention over the acquisition. In such cases, the critical section actually allocates a semaphore internally, storing it in a dedicated field in the critical section's structure (which is originally unallocated). So basically, in cases of contention, you see performance degrade to that of a mutex because you effectively are using a mutex. The user-mode thread is suspended and kernel-mode is entered to wait on either the semaphore or an event.
Critical sections in Windows are somewhat akin to "futexes" in Linux. A futex is a "Fast User-space muTEX" that, like a critical section, only switches to kernel-mode when arbitration is required.
The performance benefit of a critical section comes with serious caveats, including the inability specify a wait time-out period, the inability of a thread to determine if the owning thread was terminated before it released the critical section, the inability to determine if the critical section was deleted, and the inability to use critical sections across processes (critical sections are process-local objects).
As such, you should keep the following guidelines in mind when deciding between critical sections and mutexes:
If you're going to use a critical section, the operation must be non-blocking. If an operation can block (such as a socket), then you shouldn't use a critical section because it does not allow the waiting thread to specify a wait time-out, which can lead to deadlock.
If it's possible that the thread might throw an exception or be terminated unexpectedly, then use a mutex. With a critical section, there is no way for waiting threads to be notified that the original thread was terminated or that the critical section was deleted.
Critical sections make the most sense when the protected operation has a relatively short duration. These are the cases where avoiding the overhead of a mutex is most important, and also the cases where you're least likely to run into problems with a critical section.
You'll find lots of benchmarks online showing the relative performance difference between critical sections and mutexes, including in the article you link, which says critical sections are 25 times faster than mutexes. I have a comment here in my class library from an article I read a long time ago that says, "On a Pentium II 300 MHz, the round-trip for a critical section (assuming no contention, so no context switching required) takes 0.29 µs. With a mutex, it takes 5.3 µs." The consensus seems to be somewhere between 15–30% faster when you can avoid a kernel-mode transition. I didn't bother to benchmark it myself. :-)
Further reading:
Critical Section Objects on MSDN:
A critical section object provides synchronization similar to that provided by a mutex object, except that a critical section can be used only by the threads of a single process. Event, mutex, and semaphore objects can also be used in a single-process application, but critical section objects provide a slightly faster, more efficient mechanism for mutual-exclusion synchronization (a processor-specific test and set instruction). Like a mutex object, a critical section object can be owned by only one thread at a time, which makes it useful for protecting a shared resource from simultaneous access. Unlike a mutex object, there is no way to tell whether a critical section has been abandoned.
[ … ]
A thread uses the EnterCriticalSection or TryEnterCriticalSection function to request ownership of a critical section. It uses the LeaveCriticalSection function to release ownership of a critical section. If the critical section object is currently owned by another thread, EnterCriticalSection waits indefinitely for ownership. In contrast, when a mutex object is used for mutual exclusion, the wait functions accept a specified time-out interval.
INFO: Critical Sections Versus Mutexes, also on MSDN:
Critical sections and mutexes provide synchronization that is very similar, except that critical sections can be used only by the threads of a single process. There are two areas to consider when choosing which method to use within a single process:
Speed. The Synchronization overview says the following about critical sections:
... critical section objects provide a slightly faster, more efficient mechanism for mutual-exclusion synchronization.
Critical sections use a processor-specific test and set instruction to determine mutual exclusion.
Deadlock. The Synchronization overview says the following about mutexes:
If a thread terminates without releasing its ownership of a mutex object, the mutex is considered to be abandoned. A waiting thread can acquire ownership of an abandoned mutex, but the wait function's return value indicates that the mutex is abandoned.
WaitForSingleObject() will return WAIT_ABANDONED for a mutex that has been abandoned. However, the resource that the mutex is protecting is left in an unknown state.
There is no way to tell whether a critical section has been abandoned.
The article you link to in the question also links to this post on Larry Osterman's blog, which gives some more interesting details about the implementation.

What word refers to a code segment/function that can be/is executed concurrently/in parallel by two different threads?

I came across this term while studying threads, synchronization, and writing multi-threaded programs. If I remember correctly, it refers to a section of code that two threads execute in parallel.
If I remember incorrectly, it might actually refer to a section of code that can run simultaneously. Then again, I might be off entirely (sorry).
The term is on the tip of my tongue and I (desperately) want to google it.
RENTRANT and THREAD-SAFE. Both are necessary.
See this Wiki entry on "reentrant":
In computing, a computer program or subroutine is called reentrant if
it can be interrupted in the middle of its execution and then safely
called again ("re-entered") before its previous invocations complete
execution. The interruption could be caused by an internal action such
as a jump or call, or by an external action such as a hardware
interrupt or signal. Once the reentered invocation completes, the
previous invocations will resume correct execution.
This definition originates from single-threaded programming
environments where the flow of control could be interrupted by a
hardware interrupt and transferred to an interrupt service routine
(ISR). Any subroutine used by the ISR that could potentially have been
executing when the interrupt was triggered should be reentrant. Often,
subroutines accessible via the operating system kernel are not
reentrant. Hence, interrupt service routines are limited in the
actions they can perform; for instance, they are usually restricted
from accessing the file system and sometimes even from allocating
memory.
A subroutine that is directly or indirectly recursive should be
reentrant. This policy is partially enforced by structured programming
languages.[citation needed] However a subroutine can fail to be
reentrant if it relies on a global variable to remain unchanged but
that variable is modified when the subroutine is recursively invoked.
This definition of reentrancy differs from that of thread-safety in
multi-threaded environments. A reentrant subroutine can achieve
thread-safety,1 but being reentrant alone might not be sufficient to
be thread-safe in all situations. Conversely, thread-safe code does
not necessarily have to be reentrant (see below for examples).
...
I think the term you're looking for is a Critical Section - a piece of code whose function is critically important when dealing with multiple threads.
However, your question posits a block of code that can run simultaneously on multiple threads, which is different than a critical section - a critical section is specifically a block of code that must run on only one thread at a time, for instance, incrementing a bank balance. It's the type of code where one would expect that multiple threads could try to run it, but specifically requires that only one thread actually be allowed to run it at one time.
There is no name, to the best of my knowledge, for a block of code that could be executed simultaneously on multiple threads, because lots of code does that innocuously.

Why not to use mutex inside an interrupt

i have passed through this post and i noticed that in Clifford's answer he said that we shouldn't use mutex in an interrupt, i know that in an interrupt we have to avoid too much instructions and delays ext... but am not very clear about the reasons could anyone clarify me for which reason we have to avoid this?
In case that we want establish a synchronous communication between 2 interrupt driven threads what are the other mecahnism to use if using mutex is not allowed?
The original question you cite refers to code on an Atmel ATMegaAVR - a simple 8 mit microcontroller. In that context, one can assume that the mutex machanism is part of a simple RTOS.
In such a system, there is a thread context and an interrupt context. Interrupts are invoked by the hardware, while threads are scheduler by the RTOS scheduler. Now when an interrupt occurs, any thread will be immediately pre-empted; the interrupt must run to completion and can only be preempted by a higher priority interrupt (where nested interrupts are supported). All pending interrupts will run to completion before the scheduler can run.
Blocking on a mutex (or indeed any blocking kernel object) is a secheduling event. If you were to make any blocking call in an interrupt, the scheduler will never run. In prectice an RTOS would either ignore the blocking call, raise an exception, or enter a terminal error handler.
Some OS's such as SMX, Velocity or even WinCE have somewhat more complex interrupt architectures and support variety of deferred interrupt handler. Deferred interrupt handlers are run-to-completion scheduled from an interrupt but running outside of the interrupt context; the rules for blocking in such handlers may differ, but you would need to refer to the specific OS documentation. Without deferred interrupt handlers, the usual solution is to have a thread wait on a some blocking object such as a semaphore, and have the interrupt itself do little more that cause the object to unblock (such as giving a semaphore for example).
Multi-processor/core and parallel processing systems are another issue altogether, such systems are way beyond the scope of the question where the original comment was made, and beyond my experience - my comment may not apply in such a system, but there are no doubt additional complexities and considerations in any case
A mutex is typically used to ensure that a resource is used by only one user at any given time.
When a thread needs to use a resource it attempts to get the mutex first to ensure the resource is available. If the mutex is not available then the thread typically blocks to wait for the mutex to become available.
While a thread owns the mutex, it prevents other threads from obtaining the mutex and interfering with its use of the resource. Higher priority threads are often the concern here because those are the threads that may preempt the mutex owner.
The RTOS kernel assigns ownership of the mutex to a particular thread and typically only the mutex owner can release the mutex.
Now lets imagine this from an interrupt handler's point of view.
If an interrupt handler attempts to get a mutex that is not available, what should it do? The interrupt handler cannot block like the thread (the kernel is not equipped to push the context of an interrupt handler or switch to a thread from an interrupt handler).
If the interrupt handler obtains the mutex, what higher priority code is there that could interrupt the interrupt handler and attempt to use the mutex? Is the interrupt handler going to release the mutex before completing?
How does the kernel assign ownership of the mutex to an interrupt handler? An interrupt handler is not a thread. If the interrupt handler does not release the mutex then how will the kernel validate that the mutex is being released by the owner?
So maybe you have answers for all those questions. Maybe the you can guarantee that the interrupt handler runs only when the mutex is available or that the interrupt handler will not block on the mutex. Or maybe you're trying to protect the resource access from an even higher priority nested interrupt handler that also wants to use the resource. And maybe your kernel doesn't have any hangup with assigning ownership or restricting who releases the mutex. I guess if you've got all these questions answered then maybe you have a case for using a mutex within an interrupt handler.
But perhaps what you really need is a semaphore instead. One common application of a semaphore is to signal an event. Semaphores are very often used this way within interrupt handlers. The interrupt handler posts or sets the semaphore to signal that an event has occurred. The threads pend on the semaphore to wait for the event condition. (A semaphore doesn't have that ownership restriction that a mutex has.) Event signalling semaphores is one common way to establish synchronous communication between 2 interrupt driven threads.
The term "mutex" is often defined both as being the simplest form of synchronization between execution contexts, and also as being a construct that will not only check whether a resource is available, but wait for it to become available if it isn't, acquiring it as soon as it becomes available. These definitions are inconsistent, since the simplest forms of synchronization merely involve testing whether one has been granted ownership of a resource, and don't provide any in-built mechanism to wait for it if it isn't.
It is almost never proper to have code within an interrupt handler that waits for a resource to become available, unless the only things that could hold the resource would be higher-priority interrupts or hardware that will spontaneously release it. If the term "mutex" is only used to describe such constructs, then there would be very few cases where one could properly use a mutex within an interrupt handler. If, however, one uses the term "mutex" more broadly to refer to the simplest structures that will ensure that a piece of code that accesses a resource can only execute at times when no other piece of code anywhere in the universe will be accessing that resource, then the use of such constructs within interrupts is often not only proper, but required.
While there might be unusual cases where there's some problem with using a mutex in an interrupt handler, it's quite common practice and there's nothing wrong with it.
It really only makes sense on systems with more than one core. With just a single core (and no hyper-threading), the mutex would never do anything anyway. If the core is running code that acquires a mutex that interrupt code can acquire, interrupts (or the subset of them that matter) are disabled anyway. So with just one core, the mutex would never see any contention.
However, with multiple cores, it's common to use mutexes to protect structures that communicate between interrupt and non-interrupt code. So long as you know what you're doing, and you have to if you're going to write interrupt handlers, there's nothing wrong with it.
How the mutex blocks and unblocks is heavily implementation dependent. It can put the CPU to sleep and be woken by an inter-process interrupt. It can spin the CPU in some CPU-specific way.
Note that a totally unrelated concept that is often confused with this is using user-space mutexes in user-space signal handlers. That's a completely different question.

Difference between Mutex, Semaphore & Spin Locks

I am doing experiments with IPC, especially with Mutex, Semaphore and Spin Lock.
What I learnt is Mutex is used for Asynchronous Locking (with sleeping (as per theories I read on NET)) Mechanism, Semaphore are Synchronous Locking (with Signaling and Sleeping) Mechanism, and Spin Locks are Synchronous but Non-sleeping Mechanism.
Can anyone help me to clarify these stuff deeply?
And another doubt is about Mutex, when I wrote program with thread & mutex, while one thread is running another thread is not in Sleep state but it continuously tries to acquire the Lock. So Mutex is sleeping or Non-sleeping???
First, remember the goal of these 'synchronizing objects' :
These objects were designed to provide an efficient and coherent use of 'shared data' between more than 1 thread among 1 process or from different processes.
These objects can be 'acquired' or 'released'.
That is it!!! End of story!!!
Now, if it helps to you, let me put my grain of sand:
1) Critical Section= User object used for allowing the execution of just one active thread from many others within one process. The other non selected threads (# acquiring this object) are put to sleep.
[No interprocess capability, very primitive object].
2) Mutex Semaphore (aka Mutex)= Kernel object used for allowing the execution of just one active thread from many others, within one process or among different processes. The other non selected threads (# acquiring this object) are put to sleep. This object supports thread ownership, thread termination notification, recursion (multiple 'acquire' calls from same thread) and 'priority inversion avoidance'.
[Interprocess capability, very safe to use, a kind of 'high level' synchronization object].
3) Counting Semaphore (aka Semaphore)= Kernel object used for allowing the execution of a group of active threads from many others, within one process or among different processes. The other non selected threads (# acquiring this object) are put to sleep.
[Interprocess capability however not very safe to use because it lacks following 'mutex' attributes: thread termination notification, recursion?, 'priority inversion avoidance'?, etc].
4) And now, talking about 'spinlocks', first some definitions:
Critical Region= A region of memory shared by 2 or more processes.
Lock= A variable whose value allows or denies the entrance to a 'critical region'. (It could be implemented as a simple 'boolean flag').
Busy waiting= Continuosly testing of a variable until some value appears.
Finally:
Spin-lock (aka Spinlock)= A lock which uses busy waiting. (The acquiring of the lock is made by xchg or similar atomic operations).
[No thread sleeping, mostly used at kernel level only. Ineffcient for User level code].
As a last comment, I am not sure but I can bet you some big bucks that the above first 3 synchronizing objects (#1, #2 and #3) make use of this simple beast (#4) as part of their implementation.
Have a good day!.
References:
-Real-Time Concepts for Embedded Systems by Qing Li with Caroline Yao (CMP Books).
-Modern Operating Systems (3rd) by Andrew Tanenbaum (Pearson Education International).
-Programming Applications for Microsoft Windows (4th) by Jeffrey Richter (Microsoft Programming Series).
Here is a great explanation of the difference between semaphores and mutexes:
http://blog.feabhas.com/2009/09/mutex-vs-semaphores-–-part-1-semaphores/
The short answer has to do with ownership at least with binary semaphores but I suggest you read the entire article.
Mutex is the locking mechanism while the semaphore is the wait and signal mechanism.
Both have different applications.
There is a very good explanation given by the IISC professor.
Link for video

Resources