I read that mutex is a semaphore with value 1 (binary semaphore) used to enforce mutual exclusion.
I read this link
Semaphore vs. Monitors - what's the difference?
which says that monitor helps in achieving mutual exclusion.
Can someone tell me the difference between mutex and monitor as both help achieve the same thing (Mutual Exclusion)?
Since you haven't specified which OS or language/library you are talking about, let me answer in a generic way.
Conceptually they are the same. But usually they are implemented slightly differently
Monitor
Usually, the implementation of monitors is faster/light-weight, since it is designed for multi-threaded synchronization within the same process. Also, usually, it is provided by a framework/library itself (as opposed to requesting the OS).
Mutex
Usually, mutexes are provided by the OS kernel and libraries/frameworks simply provide an interface to invoke it. This makes them heavy-weight/slower, but they work across threads on different processes. OS might also provide features to access the mutex by name for easy sharing between instances of separate executables (as opposed to using a handle that can be used by fork only).
Monitor are different then Mutex but they can be considered similar in a sense that Monitor are build on top of Mutex. See depiction of monitor in an image at the bottom, for clarity.
Monitor is a synchronization construct that allows threads to have both mutual exclusion (using locks) and cooperation i.e. the ability to make threads wait for certain condition to be true (using wait-set).
In other words, along with data that implements a lock, every Java object is logically associated with data that implements a wait-set. Whereas locks help threads to work independently on shared data without interfering with one another, wait-sets help threads to cooperate with one another to work together towards a common goal e.g. all waiting threads will be moved to this wait-set and all will be notified once lock is released. This wait-set helps in building monitors with additional help of lock (mutex).
I you want, you can see my answer here, which may or may not be relevant to this question.
You can find another relevant discussion here
Semaphore vs. Monitors - what's the difference?
Unfortunately the textbook definitions does not always correspond to how different platforms and languages use the terms. So to get precise answers you have to specify the platform and context. But in general:
A mutex is a lock which can only be owned by a single thread at a time. The lock doesn't in itself protect anything, but code can check for ownership of a mutex to ensure that some section of code is only executed by a single thread at a time. If a thread wants to acquire a mutex lock the thread is blocked until it becomes available.
In Java terminology a monitor is a mutex lock which is implicitly associated with an object. When the synchronized keyword is applied to classes or methods an implicit mutex lock is created around the code, which ensures that only one thread at a time can execute it. This is called a monitor lock or just a monitor.
So in Java a monitor is not a specific object, rather any object has a monitor lock available which is invoked with the synchronized keyword.
The synchronized keyword can also be used on a block of code, in which case the object to lock on is explicit specified. Here it gets a bit weird because you can use the monitor of one object to lock access to another object.
In computer science textbooks you may meet a different kind of monitor, the Brinch-Hansen or Hoare-monitor, which is a class or module which is implicitly thread-safe (like a synchronized class in Java) and which have multiple conditions threads can wait/signal on. This is a higher-level concept than the Java monitor.
C#/.NET has monitors similar to Java, but also have a Mutex class in the standard library - which is different from the mutex lock used in the monitor. The monitor lock only exist inside a single process, while the Mutex-lock is machine wide. So a monitor lock is appropriate for making objects and data-structures thread safe, but not for providing system-wide exclusive access to say a file or device.
So bottom line: These terms can mean different things, so if you want a more specific answer you should specify a specific platform.
Related
I was reading up on threads and as I understand it, they are a set of values for an execution context. From what I understand, a thread is comprised of values (registers, PC, stack, etc.) that allow a CPU to continue running a set of instructions.
However, my question is: how are these threads made? I hear some of my professors throw around the word thread as a way to break up a process into multiple (mostly) independent parts of code (ie. multithreading). How does this work? Is there another section of memory that stores specifically what a thread can run, as well as it's state?
First of all you have to understand that operating systems vary greatly in their general working as well as in their implementations of seemingly identical functions.
so don't go into these kind of questions thinking that if one operating system does something in some way then other operating systems would do that in similar manner.
Now to your question
how are these threads made?
I will answer it using Linux as an example. When creating a new process Linux lets you specify which data structures (file descriptors, IO context etc) new process would share with its parent process. you can do this using the clone system call.
you can see in the documentation of clone that it takes some parameters specifying the sharing properties.
Now you can call a task_struct thread if it shares all sharable data structures with its parent ( because this property is consistent with the conventional definition of a thread). and if it shares none then you would call it a process.
But as far as Linux is concerned there is no notion of a thread or a process, all you have is a task_struct which may share certain resources with its parent.
Let's say that I have a shared object that has a piece of code protected with a critical section and more than 1 thread are accessing the object for read/write. When a thread is inside the critical section the other threads are waiting. Once the thread gets out of the CS then the OS gives access to any of the waiting threads.
If I am confined to only one process, does the CS alone is a good protection for the shared object?
I ask because I have seen on the web that the right way to do it is to use a kernel object (ex: mutex, semaphone) to guard the CS. A thread wishing to use the shared resource needs to obtain the mutex/semaphore first with a WaitForSingleObject type of function. If a mutex is used then only one of then can access the resource. Once the mutex is obtained then the thread Enters the CS, does what is supposed to do, then Leave the CS and Releases the mutex. Then the OS allows any other waiting thread to obtain the mutex and so on and so forth.
But isn't is the same as using only the CS?
Also, using a mutex is supposed to be significantly slower than using a CS alone. The only problem I see of using only a CS is that if the thread crashes inside the CS then the other threads may never access the shared resource.
Is there any other reason why this approach is better?
thanks in advance
It sounds like you're discussing some Windows-specific terminology in a way that's getting it mixed up with some general computer science terminology.
In computer science the term "critical section" is used for areas of code that must run exclusively (usually due to data sharing). In Windows, there's a synchronization object called CRITICAL_SECTION that can be used to provide exclusive access to areas of execution. Another attribute of a CRITICAL_SECTION object on Windows is that it is limited to being used within a single process.
In computer science, the term 'mutex' is often used to describe an object that can be used to provide synchronization among parallel or cncurrent threads of execution. In Windows, there is also a mutex object which can be created with the CreateMutex() function (which returns a HANDLE representing the mutex). That object can be used to synchronize access among threads in the same or different processes, so it can be used similar to a CRITICAL_SECTION (but with different APIs) in many ways. If you want to synchronized threads of execution that are in different processes, a mutex object can be used, while a CRITICAL_SECTION object cannot.
So to answer your question (I think), if you are only concerned with protecting a critical section among threads that are part of the same process, a CRITICAL_SECTION object should be adequate. A mutex object can be used instead, but it may be somewhat less performant. There should be no need to use both types of objects
I was wondering whether it would ever make sense to use a mutex or semaphore when there is only one thread?.
Thanks for your help.
I design thread protection into my components because they are reusable and scalable components intended to work in any environment I can realistically anticipate. Many times they are initially used in a single thread environment. Often times the scope of the implementation expands to include more threads. Then I don't have to chase down resources to protect from the new access scenarios.
Mutex can make sense, since Mutex can be used for system wide sharing, instead of internal process-wide sharing. For example, you can use a Mutex to prevent an application from being started twice.
This may be a bit out there but lets say you are writing a recursive function and you want each level to register with a separate resource. This way you can keep the responsibility of cleaning up the resource in one place (The resource pool).
Sounds like a trick question. Technically, yes. A named mutex can be used to synch multiple processes containing a single thread in each.
You can use system-wide semaphores (and even mutexes) to do inter-process communication.
You can signal from a single-threaded process to another single-threaded process by acquire()/release()-ing on a named semaphore, for example.
In case the environment supports system interrupts it adds non-linear behaviour.
Semaphore can be used in order to sleep in main thread until interrupt triggers.
I have two processes which access to the same physical memory(GPIO data addr).
So how can I have synchronized between these apps?
I understand that we have some kind of locking mechanism such as mutex and semaphore, so which method is the fastest?
Thank for your help,
-nm
Mutexes and semaphores are generally considered to be concurrency solutions in the same address space -- meaning that different parts of the same program will lock their access to a resource using one of these contraptions.
When you're dealing with separate processes, the standard way to do this on Linux is to create something in /var/lock, like /var/lock/myapp.lock, and place your PID followed by a newline in it. Then other processes will check for its existence, and if you're crafty check the PID to make sure it's still alive, too.
If you need real-time access to the area, skip the filesystem and the processes will have to communicate via IPC (LET_ME_KNOW_WHEN_DONE, OKAY_IM_DONE, you get the idea), or -- better -- write a process whose sole purpose is to read and write to the GPIO memory, and your other programs communicate with it via IPC (probably the best approach).
mutex means mutual exclusion -- a semaphore is just a variable used to determine if the resource is in use. In windows, there is a Mutex object that can be created to protect a shared resource.
The issue is what language are you using? What OS (I am assuming linux). Most languages provide support for multi-threading and mutual exclusion, and you should use the built-in constructs.
For example, using C on Linux, you might want to
include semaphore.h
and look up the calls for sem_init, sem_wait etc.
Back in my days as a BeOS programmer, I read this article by Benoit Schillings, describing how to create a "benaphore": a method of using atomic variable to enforce a critical section that avoids the need acquire/release a mutex in the common (no-contention) case.
I thought that was rather clever, and it seems like you could do the same trick on any platform that supports atomic-increment/decrement.
On the other hand, this looks like something that could just as easily be included in the standard mutex implementation itself... in which case implementing this logic in my program would be redundant and wouldn't provide any benefit.
Does anyone know if modern locking APIs (e.g. pthread_mutex_lock()/pthread_mutex_unlock()) use this trick internally? And if not, why not?
What your article describes is in common use today. Most often it's called "Critical Section", and it consists of an interlocked variable, a bunch of flags and an internal synchronization object (Mutex, if I remember correctly). Generally, in the scenarios with little contention, the Critical Section executes entirely in user mode, without involving the kernel synchronization object. This guarantees fast execution. When the contention is high, the kernel object is used for waiting, which releases the time slice conductive for faster turnaround.
Generally, there is very little sense in implementing synchronization primitives in this day and age. Operating systems come with a big variety of such objects, and they are optimized and tested in significantly wider range of scenarios than a single programmer can imagine. It literally takes years to invent, implement and test a good synchronization mechanism. That's not to say that there is no value in trying :)
Java's AbstractQueuedSynchronizer (and its sibling AbstractQueuedLongSynchronizer) works similarly, or at least it could be implemented similarly. These types form the basis for several concurrency primitives in the Java library, such as ReentrantLock and FutureTask.
It works by way of using an atomic integer to represent state. A lock may define the value 0 as unlocked, and 1 as locked. Any thread wishing to acquire the lock attempts to change the lock state from 0 to 1 via an atomic compare-and-set operation; if the attempt fails, the current state is not 0, which means that the lock is owned by some other thread.
AbstractQueuedSynchronizer also facilitates waiting on locks and notification of conditions by maintaining CLH queues, which are lock-free linked lists representing the line of threads waiting either to acquire the lock or to receive notification via a condition. Such notification moves one or all of the threads waiting on the condition to the head of the queue of those waiting to acquire the related lock.
Most of this machinery can be implemented in terms of an atomic integer representing the state as well as a couple of atomic pointers for each waiting queue. The actual scheduling of which threads will contend to inspect and change the state variable (via, say, AbstractQueuedSynchronizer#tryAcquire(int)) is outside the scope of such a library and falls to the host system's scheduler.