#include <iostream>
#include <mutex>
using namespace std;
int main()
{
mutex m;
m.lock();
cout << "locked once\n";
m.lock();
cout << "locked twice\n";
return 0;
}
Output:
./a.out
locked once
locked twice
Doesn't the program needs to deadlock at the point of second lock i.e. a mutex being locked twice by same thread?
If lock is called by a thread that already owns the mutex, the behavior is undefined: the program may deadlock, or, if the implementation can detect the deadlock, a resource_deadlock_would_occur error condition may be thrown.
http://en.cppreference.com/w/cpp/thread/mutex/lock
Related
I am practise the multithreaded programming with cpp. And when I use the std::lock_guard in the same code, its run time becomes shorter than before. That's amazing, why?
The lock version:
#include <iostream>
#include <thread>
#include <mutex>
#include <ctime>
using namespace std;
class test {
std::mutex m;
int a;
public:
test() :a(0) {}
void add() {
std::lock_guard<std::mutex> guard(m);
for(int i = 0; i < 1e9; i++) {
a++;
}
}
void print() {
std::cout << a << std::endl;
}
};
int main() {
test t;
auto start = clock();
std::thread t1(&test::add, ref(t));
std::thread t2(&test::add, ref(t));
t1.join();
t2.join();
auto end = clock();
t.print();
cout << "time = " << double(end - start) / CLOCKS_PER_SEC << "s" << endl;
return 0;
}
and the ouput is:
2000000000
time = 5.71852s
the no lock version is:
#include <iostream>
#include <thread>
#include <mutex>
#include <ctime>
using namespace std;
class test {
std::mutex m;
int a;
public:
test() :a(0) {}
void add() {
// std::lock_guard<std::mutex> guard(m);
for(int i = 0; i < 1e9; i++) {
a++;
}
}
void print() {
std::cout << a << std::endl;
}
};
int main() {
test t;
auto start = clock();
std::thread t1(&test::add, ref(t));
std::thread t2(&test::add, ref(t));
t1.join();
t2.join();
auto end = clock();
t.print();
cout << "time = " << double(end - start) / CLOCKS_PER_SEC << "s" << endl;
return 0;
}
and the output is:
1010269798
time = 10.765s
I'm using the ubuntu1804, g++ version is :
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
In my opinion, the lock is an extra operation, it should cost more time of course.
Maybe someone can help me? Thanks.
Modifying a variable from multiple threads cause an undefined behaviour. This means the compiler ans the processor are free to do whatever they want in this case (like removing the loop for example, or not reloading the variable from memory since it is not supposed to be modified by another thread in the first place). As a result, studying performance of this case is not really relevant.
Assuming the compiler do not perform any (allowed) advanced optimizations, the program should contain a race condition. It is certainly slower because of a cache-line bouncing effect: multiple cores compete for the same locked cache-line and moving it from one core to another is very slow compared to increasing the variable from the L1 cache (this is certainly the overhead you see). Indeed, on standard x86-64 platforms like mainstream Intel processors, moving a locked cache line from one core to another means invalidating copies of the cache line of other L1/L2 cores and fetching it from the L3 cache which is much slower than the L1 (lower throughput & much higher latency). Note that this behaviour is dependent of the target platform (mainly the processor, besides compiler optimizations), but most platforms work similarly. For more information please read this and that about cache-coherence protocols.
As you see,when I remove mt.lock() and mt.unlock,the result is smaller than 50000.
Why?What actually happens? I will be very grateful if you can explain it for me.
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
using namespace std;
class counter{
public:
mutex mt;
int value;
public:
counter():value(0){}
void increase()
{
//mt.lock();
value++;
//mt.unlock();
}
};
int main()
{
counter c;
vector<thread> threads;
for(int i=0;i<5;++i){
threads.push_back(thread([&]()
{
for(int i=0;i<10000;++i){
c.increase();
}
}));
}
for(auto& t:threads){
t.join();
}
cout << c.value <<endl;
return 0;
}
++ is actually two operations. One is reading the value, the other is incrementing it. Since it isn't an atomic operation, multiple threads operating in the same region of code will get mixed up.
As an example, consider three threads operating in the same region without any locking:
Threads 1 and 2 read value as 999
Thread 1 computes the incremented value as 1000 and updates the variable
Thread 3 reads 1000, increments to 1001 and updates the variable
Thread 2 computes incremented value as 999 + 1 = 1000 and overwrites 3's work with with 1000
Now if you were using something like the "fetch-and-add" instruction, which is atomic, you wouldn't need any locks. See fetch_add
i have such code
#include <iostream>
#include <thread>
#include <mutex>
#include <iostream>
#include <unistd.h>
using namespace std;
bool isRunning;
mutex locker;
void threadFunc(int num) {
while(isRunning) {
locker.lock();
cout << num << endl;
locker.unlock();
sleep(1);
}
}
int main(int argc, char *argv[])
{
isRunning = true;
thread thr1(threadFunc,1);
thread thr2(threadFunc,2);
cout << "Hello World!" << endl;
thr1.join();
thr2.join();
return 0;
}
when running this code i'm waiting to get output like:
1
2
1
2
1
2
1
2
...
but i dont't get that and get something like this instead:
1
2
1
2
2 <--- why so?
1
2
1
and if i run this code on Windows with replacing #include <unistd.h> to #include <windows.h> and sleep(1) to Sleep(1000) the output i get is exactly what i want, i.e. 1212121212.
So why is so and how to achieve the same result on linux?
It pertains to the scheduling of threads. Sometime one thread may be executing faster. Apparently, thread 2 is executing faster once and so you are getting ... 1 2 2 ... Nothing wrong with that because mutex is only ensuring that only one thread is printing count at a time and nothing more. There are uncertainties like when a thread is going to sleep and when it is woken up, etc. All this may not be taking exactly the same time in the two threads all the time.
For having the threads execute alternately, a different semaphore arrangement is needed. For example, let there be two semaphores, s1 and s2. Let the initial values of s1 and s2 be 1 and zero respectively. Consider the following pseudo code:
// Thread 1:
P (s1)
print number
V (s2)
// Thread 2:
P (s2)
print number
V (s1)
I read about Advanced Programming in Unix Environment 3rd, 11.6.2 Deadlock Avoidance:
A thread will deadlock itself if it tries to lock the same mutex twice
In order to verify this, I write a demo:
pthread_mutex_t mutex;
int main() {
pthread_mutex_init(&mutex, NULL);
pthread_mutex_lock(&mutex);
printf("lock 1\n");
pthread_mutex_lock(&mutex);
printf("lock 2\n");
pthread_mutex_unlock(&mutex);
printf("unlock 1\n");
pthread_mutex_unlock(&mutex);
printf("unlock 2\n");
pthread_mutex_destroy(&mutex);
return 0;
}
Main thread didn't blocked, and the output is:
lock 1
lock 2
unlock 1
unlock 2
Why is it so?
How are you compiling this? I suspect you did not pass the -pthread option to the compiler and pthread-related things like the above remain as noops (i.e. they are not pulled in).
I just tested your prog compiled as
cc -pthread meh.c
and the result nicely hangs after "lock 1".
Everyone knows the classic model of a process listening for connections on a socket and forking a new process to handle each new connection. Normal practice is for the parent process to immediately call close on the newly created socket, decrementing the handle count so that only the child has a handle to the new socket.
I've read that the only difference between a process and a thread in Linux is that threads share the same memory. In this case I'm assuming spawning a new thread to handle a new connection also duplicates file descriptors and would also require the 'parent' thread to close it's copy of the socket?
No. Threads share the same memory, so they share the same variables. If you close socket in parent thread, it will be also closed in child thread.
EDIT:
man fork: The child inherits copies of the parent’s set of open file descriptors.
man pthreads: threads share a range of other attributes (i.e., these attributes are process-wide rather than per-thread): [...] open file descriptors
And some code:
#include <cstring>
#include <iostream>
using namespace std;
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>
// global variable
int fd = -1;
void * threadProc(void * param) {
cout << "thread: begin" << endl;
sleep(2);
int rc = close(fd);
if (rc == -1) {
int errsv = errno;
cout << "thread: close() failed: " << strerror(errsv) << endl;
}
else {
cout << "thread: file is closed" << endl;
}
cout << "thread: end" << endl;
}
int main() {
int rc = open("/etc/passwd", O_RDONLY);
fd = rc;
pthread_t threadId;
rc = pthread_create(&threadId, NULL, &threadProc, NULL);
sleep(1);
rc = close(fd);
if (rc == -1) {
int errsv = errno;
cout << "main: close() failed: " << strerror(errsv) << endl;
return 0;
}
else {
cout << "main: file is closed" << endl;
}
sleep(2);
}
Output is:
thread: begin
main: file is closed
thread: close() failed: Bad file descriptor
thread: end
In principle, Linux clone() can implement not only a new process (like fork()), or a new thread (like pthread_create perhaps), but also anything in between.
In practice, it is only ever used for one or the other. Threads created with pthread_create share the file descriptors with all other threads in the process (not just the parent). This is non-negotiable.
Sharing a file descriptor and having a copy is different. If you have a copy (like fork()) then all copies must be closed before the file handle goes away. If you share the FD in a thread, once one closes it, it's gone.
On Linux threads are implemented via the clone syscall using the CLONE_FILES flag:
If CLONE_FILES is set, the calling
process and the child processes share
the same file descriptor table. Any
file descriptor created by the calling
process or by the child process is
also valid in the other process.
Similarly, if one of the processes
closes a file descriptor, or changes
its associated flags (using the
fcntl(2) F_SETFD operation), the other
process is also affected.
Also have a look at the glibc source code for the details of how it is used in createthread.c:
int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGNAL
| CLONE_SETTLS | CLONE_PARENT_SETTID
| CLONE_CHILD_CLEARTID | CLONE_SYSVSEM
#if __ASSUME_NO_CLONE_DETACHED == 0
| CLONE_DETACHED
#endif
| 0);