I grokked yesterday that mutex with conditional variables are similar to the idea of a coroutine, if the caller thread wait for callee thread signal its execution.
The idea is to have 2 threads with cooperating manner, with mutex representing the "execution lock".
Tried to verify the idea on my favorite scheme. The implementation worked fine until I expand the idea to 2 threads. The threads went slightly out of order when iterations go up to 8000-ish times.
I couldn't really see why sometimes the threads is in wrong order. If they do, the program shouldn't have worked at all, since with all the mutual waiting, a deadlock is supposed to happen IF the program's algorithm is wrong. really interested for an insight.
Here's the code so far:
(use-modules (ice-9 threads))
(define mtx1 (make-mutex))
(define mtx2 (make-mutex))
(define cv1 (make-condition-variable)) ;; cv1: B -> A
(define cv2 (make-condition-variable)) ;; cv2: B -> C
(define cv3 (make-condition-variable)) ;; cv3: A -> B
(define cv4 (make-condition-variable)) ;; cv4: C -> B
(define v 0)
(lock-mutex mtx1) ;; block t1
(lock-mutex mtx2) ;; block t2
(define (B->A)
(signal-condition-variable cv1) ;; signal B -> A is going to happen
(wait-condition-variable cv3 mtx1)) ;; release mtx1 and wait for A -> B
(define (B->C)
(signal-condition-variable cv2) ;; signal B -> C is going to happen
(wait-condition-variable cv4 mtx2)) ;; release mtx2 and wait for C -> B
(define (A->B)
(signal-condition-variable cv3) ;; signal A -> B is going to happen
(wait-condition-variable cv1 mtx1)) ;; release mtx1 and wait for B -> A
(define (C->B)
(signal-condition-variable cv4) ;; signal C -> B is going to happen
(wait-condition-variable cv2 mtx2)) ;; release mtx2 and wait for B -> C
(call-with-new-thread
(lambda ()
(lock-mutex mtx1) ;; wait for B release mtx1
(let A ()
(A->B)
(set! v (+ v 1))
(format #t "A: v=~a~%" v)
(A))))
(call-with-new-thread
(lambda ()
(lock-mutex mtx2) ;; wait for B to release mtx2
(let C ()
(C->B)
(set! v (+ v 1))
(format #t "C: v=~a~%" v)
(C))))
(wait-condition-variable cv3 mtx1) ;; trigger first execution of A, resume by A->B
(wait-condition-variable cv4 mtx2) ;; trigger first execution of C, resume by C->B
(let B ()
(set! v (+ v 1))
(format #t "B: v=~a~%" v)
(B->A)
(B->C)
(B))
and you could use the shell snippet to test the program to see how it go wrong:
for (( i=1 ; ; i+=1 )) do
echo "=== Run $i ==="
MD5_1=$(guile message.scm |tee "/tmp/message_$i.txt" |head -10000 |md5sum)
if [[ $i -gt 1 && "$MD5_2" != "$MD5_1" ]]; then
echo "bug"
break
fi
MD5_2="$MD5_1"
done
I've implemented a equivalent C version. It seems like it's working properly according to the logic!
#include <stdio.h>
#include <pthread.h>
pthread_mutex_t mtx1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mtx2 = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cv1;
pthread_cond_t cv2;
pthread_cond_t cv3;
pthread_cond_t cv4;
int v = 0;
void BA(void) {
pthread_cond_signal(&cv1);
pthread_cond_wait(&cv3, &mtx1);
}
void AB(void) {
pthread_cond_signal(&cv3);
pthread_cond_wait(&cv1, &mtx1);
}
void BC(void) {
pthread_cond_signal(&cv2);
pthread_cond_wait(&cv4, &mtx2);
}
void CB(void) {
pthread_cond_signal(&cv4);
pthread_cond_wait(&cv2, &mtx2);
}
void *A(void *args) {
pthread_mutex_lock(&mtx1);
for (;;) {
AB();
v += 1;
printf("A: v=%d\n", v);
}
}
void *C(void *args) {
pthread_mutex_lock(&mtx2);
for (;;) {
CB();
v += 1;
printf("C: v=%d\n", v);
}
}
int main() {
pthread_t t1, t2;
pthread_mutex_lock(&mtx1);
pthread_mutex_lock(&mtx2);
pthread_create(&t1, NULL, A, NULL);
pthread_create(&t2, NULL, C, NULL);
pthread_cond_wait(&cv3, &mtx1);
pthread_cond_wait(&cv4, &mtx2);
for (;;) {
v += 1;
printf("B: v=%d\n", v);
BA();
BC();
}
return 0;
}
With the assistance of another C implementation, it implies that guile scheme is not functioning properly.
The C implementation works just as intended.
Related
My understanding of Semaphore principle
I am currently trying to understand how Semaphores work.
I have understood that when calling P(sem), if sem=0 the thread will get blocked, otherwise the value of the semaphore will be reduced and the thread let into the critical section.
When calling V(sem), the semaphore value will be increased, if sem=0 and a thread is waiting, that thread is woken up.
Now consider this problem:
Two threads are running, thread1 runs function1, thread2 runs function2. There are two semaphores s=0 and m=1 which are shared between both threads.
function1 {
while(true) {
P(s)
P(s)
P(m)
print("a")
V(m)
}}
function2 {
while(true) {
P(m)
print("b")
V(m)
V(s)
}}
What I expected would happen
I have expected the output string to be print b's and a's in some random order.
The threads start. Let's say thread1 enters function1 first:
Step1: P(s)-> s=0 so block the thread
thread2 enters function2
P(m) -> m=1 -> set m=m-1=0
print b
V(m) -> m=m+1=1
V(s) -> s=0 and thread is wating -> set s=s+1=1 and wake up thread
Step2: Thread2 returns to the second P-statement
thread1 continues in function1
P(s) -> s=1 -> set s=s-1=0
P(m) -> m=1 -> set m=m-1=0
print a
V(m) -> m=0 but noone waiting -> set m=m+1=1
Step2: Thread1 runs function1 again
P(s) -> s=0 -> block
Thread2 runs function 2
P(m) -> m=1 -> set m=m-1=0
print b
V(m) -> m=m+1=1
V(s) -> s=0 and thread waiting -> wake up thread 2
Step3: Thread1 returns to function1 second P-statement
P(s) -> s=1 -> set s=s-1=0
P(m) -> m=1 -> set m=m-1=0
print a
V(m) -> set m=1
Step4 Thread2 runs function 2
P(m) -> m=1 -> set m=0
print b
V(m) -> set m=1
V(s) -> s=0 -> set s=1, no thread waiting
Step5 Thread 2 runs function 2 again
print b
... and so on
The Problem/The Questions
I am very unsure, if it is correct, that thread1 returns to the second P-statement after it is woken up when thread2 runs function2.
It seems wrong to me, because if I consider how a semaphore would usually be implemented:
P(s)
* do something *
V(s)
If it where to return after the P-statement, the value of the semaphore does not get decreased, and the V-statement would increase the semaphore to a wrong value of 2.
But if it repeats the first P-statement in the example, that would mean the output string is only b's.
Can someone tell me if my understanding is correct or if not, correct my mistake?
I wanted to print 1 to 10 for 3 threads. My code is able to do that but after that the program gets stuck. I tried using pthread_exit in the end of function. Also, I tried to remove while (1) in main and using pthread join there. But still, I got the same result. How should I terminate the threads?
enter code here
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
int done = 1;
//Thread function
void *foo()
{
for (int i = 0; i < 10; i++)
{
printf(" \n #############");
pthread_mutex_lock(&lock);
if(done == 1)
{
done = 2;
printf (" \n %d", i);
pthread_cond_signal(&cond2);
pthread_cond_wait(&cond1, &lock);
printf (" \n Thread 1 woke up");
}
else if(done == 2)
{
printf (" \n %d", i);
done = 3;
pthread_cond_signal(&cond3);
pthread_cond_wait(&cond2, &lock);
printf (" \n Thread 2 woke up");
}
else
{
printf (" \n %d", i);
done = 1;
pthread_cond_signal(&cond1);
pthread_cond_wait(&cond3, &lock);
printf (" \n Thread 3 woke up");
}
pthread_mutex_unlock(&lock);
}
pthread_exit(NULL);
return NULL;
}
int main(void)
{
pthread_t tid1, tid2, tid3;
pthread_create(&tid1, NULL, foo, NULL);
pthread_create(&tid2, NULL, foo, NULL);
pthread_create(&tid3, NULL, foo, NULL);
while(1);
printf ("\n $$$$$$$$$$$$$$$$$$$$$$$$$$$");
return 0;
}
How should I terminate the threads?
Either returning from the outermost call to the thread function or calling pthread_exit() terminates a thread. Returning a value p from the outermost call is equivalent to calling pthread_exit(p).
the program gets stuck
Well of course it does when the program performs
while(1);
.
Also, I tried to remove while (1) in main and using pthread join there. But still, I got the same result.
You do need to join the threads to ensure that they terminate before the overall program does. This is the only appropriate way to achieve that. But if your threads in fact do not terminate in the first place, then that's moot.
In your case, observe that each thread unconditionally performs a pthread_cond_wait() on every iteration of the loop, requiring it to be signaled before it resumes. Normally, the preceding thread will signal it, but that does not happen after the last iteration of the loop. You could address that by having each thread perform an appropriate additional call to pthread_cond_signal() after it exits the loop, or by ensuring that the threads do not wait in the last loop iteration.
I only see the whole output after a 5 sec delay, but I think it should be otherwise.
I expect following output:
main is here
hi received
(and only then sleep for 5 sec)
but my code starts by sleeping first for 5 sec and only then continues.
let t1 ch =
let m = sync (receive ch) in
print_string (m ^ " -> received\n");
delay 5.0;
sync (send ch "t1 got the message")
let main () =
let ch = new_channel () in
create t1 ch;
print_string "main is here\n";
sync (send ch "hi");
print_string ("main confirms :" ^ sync(receive ch))
I would gladly read some tutorials online but I didn't find any.
Try flushing the output
print_string "main is here\n";
flush stdout
I've been trying to generate strings in this way:
a
b
.
.
z
aa
ab
.
.
zz
.
.
.
.
zzzz
And I want to know why Segmentation fault (core dumped) error is prompted when it reaches 'yz'. I know my code don't cover all the posibles strings like 'zb' or 'zc', but that's not all the point, I want to know why this error. I am not a master in coding as you see so please try to explain it clearly. Thanks :)
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
void move_positions (char s[]);
int main (int argc, char *argv[])
{
char s[28];
s[0] = ' ';
s[1] = '\0';
int a = 0;
for(int r = 'a'; r <= 'z'; r++)
{
for(int t ='a';t <='z'; t++)
{
for(int u = 'a';u <= 'z'; u++)
{
for(int y = 'a'; y <= 'z'; y++)
{
s[a] = (char)y;
printf ("%s\n", s);
if (s[0] == 'z')
{
move_positions(s);
a++;
}
}
s[a-1] = (char)u;
}
s[a-2] = (char)t;
}
s[a-3] = (char)r;
}
return 0;
}
void move_positions (char s[])
{
char z[28];
z[0] = ' ';
z[1] = '\0';
strcpy(s, strcat(z, s));
}
First, let's compile with debugging turned on:
gcc -g prog.c -o prog
Now let's run it under a debugger:
> gdb prog
GNU gdb 6.3.50-20050815 (Apple version gdb-1822) (Sun Aug 5 03:00:42 UTC 2012)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries .. done
(gdb) run
Starting program: /Users/andrew/Documents/programming/sx/13422880/prog
Reading symbols for shared libraries +............................. done
a
b
c
d
e
...
yu
yv
yw
yx
yy
yz
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00007fffc0bff6c5
0x0000000100000c83 in main (argc=1, argv=0x7fff5fbff728) at prog.c:22
22 s[a] = (char)y;
Ok, it crashed on line 22, trying to do s[a] = (char)y. What's a?
(gdb) p a
$1 = 1627389953
So you're setting the ~1.6 millionth entry of the array s. What is s?
(gdb) ptype s
type = char [28]
Saving 1.6 million entries in a 28-element array? That's not going to work. Looks like you need to reset a to zero at the start of some of your loops.
Can someone explain how atomicModifyIORef works? In particular:
(1) Does it wait for a lock, or optimistically try and retry if there's contention (like TVar).
(2) Why is the signature of atomicModifyIORef different to the signature of modifyIORef? In particular, what is this extra variable b?
Edit: I think I've figured out the answer to (2), in that b is a value to be extracted (this can be empty if not needed). In a single threaded program, knowing the value is trivial, but in a multithreaded program, one may want to know what the previous value was at the time of the function being applied. I assume this is why modifyIORef doesn't have this extra return value (as such usages of modifyIORef with this return value probably should use atomicModifyIORef anyway. I'm still interested in the answer to (1) though.
Does it wait for a lock, or optimistically try and retry if there's contention (like TVar).
atomicModifyIORef uses an locking instruction on the underlying hardware architecture you're on, to swap the pointer to an allocated Haskell object in an atomic fashion.
On x86 it uses the cas intruction, exposed as a primitive to the language via atomicModifyMutVar#, which is implemented as a runtime service in Cmm as:
stg_atomicModifyMutVarzh
{
...
retry:
x = StgMutVar_var(mv);
StgThunk_payload(z,1) = x;
#ifdef THREADED_RTS
(h) = foreign "C" cas(mv + SIZEOF_StgHeader + OFFSET_StgMutVar_var, x, y) [];
if (h != x) { goto retry; }
#else
StgMutVar_var(mv) = y;
#endif
...
}
That is, it will try to do the swap, and retry otherwise.
The implementation of cas as a primitive shows how we get down to the metal:
/*
* Compare-and-swap. Atomically does this:
*/
EXTERN_INLINE StgWord cas(StgVolatilePtr p, StgWord o, StgWord n);
/*
* CMPXCHG - the single-word atomic compare-and-exchange instruction. Used
* in the STM implementation.
*/
EXTERN_INLINE StgWord
cas(StgVolatilePtr p, StgWord o, StgWord n)
{
#if i386_HOST_ARCH || x86_64_HOST_ARCH
__asm__ __volatile__ (
"lock\ncmpxchg %3,%1"
:"=a"(o), "=m" (*(volatile unsigned int *)p)
:"0" (o), "r" (n));
return o;
#elif arm_HOST_ARCH && defined(arm_HOST_ARCH_PRE_ARMv6)
StgWord r;
arm_atomic_spin_lock();
r = *p;
if (r == o) { *p = n; }
arm_atomic_spin_unlock();
return r;
#elif !defined(WITHSMP)
StgWord result;
result = *p;
if (result == o) {
*p = n;
}
return result;
So you can see that it is able to use an atomic instruction in Intel, on other architectures different mechanisms will be used. The runtime will retry.
atomicModifyIORef takes a r :: IORef a and a function f :: a -> (a, b) and does the following:
It reads the value of r and applies f to this value, yielding (a',b). Then the r is updated with the new value a' while b is the return value. This read and write access is done atomically.
Of course this atomicity only works if all accesses to r are done via atomicModifyIORef.
Note that you can find this information by looking at the source [1].
[1] https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-IORef.html#v:atomicModifyIORef