Clojure Running Multiple Threads with Functions with zero or one input - multithreading

I'm experimenting in Clojure with running independent threads and I'm getting different behaviors I don't understand.
For my code editor I'm using Atom (not emacs), REPL is Chlorine.
I'm testing a really simple function that just prints numbers.
This one prints from 100 to 1 and takes no inputs:
(defn pl100 []
"pl100 = Print Loop from 100 to 1"
(loop [counter 100]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "counter: " counter))
(recur (dec counter))))))
This one does the exact same thing, except it takes an input:
(defn pl-n [n]
"pl-n = Print Loop from n to 1"
(loop [counter n]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "counter: " counter))
(recur (dec counter))))))
When I use
(.start (Thread. #(.run pl100)))
; --> prints to console REPL
; --> runs with no errors
this code
prints to the console REPL (where I call lein) and
runs with no errors
When I use
(.start (Thread. #(.run (pl-n 100))))
; prints to console REPL
; --> java.lang.NullPointerException: Cannot invoke "Object.getClass()" because "target" is null
this code
prints to the console REPL
ends with the above exception
When I use
(.start (Thread. pl100))
; --> prints to the console REPL
; --> runs with no errors
this code
prints to the console REPL
runs with no errors
When I use
(.start (Thread. (pl-n 100)))
; --> prints to Atom REPL, not console REPL!
; ends with exception
; Execution error (NullPointerException) at java.lang.Thread/<init> (Thread.java:396).
; name cannot be null
; class java.lang.NullPointerException
this code
prints to the Atom REPL (I'm using Atom, not emacs)! Not to the console REPL like the others
ends with exception
So, can someone please help me understand:
Why is it when I'm running a function that takes an input, Java gives an error? Why are the function calls not equivalent?
What is (.run ...) doing?
Why is it that sometimes the code prints to the console and other times to Atom/Chlorine?

To answer in brief: Thread.run requires a function. Your first exhibit gives it a function, pl100, and works as you expect:
#(.run pl100)
Something altogether different would happen if you gave .run not a function, but instead the value returned by calling the pl100 function. In fact, pl100 returns nil, so Thread.run would throw a NullPointerException:
#(.run (pl100)) ;; NullPointerException
That explains why your second exhibit did not do what you expected. pl-n returned nil, and then you got an exception when you passed nil to Thread.run:
#(.run (pl-n 100)) ;; NullPointerException
To bridge the gap - between Thread.run which requires a function of no arguments, and your function pl-n which requires an argument, you could introduce a function-of-no-arguments (to satisfy Thread.run) which calls pl-n with the desired argument. Idiomatically this would be an anonymous function. Unfortunately, you can't nest #() within #(), so you will have to use the more verbose (fn [] ...) syntax for one of the anonymous functions, most likely the outer one.

Try something like this:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require [tupelo.string :as str]))
(defn pl100 []
"pl100 = Print Loop from 100 to 1"
(loop [counter 10]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "pl100 counter: " counter))
(recur (dec counter))))))
(defn pl-n [n]
"pl-n = Print Loop from n to 1"
(loop [counter n]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "pl-n counter: " counter))
(recur (dec counter))))))
(dotest
(newline)
(.start (Thread. pl100))
(Thread/sleep (* 2 1000))
(newline)
(.start (Thread. #(pl-n 5)))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
)
Clojure functions are already an instance of Runnable, and you don't need #(.run xxx) syntax. Result:
--------------------------------------
Clojure 1.10.2-alpha1 Java 15
--------------------------------------
Testing tst.demo.core
pl100 counter: 10
pl100 counter: 9
pl100 counter: 8
pl100 counter: 7
pl100 counter: 6
pl100 counter: 5
pl100 counter: 4
pl100 counter: 3
pl100 counter: 2
pl100 counter: 1
pl-n counter: 5
pl-n counter: 4
pl-n counter: 3
pl-n counter: 2
pl-n counter: 1
:done
To make it even simpler, just use a Clojure future:
(future (pl100))
(Thread/sleep (* 2 1000))
(newline)
(future (pl-n 5))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
If you remove the Thread/sleep, you can see them running in parallel:
(future (pl100))
(future (pl-n 5))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
with result
pl100 counter: 10
pl-n counter: 5
pl100 counter: 9
pl-n counter: 4
pl100 counter: 8
pl-n counter: 3
pl100 counter: 7pl-n counter: 2
pl100 counter: 6pl-n counter: 1
pl100 counter: 5
pl100 counter: 4
pl100 counter: 3
pl100 counter: 2
pl100 counter: 1

Related

Clojure proxy multithreading issue

I'm trying to create a proxy for ArrayBlockingQueue that intercepts calls to it for monitoring
(ns clj-super-bug.core
(:import [java.util.concurrent ArrayBlockingQueue Executors]))
(let [thread-count 10
put-count 100
executor (Executors/newFixedThreadPool thread-count)
puts (atom 0)
queue (proxy [ArrayBlockingQueue] [1000]
(put [el]
(proxy-super put el)
(swap! puts inc)))]
(.invokeAll executor (repeat put-count #(.put queue 0)))
(assert (= (.size queue) put-count) "should have put in put-count items")
(println #puts))
I would expect this code to always print 100, but occaissonally it's something else like 51. Am I using proxy or proxy-super wrong?
I debugged this to the point that it seems that the proxy method is not actually called on some occasions, just the base method (the items show up in the queue, as indicated by the assert). Also, I suppose it's multithreading related because if I have thread-count = 1 it's always 100.
Turns out this is a known issue with proxy-super: https://dev.clojure.org/jira/browse/CLJ-2201
"If you have a proxy with method M, which invokes proxy-super, then while that proxy-super is running all calls to M on that proxy object will immediately invoke the super M not the proxied M." That's exactly what's happening.
I would not do the subclass via proxy.
If you subclass ArrayBlockingQueue, you are saying your code is an instance of ABQ. So, you are making a specialized version of ABQ, and must take responsibility for all of the implementation details of the ABQ source code.
However, you don't need to be an instance of ABQ. All you really need is to use an instance of ABQ, which is easily done by composition.
So, we write a wrapper function which delegates to an ABQ:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[clojure.string :as str]
[clojure.java.io :as io])
(:import [java.util.concurrent ArrayBlockingQueue Executors TimeUnit]) )
(dotest
(let [N 100
puts-done (atom 0)
abq (ArrayBlockingQueue. (+ 3 N))
putter (fn []
(.put abq 0)
(swap! puts-done inc))]
(dotimes [_ N]
(future (putter)))
(Thread/sleep 1000)
(println (format "N: %d puts-done: %d" N #puts-done))
(assert (= N #puts-done)
(format "should have put in puts-done items; N = %d puts-done = %d" N #puts-done))
))
result:
N: 100 puts-done: 100
Using the executor:
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count)
abq (ArrayBlockingQueue. (+ 3 N))
putter (fn []
(.put abq 0)
(swap! puts-done inc))
putters (repeat N #(putter)) ]
(.invokeAll executor putters)
(println (format "N: %d puts-done: %d" N #puts-done))
(assert (= N #puts-done)
(format "should have put in puts-done items; N = %d puts-done = %d" N #puts-done))))
result:
N: 100 puts-done: 100
Update #1
Regarding the cause, I'm not sure. I tried to fix the original version with locking, but no joy:
(def lock-obj (Object.))
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count)
abq (proxy [ArrayBlockingQueue]
[(+ 3 N)]
(put [el]
(locking lock-obj
(proxy-super put el)
(swap! puts-done inc))))]
(.invokeAll executor (repeat N #(.put abq 0)))
with results:
N: 100 puts-done: 46
N: 100 puts-done: 71
N: 100 puts-done: 85
N: 100 puts-done: 83
Update #2
Tried some more tests using a java subclass of ABQ:
package demo;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.atomic.AtomicInteger;
public class Que<E> extends ArrayBlockingQueue<E> {
public static AtomicInteger numPuts = new AtomicInteger(0);
public static Que<Integer> queInt = new Que<>( 999 );
public Que(int size) { super(size); }
public void put(E element) {
synchronized (numPuts) {
try {
super.put(element);
numPuts.getAndIncrement();
} catch (Exception ex) {
System.out.println( "caught " + ex);
} } } }
...
(:import [java.util.concurrent Executors TimeUnit]
[demo Que] ) )
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count) ]
(.invokeAll executor (repeat N #(.put Que/queInt 0)))
(println (format "N: %d puts-done: %d" N (.get Que/numPuts)))))
results (repeated runs => accumulation):
N: 100 puts-done: 100
N: 100 puts-done: 200
N: 100 puts-done: 300
N: 100 puts-done: 400
N: 100 puts-done: 500
so it works great with a Java subclass. Get same results with/without the synchronized block.
So, it looks to be something in the Clojure proxy area.

Crystal recursive function resulting in a signal 11 (invalid memory access)?

I wanted to test recursive functions in Crystal, so I wrote something like...
def repeat(n)
return if n < 0
# Useless line just to do something
n + 1
repeat(n - 1)
end
repeat(100_000_000)
I didn't expect this to work with either crystal recurse.cr or crystal build recurse.cr, because neither one of those can optimize for tail call recursion - as expected, both resulted in stack overflows.
When I used the --release flag, it was totally fine - and incredibly fast.
If I did the following...
NUM = 100_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
... everything is fine - I don't even need to build it.
If I change to a much larger number and a non-recursive function, again it's fine...
NUM = 100_000_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def repeat(num)
(0..num).each { |n| p(n) }
end
repeat(NUM)
If I instead use recursion, however...
NUM = 100_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
...I get the following output:
100000000
Invalid memory access (signal 11) at address 0x7fff500edff8
[4549929098] *CallStack::print_backtrace:Int32 +42
[4549928595] __crystal_sigfault_handler +35
[140735641769258] _sigtramp +26
[4549929024] *recursive_repeat<Int32>:Nil +416
[4549929028] *recursive_repeat<Int32>:Nil +420 (65519 times)
[4549926888] main +2808
How does using puts like that trigger a stack overflow - especially given that it would only write 5 lines total?
Shouldn't this still be TCO?
Edit
Okay, to add to the weirdness, this works...
NUM = 100_000_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def repeat(num)
(0..num).each { |n| p(n) }
end
repeat(NUM)
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
... but removing the call to repeat(NUM), literally keeping every other line the same, will again result in the error.

How incomplete ClojureScript now? (range) (iterate) etc

I'm trying to use ClojureScript instead Clojure lately.
When I compile and run on node.js
(.log js/console (range 10))
I've got
$ node app
{ meta: null,
start: 0,
end: 10,
step: 1,
__hash: null,
'cljs$lang$protocol_mask$partition1$': 0,
'cljs$lang$protocol_mask$partition0$': 32375006 }
I'm a bit surprised to see this simple code does not work.
Is this due to my specific environment? I hope so, and if it's a problem of my side, please advise.
Here is the compiled js:
cljs.nodejs = {};
cljs.nodejs.require = require;
cljs.nodejs.process = process;
cljs.core.string_print = cljs.nodejs.require.call(null, "util").print;
var rxcljs = {core:{}};
console.log(cljs.core.range.call(null, 10));
You can either console.log the string representation of (range 10):
(.log js/console (pr-str (range 10)))
or simply use the println function:
(println (range 10))
In either case, (0 1 2 3 4 5 6 7 8 9) is printed as expected.
Looks like you want to print the vector instead; range returns a lazy seq.
Try this:
(.log js/console (vec (range 10)))

Is simulating race conditions with gdb/lldb feasible?

I'm wondering if it would be feasible to automatically test for race conditions using a debugger.
For example, imaging you want to test a multi-threaded queue. Amongst others you would want to test that you can concurrently call enqueue() and dequeue().
A simple unit-test could be able to start two threads, each calling enqueue() and dequeue() respectively in a loop and checking the results:
// thread A
for( int i=0; i<count; i+=1 ) {
enqueue( queue, i );
}
// thread B
for( int i=0; i<count; i+=1 ) {
ASSERT( i == dequeue( queue ) );
}
Now, a clever test-driver, running the unit-test in gdb or lldb, should be able to wait for breakpoints set inside both loops and then use the debuggers si (step instruction) command to simulate all possible interleavings of the two threads.
My question is not if this is technically possible (it is). What I want to know is this:
Assuming the enqueue() function has 10 instructions and the dequeue() function has 20 - how many different interleavings does the test have to try?
Let's see...
If we only have 2 instructions in each: a,b and A,B:
a,b,A,B
a,A,b,B
a,A,B,b
A,a,b,B
A,a,B,b
A,B,a,b
That's 6.
For a, b, C and A,B,C:
a,b,c,A,B,C
a,b,A,c,B,C
a,b,A,B,c,C
a,b,A,B,C,c
a,A,b,c,B,C
a,A,b,B,c,C
a,A,B,b,c,C
a,A,b,B,C,c
a,A,B,b,C,c
a,A,B,C,b,c
A,a,b,c,B,C
A,a,b,B,c,C
A,a,B,b,c,C
A,B,a,b,c,C
A,a,b,B,C,c
A,a,B,b,C,c
A,B,a,b,C,c
A,a,B,C,b,c
A,B,a,C,b,c
A,B,C,a,b,c
That's 20, unless I'm missing something.
If we generalize it to N instructions (say, N is 26) in each and start with a...zA...Z, then there will be 27 possible positions for z (from before A to after Z), at most 27 positions for y, at most 28 for x, at most 29 for w, etc. This suggest a factorial at worst. In reality, however, it's less than that, but I'm being a bit lazy, so I'm going to use the output from a simple program calculating the number of possible "interleavings" instead of deriving the exact formula:
1 & 1 -> 2
2 & 2 -> 6
3 & 3 -> 20
4 & 4 -> 70
5 & 5 -> 252
6 & 6 -> 924
7 & 7 -> 3432
8 & 8 -> 12870
9 & 9 -> 48620
10 & 10 -> 184756
11 & 11 -> 705432
12 & 12 -> 2704156
13 & 13 -> 10400600
14 & 14 -> 40116600
15 & 15 -> 155117520
16 & 16 -> 601080390
So, with these results you may conclude that while the idea is correct, it's going to take an unreasonable amount of time to use it for code validation.
Also, you should remember that you need to take into account not only the order of instruction execution, but also the state of the queue. That's going to increase the number of iterations.
Here's the program (in C):
#include <stdio.h>
unsigned long long interleavings(unsigned remaining1, unsigned remaining2)
{
switch (!!remaining1 * 2 + !!remaining2)
{
default: // remaining1 == 0 && remaining2 == 0
return 0;
case 1: // remaining1 == 0 && remaining2 != 0
case 2: // remaining1 != 0 && remaining2 == 0
return 1;
case 3: // remaining1 != 0 && remaining2 != 0
return interleavings(remaining1 - 1, remaining2) +
interleavings(remaining1, remaining2 - 1);
}
}
int main(void)
{
unsigned i;
for (i = 0; i <= 16; i++)
printf("%3u items can interleave with %3u items %llu times\n",
i, i, interleavings(i, i));
return 0;
}
BTW, you could also save an order of magnitude (or two) of the overhead due to interfacing with the debugger and due to the various context switches, if you simulate pseudo-code instead. See this answer to a somewhat related question for a sample implementation. This may also give you a more fine grained control over switching between the threads than direct execution.

The more threads that changes the Clojure's ref are, the more does the rate of retries per threads rise?

I worry about this a little.
Imagine the simplest version controll way that programmers just copy all directory from the master repository and after changing a file do reversely if the master repository is still the same. If it has been changed by another, they must try again.
When the number of programmers increases, it is natural that retries also increase, but it might not be proportional to the number of programmers.
If ten programmers work and a work takes an hour per person, to complete all work ten hours are needed at least.
If they are earnest, about 9 + 8 + 7 + ... 1 = 45 man-hours come to nothing.
In a hundread of programmers, about 99 + 98 + ... 1 = 4950 man-hours come to nothing.
I tried to count the number of retries and got the results.
Source
(defn fib [n]
(if (or (zero? n) (= n 1))
1
(+ (fib (dec n) ) (fib (- n 2)))))
(defn calc! [r counter-A counter-B counter-C n]
(dosync
(swap! counter-A inc)
;;(Thread/sleep n)
(fib n)
(swap! counter-B inc)
(alter r inc)
(swap! counter-C inc)))
(defn main [thread-num n]
(let [r (ref 0)
counter-A (atom 0)
counter-B (atom 0)
counter-C (atom 0)]
(doall (pmap deref
(for [_ (take thread-num (repeat nil))]
(future (calc! r counter-A counter-B counter-C n)))))
(println thread-num " Thread. #ref:" #r)
(println "A:" #counter-A ", B:" #counter-B ", C:" #counter-C)))
CPU: 2.93GHz Quad-Core Intel Core i7
result
user> (time (main 10 25))
10 Thread. #ref: 10
A: 53 , B: 53 , C: 10
"Elapsed time: 94.412 msecs"
nil
user> (time (main 100 25))
100 Thread. #ref: 100
A: 545 , B: 545 , C: 100
"Elapsed time: 966.141 msecs"
nil
user> (time (main 1000 25))
1000 Thread. #ref: 1000
A: 5507 , B: 5507 , C: 1000
"Elapsed time: 9555.165 msecs"
nil
I changed the job to (Thread/sleep n) instead of (fib n) and got similar results.
user> (time (main 10 20))
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
"Elapsed time: 220.616 msecs"
nil
user> (time (main 100 20))
100 Thread. #ref: 100
A: 689 , B: 689 , C: 117
"Elapsed time: 2013.729 msecs"
nil
user> (time (main 1000 20))
1000 Thread. #ref: 1000
A: 6911 , B: 6911 , C: 1127
"Elapsed time: 20243.214 msecs"
nil
In Thread/sleep case, I think retries could increase more than this result because CPU is available.
Why don't retries increase?
Thanks.
Because you are not actually spawning 10, 100 or 1000 threads! Creating a future does not always create a new thread. It uses a thread pool behind the scenes where it keeps queuing the jobs (or Runnables to be technical). The thread pool is a cached thread pool which reuses the threads for running the jobs.
So in your case, you are not actually spawning a 1000 threads. If you want to see the retries in action, get a level below future - create your own thread pool and push Runnables into it.
self answer
I have modified main function not to use pmap and got results, which work out as calculated.
(defn main [thread-num n]
(let [r (ref 0)
counter-A (atom 0)
counter-B (atom 0)
counter-C (atom 0)]
(doall (map deref (doall (for [_ (take thread-num (repeat nil))]
(future (calc! r counter-A counter-B counter-C n))))))
(println thread-num " Thread. #ref:" #r)
(println "A:" #counter-A ", B:" #counter-B ", C:" #counter-C)))
fib
user=> (main 10 25)
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
nil
user=> (main 100 25)
100 Thread. #ref: 100
A: 1213 , B: 1213 , C: 100
nil
user=> (main 1000 25)
1000 Thread. #ref: 1000
A: 19992 , B: 19992 , C: 1001
nil
Thread/sleep
user=> (main 10 20)
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
nil
user=> (main 100 20)
100 Thread. #ref: 100
A: 4979 , B: 4979 , C: 102
nil
user=> (main 1000 20)
1000 Thread. #ref: 1000
A: 491223 , B: 491223 , C: 1008
nil

Resources