Clojure proxy multithreading issue - multithreading

I'm trying to create a proxy for ArrayBlockingQueue that intercepts calls to it for monitoring
(ns clj-super-bug.core
(:import [java.util.concurrent ArrayBlockingQueue Executors]))
(let [thread-count 10
put-count 100
executor (Executors/newFixedThreadPool thread-count)
puts (atom 0)
queue (proxy [ArrayBlockingQueue] [1000]
(put [el]
(proxy-super put el)
(swap! puts inc)))]
(.invokeAll executor (repeat put-count #(.put queue 0)))
(assert (= (.size queue) put-count) "should have put in put-count items")
(println #puts))
I would expect this code to always print 100, but occaissonally it's something else like 51. Am I using proxy or proxy-super wrong?
I debugged this to the point that it seems that the proxy method is not actually called on some occasions, just the base method (the items show up in the queue, as indicated by the assert). Also, I suppose it's multithreading related because if I have thread-count = 1 it's always 100.

Turns out this is a known issue with proxy-super: https://dev.clojure.org/jira/browse/CLJ-2201
"If you have a proxy with method M, which invokes proxy-super, then while that proxy-super is running all calls to M on that proxy object will immediately invoke the super M not the proxied M." That's exactly what's happening.

I would not do the subclass via proxy.
If you subclass ArrayBlockingQueue, you are saying your code is an instance of ABQ. So, you are making a specialized version of ABQ, and must take responsibility for all of the implementation details of the ABQ source code.
However, you don't need to be an instance of ABQ. All you really need is to use an instance of ABQ, which is easily done by composition.
So, we write a wrapper function which delegates to an ABQ:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[clojure.string :as str]
[clojure.java.io :as io])
(:import [java.util.concurrent ArrayBlockingQueue Executors TimeUnit]) )
(dotest
(let [N 100
puts-done (atom 0)
abq (ArrayBlockingQueue. (+ 3 N))
putter (fn []
(.put abq 0)
(swap! puts-done inc))]
(dotimes [_ N]
(future (putter)))
(Thread/sleep 1000)
(println (format "N: %d puts-done: %d" N #puts-done))
(assert (= N #puts-done)
(format "should have put in puts-done items; N = %d puts-done = %d" N #puts-done))
))
result:
N: 100 puts-done: 100
Using the executor:
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count)
abq (ArrayBlockingQueue. (+ 3 N))
putter (fn []
(.put abq 0)
(swap! puts-done inc))
putters (repeat N #(putter)) ]
(.invokeAll executor putters)
(println (format "N: %d puts-done: %d" N #puts-done))
(assert (= N #puts-done)
(format "should have put in puts-done items; N = %d puts-done = %d" N #puts-done))))
result:
N: 100 puts-done: 100
Update #1
Regarding the cause, I'm not sure. I tried to fix the original version with locking, but no joy:
(def lock-obj (Object.))
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count)
abq (proxy [ArrayBlockingQueue]
[(+ 3 N)]
(put [el]
(locking lock-obj
(proxy-super put el)
(swap! puts-done inc))))]
(.invokeAll executor (repeat N #(.put abq 0)))
with results:
N: 100 puts-done: 46
N: 100 puts-done: 71
N: 100 puts-done: 85
N: 100 puts-done: 83
Update #2
Tried some more tests using a java subclass of ABQ:
package demo;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.atomic.AtomicInteger;
public class Que<E> extends ArrayBlockingQueue<E> {
public static AtomicInteger numPuts = new AtomicInteger(0);
public static Que<Integer> queInt = new Que<>( 999 );
public Que(int size) { super(size); }
public void put(E element) {
synchronized (numPuts) {
try {
super.put(element);
numPuts.getAndIncrement();
} catch (Exception ex) {
System.out.println( "caught " + ex);
} } } }
...
(:import [java.util.concurrent Executors TimeUnit]
[demo Que] ) )
(dotest
(let [N 100
puts-done (atom 0)
thread-count 10
executor (Executors/newFixedThreadPool thread-count) ]
(.invokeAll executor (repeat N #(.put Que/queInt 0)))
(println (format "N: %d puts-done: %d" N (.get Que/numPuts)))))
results (repeated runs => accumulation):
N: 100 puts-done: 100
N: 100 puts-done: 200
N: 100 puts-done: 300
N: 100 puts-done: 400
N: 100 puts-done: 500
so it works great with a Java subclass. Get same results with/without the synchronized block.
So, it looks to be something in the Clojure proxy area.

Related

Clojure Running Multiple Threads with Functions with zero or one input

I'm experimenting in Clojure with running independent threads and I'm getting different behaviors I don't understand.
For my code editor I'm using Atom (not emacs), REPL is Chlorine.
I'm testing a really simple function that just prints numbers.
This one prints from 100 to 1 and takes no inputs:
(defn pl100 []
"pl100 = Print Loop from 100 to 1"
(loop [counter 100]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "counter: " counter))
(recur (dec counter))))))
This one does the exact same thing, except it takes an input:
(defn pl-n [n]
"pl-n = Print Loop from n to 1"
(loop [counter n]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "counter: " counter))
(recur (dec counter))))))
When I use
(.start (Thread. #(.run pl100)))
; --> prints to console REPL
; --> runs with no errors
this code
prints to the console REPL (where I call lein) and
runs with no errors
When I use
(.start (Thread. #(.run (pl-n 100))))
; prints to console REPL
; --> java.lang.NullPointerException: Cannot invoke "Object.getClass()" because "target" is null
this code
prints to the console REPL
ends with the above exception
When I use
(.start (Thread. pl100))
; --> prints to the console REPL
; --> runs with no errors
this code
prints to the console REPL
runs with no errors
When I use
(.start (Thread. (pl-n 100)))
; --> prints to Atom REPL, not console REPL!
; ends with exception
; Execution error (NullPointerException) at java.lang.Thread/<init> (Thread.java:396).
; name cannot be null
; class java.lang.NullPointerException
this code
prints to the Atom REPL (I'm using Atom, not emacs)! Not to the console REPL like the others
ends with exception
So, can someone please help me understand:
Why is it when I'm running a function that takes an input, Java gives an error? Why are the function calls not equivalent?
What is (.run ...) doing?
Why is it that sometimes the code prints to the console and other times to Atom/Chlorine?
To answer in brief: Thread.run requires a function. Your first exhibit gives it a function, pl100, and works as you expect:
#(.run pl100)
Something altogether different would happen if you gave .run not a function, but instead the value returned by calling the pl100 function. In fact, pl100 returns nil, so Thread.run would throw a NullPointerException:
#(.run (pl100)) ;; NullPointerException
That explains why your second exhibit did not do what you expected. pl-n returned nil, and then you got an exception when you passed nil to Thread.run:
#(.run (pl-n 100)) ;; NullPointerException
To bridge the gap - between Thread.run which requires a function of no arguments, and your function pl-n which requires an argument, you could introduce a function-of-no-arguments (to satisfy Thread.run) which calls pl-n with the desired argument. Idiomatically this would be an anonymous function. Unfortunately, you can't nest #() within #(), so you will have to use the more verbose (fn [] ...) syntax for one of the anonymous functions, most likely the outer one.
Try something like this:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require [tupelo.string :as str]))
(defn pl100 []
"pl100 = Print Loop from 100 to 1"
(loop [counter 10]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "pl100 counter: " counter))
(recur (dec counter))))))
(defn pl-n [n]
"pl-n = Print Loop from n to 1"
(loop [counter n]
(when (pos? counter)
(do
(Thread/sleep 100)
(println (str "pl-n counter: " counter))
(recur (dec counter))))))
(dotest
(newline)
(.start (Thread. pl100))
(Thread/sleep (* 2 1000))
(newline)
(.start (Thread. #(pl-n 5)))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
)
Clojure functions are already an instance of Runnable, and you don't need #(.run xxx) syntax. Result:
--------------------------------------
Clojure 1.10.2-alpha1 Java 15
--------------------------------------
Testing tst.demo.core
pl100 counter: 10
pl100 counter: 9
pl100 counter: 8
pl100 counter: 7
pl100 counter: 6
pl100 counter: 5
pl100 counter: 4
pl100 counter: 3
pl100 counter: 2
pl100 counter: 1
pl-n counter: 5
pl-n counter: 4
pl-n counter: 3
pl-n counter: 2
pl-n counter: 1
:done
To make it even simpler, just use a Clojure future:
(future (pl100))
(Thread/sleep (* 2 1000))
(newline)
(future (pl-n 5))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
If you remove the Thread/sleep, you can see them running in parallel:
(future (pl100))
(future (pl-n 5))
(Thread/sleep (* 2 1000))
(newline)
(println :done)
with result
pl100 counter: 10
pl-n counter: 5
pl100 counter: 9
pl-n counter: 4
pl100 counter: 8
pl-n counter: 3
pl100 counter: 7pl-n counter: 2
pl100 counter: 6pl-n counter: 1
pl100 counter: 5
pl100 counter: 4
pl100 counter: 3
pl100 counter: 2
pl100 counter: 1

Crystal recursive function resulting in a signal 11 (invalid memory access)?

I wanted to test recursive functions in Crystal, so I wrote something like...
def repeat(n)
return if n < 0
# Useless line just to do something
n + 1
repeat(n - 1)
end
repeat(100_000_000)
I didn't expect this to work with either crystal recurse.cr or crystal build recurse.cr, because neither one of those can optimize for tail call recursion - as expected, both resulted in stack overflows.
When I used the --release flag, it was totally fine - and incredibly fast.
If I did the following...
NUM = 100_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
... everything is fine - I don't even need to build it.
If I change to a much larger number and a non-recursive function, again it's fine...
NUM = 100_000_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def repeat(num)
(0..num).each { |n| p(n) }
end
repeat(NUM)
If I instead use recursion, however...
NUM = 100_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
...I get the following output:
100000000
Invalid memory access (signal 11) at address 0x7fff500edff8
[4549929098] *CallStack::print_backtrace:Int32 +42
[4549928595] __crystal_sigfault_handler +35
[140735641769258] _sigtramp +26
[4549929024] *recursive_repeat<Int32>:Nil +416
[4549929028] *recursive_repeat<Int32>:Nil +420 (65519 times)
[4549926888] main +2808
How does using puts like that trigger a stack overflow - especially given that it would only write 5 lines total?
Shouldn't this still be TCO?
Edit
Okay, to add to the weirdness, this works...
NUM = 100_000_000
def p(n)
puts n if n % (NUM / 4) == 0
end
def repeat(num)
(0..num).each { |n| p(n) }
end
repeat(NUM)
def recursive_repeat(n)
return if n < 0
p(n)
recursive_repeat(n-1)
end
recursive_repeat(NUM)
... but removing the call to repeat(NUM), literally keeping every other line the same, will again result in the error.

Use a MailboxProcessor with reply-channel to create limited agents that return values in order

Basically, I want to change the following into a limited threading solution, because in my situation the list of calculations is too large, spawning too many threads, and I'd like to experiment and measure performance with less threads.
// the trivial approach (and largely my current situation)
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> Async.Parallel
|> Async.RunSynchronously // works, total wall time 1s
My new approach, this code is borrowed/inspired by this online snippet from Tomas Petricek (which I tested, it works, but I need it to return a value, not unit).
type LimitAgentMessage =
| Start of Async<int> * AsyncReplyChannel<int>
| Finished
let threadingLimitAgent limit = MailboxProcessor.Start(fun inbox -> async {
let queue = System.Collections.Generic.Queue<_>()
let count = ref 0
while true do
let! msg = inbox.Receive()
match msg with
| Start (work, reply) -> queue.Enqueue((work, reply))
| Finished -> decr count
if count.Value < limit && queue.Count > 0 then
incr count
let work, reply = queue.Dequeue()
// Start it in a thread pool (on background)
Async.Start(async {
let! x = work
do! async {reply.Reply x }
inbox.Post(Finished)
})
})
// given a synchronous list of tasks, run each task asynchronously,
// return calculated values in original order
let worker lst =
// this doesn't work as expected, it waits for each reply
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndReply(
fun replyChannel -> Start(x, replyChannel)))
Now, with this in place, the original code would become:
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i) // longest thread will run 1 sec
return i * i // some complex calculation returning a certain type
})
|> worker // worker is not working (correct output, runs 5.5s)
All in all, the output is correct (it does calculate and propagate back the replies), but it does not do so in the (limited set) of threads.
I've been playing around a bit, but think I'm missing the obvious (and besides, who knows, someone may like the idea of a limited-threads mailbox processor that returns its calculations in order).
The problem is the call to agent.PostAndReply. PostAndReply will block until the work has finished. Calling this inside List.map will cause the work to be executed sequentially. One solution is to use PostAndAsyncReply which does not block and also returns you an async handle for getting the result back.
let worker lst =
let agent = threadingLimitAgent 10
lst
|> List.map(fun x ->
agent.PostAndAsyncReply(
fun replyChannel -> Start(x, replyChannel)))
|> Async.Parallel
let doWork() =
[1 .. 10]
|> List.map (fun i -> async {
do! Async.Sleep (100 * i)
return i * i
})
|> worker
|> Async.RunSynchronously
That's of course only one possible solution (getting all async handles back and awaiting them in parallel).

ForkJoinPool for parallel processing

I am trying to run run some code 1 million times. I initially wrote it using Threads but this seemed clunky. I started doing some more reading and I came across ForkJoin. This seemed like exactly what I needed but I cant figure out how to translate what I have below into "scala-style". Can someone explain the best way to use ForkJoin in my code?
val l = (1 to 1000000) map {_.toLong}
println("running......be patient")
l.foreach{ x =>
if(x % 10000 == 0) println("got to: "+x)
val thread = new Thread {
override def run {
//my code (API calls) here. writes to file if call success
}
}
}
The easiest way is to use par (it will use ForkJoinPool automatically):
val l = (1 to 1000000) map {_.toLong} toList
l.par.foreach { x =>
if(x % 10000 == 0) println("got to: " + x) //will be executed in parallel way
//your code (API calls) here. will also be executed in parallel way (but in same thread with `println("got to: " + x)`)
}
Another way is to use Future:
import scala.concurrent._
import ExecutionContext.Implicits.global //import ForkJoinPool
val l = (1 to 1000000) map {_.toLong}
println("running......be patient")
l.foreach { x =>
if(x % 10000 == 0) println("got to: "+x)
Future {
//your code (API calls) here. writes to file if call success
}
}
If you need work stealing - you should mark blocking code with scala.concurrent.blocking:
Future {
scala.concurrent.blocking {
//blocking API call here
}
}
It will tell ForkJoinPool to compensate blocked thread with new one - so you can avoid thread starvation (but there is some disadvantages).
In Scala, you can use Future and Promise:
val l = (1 to 1000000) map {
_.toLong
}
println("running......be patient")
l.foreach { x =>
if (x % 10000 == 0) println("got to: " + x)
Future{
println(x)
}
}

The more threads that changes the Clojure's ref are, the more does the rate of retries per threads rise?

I worry about this a little.
Imagine the simplest version controll way that programmers just copy all directory from the master repository and after changing a file do reversely if the master repository is still the same. If it has been changed by another, they must try again.
When the number of programmers increases, it is natural that retries also increase, but it might not be proportional to the number of programmers.
If ten programmers work and a work takes an hour per person, to complete all work ten hours are needed at least.
If they are earnest, about 9 + 8 + 7 + ... 1 = 45 man-hours come to nothing.
In a hundread of programmers, about 99 + 98 + ... 1 = 4950 man-hours come to nothing.
I tried to count the number of retries and got the results.
Source
(defn fib [n]
(if (or (zero? n) (= n 1))
1
(+ (fib (dec n) ) (fib (- n 2)))))
(defn calc! [r counter-A counter-B counter-C n]
(dosync
(swap! counter-A inc)
;;(Thread/sleep n)
(fib n)
(swap! counter-B inc)
(alter r inc)
(swap! counter-C inc)))
(defn main [thread-num n]
(let [r (ref 0)
counter-A (atom 0)
counter-B (atom 0)
counter-C (atom 0)]
(doall (pmap deref
(for [_ (take thread-num (repeat nil))]
(future (calc! r counter-A counter-B counter-C n)))))
(println thread-num " Thread. #ref:" #r)
(println "A:" #counter-A ", B:" #counter-B ", C:" #counter-C)))
CPU: 2.93GHz Quad-Core Intel Core i7
result
user> (time (main 10 25))
10 Thread. #ref: 10
A: 53 , B: 53 , C: 10
"Elapsed time: 94.412 msecs"
nil
user> (time (main 100 25))
100 Thread. #ref: 100
A: 545 , B: 545 , C: 100
"Elapsed time: 966.141 msecs"
nil
user> (time (main 1000 25))
1000 Thread. #ref: 1000
A: 5507 , B: 5507 , C: 1000
"Elapsed time: 9555.165 msecs"
nil
I changed the job to (Thread/sleep n) instead of (fib n) and got similar results.
user> (time (main 10 20))
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
"Elapsed time: 220.616 msecs"
nil
user> (time (main 100 20))
100 Thread. #ref: 100
A: 689 , B: 689 , C: 117
"Elapsed time: 2013.729 msecs"
nil
user> (time (main 1000 20))
1000 Thread. #ref: 1000
A: 6911 , B: 6911 , C: 1127
"Elapsed time: 20243.214 msecs"
nil
In Thread/sleep case, I think retries could increase more than this result because CPU is available.
Why don't retries increase?
Thanks.
Because you are not actually spawning 10, 100 or 1000 threads! Creating a future does not always create a new thread. It uses a thread pool behind the scenes where it keeps queuing the jobs (or Runnables to be technical). The thread pool is a cached thread pool which reuses the threads for running the jobs.
So in your case, you are not actually spawning a 1000 threads. If you want to see the retries in action, get a level below future - create your own thread pool and push Runnables into it.
self answer
I have modified main function not to use pmap and got results, which work out as calculated.
(defn main [thread-num n]
(let [r (ref 0)
counter-A (atom 0)
counter-B (atom 0)
counter-C (atom 0)]
(doall (map deref (doall (for [_ (take thread-num (repeat nil))]
(future (calc! r counter-A counter-B counter-C n))))))
(println thread-num " Thread. #ref:" #r)
(println "A:" #counter-A ", B:" #counter-B ", C:" #counter-C)))
fib
user=> (main 10 25)
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
nil
user=> (main 100 25)
100 Thread. #ref: 100
A: 1213 , B: 1213 , C: 100
nil
user=> (main 1000 25)
1000 Thread. #ref: 1000
A: 19992 , B: 19992 , C: 1001
nil
Thread/sleep
user=> (main 10 20)
10 Thread. #ref: 10
A: 55 , B: 55 , C: 10
nil
user=> (main 100 20)
100 Thread. #ref: 100
A: 4979 , B: 4979 , C: 102
nil
user=> (main 1000 20)
1000 Thread. #ref: 1000
A: 491223 , B: 491223 , C: 1008
nil

Resources