Clojure - Using agents slows down execution too much - multithreading

I am writing a benchmark for a program in Clojure. I have n threads accessing a cache at the same time. Each thread will access the cache x times. Each request should be logged inside a file.
To this end I created an agent that holds the path to the file to be written to. When I want to write I send-off a function that writes to the file and simply returns the path. This way my file-writes are race-condition free.
When I execute my code without the agent it finished in a few miliseconds. When I use the agent, and ask each thread to send-off to the agent each time my code runs horribly slow. I'm talking minutes.
(defn load-cache-only [usercount cache-size]
"Test requesting from the cache only."
; Create the file to write the benchmark results to.
(def sink "benchmarks/results/load-cache-only.txt")
(let [data-agent (agent sink)
; Data for our backing store generated at runtime.
store-data (into {} (map vector (map (comp keyword str)
(repeat "item")
(range 1 cache-size))
(range 1 cache-size)))
cache (create-full-cache cache-size store-data)]
(barrier/run-with-barrier (fn [] (load-cache-only-work cache store-data data-agent)) usercount)))
(defn load-cache-only-work [cache store-data data-agent]
"For use with 'load-cache-only'. Requests each item in the cache one.
We time how long it takes for each request to be handled."
(let [cache-size (count store-data)
foreachitem (fn [cache-item]
(let [before (System/nanoTime)
result (cache/retrieve cache cache-item)
after (System/nanoTime)
diff_ms ((comp str float) (/ (- after before) 1000))]
;(send-off data-agent (fn [filepath]
;(file/insert-record filepath cache-size diff_ms)
;filepath))
))]
(doall (map foreachitem (keys store-data)))))
The (barrier/run-with-barrier) code simply spawns usercount number of threads and starts them at the same time (using an atom). The function I pass is the body of each thread.
The body willl simply map over a list named store-data, which is a key-value list (e.g., {:a 1 :b 2}. The length of this list in my code right now is 10. The number of users is 10 as well.
As you can see, the code for the agent send-off is commented out. This makes the code execute normally. However, when I enable the send-offs, even without writing to the file, the execution time is too slow.
Edit:
I made each thread, before he sends off to the agent, print a dot.
The dots appear just as fast as without the send-off. So there must be something blocking in the end.
Am I doing something wrong?

You need to call (shutdown-agents) when you're done sending stuff to your agent if you want the JVM to exit in reasonable time.
The underlying problem is that if you don't shutdown your agents, the threads backing its threadpool will never get shut down, and prevent the JVM from exiting. There's a timeout that will shutdown the pool if there's nothing else running, but it's fairly lengthy. Calling shutdown-agents as soon as you're done producing actions will resolve this problem.

Related

How to Properly Terminate a Thread which is Blocking (Lparallel Common Lisp)

In the Lparallel API, the recommended way to terminate all threaded tasks is to stop the kernel with (lparallel:end-kernel). But when a thread is blocking—eg, with (pop-queue queue1) waiting for an item to appear in the queue—it will still be active when the kernel is stopped. In this case (at least in SBCL) the kernel shutdown occasionally (but not every time) fails with:
debugger invoked on a SB-KERNEL:BOUNDING-INDICES-BAD-ERROR in thread
#<THREAD "lparallel" RUNNING {1002F04973}>:
The bounding indices 1 and NIL are bad for a sequence of length 0.
See also:
The ANSI Standard, Glossary entry for "bounding index designator"
The ANSI Standard, writeup for Issue SUBSEQ-OUT-OF-BOUNDS:IS-AN-ERROR
debugger invoked on a SB-SYS:INTERACTIVE-INTERRUPT in thread
#<THREAD "main thread" RUNNING {10012E0613}>:
Interactive interrupt at #x1001484328.
I’m assuming this has something to do with the blocking thread not terminating correctly. How should a blocking thread be properly terminated before shutting down the kernel? (The API says kill-tasks should only be used in exceptional circumstances, which I’m taking not to apply to this “normal” shutdown circumstance.)
The problem with killing a thread is that it might happen anywhere, when the thread could be in any unknown state.
The only way to safely terminate a thread it is to let it shutdown itself gracefully, meaning you expect that during normal operations, there is a way for the thread to know it should stop working. Then you can properly clean your resources, close databases, free foreign pointers, log all things, ...
The queues you are using have operations that can timeout, that is a simple yet safe way to ensure you can avoid blocking forever and exit properly. But that's not the only option (you can use them in addition to what is shown below).
 Shared / global flag
When a timeout occurs, or when you receive a message, you check a global boolean variable (or one that is shared among all interested threads). That's also a simple way to exit, and it can be read by multiple threads. This is however a concurrent access, so you should use locks or atomic operations (http://www.sbcl.org/manual/#Atomic-Operations), for example use defglobal and a fixnum type with atomic-incf, etc.
 Control messages
Send control data in the queues and use them to determine how to shutdown gracefully, and how to propagate the information down the pipes, or how to restart things. This is safe (just message-passing) and allows any kind of control you might want to implement in your thread.
(defpackage :so (:use :cl :bt :lparallel.queue))
(in-package :so)
Let's define two services.
The first one echoes back its input:
(defun echo (in out)
(lambda ()
(loop
for value = (pop-queue in)
do (push-queue value out)
until (eq value :stop))))
Notice how it is expected to finish properly when given a :stop input, and how it also propagates the :stop message to its output queue.
The second thread will perform a modular addition, and also sleeps a bit between requests:
(defun modulo-adder (x m in out)
(lambda ()
(loop
for value = (progn (sleep 0.02)
(pop-queue in))
do (push-queue (typecase value
(keyword value)
(number (mod (+ x value) m)))
out)
until (eq value :stop))))
Create queues:
(defparameter *q1* (make-queue))
(defparameter *q2* (make-queue))
Create threads:
(progn
(bt:make-thread (echo *q1* *q2*) :name "echo")
(bt:make-thread (modulo-adder 5 1024 *q2* *q1*) :name "adder"))
Both threads are connected to each others in a circular fashion, creating an infinite loop of additions. No value is currently exchanged between threads, and you can see them running for example with slime-list-threads or any other implementation-provided way; In any case (bt:all-threads) returns a list.
slime-list-threads
10 adder Running
11 echo Running
...
Add an item, now there is an infinite exchange of data between threads:
(push-queue 10 *q1*)
Wait, then stop them both:
(push-queue :stop *q1*)
Both threads stopped gracefully (they are no more visible in lists of threads).
We can inspect what remains in the queues (result vary from one test to another):
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(99 NIL)
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(:STOP NIL)
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(NIL NIL)
Interrupting a thread
You create a service, controlled by messages or a global flag, but then you have a bug and the thread hangs. Instead of killing it and lose everything, you want at least to unwind the thread stack properly. This is a dangerous too, but you can use bt:interrupt to stop a thread anywhere it is running right now and execute a function.
(define-condition stop () ())
(defun signal-stop ()
(signal 'stop))
(defun endless ()
(let ((output *standard-output*))
(lambda ()
(print "START" output)
(unwind-protect (handler-case (loop)
(stop ()
(print "INTERRUPTED" output)))
(print "STOP" output)))))
Start it:
(bt:make-thread (endless) :name "loop")
This prints "START" and loops.
Then we interrupt it:
(bt:interrupt-thread (find "loop"
(bt:all-threads)
:test #'string=
:key #'bt:thread-name)
#'signal-stop)
The following is printed:
"INTERRUPTED"
"STOP"
Those messages would not be printed if the thread was killed, but note that you could still manage to have corrupted data given how random the interruption is. Also, it can unblock blocking calls like sleep or pop-queue.

F# / MailBoxProcessor is unresponsive to PostAndReply under nearly 100% load

I have a MailBoxProcessor, which does the following things:
Main loop (type AsyncRunner: https://github.com/kkkmail/ClmFSharp/blob/master/Clm/ContGen/AsyncRun.fs#L257 – the line number may change as I keep updating the code). It generates some "models", compiles each of them into a model specific folder, spawns them as external processes, and then each model uses WCF to "inform" AsyncRunner about its progress by calling updateProgress. A model may take several days to run. Once any of the models is completed, the runner generates / spawns more. It is designed to run at 100% processor load (but with priority: ProcessPriorityClass.BelowNormal), though I can specify a smaller number of logical cores to use (some number between 1 and Environment.ProcessorCount). Currently I "async"-ed almost everything that goes inside MailBoxProcessor by using … |> Async.Start to ensure that I "never ever" block the main loop.
I can "ask" the runner (using WCF) about its state by calling member this.getState () = messageLoop.PostAndReply GetState.
OR I can send some commands to it (again using WCF), e.g. member this.start(), member this.stop(), …
Here is where it gets interesting. Everything works! However, if I run a "monitor", which would ask for a state by effectively calling PostAndReply (exposed as this.getState ()) in an infinite loop, the after a while it sort of hangs up. I mean that it does eventually return, but with some unpredictably large delays (like a few minutes). At that same time, I can issue commands and they do return fast while getState still has not returned.
Is it possible to make it responsive at nearly 100% load? Thanks a lot!
I would suggest not asyncing anything(other than your spawning of processes) in your main program, since your code creates additional processes. Your main loop is waiting on the loop return to continue before processing the GetState() method.

Achieving multiple locks in clojure

I'm new to Clojure and am writing a web application. It includes a function fn performed on user user-id which includes several steps of reading and writing to the database and file system. These steps cannot be performed simultaneously by multiple threads (will cause database and file system inconsistencies) and I don't believe they can be performed using a database transaction. However, they are specific to one user and thus can be performed simultaneously for different users.
Thus, if a http request is made to perform fn for a specific user-id I need to make sure that it is completed before any http requests can perform fn for this user-id
I've come up with a solution that seems to work in the REPL but have not tried it in the web server yet. However, being unexperienced with Clojure and threaded programming I'm not sure whether this is a good or safe way to solve the problem. The following code has been developed by trial-and-error and uses the locking function - which seems to go against the "no locks" philosophy of Clojure.
(ns locking.core)
;;; Check if var representing lock exists in namespace
;;; If not, create it. Creating a new var if one already
;;; exists seems to break the locking.
(defn create-lock-var
[var-name value]
(let [var-sym (symbol var-name)]
(do
(when (nil? (ns-resolve 'locking.core var-sym))
(intern 'locking.core var-sym value))
;; Return lock var
(ns-resolve 'locking.core var-sym))))
;;; Takes an id which represents the lock and the function
;;; which may only run in one thread at a time for a specific id
(defn lock-function
[lock-id transaction]
(let [lock (create-lock-var (str "lock-id-" lock-id) lock-id)]
(future
(locking lock
(transaction)))))
;;; A function to test the locking
(defn test-transaction
[transaction-count sleep]
(dotimes [x transaction-count]
(Thread/sleep sleep)
(println "performing operation" x)))
If I open three windows in REPL and execute these functions, it works
repl1 > (lock-function 1 #(test-transaction 10 1000)) ; executes immediately
repl2 > (lock-function 1 #(test-transaction 10 1000)) ; waits for repl1 to finish
repl2 > (lock-function 2 #(test-transaction 10 1000)) ; executes immediately because id=2
Is this reliable? Are there better ways to solve the problem?
UPDATE
As pointed out, the creation of the lock variable is not atomic. I've re-written the lock-function function and it seems to work (no need for create-lock-var)
(def locks (atom {}))
(defn lock-transaction
[lock-id transaction]
(let [lock-key (keyword (str "lock-id-" lock-id))]
(do
(compare-and-set! locks (dissoc #locks lock-key) (assoc #locks lock-key lock-id))
(future
(locking (lock-key #locks)
(transaction))))))
Note: Renamed the function to lock-transaction, seems more appropriate.
Don't use N vars in a namespace, use an atom wrapped around 1 hash-map mapping N symbols to N locks. This fixes your current race condition, avoids creating a bunch of silly vars, and is easier to write anyway.
Since you're making a web app, I have to warn you: even if you do manage to get in-process locking right (which is not easy in itself), it will be for nothing as soon as you deploy your web server on more than one machine (which is almost mandatory if you want your app to be highly-available).
So basically, if you want to use locking, you'd better use distributed locking. From this point on, this discussion is not Clojure-specific, since Clojure's concurrency tools won't be especially helpful here.
For distributed locking, you could use something like Zookeeper. If you don't want to set up a whole Zookeeper cluster just for this, maybe you can compromise by using a Redis database (the Carmine library gives you distributed locks out of the box), although last time I heard Redis locking is not 100% reliable.
Now, it seems to me locking is not especially a requirement, and is not the best approach, especially if you're striving for idiomatic Clojure. How about using a queue instead ? Some popular JVM message brokers (such as HornetQ and ActiveMQ) give you Message Grouping, which guarantees that messages of the same group-id will be processed (serially) by the same consumer. All you have to do is have some threads listen to the right queue and set the user-id as the group id for your messages.
HACK: If you don't want to set up a distributed message broker, maybe you could get around by enabling sticky sessions on you load balancer, and using such a message broker in-VM.
By the way, don't name your function fn :).

Upper limit for number of jobs in a go block?

Here is the code:
(ns typedclj.async
(:require [clojure.core.async
:as a
:refer [>! <! >!! <!!
go chan buffer
close! thread
alts! alts!! timeout]]
[clj-http.client :as -cc]))
(time (dorun
(let [c (chan)]
(doseq [i (range 10 1e4)]
(go (>! c i))))))
And I got an error:
Exception in thread "async-dispatch-12" java.lang.AssertionError: Assert failed: No more than 1024 pending puts are allowed on a single channel. Consider using a windowed buffer.
(< (.size puts) impl/MAX-QUEUE-SIZE)
at clojure.core.async.impl.channels.ManyToManyChannel.put_BANG_(channels.clj:150)
at clojure.core.async.impl.ioc_macros$put_BANG_.invoke(ioc_macros.clj:959)
at typedclj.async$eval11807$fn__11816$state_machine__6185__auto____11817$fn__11819.invoke(async.clj:19)
at typedclj.async$eval11807$fn__11816$state_machine__6185__auto____11817.invoke(async.clj:19)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:940)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:944)
at typedclj.async$eval11807$fn__11816.invoke(async.clj:19)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)...
According to http://martintrojer.github.io/clojure/2013/07/07/coreasync-and-blocking-io/
... This will break the 1 job = 1 thread knot, thus this thread
parking will allow us to scale the number of jobs way beyond any
thread limit on the platform (usually around 1000 on the JVM).
core.async gives (blocking) channels and a new (unbounded) thread pool
when using 'thread'. This (in effect) is just some sugar over using
java threads (or clojure futures) and BlockingQueues from
java.util.concurrent. The main feature is go blocks in which threads
can be parked and resumed on the (potentially) blocking calls dealing
with core.async's channels...
Is 1e4 jobs already too many? What is the upper limit then?
I don't usually rant like this so I hope you will forgive me this one transgression:
In a more prefect world every programmer would repeat to themselves "there is no such thing as an unbounded queue" five times before sleeping and first thing upon waking. This mode of thinking requires firguring out how backpressure will be handled in your system so when there is a slowdown somewhere in the process the parts before that have a way to find out about it and slow themselves down in response. In core.async the default back pressure is immediate because the default buffer size is zero. No go block succeeds in putting something into a chan until someone is ready to consume it.
chans look basically like this:
"queue of pending puts" --> buffer --> "queue of pending takes"
The putter and taker queues are intended to allow time for the two processes that are communicating via this pipe to schedule themselves so progress can be made. Without these there would be no room for threads to schedule and deadlocks would happen. They are NOT intended to be used as the buffer. thats what the buffer in the middle is for, and this was the design behind making that the only one that has a explicit size. explicitly set the buffer size for your system by setting the size of the buffer in the chan:
user> (time (dorun
(let [c (chan 1e6)]
(doseq [i (range 10 1e4)]
(go (>! c i))))))
"Elapsed time: 83.526679 msecs"
nil
In this case I have "calculated" that my system as a whole will be in a good state if there are up to a million waiting jobs. Of course your real world expierences will be different, and very much unique to your situation.
Thanks for your patience,
The limit of unconsumed puts is the size of the channels buffer plus the size of the queue.
The queue size in core.async is limited to 1024 but one should not rely on that.

Canonical Way to Ensure Only One Instance of a Service Is Running / Starting / Stopping in Clojure?

I'm writing a stateful server in Clojure backed by Neo4j that can serve socket requests, like HTTP. Which means, of course, that I need to be able to start and stop socket servers from within this server. Design-wise, I would want to be able to declare a "service" within this server and start and stop it.
What I'm trying to wrap my mind around in Clojure is how to ensure that starting and stopping these services is thread-safe. This server I'm writing will have NREPL embedded inside it and process incoming requests in a parallel way. Some of these requests will be administrative: start service X, stop service Y. Which opens up the possibility that two start requests come in at the same time.
Starting should synchronously check a "running" flag and a "starting" flag and fail if either are set. In the same transaction, the "starting" flag should be set.
After the "starting" flag is set, the transaction closes. That makes the "starting" flag visible to other transactions.
Then the (start) function actually starts the service.
If (start) succeeds, the "running" and "starting" flags are synchronously set.
If (start) fails, the "starting" flag is set and the exception is returned.
Stopping needs the same thing, checking a "running" flag and checking and setting it's own "stopping" flag.
I'm trying to reason through all possible combinations of (start) and (stop).
Have I missed anything?
Is there a library for this already? If not, what should a library like this look like? I'll open source it and put it on Github.
Edit:
This is what I have so far. There's a hole I can see though. What am I missing?
(ns extenium.db
(:require [clojure.tools.logging :as log])
(:import org.neo4j.graphdb.factory.GraphDatabaseFactory))
(def ^:private
db- (ref {:ref nil
:running false
:starting false
:stopping false}))
(defn stop []
(dosync
(if (or (not (:running (ensure db-)))
(:stopping (ensure db-)))
(throw (IllegalStateException. "Database already stopped or stopping."))
(alter db- assoc :stopping true)))
(try
(log/info "Stopping database")
(.shutdown (:ref db-))
(dosync
(alter db- assoc :ref nil))
(log/info "Stopped database")
(finally
(dosync
(alter db- assoc :stopping false)))))
In the try block, I log, then call .shutdown, then log again. If the first log fails (I/O exceptions can happen), then (:stopping db-) is set to false, which unblocks it and is fine. .shutdown is a void function from Neo4j, so I don't have to evaluate a return value. If it fails, (:stopping db-) is set to false, so that's fine too. Then I set the (:ref db-) to nil. What if that fails? (:stopping db-) is set to false, but the (:ref db-) is left hanging. So that's a hole. Same case with the second log call. What am I missing?
Would this be better if I just used Clojure's locking primitives instead of a ref dance?
This is actually a natural fit for a simple lock:
(locking x
(do-stuff))
Here x is the object on which to synchronize.
To elaborate: starting and stopping a service is a side effect; side effects should not be initiated from inside a transaction, except possibly as Agent actions. Here though locks are exactly what the design calls for. Note that there's nothing wrong in using them in Clojure when they are a good fit for the problem at hand, in fact I would say locking is the canonical solution here. (See Stuart Halloway's Lancet, introduced in Programming Clojure (1st ed.), for an example of a Clojure library using locks which has seen some widespread use, mostly in Leiningen.)
Update: Adding fail-fast behaviour:
This is still a good fit for a lock, namely a java.util.concurrent.locks.ReentrantLock (follow link for Javadoc):
(import java.util.concurrent.locks.ReentrantLock)
(def lock (ReentrantLock.))
(defn start []
(if (.tryLock lock)
(try
(do-stuff)
(finally (.unlock lock)))
(do-other-stuff)))
(do-stuff) will be executed if lock acquisition succeeds; otherwise, (do-other-stuff) will happen. Current thread will not block in either case.
This sounds like a good use case for agents, they allow you to serialize changes to a piece of mutable state, the Clojure Agents documentation has a good overview.
You can use the error handler and agent-error methods to handle exceptions and never need to worry about locks or race conditions.
(def service (agent {:status :stopped}))
(defn start-service [{:keys [status] :as curr}]
(if (= :stopped status)
(do
(println "starting service")
{:status :started})
(do
(println "service already running")
curr)))
;; start the service like this
(send-off service start-service)
;; gets the current status of the service
#service

Resources