How to initialize atom in thread safe manner? - multithreading

Let's say I have an atom:
(def my-atom (atom nil))
Then I initialize it as follows:
(defn init-atom [init-value]
(when (nil? #my-atom)
(reset! my-atom init-value)))
If init-atom is called concurrently from different threads, I race condition may occur. I'm looking for a way to safely and correctly initialize an atom. Anything there?
UPD:
Actually I'm initializing it as follows:
(defn init-atom [produce-init-fn]
(when (nil? #my-atom)
(reset! my-atom (produce-init-fn)])))
produce-init-fn may contain side effects.

The following will make sure that the atom is initialized only once:
(defn init-atom [init-value]
(swap! my-atom #(when (nil? %) init-value)))
Atom and swap! semantics guarantee that the function passed to swap! will be executed atomically.
If you pass a function producing the init value then it won't work as swap! might invoke the function multiple times in case of conflicting transactions. Then you need to use some kind of locking like in the other answer:
(let [o (Object.)]
(defn init-atom [init-value-fn]
(locking o
(swap! my-atom #(when (nil? %) (init-value-fn))))))
init-value-fn still might be called more than once if there are other concurrent transactions with my-atom.
If you need to support lazy initialization and the init-value-fn is known upfront and the same for all the threads you can just wrap it into delay and then it will be called only once and its result will be cached and reused:
(def my-init-value (delay init-value-fn))
(defn init-atom []
(swap! my-atom #(when (nil? %) #my-init-value)))

This should do the trick:
(let [o (Object.)]
(defn init-atom [init-value]
(locking o
(when (nil? #my-atom)
(reset! my-atom init-value)))))

Related

swap! value in atom (nested-map) Clojure

I'm trying to update a nested counter in an atom (a map) from multiple threads, but getting unpredictable results.
(def a (atom {:id {:counter 0}}))
(defn upvote [id]
(swap! a assoc-in [(keyword id) :counter] (inc (get-in #a [(keyword id) :counter])))
)
(dotimes [i 10] (.start (Thread. (fn [] (upvote "id")))))
(Thread/sleep 12000)
(prn #a)
I'm new to Clojure so very possible I'm doing something wrong, but can't figure out what. It's printing a counter value with results varying from 4-10, different each time.
I want to atomically update the counter value and hoped that this approach would always give me a counter value of 10. That it would just retry upon failure and eventually get to 10.
It's for an up-vote function that can get triggered concurrently.
Can you see what I'm doing wrong here?
You are updating the atom non-atomically in your code. You first get its value by #a, and then apply it using the swap function. The value may change in between.
The atomic way to update the value is to use a pure function within swap, without referring to the previous atom value via #:
(defn upvote [id]
(swap! a update-in [(keyword id) :counter] inc))

Clojure how to get access to one field from two threads?

Can't understand multithreading in clojure. Can't find examples of REAL multithreading. Most samples with atoms, refs, vars are singlethreaded. So, I have a quest. Two threads gaining access to one field, each thread can change it. I use atom for this purpose, so the Code is:
(do
(def field (atom "v0"))
(defn f1 []
(dotimes [i 100000]
(if (= i 9999)
(reset! field "v1"))))
(defn f2 []
(dotimes [i 100000]
(if (= i 777)
(reset! field "v2"))))
(do
(deref (future (Thread/sleep 10) (f1))
0 f2)
(prn #field)))
But nothing, the value of field is "v0". How to make normal twothreaded example with cycles in each thread and with access to variable???
watch the docs of deref:
clojure.core/deref
([ref] [ref timeout-ms timeout-val])
returns the in-transaction-value of ref, else returns the
most-recently-committed value of ref. When applied to a var, agent
or atom, returns its current state. When applied to a delay, forces
it if not already forced. When applied to a future, will block if
computation not complete. When applied to a promise, will block
until a value is delivered. The variant taking a timeout can be
used for blocking references (futures and promises), and will return
timeout-val if the timeout (in milliseconds) is reached before a
value is available. See also - realized?.
so your timeout is 0, that means it will return default value
which is f2 - a function value (not a function call), which is not being called obviously, so no reset! ever happens.
if you want "v1" you should deref like:
(deref (future (Thread/sleep 10) (f1)) 100 (f2))
if you want "v2":
(deref (future (Thread/sleep 10) (f1)) 0 (f2))

Clojure wait for condition without spinning

I'm implementing a mechanism for a thread to have a queue that contains messages. The queue is built using LinkedBlockingQueue from java.util.concurrent. What I want to achieve is something like the following.
Thread with mailbox:
defn work:
* do some stuff
* Get the head of the queue (a message):
- if it is "hello":
<do some stuff>
<recur work fn>
- if it is "bye":
<do some stuff>
- if it is none of the above, add the message to the back of queue
and restart from "Get the head of the queue"
* <reaching this point implies terminating the thread>
My first idea that I tried to implement was using a loop wrapped around the * Get the head of the queue, use a conditional to check the message and add it to the queue in an :else branch if it did not match any of the clauses. The downside of this was that calling recur in any of the bodies of the clauses of the cond would always recur the loop, while using recur (e.g., like in the hello case) means recur the function (i.e., work). So that was not an option. An other downside would be that, in case it would take a long time for such message to arrive the thread would indefinitely spin and eat resources.
The next idea I had (but have not yet implemented) is using a future. The scheme would be as follows.
* Get all the matches I have to match (i.e., "hello" and "bye")
* Start a future and pass it the list of messages:
* While the queue does not contain any of the messages
recur
* when found, return the first element that matches.
* Wait for the future to deliver.
* if it is "hello":
<do some stuff>
<recur work fn>
if it is "bye":
<do some stuff>
When doing it this way I get almost what I want:
Receiving either "hello" or "bye" blocks until I have either one.
I can make an indefinite number of clauses to match the message
I have extracted the looping behaviour into a future that blocks, which
has the nice side-effect that each time I evaluate my cond
I'm sure I have a matching message and don't have to worry about retrying.
One thing I really would like, but can't imagine how to achieve, is that the future in this case does not spin. As it stands it would keep eating up precious CPU resources traversing the queue indefinitely, while it might be perfectly normal to never receive one of the messages it is looking for.
Perhaps it would make sense to abandon the LinkedBlockedQueue and trade it in for a data structure that has a method, say, getEither(List<E> oneOfThese) that blocks until one of these elements in available.
An other thought I had, which is a way I could possibly do it in Java, is having the aforementioned getEither() operation on the queue that calls wait() if none of the elements are in the queue. When an other thread puts a message in the queue I can call notify() so that each thread will check the queue against his list of wanted messages.
Example
The code below works fine. However, it has the spinning problem. It's basicly a very elementary example of what I'm trying to achieve.
(def queue (ref '()))
(defn contains-element [elements collection]
(some (zipmap elements (repeat true)) collection))
(defn has-element
[col e]
(some #(= e %) col))
(defn find-first
[f coll]
(first (filter f coll)))
; This function is blocking, which is what I want.
; However, it spins and thus used a LOT of cpu,
; whit is *not* what I want..
(defn get-either
[getthese queue]
(dosync
(let [match (first (filter #(has-element getthese %) #queue))
newlist (filter #(not= match %) #queue)]
(if (not (nil? match))
(do (ref-set queue newlist)
match)
(Thread/sleep 500)
(recur)))))
(defn somethread
[iwantthese]
(let [element (get-either iwantthese queue)
wanted (filter #(not= % element) iwantthese)]
(println (str "I got " element))
(Thread/sleep 500)
(recur wanted)))
(defn test
[]
(.start (Thread. (fn [] (somethread '(3 4 5)))))
(dosync (alter queue #(cons 1 %)))
(println "Main: added 1")
(Thread/sleep 1000)
(dosync (alter queue #(cons 2 %)))
(println "Main: added 2")
(Thread/sleep 1000)
(dosync (alter queue #(cons 3 %)))
(println "Main: added 3")
(Thread/sleep 1000)
(dosync (alter queue #(cons 4 %)))
(println "Main: added 4")
(Thread/sleep 1000)
(dosync (alter queue #(cons 5 %)))
(println "Main: added 5")
)
Any tips?
(In case anyone noticed, yes, this is like actors and the purpose is an implementation in Clojure for academic purposes)
You need 2 queues instead of one: incoming queue and a "dead-letter" queue.
A "thread" should read from the incoming queue in a blocking way ( LinkedBlockingQueue.take(), core.async/<! or using agents).
If message doesn't match any clause:
Put message on end of dead queue
Go to 1.
If message matches a clause:
Run the clause work
For each message in the dead queue, match against clauses, removing the ones that are matched.
go to 1.
See below for the two implementations.
Agents
Agents are quite similar to actors, the "only" difference is that you send data/messages to actors but you send functions to agents. A possible implementation would be:
(defn create-actor [behaviour]
(agent {:dead-queue []
:behaviour behaviour}))
dead-queue will contain messages that didn't match any of the clauses. This is basically your "end of the queue".
behaviour should be some map/vector of match-fn to fn to run. In my particular implementation, I have chosen a map, where keys are the element to match and values are the fn to run when the new item matches:
(def actor (create-actor {3 println
4 (partial println "Got a ")
5 #(println "Got a " %)}))
You will probably require a more sophisticated behaviour data structure. The only thing important is to know if the element was processed or not, so you know if the element has to go to the dead queue or not.
To send messages to the actor:
(defn push [actor message]
(send actor
(fn [state new-message]
(if-let [f (get-in state [:behaviour new-message])]
(do
(f new-message)
state)
(update-in state [:dead-queue] conj new-message)))
message))
So if there is a match on the behaviour, the message is processed immediately. If not, it is stored in the dead queue. You could in try to match/process all the messages in the dead queue after processing the new message if you expected that the behaviours are not pure functions. In this example implementation this is not possible.
We could change the behaviour of the actor to give the messages on the dead queue a chance to be processed:
(defn change-behaviour [actor behaviour]
(send actor
(fn [state new-behaviour]
(let [to-process (filter new-behaviour (:dead-queue state))
new-dead-queue (vec (remove (set to-process) (:dead-queue state)))]
(doseq [old-message to-process
:let [f (get new-behaviour old-message)]]
(f old-message))
{:behaviour new-behaviour
:dead-queue new-dead-queue}))
conds))
And an example of using it:
(push actor 4)
(push actor 18)
(push actor 1)
(push actor 18)
(push actor 5)
(change-behaviour actor {18 (partial println "There was an")})
And the same solution based on core.async:
(defn create-actor [behaviour]
(let [queue (async/chan)]
(async/go-loop [dead-queue []
behaviour behaviour]
(let [[type val] (async/<! queue)]
(if (= type :data)
(if-let [f (get behaviour val)]
(do
(f val)
(recur dead-queue behaviour))
(recur (conj dead-queue val) behaviour))
(let [to-process (filter val dead-queue)
new-dead-queue (vec (remove (set to-process) dead-queue))]
(doseq [old-msg to-process
:let [f (get val old-msg)]]
(f old-msg))
(recur new-dead-queue val)))))
queue))
(defn push [actor message]
(async/go
(async/>! actor [:data message])))
(defn change-behaviour [actor behaviour]
(async/go
(async/>! actor [:behaviour behaviour])))
Have you considered using core.async? It provides what you need in a lightweight way.

Clojure core.async, CPU hangs after timeout. Anyway to properly kill macro thread produced by (go..) block?

Based on core.async walk through example, I created below similar code to handle some CPU intensive jobs using multiple channels with a timeout of 10 seconds. However after the main thread returns, the CPU usage remains around 700% (8 CPUs machine). I have to manually run nrepl-close in emacs to shut down the Java process.
Is there any proper way to kill macro thread produced by (go..) block ? I tried close! each chan, but it doesn't work. I want to make sure CPU usage back to 0 by Java process after main thread returns.
(defn [] RETURNED-STR-FROM-SOME-CPU-INTENSE-JOB (do... (str ...)))
(let [n 1000
cs (repeatedly n chan)]
(doseq [c cs]
(go
(>! c (RETURNED-STR-FROM-SOME-CPU-INTENSE-JOB ))))
(dotimes [i n]
(let [[result source] (alts!! (conj cs (timeout 10000))) ] ;;wait for 10 seconds for each job
(if (list-contains? cs source) ;;if returned chan belongs to cs
(prn "OK JOB FINISHED " result)
(prn "JOB TIMEOUT")
)))
(doseq [i cs]
(close! i)) ;;not useful for "killing" macro thread
(prn "JOBS ARE DONE"))
;;Btw list-contains? function is used to judge whether an element is in a list
;;http://stackoverflow.com/questions/3249334/test-whether-a-list-contains-a-specific-value-in-clojure
(defn list-contains? [coll value]
(let [s (seq coll)]
(if s
(if (= (first s) value) true (recur (rest s) value))
false)))
In REPL there seems to be no clean way yet.
I first tried a very dirty way by using deprecated method Thread.stop
(doseq [i #threadpool ]
(.stop i))
It seemed worked as CPU usage dropped once the main thread returned to REPL, but if I run the program again in REPL, it'd just hang at the go block part!!
Then I googled around and found this blog and it says
One final thing to note: we don't explicitly do any work to shutdown the go routines. Go routines will automatically stop operation when the main function exits. Thus, go routines are like daemon threads in the JVM (well, except for the "thread" part ...)
So I tried again by making my project into a uberjar and run it on a command console, and it turned out that CPU usage would drop immediately when blinking cursor returns to the console!
Based on answer for another related question How to control number of threads in (go...), I've found a better way to properly kill all the threads started by (go...) block:
First alter the executor var and supply a custom thread pool
;; def, not defonce, so that the executor can be re-defined
;; Number of threads are fixed to be 4
(def my-executor
(java.util.concurrent.Executors/newFixedThreadPool
4
(conc/counted-thread-factory "my-async-dispatch-%d" true)))
(alter-var-root #'clojure.core.async.impl.dispatch/executor
(constantly (delay (tp/thread-pool-executor my-executor))))
Then call .shutdownNow and .awaitTermination method at the end of (go...) block
(.shutdownNow my-executor)
(while (not (.awaitTermination my-executor 10 java.util.concurrent.TimeUnit/SECONDS ) )
(prn "...waiting 10 secs for executor pool to finish") )
[UPDATE]
The shutdown executor method above seems not pure enough. The final solution for my case is to send a function with control of its own timeout into go block, using thunk-timeout function. Credits go to this post. Example below
(defn toSendToGo [args timeoutUnits]
(let [result (atom nil)
timeout? (atom false)]
(try
( thunk-timeout
(fn [] (reset! result (myFunction args))) timeoutUnits)
(catch java.util.concurrent.TimeoutException e (do (prn "!Time out after " timeoutUnits " seconds!!") (reset! timeout? true)) ))
(if #timeout? (do sth))
#result))
(let [c ( chan)]
(go (>! c (toSendToGo args timeoutUnits))))
(shutdown-agents)
Implementation-specific, JVM: both agents and channels use a global thread pool, and the termination function for agents iterates and closes all open threads in the VM. Empty the channels first: this action is immediate and non-reversible (especially if you are in a REPL).

Alternate version of swap! also returning swapped out value

I talked about this a bit on IRC's #clojure channel today but would like to go more in detail here. Basically, in order to better understand atoms, swap!, deref and Clojure concurrency as a whole, I'd like to try to write a function which not only returns the value that was swapped-in using swap!, but also the value that was swapped out.
(def foo (atom 42))
.
.
.
((fn [a]
(do
(println "swapped out: " #a)
(println "swapped in: "(swap! a rand-int)))) foo)
may print:
swapped out: 42
swapped in: 14
However if another thread does swap! the same atom between the #a deref and the call to swap! then I may be swapping out a value that is not 42.
How can I write a function which gives back correctly both values (the swapped out and the swapped in)?
I don't care about the various values that the atom does change to: all I want to know is what was the value swapped out.
Can this be written using code that is guaranteed not to deadlock and if so why?
Clojure's swap! is just a spinning compare-and-set. You can define an alternate version that returns whatever you like:
(defn alternate-swap [atom f & args]
(loop []
(let [old #atom
new (apply f old args)]
(if (compare-and-set! atom old new)
[old new] ; return value
(recur)))))
Atoms are un-coordinated so it seems likely that any attempt to do this outside of the swapping function it's self will likely fail. You could write a function that you call instead of swap! which constructs a function that saves the existing value before applying the real function, and then pass this constructed function to swap!.
user> (def foo (atom []))
#'user/foo
user> (defn save-n-swap! [a f & args]
(swap! a (fn [old-val]
(let [new-val (apply f (cons old-val args))]
(println "swapped out: " old-val "\n" "swapped in: " new-val)
new-val))))
#'user/save-n-swap!
user> (save-n-swap! foo conj 4)
swapped out: []
swapped in: [4]
[4]
user> (save-n-swap! foo conj 4)
swapped out: [4]
swapped in: [4 4]
[4 4]
This example prints it, It would also make sense to push them to a changelog stored in another atom
If you want the return value, Stuart answer is the correct one, but if you are just going to do a bunch of println to understand how atoms/refs work, I would recommend to add a watch to the atom/ref http://clojuredocs.org/clojure_core/1.2.0/clojure.core/add-watch
(add-watch your-atom :debug (fn [_ _ old new] (println "out" old "new" new)))
You could use a macro like:
(defmacro swap!-> [atom & args]
`(let [old-val# (atom nil)
new-val# (swap! ~atom #(do
(swap! old-val# (constantly %))
(-> % ~args)))]
{:old #old-val# :new new-val#}))
(def data (atom {}))
(swap!-> data assoc :a 3001)
=> {:new {:a 3001} :old {}}
Refer to swap-vals! available since 1.9: https://clojuredocs.org/clojure.core/swap-vals%21
You could rely on a promise to store the current value inside the swap! operation. Then you return the new and old value in a vector, as follows:
(defn- swap-and-return-old-value!
[^clojure.lang.IAtom atom f & args]
(let [old-value-promise (promise)
new-value (swap! atom
(fn [old-value]
(deliver old-value-promise old-value)
(apply f old-value args)))]
[new-value #old-value-promise]))

Resources