How do I do I/O in Julia distributed for loop run non-interactively? - io

With the following code in foo.jl:
using Distributed
#distributed for k = 1:4
println("Creating file ", k)
write(string("file_", k), "foo")
end
executing include("foo.jl") in the REPL prints the expected lines and creates the expected files, but when I step out of the REPL and run
julia foo.jl
Nothing is written and no files are created.
Why is there a difference and what is required for the script to run as expected outside of the REPL?

As stated in the documentation:
Note that without a reducer function, #distributed executes asynchronously, i.e. it spawns independent tasks on all available workers and returns immediately without waiting for completion. To wait for completion, prefix the call with #sync
This is what happens, in your case (which does not use a reducer function): #distributed spawns tasks and returns immediately. Your tasks are so short that in the REPL you get to see them almost immediately, and don't really notice the difference with a synchronous process (except that perhaps the output is mixed with the REPL prompt).
In a script however, the julia main process terminates immediately after having spawned the tasks, without having any occasion to actually run them. And you don't get to see the output.
As advised by the documentation, use #sync to wait for the tasks to complete before existing:
using Distributed
#sync #distributed for k = 1:4
println("Creating file ", k)
write(string("file_", k), "foo")
end

#distributed isn't blocking, so the Julia process finishes before any files are written. Try
using Distributed
#sync #distributed for k = 1:4
println("Creating file ", k)
write(string("file_", k), "foo")
end
instead.

Related

Julia #threads single

Is there something in Julia Threads similar to a single command in OpenMP that will ensure all threads wait before a particular block of code and then execute that block in only one thread? I have a loop that distributes calculations of forces between threads before performing an update to all locations at once, and I cannot find any feature to achieve this without terminating the #threads loop.
You can use locks:
function f()
l = Threads.SpinLock()
x = 0
Threads.#threads for i in 1:10^7
Threads.lock(l)
x += 1 # this block is executed only in one thread
Threads.unlock(l)
end
return x
end
Note that the SpinLock mechanism is dedicated to non-blocking codes (that is computations only, no I/O in the loop). If there is I/O involved ReentrantLock should be used instead.

F# / MailBoxProcessor is unresponsive to PostAndReply under nearly 100% load

I have a MailBoxProcessor, which does the following things:
Main loop (type AsyncRunner: https://github.com/kkkmail/ClmFSharp/blob/master/Clm/ContGen/AsyncRun.fs#L257 – the line number may change as I keep updating the code). It generates some "models", compiles each of them into a model specific folder, spawns them as external processes, and then each model uses WCF to "inform" AsyncRunner about its progress by calling updateProgress. A model may take several days to run. Once any of the models is completed, the runner generates / spawns more. It is designed to run at 100% processor load (but with priority: ProcessPriorityClass.BelowNormal), though I can specify a smaller number of logical cores to use (some number between 1 and Environment.ProcessorCount). Currently I "async"-ed almost everything that goes inside MailBoxProcessor by using … |> Async.Start to ensure that I "never ever" block the main loop.
I can "ask" the runner (using WCF) about its state by calling member this.getState () = messageLoop.PostAndReply GetState.
OR I can send some commands to it (again using WCF), e.g. member this.start(), member this.stop(), …
Here is where it gets interesting. Everything works! However, if I run a "monitor", which would ask for a state by effectively calling PostAndReply (exposed as this.getState ()) in an infinite loop, the after a while it sort of hangs up. I mean that it does eventually return, but with some unpredictably large delays (like a few minutes). At that same time, I can issue commands and they do return fast while getState still has not returned.
Is it possible to make it responsive at nearly 100% load? Thanks a lot!
I would suggest not asyncing anything(other than your spawning of processes) in your main program, since your code creates additional processes. Your main loop is waiting on the loop return to continue before processing the GetState() method.

Call MEX function without blocking main thread

In my Matlab code, I call a MEX function that takes a few seconds to execute (feature extraction with Caffe, http://caffe.berkeleyvision.org/). I was wondering if there is a way of calling this function without blocking Matlab's main thread, so I can run other Matlab commands simultaneously while waiting for it to finish.
Would it be possible, for example, to launch the MEX call in another thread the Parallel Computing Toolbox?
Without editing the MEX file, one way is with batch:
c = parcluster();
job = batch(c, #myMEXfun, numOutputs, {myinput1,myinput2});
% do something else in MATLAB
job.wait();
out = job.fetchOutputs();
There are also possibilities with parfeval:
p = gcp();
f = parfeval(#myMEXfun, numOutputs, myinput1, myinput2);
% do something else in MATLAB
out = fetchOutputs(f); % Blocks until complete
They both allow asynchronous execution.
This can also be done without the Parallel Computing Toolbox, but with significant changes to the MEX file source to create a thread and with additional syntax to check for completion and retrieve outputs.

Clojure - Using agents slows down execution too much

I am writing a benchmark for a program in Clojure. I have n threads accessing a cache at the same time. Each thread will access the cache x times. Each request should be logged inside a file.
To this end I created an agent that holds the path to the file to be written to. When I want to write I send-off a function that writes to the file and simply returns the path. This way my file-writes are race-condition free.
When I execute my code without the agent it finished in a few miliseconds. When I use the agent, and ask each thread to send-off to the agent each time my code runs horribly slow. I'm talking minutes.
(defn load-cache-only [usercount cache-size]
"Test requesting from the cache only."
; Create the file to write the benchmark results to.
(def sink "benchmarks/results/load-cache-only.txt")
(let [data-agent (agent sink)
; Data for our backing store generated at runtime.
store-data (into {} (map vector (map (comp keyword str)
(repeat "item")
(range 1 cache-size))
(range 1 cache-size)))
cache (create-full-cache cache-size store-data)]
(barrier/run-with-barrier (fn [] (load-cache-only-work cache store-data data-agent)) usercount)))
(defn load-cache-only-work [cache store-data data-agent]
"For use with 'load-cache-only'. Requests each item in the cache one.
We time how long it takes for each request to be handled."
(let [cache-size (count store-data)
foreachitem (fn [cache-item]
(let [before (System/nanoTime)
result (cache/retrieve cache cache-item)
after (System/nanoTime)
diff_ms ((comp str float) (/ (- after before) 1000))]
;(send-off data-agent (fn [filepath]
;(file/insert-record filepath cache-size diff_ms)
;filepath))
))]
(doall (map foreachitem (keys store-data)))))
The (barrier/run-with-barrier) code simply spawns usercount number of threads and starts them at the same time (using an atom). The function I pass is the body of each thread.
The body willl simply map over a list named store-data, which is a key-value list (e.g., {:a 1 :b 2}. The length of this list in my code right now is 10. The number of users is 10 as well.
As you can see, the code for the agent send-off is commented out. This makes the code execute normally. However, when I enable the send-offs, even without writing to the file, the execution time is too slow.
Edit:
I made each thread, before he sends off to the agent, print a dot.
The dots appear just as fast as without the send-off. So there must be something blocking in the end.
Am I doing something wrong?
You need to call (shutdown-agents) when you're done sending stuff to your agent if you want the JVM to exit in reasonable time.
The underlying problem is that if you don't shutdown your agents, the threads backing its threadpool will never get shut down, and prevent the JVM from exiting. There's a timeout that will shutdown the pool if there's nothing else running, but it's fairly lengthy. Calling shutdown-agents as soon as you're done producing actions will resolve this problem.

Lua Script coroutine

Hi need some help on my lua script. I have a script here that will run a server like application (infinite loop). Problem here is it doesn't execute the second coroutine.
Could you tell me whats wrong Thank you.
function startServer()
print( "...Running server" )
--run a server like application infinite loop
os.execute( "server.exe" )
end
function continue()
print("continue")
end
co = coroutine.create( startServer() )
co1 = coroutine.create( continue() )
Lua have cooperative multithreading. Threads are not swtiched automatically, but must yield to others. When one thread is running, every other thread is waiting for it to finish or yield. Your first thread in this example seems to run server.exe, which, I assume, never finishes until interrupted. Thus second thread never gets its turn to run.
You also run threads wrong. In your example you're not running any threads at all. You execute function and then would try to create coroutine with its output, which naturally would fail. But since you never get back from server.exe you didn't notice this problem yet. Remove those brackets after startServer and continue to fix it.
As already noted, there are several issues with the script that prevent you from getting what you want:
os.execute("...") is blocked until the command is completed and in your case it doesn't complete (as it runs an infinite loop). Solution: you need to detach that process from yours by using something like io.popen() instead of os.execute()
co = coroutine.create( startServer() ) doesn't create a coroutine in your case. coroutine.create call accepts a function reference and you pass it the result of startServer call, which is nil. Solution: use co = coroutine.create( startServer ) (note that parenthesis are dropped, so it's not a function call anymore).
You are not yielding from your coroutines; if you want several coroutines to work together, they need to be cooperating by giving control to each other when appropriate. That's what yield command is for and that's why it's called non-preemptive multithreading. Solution: you need to use a combination of resume and yield calls after you create your coroutine.
startServer doesn't need to be a coroutine as you are not giving control back to it; its only purpose is to start the server.
In your case, the solution may not even need coroutines as all you need to do is: (1) start the server and let it detach from your process (for example, using popen) and (2) work with your process using whatever communication protocol it requires (pipes, sockets, etc.).
There are more complex and complete solutions (like LuaLanes) and also several good descriptions on creating simple coroutine dispatchers.
Your coroutine is not yielding

Resources