This code is really pushing the limits of my understanding so bear with me.
Previously I implemented coroutines in Racket in the following code:
;; Coroutine definition
(define (make-generator procedure)
(define last-return values)
(define last-value #f)
(define status 'suspended)
(define (last-continuation _)
(let ([result (procedure yield)])
(last-return result)))
(define (yield value)
(call/cc (lambda (continuation)
(set! last-continuation continuation)
(set! last-value value)
(set! status 'suspended)
(last-return value))))
(lambda args
(call/cc (lambda (return)
(set! last-return return)
(cond ((null? args)
(let ()
(set! status 'dead)
(last-continuation last-value)))
((eq? (car args) 'coroutine?) 'coroutine)
((eq? (car args) 'status?) status)
((eq? (car args) 'dead?) (eq? status 'dead))
((eq? (car args) 'alive?) (not (eq? status 'dead)))
((eq? (car args) 'kill!) (set! status 'dead))
(#t (apply last-continuation args)))))))
;;Define a function that will return a suspended coroutine created from given args and body forms
(define-syntax (define-coroutine stx)
(syntax-case stx ()
((_ (name . args) . body )
#`(define (name . args)
(make-generator
(lambda (#,(datum->syntax stx 'yield))
. body))))))
What I want to do is implement an exception handler (with-handlers) that calls the (yield) function. The idea is a second thread can send a signal to the thread evaluating the coroutine forcing it to yield when its running for too long.
I've tried the following in the args lambda, which successfully returned early but later evaluations of the coroutine (my-coroutine 'dead?) returned that the coroutine was in the 'dead state:
(with-handlers
([exn:break?
(lambda (break)
(yield 'coroutine-timeout))])
(break-enabled #t) ;register for yield requests from coroutine manager thread
(last-continuation last-value))))
Alternatively, I've tried the following, but it didn't produce a procedure that can be applied to arguments:
(with-handlers
([exn:break?
(lambda (break)
(set! last-continuation (exn:break-continuation break))
(set! last-value 'coroutine-timeout)
(set! status 'suspended)
(last-return 'coroutine-timeout))])
(break-enabled #t) ;register for yield requests from coroutine manager thread
(last-continuation last-value))))
I'm trying to understand how continuations and exceptions interact/block each other. It seems like I may need to use Parameters somehow?
How can I successfully write a signal handler that will (yield) correctly so that I can resume the coroutine later?
Edit:
I am mixing metaphores here (cooperative and preemptive multithreading). However, my question seems possible to me (from a layman's perspective) as I can evaluate functions defined in my coroutine (including (yield)) from within the exception handler. I'm essentially trying to limit resource starvation in my worker threads, as well as mitigate a certain class of deadlock (where task 1 can only complete after task 2 has run, and there are no free threads for task 2 to run on).
I have written a (go) function for these coroutines that is modeled after go's goroutines. I assume they achieve their asynchronous behavior on single threads by having cooperative yield checks in the underlying code they control. Perhaps it runs in a VM as you suggested and there are checks, perhaps their operators have the checks. Whatever the case may be I'm trying to achieve similar behavior with a different strategy.
As far as "how continuations and exceptions interact/block each other," it's important to know that exceptions are implemented using delimited continuations. In particular, the exception system makes use of continuation barriers. Both of these are introduced in the Racket reference §1.1.12 Prompts, Delimited Continuations, and Barriers:
A continuation barrier is another kind of continuation frame that prohibits certain replacements of the current continuation with another. … A continuation barrier thus prevents “downward jumps” into a continuation that is protected by a barrier. Certain operations install barriers automatically; in particular, when an exception handler is called, a continuation barrier prohibits the continuation of the handler from capturing the continuation past the exception point.
You may also want to see the material on exceptions from later in the evaluation model section and from the control flow section, which cites an academic paper on the subject. The differences between call-with-exception-handler and with-handlers are also relevant to capturing continuations from within exception handlers.
Basically, though, the continuation barrier prevents using exception handlers for continuations that you abort and might later resume: you should use continuation barriers and prompts directly for that.
More broadly, I would suggest that you look at Racket's substantial existing support for concurrency. Even if you want to implement coroutines as an experiment, they would be useful for inspiration and examples of implementation techniques. Racket comes with derived constructs such as engines ("processes that can be preempted by a timer or other external trigger") and generators, in addition to the fundamental building-blocks, green threads and synchronizable events (which are based on Concurrent ML model).
The gist of your question:
How can I implement an exception handler for coroutines, such that a second thread can send
a signal to a thread evaluating a coroutine, forcing it to yield
when its running for too long.
And once more:
How can I successfully write a signal handler that will (yield)
correctly so that I can resume the coroutine later?
It seems to me that you are not cleanly separating cooperative and preemptive multitasking, since you seem to want to combine coroutines (cooperative) with time-outs (preemptive). (You also mention threads, but seem to conflate them with coroutines.)
With cooperative multitasking there is no way that you can force anyone else to stop running; hence the moniker "cooperative".
With preemptive multitasking you do not need to yield, because the scheduler will preempt you when your allocated time has run out. The scheduler is also responsible for saving your continuation, but it is not the (scheduler's) current continuation, since the scheduler is wholly separate from the user thread.
Perhaps the closest thing to what you are proposing is simulating preemptive multitasking via polling. Every (simulated) timestep (i.e. a VM instruction) the simulation needs to check whether any interrupts/signals have been received by a running thread and handle them.
Related
In the Lparallel API, the recommended way to terminate all threaded tasks is to stop the kernel with (lparallel:end-kernel). But when a thread is blocking—eg, with (pop-queue queue1) waiting for an item to appear in the queue—it will still be active when the kernel is stopped. In this case (at least in SBCL) the kernel shutdown occasionally (but not every time) fails with:
debugger invoked on a SB-KERNEL:BOUNDING-INDICES-BAD-ERROR in thread
#<THREAD "lparallel" RUNNING {1002F04973}>:
The bounding indices 1 and NIL are bad for a sequence of length 0.
See also:
The ANSI Standard, Glossary entry for "bounding index designator"
The ANSI Standard, writeup for Issue SUBSEQ-OUT-OF-BOUNDS:IS-AN-ERROR
debugger invoked on a SB-SYS:INTERACTIVE-INTERRUPT in thread
#<THREAD "main thread" RUNNING {10012E0613}>:
Interactive interrupt at #x1001484328.
I’m assuming this has something to do with the blocking thread not terminating correctly. How should a blocking thread be properly terminated before shutting down the kernel? (The API says kill-tasks should only be used in exceptional circumstances, which I’m taking not to apply to this “normal” shutdown circumstance.)
The problem with killing a thread is that it might happen anywhere, when the thread could be in any unknown state.
The only way to safely terminate a thread it is to let it shutdown itself gracefully, meaning you expect that during normal operations, there is a way for the thread to know it should stop working. Then you can properly clean your resources, close databases, free foreign pointers, log all things, ...
The queues you are using have operations that can timeout, that is a simple yet safe way to ensure you can avoid blocking forever and exit properly. But that's not the only option (you can use them in addition to what is shown below).
Shared / global flag
When a timeout occurs, or when you receive a message, you check a global boolean variable (or one that is shared among all interested threads). That's also a simple way to exit, and it can be read by multiple threads. This is however a concurrent access, so you should use locks or atomic operations (http://www.sbcl.org/manual/#Atomic-Operations), for example use defglobal and a fixnum type with atomic-incf, etc.
Control messages
Send control data in the queues and use them to determine how to shutdown gracefully, and how to propagate the information down the pipes, or how to restart things. This is safe (just message-passing) and allows any kind of control you might want to implement in your thread.
(defpackage :so (:use :cl :bt :lparallel.queue))
(in-package :so)
Let's define two services.
The first one echoes back its input:
(defun echo (in out)
(lambda ()
(loop
for value = (pop-queue in)
do (push-queue value out)
until (eq value :stop))))
Notice how it is expected to finish properly when given a :stop input, and how it also propagates the :stop message to its output queue.
The second thread will perform a modular addition, and also sleeps a bit between requests:
(defun modulo-adder (x m in out)
(lambda ()
(loop
for value = (progn (sleep 0.02)
(pop-queue in))
do (push-queue (typecase value
(keyword value)
(number (mod (+ x value) m)))
out)
until (eq value :stop))))
Create queues:
(defparameter *q1* (make-queue))
(defparameter *q2* (make-queue))
Create threads:
(progn
(bt:make-thread (echo *q1* *q2*) :name "echo")
(bt:make-thread (modulo-adder 5 1024 *q2* *q1*) :name "adder"))
Both threads are connected to each others in a circular fashion, creating an infinite loop of additions. No value is currently exchanged between threads, and you can see them running for example with slime-list-threads or any other implementation-provided way; In any case (bt:all-threads) returns a list.
slime-list-threads
10 adder Running
11 echo Running
...
Add an item, now there is an infinite exchange of data between threads:
(push-queue 10 *q1*)
Wait, then stop them both:
(push-queue :stop *q1*)
Both threads stopped gracefully (they are no more visible in lists of threads).
We can inspect what remains in the queues (result vary from one test to another):
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(99 NIL)
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(:STOP NIL)
(list (try-pop-queue *q1*)
(try-pop-queue *q2*))
(NIL NIL)
Interrupting a thread
You create a service, controlled by messages or a global flag, but then you have a bug and the thread hangs. Instead of killing it and lose everything, you want at least to unwind the thread stack properly. This is a dangerous too, but you can use bt:interrupt to stop a thread anywhere it is running right now and execute a function.
(define-condition stop () ())
(defun signal-stop ()
(signal 'stop))
(defun endless ()
(let ((output *standard-output*))
(lambda ()
(print "START" output)
(unwind-protect (handler-case (loop)
(stop ()
(print "INTERRUPTED" output)))
(print "STOP" output)))))
Start it:
(bt:make-thread (endless) :name "loop")
This prints "START" and loops.
Then we interrupt it:
(bt:interrupt-thread (find "loop"
(bt:all-threads)
:test #'string=
:key #'bt:thread-name)
#'signal-stop)
The following is printed:
"INTERRUPTED"
"STOP"
Those messages would not be printed if the thread was killed, but note that you could still manage to have corrupted data given how random the interruption is. Also, it can unblock blocking calls like sleep or pop-queue.
I need to make an HTTP request which quite often can fail and I'm totally not interested in the result, if it worked or not. Also, I don't want to wait for it to return.
So, I'd like to wrap that call in a separate thread and make sure that the thread won't stick around when something is blocking.
My current approach is something like this:
(defn- call-and-forget [url]
(let [timeout 250
combined-timeout (* timeout 2.5)
f (future
(try
(http/delete url
{:socket-timeout timeout
:conn-timeout timeout})
(catch Throwable e
(printf "Could not call %s: %s"
url (.getMessage e)))))]
(deref f combined-timeout)
(when-not (future-done? f)
(future-cancel f))))
I hereby put this code under the Apache 2.0 license
It uses clj-http to make the call and a Future to create another thread. I am aware of this using a thread of the built-in pool and the discussion over in this thread. The amount of complexity added by using my own thread pool, thread factory, executor service, uncaught handler and so on is not really worth it.
Would you agree that the code above is a good, working solution, or do you see a better way?
Looks good. You could also do
(when (= :failed (deref f timeout-ms :failed))
(future-cancel f))
My question may seem weird but I think I'm facing an issue with volatile objects.
I have written a library implemented like this (just a scheme, not real content):
(def var1 (volatile! nil))
(def var2 (volatile! nil))
(def do-things [a]
(vreset! var1 a)
(vswap! var2 (inc #var2))
{:a #var1 :b #var2})
So I have global var which are initialized by external values, others that are calculated and I return their content.
i used volatile to have better speed than with atoms and not to redefine everytime a new var for every calculation.
The problem is that this seems to fail in practice because I map do-things to a collection (in another program) with inner sub-calls to this function occasionaly, like (pseudo-code) :
(map
(fn [x]
(let [analysis (do-things x)]
(if blabla
(do-things (f x))
analysis)))) coll)
Will inner conditionnal call spawn another thread under the hood ? It seems yes because somethimes calls work, sometimes not.
is there any other way to do apart from defining volatile inside every do-things body ?
EDIT
Actually the error was another thing but the question is still here : is this an acceptable/safe way to do without any explicit call to multithreading capabilities ?
There are very few constructs in Clojure that create threads on your behalf - generally Clojure can and will run on one or more threads depending on how you structure your program. pmap is a good example that creates and manages a pool of threads to map in parallel. Another is clojure.core.reducers/fold, which uses a fork/join pool, but really that's about it. In all other cases it's up to you to create and manage threads.
Volatiles should only be used with great care and in circumstances where you control the scope of use such that you are guaranteed not to be competing with threads to read and write the same volatile. Volatiles guarantee that writes can be read on another thread, but they do nothing to guarantee atomicity. For that, you must use either atoms (for uncoordinated) or refs and the STM (for coordinated).
I wrote a function to get the old value of an atom while putting a new value, all in one atomic operation:
(defn get-and-reset! [at newval]
"Resets atom to newval and returns the old value. Atomic."
(let [tmp (atom [])]
(swap! at #(do (reset! tmp %) newval))
#tmp))
The documentation says the swap! function shouldn't have side effects because it can be called multiple times. That alone doesn't seem like a problem since tmp never leaves the function and it's the last value that it gets reset! to that matters. The function seems to work but I haven't tested it thoroughly with multiple threads, etc. Is this local side-effect a safe exception to the documentation, or am I missing some other subtle problem?
Yes, that will work with the current implementation of atoms in Clojure, and is (almost) guaranteed to work by contract.
The key here is that atoms are sychronous. Therefore, the inner swap! is guaranteed to complete before the outer swap!. Since tmp is only used locally, from a single thread, the inner swap! is also guaranteed not to conflict with a swap! (of tmp) on another thread.
While the outer swap! (i.e., swap! of at) could conflict with other threads, this swap! will retry when a conflict is detected. Since swap! is synchronous, these retries will occur serially w.r.t. the thread the swap! is invoked on. I suppose it's conceivable this last condition does not necessarily hold. E.g., it would be possible for an implementation of atoms to perform the swap! on a different thread, and issue retries as soon as a conflict is detected (without waiting for previous tries to finish). However, that's not the way atoms are currently implemented, and (in my opinion) doesn't seem like a very likely way to implement atoms.
If this weakness bothers you, you can use compare-and-set! instead:
(defn get-and-reset! [at newval]
"Resets atom to newval and returns the old value. Atomic."
(loop [oldval #at]
(if (compare-and-set! at oldval newval)
;; then (no conflict => return oldval)
oldval
;; else (conflict => retry)
(recur #at))))
atom's cannot do what you are trying to do.
Atoms are only defined for uncoordinated syncronous updates of a single identity. for instance the functions used to update atoms may run many times so whatever you do with that value may happen many time for each value that makes it into the atom.
Agents are often a better choice for this sort of thing because If you send an action to an agent it will run at most once:
"At any point in time, at most one action for each Agent is being executed.
Actions dispatched to an agent from another single agent or thread will occur in the order they were sent"
Another option is to add a watch to the agent or atom and have that watch react to each change after it happens. If you can convince your self that neither of these cases work for you then you may have found one of the cases where coordinated change are actually required and then refs would be the better tool, though this is rare. Usually agents or atoms with watches cover most situations.
When working with channels, is future recommended or is thread? Are there times when future makes more sense?
Rich Hickey's blog post on core.async recommends using thread rather than future:
While you can use these operations on threads created with e.g. future, there is also a macro, thread , analogous to go, that will launch a first-class thread and similarly return a channel, and should be preferred over future for channel work.
~ http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html
However, a core.async example makes extensive use of future when working with channels:
(defn fake-search [kind]
(fn [c query]
(future
(<!! (timeout (rand-int 100)))
(>!! c [kind query]))))
~ https://github.com/clojure/core.async/blob/master/examples/ex-async.clj
Summary
In general, thread with its channel return will likely be more convenient for the parts of your application where channels are prominent. On the other hand, any subsystems in your application that interface with some channels at their boundaries but don't use core.async internally should feel free to launch threads in whichever way makes the most sense for them.
Differences between thread and future
As pointed out in the fragment of the core.async blog post you quote, thread returns a channel, just like go:
(let [c (thread :foo)]
(<!! c))
;= :foo
The channel is backed by a buffer of size 1 and will be closed after the value returned by the body of the thread form is put on it. (Except if the returned value happens to be nil, in which case the channel will be closed without anything being put on it -- core.async channels do not accept nil.)
This makes thread fit in nicely with the rest of core.async. In particular, it means that go + the single-bang ops and thread + the double-bang ops really are used in the same way in terms of code structure, you can use the returned channel in alt! / alts! (and the double-bang equivalents) and so forth.
In contrast, the return of future can be deref'd (#) to obtain the value returned by the future form's body (possibly nil). This makes future fit in very well with regular Clojure code not using channels.
There's another difference in the thread pool being used -- thread uses a core.async-specific thread pool, while future uses one of the Agent-backing pools.
Of course all the double-bang ops, as well as put! and take!, work just fine regardless of the way in which the thread they are called from was started.
it sounds like he is recommending using core. async's built in thread macro rather than java's Thread class.
http://clojure.github.io/core.async/#clojure.core.async/thread
Aside from which threadpool things are run in (as pointed out in another answer), the main difference between async/thread and future is this:
thread will return a channel which only lets you take! from the channel once before you just get nil, so good if you need channel semantics, but not ideal if you want to use that result over and over
in contrast, future returns a dereffable object, which once the thread is complete will return the answer every time you deref , making it convenient when you want to get this result more than once, but this comes at the cost of channel semantics
If you want to preserve channel semantics, you can use async/thread and place the result on (and return a) async/promise-chan, which, once there's a value, will always return that value on later take!s. It's slightly more work than just calling future, since you have to explicitly place the result on the promise-chan and return it instead of the thread channel, but buys you interoperability with the rest of the core.async infrastructure.
It almost makes one wonder if there shouldn't be a core.async/thread-promise and core.async/go-promise to make this more convenient...