Lparallel.queue thread-safe?

Lparallel.queue thread-safe? - multithreading

I may not be looking in the correct location for the documentation of lparallel.queue, but can we assume that those queues are thread-safe and that the queues take care of any locking/unlocking themselves, so that the user of the queues don't have to explicitly perform any locking/unlocking? If this is documented somewhere, I would appreciate the link.

I don't know where it's specified, but looking at the code shows that the queue functions are indeed thread safe.
The queue functions all expand into calls to DEFINE-LOCKING-FN which in turns expands to a DEFINE-LOCKING-FN/BASE which results in a DEFUN of the function with the content wrapped in a WITH-LOCK-HELD.

Looking at the source code, the queues seem to be locked.
(define-queue-fn push-queue (object queue)
push-cons-queue
push-vector-queue)
...
(define-locking-fn push-cons-queue (object queue) (t cons-queue) (values) lock
(with-cons-queue-slots (impl cvar) queue
(push-raw-queue object impl)
(when cvar
(condition-notify cvar)))
(values))
The unlocked functions have a separate name:
(define-queue-fn push-queue/no-lock (object queue)
push-cons-queue/no-lock
push-vector-queue/no-lock)

Related

Future vs Thread: Which is better for working with channels in core.async?

When working with channels, is future recommended or is thread? Are there times when future makes more sense?
Rich Hickey's blog post on core.async recommends using thread rather than future:
While you can use these operations on threads created with e.g. future, there is also a macro, thread , analogous to go, that will launch a first-class thread and similarly return a channel, and should be preferred over future for channel work.
~ http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html
However, a core.async example makes extensive use of future when working with channels:
(defn fake-search [kind]
(fn [c query]
(future
(<!! (timeout (rand-int 100)))
(>!! c [kind query]))))
~ https://github.com/clojure/core.async/blob/master/examples/ex-async.clj

Summary
In general, thread with its channel return will likely be more convenient for the parts of your application where channels are prominent. On the other hand, any subsystems in your application that interface with some channels at their boundaries but don't use core.async internally should feel free to launch threads in whichever way makes the most sense for them.
Differences between thread and future
As pointed out in the fragment of the core.async blog post you quote, thread returns a channel, just like go:
(let [c (thread :foo)]
(<!! c))
;= :foo
The channel is backed by a buffer of size 1 and will be closed after the value returned by the body of the thread form is put on it. (Except if the returned value happens to be nil, in which case the channel will be closed without anything being put on it -- core.async channels do not accept nil.)
This makes thread fit in nicely with the rest of core.async. In particular, it means that go + the single-bang ops and thread + the double-bang ops really are used in the same way in terms of code structure, you can use the returned channel in alt! / alts! (and the double-bang equivalents) and so forth.
In contrast, the return of future can be deref'd (#) to obtain the value returned by the future form's body (possibly nil). This makes future fit in very well with regular Clojure code not using channels.
There's another difference in the thread pool being used -- thread uses a core.async-specific thread pool, while future uses one of the Agent-backing pools.
Of course all the double-bang ops, as well as put! and take!, work just fine regardless of the way in which the thread they are called from was started.

it sounds like he is recommending using core. async's built in thread macro rather than java's Thread class.
http://clojure.github.io/core.async/#clojure.core.async/thread

Aside from which threadpool things are run in (as pointed out in another answer), the main difference between async/thread and future is this:
thread will return a channel which only lets you take! from the channel once before you just get nil, so good if you need channel semantics, but not ideal if you want to use that result over and over
in contrast, future returns a dereffable object, which once the thread is complete will return the answer every time you deref , making it convenient when you want to get this result more than once, but this comes at the cost of channel semantics
If you want to preserve channel semantics, you can use async/thread and place the result on (and return a) async/promise-chan, which, once there's a value, will always return that value on later take!s. It's slightly more work than just calling future, since you have to explicitly place the result on the promise-chan and return it instead of the thread channel, but buys you interoperability with the rest of the core.async infrastructure.
It almost makes one wonder if there shouldn't be a core.async/thread-promise and core.async/go-promise to make this more convenient...

The difference between wait_queue_head and wait_queue in linux kernel

I can find many examples regarding wait_queue_head.
It works as a signal, create a wait_queue_head, someone
can sleep using it until someother kicks it up.
But I can not find a good example of using wait_queue itself, supposedly very related to it.
Could someone gives example, or under the hood of them?

From Linux Device Drivers:
The wait_queue_head_t type is a fairly simple structure, defined in
<linux/wait.h>. It contains only a lock variable and a linked list
of sleeping processes. The individual data items in the list are of
type wait_queue_t, and the list is the generic list defined in
<linux/list.h>.
Normally the wait_queue_t structures are allocated on the stack by
functions like interruptible_sleep_on; the structures end up in the
stack because they are simply declared as automatic variables in the
relevant functions. In general, the programmer need not deal with
them.
Take a look at A Deeper Look at Wait Queues part.
Some advanced applications, however, can require dealing with
wait_queue_t variables directly. For these, it's worth a quick look at
what actually goes on inside a function like interruptible_sleep_on.
The following is a simplified version of the implementation of
interruptible_sleep_on to put a process to sleep:
void simplified_sleep_on(wait_queue_head_t *queue)
{
wait_queue_t wait;
init_waitqueue_entry(&wait, current);
current->state = TASK_INTERRUPTIBLE;
add_wait_queue(queue, &wait);
schedule();
remove_wait_queue (queue, &wait);
}
The code here creates a new wait_queue_t variable (wait, which gets
allocated on the stack) and initializes it. The state of the task is
set to TASK_INTERRUPTIBLE, meaning that it is in an interruptible
sleep. The wait queue entry is then added to the queue (the
wait_queue_head_t * argument). Then schedule is called, which
relinquishes the processor to somebody else. schedule returns only
when somebody else has woken up the process and set its state to
TASK_RUNNING. At that point, the wait queue entry is removed from the
queue, and the sleep is done
The internals of the data structures involved in wait queues:
Update:
for the users who think the image is my own - here is one more time the link to the Linux Device Drivers where the image is taken from

Wait queue is simply a list of processes and a lock.
wait_queue_head_t represents the queue as a whole. It is the head of the waiting queue.
wait_queue_t represents the item of the list - a single process waiting in the queue.

Worker thread doesn't have message loop (MFC, windows). Can we make it to receive messages?

Mfc provides both worker and UI thread. UI thread is enabled with message receiving capabilities (send, post). Could it be possible to let worker thread too receive messages.

Call CWinThread::PumpMessage() repeatedly until it returns a WM_QUIT message.

It seems you need a thread, that can handle multiple messages from another threads. Another threads would add-a-message to the message-queue of this thread. Well, in that case you may use PeekMessage to startup a loop, which would eventually create a hidden window, and then use GetMessage to get the messages. The other threads would use PostThreadMessage with the thread ID (the one having Peek/GetMessage), and the message-code, LPARAM, WPARAM.
It would be like (not syntactically correct):
TheProcessor()
{
MSG msg;
PeekMessage(&msg,...);
while(GetMessage(&msg...)
{ /* switch case here */ }
}
The threads would call PostThreadMessage - See MSDN for more info.
When you need to send more data than LPARAM/WPARAM can hold, you eventually need to allocate them on heap, and then delete AFTER processing the message in your custom message-loop. This would be cumbersome and buggy.
But... I would suggest you to have your own class, on top of std::queue/deque or other DS, where you can add AddMessage/PushMessage, and PopMessage (or whatever names you like). You need to use SetEvent, WaitForSingleObject to trigger the new message in loop (See one of the implementation here. You may make it generic for one data-type, or make it template class - that would support any data-type (your underlying DS (queue) would utilize the same data-type). You also need not to worry about heaps and deletions. This is less error prone. You may however, have to handle MT issues.
Using Windows events involves kernel mode transition (since events are named/kernel objects), and you may like to use Conditional Variables which are user objects.Or you may straightaway use unbounded_buffer class from Concurrency Runtime Library available in VC10. See this article (jump to unbounded_buffer).

Yes you can create a message queue on a worker thread. You will need to run a message pump on that thread.

Clojure mutable storage types

I'm attempting to learn Clojure from the API and documentation available on the site. I'm a bit unclear about mutable storage in Clojure and I want to make sure my understanding is correct. Please let me know if there are any ideas that I've gotten wrong.
Edit: I'm updating this as I receive comments on its correctness.
Disclaimer: All of this information is informal and potentially wrong. Do not use this post for gaining an understanding of how Clojure works.
Vars always contain a root binding and possibly a per-thread binding. They are comparable to regular variables in imperative languages and are not suited for sharing information between threads. (thanks Arthur Ulfeldt)
Refs are locations shared between threads that support atomic transactions that can change the state of any number of refs in a single transaction. Transactions are committed upon exiting sync expressions (dosync) and conflicts are resolved automatically with STM magic (rollbacks, queues, waits, etc.)
Agents are locations that enable information to be asynchronously shared between threads with minimal overhead by dispatching independent action functions to change the agent's state. Agents are returned immediately and are therefore non-blocking, although an agent's value isn't set until a dispatched function has completed.
Atoms are locations that can be synchronously shared between threads. They support safe manipulation between different threads.
Here's my friendly summary based on when to use these structures:
Vars are like regular old variables in imperative languages. (avoid when possible)
Atoms are like Vars but with thread-sharing safety that allows for immediate reading and safe setting. (thanks Martin)
An Agent is like an Atom but rather than blocking it spawns a new thread to calculate its value, only blocks if in the middle of changing a value, and can let other threads know that it's finished assigning.
Refs are shared locations that lock themselves in transactions. Instead of making the programmer decide what happens during race conditions for every piece of locked code, we just start up a transaction and let Clojure handle all the lock conditions between the refs in that transaction.
Also, a related concept is the function future. To me, it seems like a future object can be described as a synchronous Agent where the value can't be accessed at all until the calculation is completed. It can also be described as a non-blocking Atom. Are these accurate conceptions of future?

It sounds like you are really getting Clojure! good job :)
Vars have a "root binding" visible in all threads and each individual thread can change the value it sees with out affecting the other threads. If my understanding is correct a var cannot exist in just one thread with out a root binding that is visible to all and it cant be "rebound" until it has been defined with (def ... ) the first time.
Refs are committed at the end of the (dosync ... ) transaction that encloses the changes but only when the transaction was able to finish in a consistent state.

I think your conclusion about Atoms is wrong:
Atoms are like Vars but with thread-sharing safety that blocks until the value has changed
Atoms are changed with swap! or low-level with compare-and-set!. This never blocks anything. swap! works like a transaction with just one ref:
the old value is taken from the atom and stored thread-local
the function is applied to the old value to generate a new value
if this succeeds compare-and-set is called with old and new value; only if the value of the atom has not been changed by any other thread (still equals old value), the new value is written, otherwise the operation restarts at (1) until is succeeds eventually.

I've found two issues with your question.
You say:
If an agent is accessed while an action is occurring then the value isn't returned until the action has finished
http://clojure.org/agents says:
the state of an Agent is always immediately available for reading by any thread
I.e. you never have to wait to get the value of an agent (I assume the value changed by an action is proxied and changed atomically).
The code for the deref-method of an Agent looks like this (SVN revision 1382):
public Object deref() throws Exception{
if(errors != null)
{
throw new Exception("Agent has errors", (Exception) RT.first(errors));
}
return state;
}
No blocking is involved.
Also, I don't understand what you mean (in your Ref section) by
Transactions are committed on calls to deref
Transactions are committed when all actions of the dosync block have been completed, no exceptions have been thrown and nothing has caused the transaction to be retried. I think deref has nothing to do with it, but maybe I misunderstand your point.

Martin is right when he say that Atoms operation restarts at 1. until is succeeds eventually.
It is also called spin waiting.
While it is note really blocking on a lock the thread that did the operation is blocked until the operation succeeds so it is a blocking operation and not an asynchronously operation.
Also about Futures, Clojure 1.1 has added abstractions for promises and futures.
A promise is a synchronization construct that can be used to deliver a value from one thread to another. Until the value has been delivered, any attempt to dereference the promise will block.
(def a-promise (promise))
(deliver a-promise :fred)
Futures represent asynchronous computations. They are a way to get code to run in another thread, and obtain the result.
(def f (future (some-sexp)))
(deref f) ; blocks the thread that derefs f until value is available

Vars don't always have a root binding. It's legal to create a var without a binding using
(def x)
or
(declare x)
Attempting to evaluate x before it has a value will result in
Var user/x is unbound.
[Thrown class java.lang.IllegalStateException]

Design Pattern for multithreaded observers

In a digital signal acquisition system, often data is pushed into an observer in the system by one thread.
example from Wikipedia/Observer_pattern:
foreach (IObserver observer in observers)
observer.Update(message);
When e.g. a user action from e.g. a GUI-thread requires the data to stop flowing, you want to break the subject-observer connection, and even dispose of the observer alltogether.
One may argue: you should just stop the data source, and wait for a sentinel value to dispose of the connection. But that would incur more latency in the system.
Of course, if the data pumping thread has just asked for the address of the observer, it might find it's sending a message to a destroyed object.
Has someone created an 'official' Design Pattern countering this situation? Shouldn't they?

If you want to have the data source to always be on the safe side of concurrency, you should have at least one pointer that is always safe for him to use.
So the Observer object should have a lifetime that isn't ended before that of the data source.
This can be done by only adding Observers, but never removing them.
You could have each observer not do the core implementation itself, but have it delegate this task to an ObserverImpl object.
You lock access to this impl object. This is no big deal, it just means the GUI unsubscriber would be blocked for a little while in case the observer is busy using the ObserverImpl object. If GUI responsiveness would be an issue, you can use some kind of concurrent job-queue mechanism with an unsubscription job pushed onto it. ( like PostMessage in Windows )
When unsubscribing, you just substitute the core implementation for a dummy implementation. Again this operation should grab the lock. This would indeed introduce some waiting for the data source, but since it's just a [ lock - pointer swap - unlock ] you could say that this is fast enough for real-time applications.
If you want to avoid stacking Observer objects that just contain a dummy, you have to do some kind of bookkeeping, but this could boil down to something trivial like an object holding a pointer to the Observer object he needs from the list.
Optimization :
If you also keep the implementations ( the real one + the dummy ) alive as long as the Observer itself, you can do this without an actual lock, and use something like InterlockedExchangePointer to swap the pointers.
Worst case scenario : delegating call is going on while pointer is swapped --> no big deal all objects stay alive and delegating can continue. Next delegating call will be to new implementation object. ( Barring any new swaps of course )

You could send a message to all observers informing them the data source is terminating and let the observers remove themselves from the list.
In response to the comment, the implementation of the subject-observer pattern should allow for dynamic addition / removal of observers. In C#, the event system is a subject/observer pattern where observers are added using event += observer and removed using event -= observer.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string