recoding a c++ task queue in rust. Is futures the right abstraction?

recoding a c++ task queue in rust. Is futures the right abstraction? - rust

I am rewriting a c++ project in rust as my first non-tiny rust program. I thought I would start with a simple but key gnarly piece of code.
Its a queue of std::packaged_tasks that run at specific times. A client says
running_func_fut_ = bus_->TimerQueue().QueueTask(std::chrono::milliseconds(func_def.delay),
[this, func, &unit]()
{
func(this, &unit);
Done();
}, trace);
func is a std::function, but they key point is that as far as the queue is concerned is queuing up a lambda (closure in rust speak )
It returns a std::future which the client can ignore or can hang onto. If they hang onto it they can see if the task completed yet. (It could return a result but in my current use case the functions are all void, the client just needs to know if the task completed). All the tasks run on a single dedicated thread. The QueueTask method wraps the passed lambda up in a packaged_task and then places it in a multiset of objects that say when and what to run.
I am reading the rust docs and it seems that futures encapsulate both the callable object and the 'get me the result' mechanism.
So I think I need a BTreeSet (I need the queue sorted by launch time so I can pick the next one to run) of futures, but I am not even sure how to declare one of those. SO before I dive into the deep end of futures, is this the right approach? Is there a better , more natural, abstraction for rust?

For the output, you probably do want a Future. However, for the input, you probably want a function object (Box<dyn FnOnce(...)>); see https://doc.rust-lang.org/book/ch19-05-advanced-functions-and-closures.html.

Related

Py3.6 :: ThreadPoolExecutor future.add_done_callback vs. concurrent.futures.as_completed

I’m learning concurrent.futures.ThreadPoolExecutor in Py3.6 and a bit confused as to what’s the difference, pros-and-cons between using
1 future.add_done_callback(callback)
2 concurrent.futures.as_completed(futures)
When would you choose one over the other? If I understand correctly the purpose is same for both more or less.. #1 calls the callback(future) fn as soon as the task has finished and corresponding future has settled, and #2 returns the futures object in the order which the tasks finish and futures settle..
In both cases we we can retrieve the returned value using future.results() (or raise future.exception() if exception was raised).
Thanks for any clarification around that.

The definitions of the functions are at https://github.com/python/cpython/blob/f85af035c5cb9a981f5e3164425f27cf73231b5f/Lib/concurrent/futures/_base.py#L200
def as_completed(fs, timeout=None):
"""An iterator over the given futures that yields each as it completes.
add_done_callback is a method within futures class and a lower level function than as_completed. Essentially, as_completed uses add_done_callback internally. as_completed also has timeout argument for callback.
In general, we may use as_completed if working with multiple futures, while add_done_callback is used with single future.
Overall, both add_done_callback and as_completed may achieve the similar objectives for simpler programs.
Just a thought. We can use different callback function for each future in the list of futures with add_done_callback, while as_completed may work only accept a single single callback.

How to wrap Web Worker response messages in futures?

Please consider a scala.js application which runs in the browser and consists of a main program and a web worker.
The main thread delegates long running operations to the web worker by passing messages that contain the names of methods and the parameters required to invoke them. The worker passes method return values back to the main thread in the form of response messages.
In simpler terms, this program abstracts web worker messaging so that code in the main thread can call methods in the worker thread in idiomatic and asynchronous Scala syntax.
Because web workers do not associate messages with their responses in any way, the abstraction relies on a registry, an intermediary object, that governs each cross context method call to associate the invocation with the result. This singleton could also bind callback functions but is there a way to accomplish this with futures instead of callbacks?
How can I build an abstraction over this registry that allows programmers to use it with the standard asynchronous programming structures in Scala: futures and promises?
How should I write this functionality so that scala programmers can interact with it in the canonical way? For example:
// long running method in the web worker
val f: Future[String] = Registry.ultimateQuestion(42) // async
f onSuccess { case q => println("The ultimate question is: " + q) }
I'm new to futures and promises, but it seems like they usually complete when some execution block terminates. In this case, receiving a response from the web worker signifies completion of the future. Is there a way to write a custom future that delegates its completion status to an external process? Is there another way to link the web worker response message to the status of the future?
Can/Should I extend the Future trait? Is this possible in Scala.js? Is there a concrete class that I should extend? Is there some other way to encapsulate these cross context web worker method calls in existing asynchronous Scala functionality?
Thank you for your consideration.

Hmm. Just spitballing here (I haven't used workers yet), but it seems like associating the request with the Future is fairly easy in the single-threaded JavaScript world you're working in.
Here's a hypothetical design. Say that each request/response to the worker is automatically wrapped in an Envelope; the Envelope contains a RequestId. So the send side looks something like (this is pseudo-code, but real-ish):
def sendRequest[R](msg:Message):Future[R] = {
val promise = Promise[R]
val id = nextRequestId()
val envelope = Envelope(id, msg)
register(id, promise)
sendToWorker(envelope)
promise.future
}
The worker processes msg, wraps the result in another Envelope, and the result gets handled back in the main thread with something like:
def handleResult(resultEnv:Envelope):Unit = {
val promise = findRegistered(resultEnv.id)
val result = resultEnv.msg
promise.success(result)
}
That needs some filling in, and some thought about what the types like R should be, but that sort of outline would probably work decently well. If this was the JVM you'd have to worry about all sorts of race conditions, but in the single-threaded JS world it probably can be as simple as using an autoincrementing integer for the request ID, and storing away the Promise...

How do I Yield() to another thread in a Win8 C++/Xaml app?

Note: I'm using C++, not C#.
I have a bit of code that does some computation, and several bits of code that use the result. The bits that use the result are already in tasks, but the original computation is not -- it's actually in the callstack of the main thread's App::App() initialization.
Back in the olden days, I'd use:
while (!computationIsFinished())
std::this_thread::yield(); // or the like, depending on API
Yet this doesn't seem to exist for Windows Store apps (aka WinRT, pka Metro-style). I can't use a continuation because the bits that use the results are unconnected to where the original computation takes place -- in addition to that computation not being a task anyway.
Searching found Concurrency::Context::Yield(), but Context appears not to exist for Windows Store apps.
So... say I'm in a task on the background thread. How do I yield? Especially, how do I yield in a while loop?

First of all, doing expensive computations in a constructor is not usually a good idea. Even less so when it's the "App" class. Also, doing heavy work in the main (ASTA) thread is pretty much forbidden in the WinRT model.
You can use concurrency::task_completion_event<T> to interface code that isn't task-oriented with other pieces of dependent work.
E.g. in the long serial piece of code:
...
task_completion_event<ComputationResult> tce;
task<ComputationResult> computationTask(tce);
// This task is now tied to the completion event.
// Pass it along to interested parties.
try
{
auto result = DoExpensiveComputations();
// Successfully complete the task.
tce.set(result);
}
catch(...)
{
// On failure, propagate the exception to continuations.
tce.set_exception(std::current_exception());
}
...
Should work well, but again, I recommend breaking out the computation into a task of its own, and would probably start by not doing it during construction... surely an anti-pattern for a responsive UI. :)

Qt simply uses Sleep(0) in their WinRT yield implementation.

Future vs Thread: Which is better for working with channels in core.async?

When working with channels, is future recommended or is thread? Are there times when future makes more sense?
Rich Hickey's blog post on core.async recommends using thread rather than future:
While you can use these operations on threads created with e.g. future, there is also a macro, thread , analogous to go, that will launch a first-class thread and similarly return a channel, and should be preferred over future for channel work.
~ http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html
However, a core.async example makes extensive use of future when working with channels:
(defn fake-search [kind]
(fn [c query]
(future
(<!! (timeout (rand-int 100)))
(>!! c [kind query]))))
~ https://github.com/clojure/core.async/blob/master/examples/ex-async.clj

Summary
In general, thread with its channel return will likely be more convenient for the parts of your application where channels are prominent. On the other hand, any subsystems in your application that interface with some channels at their boundaries but don't use core.async internally should feel free to launch threads in whichever way makes the most sense for them.
Differences between thread and future
As pointed out in the fragment of the core.async blog post you quote, thread returns a channel, just like go:
(let [c (thread :foo)]
(<!! c))
;= :foo
The channel is backed by a buffer of size 1 and will be closed after the value returned by the body of the thread form is put on it. (Except if the returned value happens to be nil, in which case the channel will be closed without anything being put on it -- core.async channels do not accept nil.)
This makes thread fit in nicely with the rest of core.async. In particular, it means that go + the single-bang ops and thread + the double-bang ops really are used in the same way in terms of code structure, you can use the returned channel in alt! / alts! (and the double-bang equivalents) and so forth.
In contrast, the return of future can be deref'd (#) to obtain the value returned by the future form's body (possibly nil). This makes future fit in very well with regular Clojure code not using channels.
There's another difference in the thread pool being used -- thread uses a core.async-specific thread pool, while future uses one of the Agent-backing pools.
Of course all the double-bang ops, as well as put! and take!, work just fine regardless of the way in which the thread they are called from was started.

it sounds like he is recommending using core. async's built in thread macro rather than java's Thread class.
http://clojure.github.io/core.async/#clojure.core.async/thread

Aside from which threadpool things are run in (as pointed out in another answer), the main difference between async/thread and future is this:
thread will return a channel which only lets you take! from the channel once before you just get nil, so good if you need channel semantics, but not ideal if you want to use that result over and over
in contrast, future returns a dereffable object, which once the thread is complete will return the answer every time you deref , making it convenient when you want to get this result more than once, but this comes at the cost of channel semantics
If you want to preserve channel semantics, you can use async/thread and place the result on (and return a) async/promise-chan, which, once there's a value, will always return that value on later take!s. It's slightly more work than just calling future, since you have to explicitly place the result on the promise-chan and return it instead of the thread channel, but buys you interoperability with the rest of the core.async infrastructure.
It almost makes one wonder if there shouldn't be a core.async/thread-promise and core.async/go-promise to make this more convenient...

Clojure mutable storage types

I'm attempting to learn Clojure from the API and documentation available on the site. I'm a bit unclear about mutable storage in Clojure and I want to make sure my understanding is correct. Please let me know if there are any ideas that I've gotten wrong.
Edit: I'm updating this as I receive comments on its correctness.
Disclaimer: All of this information is informal and potentially wrong. Do not use this post for gaining an understanding of how Clojure works.
Vars always contain a root binding and possibly a per-thread binding. They are comparable to regular variables in imperative languages and are not suited for sharing information between threads. (thanks Arthur Ulfeldt)
Refs are locations shared between threads that support atomic transactions that can change the state of any number of refs in a single transaction. Transactions are committed upon exiting sync expressions (dosync) and conflicts are resolved automatically with STM magic (rollbacks, queues, waits, etc.)
Agents are locations that enable information to be asynchronously shared between threads with minimal overhead by dispatching independent action functions to change the agent's state. Agents are returned immediately and are therefore non-blocking, although an agent's value isn't set until a dispatched function has completed.
Atoms are locations that can be synchronously shared between threads. They support safe manipulation between different threads.
Here's my friendly summary based on when to use these structures:
Vars are like regular old variables in imperative languages. (avoid when possible)
Atoms are like Vars but with thread-sharing safety that allows for immediate reading and safe setting. (thanks Martin)
An Agent is like an Atom but rather than blocking it spawns a new thread to calculate its value, only blocks if in the middle of changing a value, and can let other threads know that it's finished assigning.
Refs are shared locations that lock themselves in transactions. Instead of making the programmer decide what happens during race conditions for every piece of locked code, we just start up a transaction and let Clojure handle all the lock conditions between the refs in that transaction.
Also, a related concept is the function future. To me, it seems like a future object can be described as a synchronous Agent where the value can't be accessed at all until the calculation is completed. It can also be described as a non-blocking Atom. Are these accurate conceptions of future?

It sounds like you are really getting Clojure! good job :)
Vars have a "root binding" visible in all threads and each individual thread can change the value it sees with out affecting the other threads. If my understanding is correct a var cannot exist in just one thread with out a root binding that is visible to all and it cant be "rebound" until it has been defined with (def ... ) the first time.
Refs are committed at the end of the (dosync ... ) transaction that encloses the changes but only when the transaction was able to finish in a consistent state.

I think your conclusion about Atoms is wrong:
Atoms are like Vars but with thread-sharing safety that blocks until the value has changed
Atoms are changed with swap! or low-level with compare-and-set!. This never blocks anything. swap! works like a transaction with just one ref:
the old value is taken from the atom and stored thread-local
the function is applied to the old value to generate a new value
if this succeeds compare-and-set is called with old and new value; only if the value of the atom has not been changed by any other thread (still equals old value), the new value is written, otherwise the operation restarts at (1) until is succeeds eventually.

I've found two issues with your question.
You say:
If an agent is accessed while an action is occurring then the value isn't returned until the action has finished
http://clojure.org/agents says:
the state of an Agent is always immediately available for reading by any thread
I.e. you never have to wait to get the value of an agent (I assume the value changed by an action is proxied and changed atomically).
The code for the deref-method of an Agent looks like this (SVN revision 1382):
public Object deref() throws Exception{
if(errors != null)
{
throw new Exception("Agent has errors", (Exception) RT.first(errors));
}
return state;
}
No blocking is involved.
Also, I don't understand what you mean (in your Ref section) by
Transactions are committed on calls to deref
Transactions are committed when all actions of the dosync block have been completed, no exceptions have been thrown and nothing has caused the transaction to be retried. I think deref has nothing to do with it, but maybe I misunderstand your point.

Martin is right when he say that Atoms operation restarts at 1. until is succeeds eventually.
It is also called spin waiting.
While it is note really blocking on a lock the thread that did the operation is blocked until the operation succeeds so it is a blocking operation and not an asynchronously operation.
Also about Futures, Clojure 1.1 has added abstractions for promises and futures.
A promise is a synchronization construct that can be used to deliver a value from one thread to another. Until the value has been delivered, any attempt to dereference the promise will block.
(def a-promise (promise))
(deliver a-promise :fred)
Futures represent asynchronous computations. They are a way to get code to run in another thread, and obtain the result.
(def f (future (some-sexp)))
(deref f) ; blocks the thread that derefs f until value is available

Vars don't always have a root binding. It's legal to create a var without a binding using
(def x)
or
(declare x)
Attempting to evaluate x before it has a value will result in
Var user/x is unbound.
[Thrown class java.lang.IllegalStateException]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string