Wait without blocking thread? - How? - multithreading

This is probably one of the most elementary things in F#, but I've just realized I've got no idea what's going on behind the sceens.
let testMe() =
async { printfn "before!"
do! myAsyncFunction() // Waits without blocking thread
printfn "after!" }
testMe |> Async.RunSynchronously
What's happening at do! myAsyncFunction()? I'm aware of that it waits for myAsyncFunction to finish, before moving on. But how can it do that, without blocking the thread?
My best guess is that everything after do! myAsyncFunction() is passed along as a continuation, which gets executed on the same thread myAsyncFunction() was scheduled on, once myAsyncFunction() has finished executing.. but then again, that's just a guess.

As you correctly pointed out, the myAsyncFunction is passed a continuation and it calls it to resume the rest of the asynchronous workflow when it completes.
You can understand it better by looking at the desugared version of the code:
let testMe() =
async.Delay(fun () ->
printfn "before!"
async.Bind(myAsyncFunction(), fun () ->
printfn "after!"
async.Zero()))
They key thing is that the asynchronous workflow created by myAsyncFunction is given to the Bind operation that starts it and gives it the second argument (a continuation) as a function to call when the workflow completes. If you simplify a lot, then an asynchronous workflow could be defined like this:
type MyAsync<'T> = (('T -> unit) * (exn -> unit)) -> unit
So, an asynchronous workflow is just a function that takes some continuations as argument. When it gets the continuations, it does something (i.e. create a timer or start I/O) and then it eventually calls these continuations. The question "On which thread are the continuations called?" is an interesting one - in a simple model, it depends on the MyAsync that you're starting - it may decide to run them anywhere it wants (i.e. Async.SwithcToNewThread runs them on a new thread). The F# library includes some additional handling that makes GUI programming using workflows easier.
Your example uses Async.RunImmediate, which blocks the current thread, but you could also use Async.Start, which just starts the workflow and ignores the result when it is produced. The implementation of Async.Start could look like this:
let Start (async:MyAsync<unit>) = async (ignore, ignore)

Related

recoding a c++ task queue in rust. Is futures the right abstraction?

I am rewriting a c++ project in rust as my first non-tiny rust program. I thought I would start with a simple but key gnarly piece of code.
Its a queue of std::packaged_tasks that run at specific times. A client says
running_func_fut_ = bus_->TimerQueue().QueueTask(std::chrono::milliseconds(func_def.delay),
[this, func, &unit]()
{
func(this, &unit);
Done();
}, trace);
func is a std::function, but they key point is that as far as the queue is concerned is queuing up a lambda (closure in rust speak )
It returns a std::future which the client can ignore or can hang onto. If they hang onto it they can see if the task completed yet. (It could return a result but in my current use case the functions are all void, the client just needs to know if the task completed). All the tasks run on a single dedicated thread. The QueueTask method wraps the passed lambda up in a packaged_task and then places it in a multiset of objects that say when and what to run.
I am reading the rust docs and it seems that futures encapsulate both the callable object and the 'get me the result' mechanism.
So I think I need a BTreeSet (I need the queue sorted by launch time so I can pick the next one to run) of futures, but I am not even sure how to declare one of those. SO before I dive into the deep end of futures, is this the right approach? Is there a better , more natural, abstraction for rust?
For the output, you probably do want a Future. However, for the input, you probably want a function object (Box<dyn FnOnce(...)>); see https://doc.rust-lang.org/book/ch19-05-advanced-functions-and-closures.html.

Is there any linter that detects blocking calls in an async function?

https://www.aeracode.org/2018/02/19/python-async-simplified/
It's not going to ruin your day if you call a non-blocking synchronous
function, like this:
def get_chat_id(name):
return "chat-%s" % name
async def main():
result = get_chat_id("django")
However, if you call a blocking function, like the Django ORM, the
code inside the async function will look identical, but now it's
dangerous code that might block the entire event loop as it's not
awaiting:
def get_chat_id(name):
return Chat.objects.get(name=name).id
async def main():
result = get_chat_id("django")
You can see how it's easy to have a non-blocking function that
"accidentally" becomes blocking if a programmer is not super-aware of
everything that calls it. This is why I recommend you never call
anything synchronous from an async function without doing it safely,
or without knowing beforehand it's a non-blocking standard library
function, like os.path.join.
So I am looking for a way to automatically catch instances of this mistake. Are there any linters for Python which will report sync function calls from within an async function as a violation?
Can I configure Pylint or Flake8 to do this?
I don't necessarily mind if it catches the first case above too (which is harmless).
Update:
On one level I realise this is a stupid question, as pointed out in Mikhail's answer. What we need is a definition of a "dangerous synchronous function" that the linter should detect.
So for purpose of this question I give the following definition:
A "dangerous synchronous function" is one that performs IO operations. These are the same operations which have to be monkey-patched by gevent, for example, or which have to be wrapped in async functions so that the event loop can context switch.
(I would welcome any refinement of this definition)
So I am looking for a way to automatically catch instances of this
mistake.
Let's make few things clear: mistake discussed in article is when you call any long running sync function inside some asyncio coroutine (it can be I/O blocking call or just pure CPU function with a lot of calculations). It's a mistake because it'll block whole event loop what will lead to significant performance downgrade (more about it here including comments below answer).
Is there any way to catch this situation automatically? Before run time - no, no one except you can predict if particular function will take 10 seconds or 0.01 second to execute. On run time it's already built-in asyncio, all you have to do is to enable debug mode.
If you afraid some sync function can vary between being long running (detectable in run time in debug mode) and short running (not detectable) just execute function in background thread using run_in_executor - it'll guarantee event loop will not be blocked.

How can I stop async/await from bubbling up in functions?

Lets say I have a function A that uses a function B which uses C, etc:
A -> B -> C -> D and E
Now assume that function D has to use async/await. This means I have to use async/await to the call of function C and then to the call of function B and so on. I understand that this is because they depend on each other and if one of them is waiting for a function to resolve, then transitively, they all have to. What alternatives can I do to make this cleaner?
There is a way to do this, but you'll loose the benefits of async-await.
One of the reason for async-await, is, that if your thread has to wait for another process to complete, like a read or write to the hard-disk, a database query, or fetching some internet information, your thread might do some other useful stuff instead of just waiting idly for this other process to complete.
This is done by using the keyword await. Once your thread sees the await. The thread doesn't really wait idly. Instead, it remembers the context (think of variable values, parts of the call stack etc) and goes up the call stack to see if the caller is not awaiting. If not, it starts executing these statements until it sees an await. It goes up the call stack again to see if the caller is not awaiting, etc.
Once this other process is completed the thread (or maybe another thread from the thread pool that acts as if it is the original thread) continues with the statements after the await until the procedure is finished, or until it sees another await.
To be able to do this, your procedure must know, that when it sees an await, the context needs to be saved and the thread must act like described above. That is why you declare a method async.
That is why typical async functions are functions that order other processes to do something lengthy: disk access, database access, internet communications. Those are typical functions where you'll find a ReadAsync / WriteAsync, next to the standard Read / Write functions. You'll also find them in classes that are typically designed to call these processes, like StreamReaders, TextWriters etc.
If you have a non-async class that calls an async function and waits until the async function completes before returning, the go-up-the-call-stack-to-see-if-the-caller-is-not-awaiting stops here: your program acts as if it is not using async-await.
Almost!
If you start an awaitable task, without waiting for it to finish, and do something else before you wait for the result, then this something else is executed instead of the idly wait, that the thread would have done if you would have used the non-async version.
How to call async function from non-async function
ICollection<string> ReadData(...)
{
// call the async function, don't await yet, you'll have far more interesting things to do
var taskReadLines = myReader.ReadLinesAsync(...);
DoSomethingInteresting();
// now you need the data from the read task.
// However, because this method is not async, you can't await.
// This Wait will really be an idle wait.
taskReadLines.Wait();
ICollection<string> readLines= taskRead.Result;
return readLines();
}
Your callers won't benefit from async-await, however your thread will be able to do something interesting while the lines have not been read yet.

Lua Script coroutine

Hi need some help on my lua script. I have a script here that will run a server like application (infinite loop). Problem here is it doesn't execute the second coroutine.
Could you tell me whats wrong Thank you.
function startServer()
print( "...Running server" )
--run a server like application infinite loop
os.execute( "server.exe" )
end
function continue()
print("continue")
end
co = coroutine.create( startServer() )
co1 = coroutine.create( continue() )
Lua have cooperative multithreading. Threads are not swtiched automatically, but must yield to others. When one thread is running, every other thread is waiting for it to finish or yield. Your first thread in this example seems to run server.exe, which, I assume, never finishes until interrupted. Thus second thread never gets its turn to run.
You also run threads wrong. In your example you're not running any threads at all. You execute function and then would try to create coroutine with its output, which naturally would fail. But since you never get back from server.exe you didn't notice this problem yet. Remove those brackets after startServer and continue to fix it.
As already noted, there are several issues with the script that prevent you from getting what you want:
os.execute("...") is blocked until the command is completed and in your case it doesn't complete (as it runs an infinite loop). Solution: you need to detach that process from yours by using something like io.popen() instead of os.execute()
co = coroutine.create( startServer() ) doesn't create a coroutine in your case. coroutine.create call accepts a function reference and you pass it the result of startServer call, which is nil. Solution: use co = coroutine.create( startServer ) (note that parenthesis are dropped, so it's not a function call anymore).
You are not yielding from your coroutines; if you want several coroutines to work together, they need to be cooperating by giving control to each other when appropriate. That's what yield command is for and that's why it's called non-preemptive multithreading. Solution: you need to use a combination of resume and yield calls after you create your coroutine.
startServer doesn't need to be a coroutine as you are not giving control back to it; its only purpose is to start the server.
In your case, the solution may not even need coroutines as all you need to do is: (1) start the server and let it detach from your process (for example, using popen) and (2) work with your process using whatever communication protocol it requires (pipes, sockets, etc.).
There are more complex and complete solutions (like LuaLanes) and also several good descriptions on creating simple coroutine dispatchers.
Your coroutine is not yielding

transforming a thread blocking to non thread blocking in f#

I retrieve data from the Bloomberg API, and am quite surprised by the slowness.
My computation is IO bounded by this.
Therefore I decided to use some async monad builder to unthrottle it.
Upon running it, the results are not so much better, which was obvious as I make a call to a function, NextEvent, which is thread blocking.
let outerloop args dic =
...
let rec innerloop continuetoloop =
let eventObj = session.NextEvent(); //This blocks
...
let seqtable = reader.ReadFile( #"C:\homeware\sector.csv", ";".[0], true)
let dic = ConcurrentDictionary<_,_> ()
let wf = seqtable |> Seq.mapi (fun i item -> async { outerloop item dic } )
wf |> Async.Parallel
|> Async.RunSynchronously
|> ignore
printfn "%A" ret
Is there a good way to wrap that blocking call to a nonblocking call ?
Also, why is the async framework not creating as many threads as I have requests (like 200)? when I inspect the threads from which I receive values I see only 4-5 that are used..
UPDATE
I found a compelling reason of why it will never be possible.
async operation take what is after the async instruction and schedule it somewhere in the threadpool.
for all that matters, as long as async function are use correctly, that is, always returning to the threadpool it originated from, we can consider that we are execution on a single thread.
Being on a single thread mean all that scheduling will always be executed somewhere later, and a blocking instruction has no way to avoid the fact that, eventually, once it runs, it will have to block at some point in the future the worflow.
Is there a good way to wrap that blocking call to a nonblocking call ?
No. You can never wrap blocking calls to make them non-blocking. If you could, async would just be a design pattern rather than a fundamental paradigm shift.
Also, why is the async framework not creating as many threads as I have requests (like 200)?
Async is built upon the thread pool which is designed not to create threads aggressively because they are so expensive. The whole point of (real) async is that it does not require as many threads as there are connections. You can handle 10,000 simultaneous connections with ~30 threads.
You seem to have a complete misunderstanding of what async is and what it is for. I'd suggest buying any of the F# books that cover this topic and reading up on it. In particular, your solution is not asynchronous because you just call your blocking StartGetFieldsValue member from inside your async workflow, defeating the purpose of async. You might as well just do Array.Parallel.map getFieldsValue instead.
Also, you want to use a purely functional API when doing things in parallel rather than mutating a ConcurrentDictionary in-place. So replace req.StartGetFieldsValue ret with
let! result = req.StartGetFieldsValue()
...
return result
and replace ignore with dict.
Here is a solution I made that seems to be working.
Of course it does not use only async (minus a), but Async as well.
I define a simple type that has one event, finished, and one method, asyncstart, with the method to run as an argument. it launches the method, then fires the event finished at the appropriate place (in my case I had to capture the synchronization context etc..)
Then in the consumer side, I use simply
let! completion = Async.Waitfromevent wapper.finished |> Async.StartAsChild
let! completed = completion
While running this code, on the consumer side, I use only async calls, making my code non blocking. Of course, there has to be some thread somewhere which is blocked, but this happens outside of my main serving loop which remains reactive and fit.

Resources