How to use aio_read(3) in Haskell? - haskell

In Linux one is able to asynchronously read from a file by calling aio_read(3) from C. A sigevent structure is one of the parameters and there are different options one can specify to be notified when the operation is complete. Let me summarize:
SIGEV_NONE no notification.
The status can be checked with aio_error(3). The operation is async, but completion must be busily awaited in some loop which is not what I want.
SIGEV_SIGNAL a signal is raised to the process.
In theory, this can be caught in Haskell by installing a signal handler via System.Posix.Signals. There is a problem though: the API of SignalInfo doesn't include the crucial si_value that let's one communicate some specifics about the read request, like a StaticPtr. This is unfortunate.
SIGEV_THREAD this would start a new thread, according to the documentation.
I don't know how to represent this in Haskell. My best guess would be an IO () action. I'm not sure how to write the accompanying native code.
How can I use aio_read or something of that sort in Haskell? I will probably not get around using FFI on this (or a library).

Related

How to propagate errors from multiple threads in rust?

In a Rust application that is:
Synchronous in the sense of not using "async"
multi-threaded using std::thread
threads are communicating via channels
the "anyhow" crate is being used to annotate and propagate Results
I am propagating all errors up to the main thread, but I only see the Error that is hit by the main thread. This usually happens before I join the child threads, so I don't see the actual root cause.
What minimum-boilerplate modification can I make to see the Errors from multiple threads?
(I'll put some ideas I have in answers, but I'm hoping there is something better.)
I could use my main thread only for supervising child threads, aggregate all of their Results in some kind of Vec as I join them, filter it, and write custom code to print it.
This still feels like more work than should be necessary; I'm not the first person to write a threaded rust application that handles errors.
I could let threads panic and "propagate the panics":
https://doc.rust-lang.org/beta/std/thread/type.Result.html
But this is ugly:
It introduces a difference between how errors are handled in child threads vs. the main thread.
At every fallible call site in my original code I have to add .unwrap().
If I unwrap() every error in the child threads, I might as well not even be using Result error handling, because everything will be returning Ok() or panicking. If I make that transformation, then I change the signature of all my existing functions, which is also gross.
One option would be to upgrade the entire application to use a logging framework, and then scrape through the logs.
This will require modifications everywhere have added an anyhow .context(), ensure!, or bail! annotation, to conditionally also throw a logger error.
Eventually I will want both logging/telemetry and clean teardown from propagating Results, but even then I will not want to have several lines of boilerplate for every single fallible function call.

Watch blocking call nodejs

I'm trying to connect to a 3rd party library, that has a function that can block. I would like to use it, but without blocking. Is it possible to wrap a blocking call that I don't have the control of, to make it async?
// calling this function will block the nodejs thread
blockingCall();
What I would like would be something like this.
// wrapper for the blocking call
var wrapper = wrapBlockingCall(blockingCall);
wrapper.on('complete', function() {});
Is this possible? Does this make sense?
There is no way to make a blocking JavaScript code non-blocking in Node.js - the mechanism which Node uses for its non-blocking behaviour is implemented in the C/C++ layer, which in turn is used only when doing I/O operations (reading from disk, networking etc.).
In reality, every line of JavaScript your program uses will be executed one-by-one, because it is always executed on the same thread, no matter what you do.
The only option I see is to execute the offending code in a separate Node process using the built-in Child Process module. However, this will have significant performance impact, even bigger one if the code needs to be executed frequently.
Note:
After reading the comments under your question, it seems that you are actually the author of the blocking function, which in turn calls a C API which performs blocking I/O. There are ways of calling C functions which would normally block in a manner which does not block the upper JavaScript layer.
While I am not a C expert, I think this is accomplished using the libuv library included in Node - have a look at the addons documentation for more info.

Node Background Threads - When Do These Get Created?

I've been doing a fair amount of work with Node lately, trying to build a system which has certain characteristics, one of which is non-blocking / parallelism - a Node strong suit, as I understand it.
What I don't fully understand is when a separate thread is spun off to handle some processing. I'm pretty sue this happens on a function call/call back, but certainly not all of them.
In my specific case, it's an Express based app. At app start-up it does several things including instantiating a RabbitMQ based "bus", an object with a method which will write to the bus (objA) and object which will subscribe to the bus and process messages coming across it (objB).
objA will write to the bus inside an express callback
app.put((req,res) => {
objA.methodWhichWritesToBus();
});
I believe at this point, that objA.methodWhichWritesToBus is executed in a background/worker thread - whatever you call it, not on the main event loop.
Is that the only point at which this sort of thing happens? methodWhichWritesToBus is IO instensive (it calls an elastic search service on another box and brings back 10's to 100's of thousands of records) with lots of chained promises etc., but none of that gets split off, does it?
How about the fact that the obj on which the method is called is instantiated outside the Express callback - does that affect the parallel-ism?
Finally, are the ways to effect/force a method etc to "run in the background"?
I've been noodling this, testing it, for awhile now but all on one machine so it's difficult to tell what's going on.
Who can clarify this for me?
Pre-answer: this is a topic best learned by going and reading, doing coding exercises to solidify your understanding, and working with the technology in a significant way. You're not going to "get it" based on a Q&A format. That said...
What I don't fully understand is when a separate thread is spun off to handle some processing.
Never, sort of. "Processing" as in the computation that happens in your javascript program, happens in the main event loop thread. End of story. However, waiting on I/O to come back from the OS is not considered "processing" so there are various queues managed by node and the OS to track pending I/O requests and invoke callbacks when data is ready. There are a handful of threads node uses internally to manage this stuff with the OS, but from your program's perspective, those threads are irrelevant. Your program can ask node to do some IO, then your program keeps running in parallel, and when the I/O is done, node will eventually invoke the callback in the main event loop and you can process the results.
I believe at this point, that objA.methodWhichWritesToBus is executed in a background/worker thread - whatever you call it, not on the main event loop.
You call it "asynchronously" and it happens whenever you do IO, including filesystem calls, networking, or child processes. Which is to say, quite a lot.
How about the fact that the obj on which the method is called is instantiated outside the Express callback - does that affect the parallel-ism?
Nope.
Finally, are the ways to effect/force a method etc to "run in the background"?
Generally I/O is done asynchronously by default, so no you don't normally need to force anything to run in the background. It's baked into the node design by way of the node core APIs themselves. However, there are ways to delay synchronous processing to a future event loop using setImmediate, setTimeout, or process.nextTick. I explain these in some detail in my blog post setTimeout and friends.
More precisely, all networking is asynchronous. End of story. Specifically, the APIs in node core that are available are all asynchronous, and there's simply no synchronous API available in node. For filesystem IO and child processes, there are both synchronous and asynchronous APIs, but the synchronous APIs must only be used under special limited circumstances, and if you don't know confidently that it's OK in this specific case to make a synchronous IO API call, you should use the asynchronous API so you don't break the lynchpin that makes node perform as it does.

Does Go have callback concept?

I found many talks saying that Node.js is bad because of callback hell and Go is good because of its synchronous model.
What I feel is Go can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Go and Node.js in callback perspective as if Go cannot become callback hell.
Or I misunderstand the meaning of callback and anonymous function in Go?
A lot of things take time, e.g. waiting on a network socket, a file system read, a system call, etc. Therefore, a lot of languages, or more precisely their standard library, include asynchronous version of their functions (often in addition to the synchronous version), so that your program is able to do something else in the mean-time.
In node.js things are even more extreme. They use a single-threaded event loop and therefore need to ensure that your program never blocks. They have a very well written standard library that is built around the concept of being asynchronous and they use callbacks in order to notify you when something is ready. The code basically looks like this:
doSomething1(arg1, arg2, function() {
doSomething2(arg1, arg2, function() {
doSomething3(function() {
// done
});
});
});
somethingElse();
doSomething1 might take a long time to execute (because it needs to read from the network for example), but your program can still execute somethingElse in the mean time. After doSomething1 has been executed, you want to call doSomething2 and doSomething3.
Go on the other hand is based around the concept of goroutines and channels (google for "Communicating Sequential Processes", if you want to learn more about the abstract concept). Goroutines are very cheap (you can have several thousands of them running at the same time) and therefore you can use them everywhere. The same code might look like this in Go:
go func() {
doSomething1(arg1, arg2)
doSomething2(arg1, arg2)
doSomething3()
// done
}()
somethingElse()
Whereas node.js focus on providing only asynchronous APIs, Go usually encourages you to write only synchronous APIs (without callbacks or channels). The call to doSomething1 will block the current goroutine and doSomething2 will only be executed after doSomething1 has finished. But that's not a problem in Go, since there are usually other goroutines available that can be scheduled to run on the system thread. In this case, somethingElse is part of another goroutine and can be executed in the meantime, just like in the node.js example.
I personally prefer the Go code, since it's easier to read and reason about. Another advantage of Go is that it also works well with computation heavy tasks. If you start a heavy computation in node.js that doesn't need to wait for network of filesystem calls, this computation basically blocks your event loop. Go's scheduler on the other hand will do its best to dispatch the goroutines on a few number of system threads and the OS might run those threads in parallel if your CPU supports it.
What I feel is Golang can also do callback as same as Node.js but in a synchronous way. As we can pass anonymous function and do closure things
So, why are they comparing Golang and Node.js in callback perspective as if Golang cannot become callback hell.
Yes, of course it is possible to mess things up in Go as well. The reason why you don't see as much callbacks as in node.js is that Go has channels for communication, which allow for a way of structuring your code without using callbacks.
So, since there are channels, callbacks are not used as often therefore it is unlikely to stumble over callback infested code. Of course this doesn't mean that you cannot write scary code with channels as well...

Preemptive multithreading in Lua

I'm using lua as the scripting language for handling events in my application, and I don't want to restrict users to writing short handlers - e.g. someone might want to have one handler run an infinite loop, and another handler would interrupt the first one. Obviously, lua doesn't directly support such behavior, so I'm looking for workarounds.
First of all, I'd like to avoid modifying the engine. Is it possible to set up a debug hook that would yield once the state has reached its quota? Judging by the documentation, it shouldn't be hard at all, but I don't know if there are any caveats to this.
And second, can I use lua_close to terminate a thread as I would in actual multithreading?
I've done something similar in the past. Its completely possible to multi-thread on separate Lua states. Be sure to take a look at luaL_lock() and luaL_unlock() (plus associated setup/cleanup), as you will no doubt need this setup (a simple mutex should do the trick).
After that, it should be a fairly simple matter of creating a lock/wait/interrupt API for your handlers.

Resources