I'm writing an event-driven program with the event scheduling written in C. The program uses Python for extension modules. I would like to allow extension modules to use async/await syntax to implement coroutines. Coroutines will just interact with parts of my program, no IO is involved. My C scheduler is single-threaded and I need coroutines to execute in its thread.
In a pure Python program, I would just use asyncio as is, and let my program use its event loop to drive all events. This is however not an option; my event loop needs to serve millions of C-based events per second and I cannot afford Python's overheads.
I tried to write my own event loop implementation, that delegates all scheduling to my C scheduler. I tried a few approaches:
Re-implement EventLoop, Future, Task etc to imitate how asyncio works (minus IO), so that call_soon delegates scheduling to my C event loop. This is safe, but requires a bit of work and my implementation will always be inferior to asyncio when it comes to documentation, debugging support, intricate semantic details, and correctness/test coverage.
I can use vanilla Task, Future etc from asyncio, and only create a custom implementation of AbstractEventLoop, delegating scheduling to my C event loop in the same way. This is pretty straightforward, but I can see that the vanilla EventLoop accesses non-obvious internals (task._source_traceback, _asyncgen_finalizer_hook, _set_running_loop), so my implementation is still second class. I also have to rely on the undocumented Handle._run to actually invoke callbacks.
Things appeared to get simpler and better if I subclassed from BaseEventLoop instead of AbstractEventLoop (but docs say I shouldn't do that). I still need Handle._run, though.
I could spawn a separate thread that run_forever:s a vanilla asyncio.DefaultEventLoop and run all my coroutines there, but coroutines depend on my program's extension API, which does not support concurrent calls. So I must somehow make DefaultEventLoop pause my C event loop while calling Handle._run(). I don't see a reasonable way to achieve that.
Any ideas on how to best do this? How did others solve this problem?
I found that trio, a third-party alternative to asyncio, provides explicit support for integration with alien event loops through something called guest mode. Solves my problem!
Related
The documentation of node.js describes the so called phases of its underlying event loop.
It explicitly states also that idle and prepare phases are only used internally.
For the event loop of node.js is the one of libuv, it goes without saying that those phases are probably mapped on the idle and prepare handles of libuv.
They would help to have greater granularity while organizing the tasks in a software. In particular, they are the only way to schedule something between the execution of the I/O callbacks and the poll phase.
Anyway, they are not exported from the underlying environment.
What's the reason for which those phases have been forbidden, actually giving to the users an apparently poorest event loop than the one offered by libuv?
Is there any other way to schedule tasks the way mentioned above?
Side note: it's just curiosity.
I used to work with both libuv and nodejs and I noticed it, so I want to know if there is a technical reason for that or... Well, that is how it has been designed and that's all, no particular reason.
I don't think there is a specific reason to "forbid" them. Moreover, they are not really forbidden, they are just not exposed. You could create a Node addon which allows you to create idle and prepare handles and there would be no problem at all. There are some things you must be aware of:
Idle handles have a terrible name: they don't run when the loop is actually idle. They run once per loop iteration, after the timers, and if any idle timer is active, the loop will block for i/o for zero seconds. So they can be dangerous because the CPU will spin if you don't stop it.
Callbacks registered with process.nextTick are called when the C++ <-> JS boundary is crossed (see calls to MakeCallback) so i/o callbacks could be deferred and run a bit later. If you exposed prepare handles to JS you would use MakeCallback in the C++ code, so some of the process.nextTick callbacks would also be called alongside your prepare callbacks.
As a general note: idle, check and prepare handles were somehow inherited from libev (which libuv used to use internally). Check and prepare can be used when embedding libuv with other libraries and idle handles are a bit weird, as I mentioned above. Also, libuv follows its own path these days, so not everything libuv has will end up exposed in Node land.
You could ask a reverse question "why do you need idle phase, for example, to be exposed"? You can just use setImmediate().
Also, why do you want to execute something in between I/O callbacks and polling phase, as you don't control explicitly those things anyways?
It's few times harder to program using continuations (callbacks) rather than in model of straight sequential execution. Can NodeJs do blocking calls?
Yes, it can. For example you can read a file with fs.readFileSync() rather than fs.readFile(). Each library usually provides a xxxSync method for synchronous/blocking methods.
But you should NOT use the sync method very often. Remember that Node.js uses a single thread of execution for JavaScript code. If you block this thread you block it for everybody (unlike C#/Java where a new thread will be created for each request.)
If the asynch approach is too much for you you might want to use another platform (Ruby, Python, PHP).
Yes, node.js 0.11.x can do blocking calls inside if a generator. By "blocking call" I mean stopping execution of current function for a while. Look at co library.
This is the only recommended way to do blocking calls.
Other than that, you can look at fibers, but it needs to be used carefully, and not in general-purpose libraries.
There are also a couple of *Sync calls mentioned before, but please avoid using them entirely unless you're writing one-liner.
I've read that node.js uses both treads and an event loop.
I'm curious to know how does it know how to treat a call back... Is it specified by the EventEmitter (and does the engineer know if it is going to be blocking or not)?
Or is the core itself that chooses it at runtime?
If it's this one how does it detect if it has to be run async or threaded?
I've already read a lot of resources but i didn't find about this. Im reading the source code but it's hard since it is a lot of time since the last time i coded with C++.
thanks
Your JavaScript code always runs in a single thread. That's because the V8 JavaScript engine is not threadsafe.
However, as an implementation detail of some of the C++ code, there may be threads. For example, suppose you write some JavaScript code that connects to a database. Your JavaScript code will of course be async, like any good Node code. But async coding is very uncommon in the C/C++ world, so the database vendor probably didn't write an async C/C++ API.
So when someone is writing a Node package for database access, they have to write a shim that adapts between the "blocking" C++ behavior and the "non-blocking, event-driven" Node behavior. When you call, say, a "connect" method, that goes to C++ code that spawns a new thread, and that thread issues a (blocking) "connect" call to the database, which blocks the thread until the connection is done. Then the C++ code will communicate the "connection done" back to the event queue, and the next time the main (JavaScript) thread polls the event queue, your callback will fire.
So yes, there are threads, but their use should be completely transparent to you. When you're writing Node.js code in JavaScript, you don't need to worry about threads -- you just care that things happen when they're supposed to. Package authors may use threads, but that's purely an implementation detail and you should never have to worry about it. Your JavaScript code never explicitly uses threads.
I haven't been able to write a program in Lua that will load more than one CPU. Since Lua supports the concept via coroutines, I believe it's achievable.
Reason for me failing can be one of:
It's not possible in Lua
I'm not able to write it ☺ (and I hope it's the case )
Can someone more experienced (I discovered Lua two weeks ago) point me in right direction?
The point is to write a number-crunching script that does hi-load on ALL cores...
For demonstrative purposes of power of Lua.
Thanks...
Lua coroutines are not the same thing as threads in the operating system sense.
OS threads are preemptive. That means that they will run at arbitrary times, stealing timeslices as dictated by the OS. They will run on different processors if they are available. And they can run at the same time where possible.
Lua coroutines do not do this. Coroutines may have the type "thread", but there can only ever be a single coroutine active at once. A coroutine will run until the coroutine itself decides to stop running by issuing a coroutine.yield command. And once it yields, it will not run again until another routine issues a coroutine.resume command to that particular coroutine.
Lua coroutines provide cooperative multithreading, which is why they are called coroutines. They cooperate with each other. Only one thing runs at a time, and you only switch tasks when the tasks explicitly say to do so.
You might think that you could just create OS threads, create some coroutines in Lua, and then just resume each one in a different OS thread. This would work so long as each OS thread was executing code in a different Lua instance. The Lua API is reentrant; you are allowed to call into it from different OS threads, but only if are calling from different Lua instances. If you try to multithread through the same Lua instance, Lua will likely do unpleasant things.
All of the Lua threading modules that exist create alternate Lua instances for each thread. Lua-lltreads just makes an entirely new Lua instance for each thread; there is no API for thread-to-thread communication outside of copying parameters passed to the new thread. LuaLanes does provide some cross-connecting code.
It is not possible with the core Lua libraries (if you don't count creating multiple processes and communicating via input/output), but I think there are Lua bindings for different threading libraries out there.
The answer from jpjacobs to one of the related questions links to LuaLanes, which seems to be a multi-threading library. (I have no experience, though.)
If you embed Lua in an application, you will usually want to have the multithreading somehow linked to your applications multithreading.
In addition to LuaLanes, take a look at llthreads
In addition to already suggested LuaLanes, llthreads and other stuff mentioned here, there is a simpler way.
If you're on POSIX system, try doing it in old-fashioned way with posix.fork() (from luaposix). You know, split the task to batches, fork the same number of processes as the number of cores, crunch the numbers, collate results.
Also, make sure that you're using LuaJIT 2 to get the max speed.
It's very easy just create multiple Lua interpreters and run lua programs inside all of them.
Lua multithreading is a shared nothing model. If you need to exchange data you must serialize the data into strings and pass them from one interpreter to the other with either a c extension or sockets or any kind of IPC.
Serializing data via IPC-like transport mechanisms is not the only way to share data across threads.
If you're programming in an object-oriented language like C++ then it's quite possible for multiple threads to access shared objects across threads via object pointers, it's just not safe to do so, unless you provide some kind of guarantee that no two threads will attempt to simultaneously read and write to the same data.
There are many options for how you might do that, lock-free and wait-free mechanisms are becoming increasingly popular.
I know the very basics about using coroutines as a base and implementing a toy scheduler. But I assume it's oversimplified view about asynchronous schedulers in whole. There are whole set of holes missing in my thoughts.
How to keep the cpu from running a scheduler that's running idle/waiting? Some fibers just sleep, others wait for input from operating system.
You'd need to multiplex io operations into an event based interface(select/poll), so you can leverage the OS to do the waiting, while still being able to schedule other fibers. select/poll have a timeout argument - for fibers that want to sleep, you can create a priority queue that uses that option of select/poll to emulate a sleep call.
Trying to serve fibers that does blocking operations (call read/write/sleep etc). directly won't work unless you schedule each fiber in a native thread - which kind of beats the purpose.
See http://swtch.com/libtask/ for a working implementation.
You should probably take a look at the setcontext family of functions (http://en.wikipedia.org/wiki/Setcontext). This will mean that within your application you will need to re-implement all functions that may block (read, write, sleep etc) into asynchronous forms and return to the scheduler.
Only the "scheduler fibre" will get to wait on completion events using select(), poll() or epoll(). This means when the scheduler is idle, the process will be sleeping in the select/poll/epoll call, and would not be taking up CPU.
Though it's a little bit late to answer, I'd like to mention that I have a practical implementation of a fiber library in C, called libevfibers.
Despite of being a young project, it is used in production. It provides a solution not only to classical asynchronous operations like reading/writing a socket, but also addresses the filesystem IO in a non-blocking manner. The project leverages 3 great libraries --- libcoro, libev and libeio.
You can control controlflow also via the use of coroutines. A library that supports the creation of those is BOOST.ASIO.
A good example is available here: Boost Stackful Coroutines
From an implementation point of view, you can start with an asynchronous event loop implementation. Then you can just implement the fiber scheduling on top of that by using the asynchronous event handlers to switch to the corresponding fiber.
A sleeping/waiting fiber just means that it isn't scheduled at the moment - it just switches to the event loop instead.
BTW, if you are looking for some actual code, have a look at http://svn.cmeerw.net/src/nginetd/trunk/ which is still work in progress, but tries to implement a fiber scheduler on top of a multi-threaded event loop (with Win32 I/O completion ports or Linux's edge-triggered epoll).