I honestly couldn't find an answer to this seemingly simple question on the web.
How are asynchronous methods implemented? Does it involve creating a thread each time the method is called, or is there a more specialized operating system service for this?
I know that most programming languages have built-in facilities for this (such as an async keyword), but I'm asking about the underlying operating system feature that is used.
There is no special operating system feature for asynchronous methods other than basic multithreading features. A method's being asynchronous simply means that the caller can move on without waiting for the method to complete. Sometimes the caller provides a callback function for notification of when the method has completed.
Depending on the programming language, when you write an asynchronous method, you may need to write the thread handling yourself. For example, the asynchronous method may spawn a new thread to handle the request, or it may queue the request, and one or more other threads may dequeue the requests and handle them. Note that the "asynchronous" method actually involves a short stretch of synchronous code, in this case to spawn the new thread or to queue the request, with the main task executed asynchronously, typically in another thread.
Related
Is "event based" the same as "asynchronous"?
No it doesn't imply that the events are asynchronous.
In a event driven single threaded system, you can fire events, but they are all processed serially. They may yield as part of their processing, but nothing happens concurrently, if they yield, they stop processing and have to wait until they are messaged again to start processing again.
Examples of this are Swing ( Java ), Twisted ( Python ), Node.js ( JavaScript ), EventMachine ( Ruby )
All of these examples are event driven message loops, but they are all single threaded, every event will block all subsequent events on that same thread.
In programming, asynchronous events are those occurring independently of the main program flow. Asynchronous actions are actions executed in a non-blocking scheme, allowing the main program flow to continue processing.
So just because something is event driven doesn't make it asynchronous, and just because something is asynchronous doesn't make it event driven either; much less concurrent.
They are essentially orthogonal concepts.
"event driven" essentially means that the code associated to certain function calls is bind at runtime (and can change through the execution).
Who "fires" the event doesn't know what will handle it, and who handle the event is defined to respond to the event through an association defined while the program executes. Typically though function pointers, reference or pointers to object carrying virtual methods etc.)
"asynchronous" means that a program flow doesn't have to wait for a call to be executed before proceed over (mostly implemented with a call that returns immediately after delegating the execution to another thread or process)
Not all events are asynchronous (think to the windows SendMessage, respect to PostMessage), and not all asynchronous calls are necessary implemented by "events" (although the use of the event mechanism is quite common to implement asynchronous calls)
One meaning of asynchronous is that at a point where you emit an computation, you don't wait for an answer, but you get the answer later. The answer comes in orthogonal to you normal control flow.
One way the answer comes in is using events: They happen spontaneously in this case, without your code triggering them. In a handler you may process the result.
Whereas the computation and answer is connected by the point in control flow for synchronous mode, you have to do the connection yourself in asynchronous mode. For example by use of a sequence number or something.
Node.js, is an asynchronous I/O. what does this actually mean?
Is there different between I create an async function by spawning another thread to do the process?
e.g.
void asyncfuntion(){
Thread apple = new Thread(){
public void run(){
...do stuff
}
}
apple.start()
}
If there is difference, can I do a asynchronous I/O in javascript?
Asynchronous I/O
Asynchronous I/O (from Wikipedia)
Asynchronous I/O, or non-blocking I/O, is a form of input/output
processing that permits other processing to continue before the
transmission has finished.
What this means is, if a process wants to do a read() or write(), in a synchronous call, the process would have to wait until the hardware finishes the physical I/O so that it can be informed of the success/failure of the I/O operation.
On asynchronous mode, once the process issues a read/write I/O asynchronously, the system calls is returned immediately once the I/O has been passed down to the hardware or queued in the OS/VM. Thus the execution of the process isn't blocked (hence why it's called non-blocking I/O) since it doesn't need to wait for the result from the system call, it will receive the result later.
Asynchronous Function
Asynchronous functions is a function that returns the data back to the caller by a means of event handler (or callback functions). The callback function can be called at any time (depending on how long it takes the asynchronous function to complete). This is unlike the synchronous function, which will execute its instructions before returning a value.
...can I do a asynchronous I/O in java?
Yes, Java NIO provides non-blocking I/O support via Selector's. Also, Apache MINA, is a networking framework that also includes non-blocking I/O. A related SO question answers that question.
There are several great articles regarding asynchronous code in node.js:
Asynchronous Code Design with Node.js
Understanding Event-driven Programming
Understanding event loops and writing great code for Node.js
In addition to #The Elite Gentleman's answer, node doesn't spawn threads for asynchronous I/O functions. Everything under node runs in a single threaded event loop. That is why it is really important to avoid synchronized versions of some I/O functions unless absolutely necessary, like fs.readSync
You may read this excellent blog post for some insight: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
I was investigating the same question since this async IO pattern was very new to me. I found this talk on infoq.com which made me very happy. The guy explains very well where async IO actually resides (the OS -> the kernel) and how it is embedded in node.js as the primary idiom for doing IO. Enjoy!
http://www.infoq.com/presentations/Nodejs-Asynchronous-IO-for-Fun-and-Profit
node.js enables a programmer to do asynchronous IO by forcing the usage of callbacks. Now callbacks are similar to the old asynchronous functions that we have been using for a long time to handle DOM events in javascript!
e.g.
asyncIOReadFile(fileName, asynFuncReadTheActualContentsAndDoSomethingWithThem);
console.log('IDontKnowWhatIsThereInTheFileYet')
function asynFuncReadTheActualContentsAndDoSomethingWithThem(fileContents) {
console.log('Now I know The File Contents' + fileContents)
}
//Generally the output of the above program will be following
'IDontKnowWhatIsThereInTheFileYet'
//after quite a bit of time
'Now I know The File Contents somebinarystuff'
From https://en.wikipedia.org/wiki/Asynchronous_I/O
Synchronous blocking I/O
A simple approach to I/O would be to start the access and then wait for it to complete.
Would block the progress of a program while the communication is in progress, leaving system resources idle.
Asynchronous I/O
Alternatively, it is possible to start the communication and then perform processing that does not require that the I/O be completed.
Any task that depends on the I/O having completed ... still needs to wait for the I/O operation to complete, and thus is still blocked,
but other processing that does not have a dependency on the I/O operation can continue.
Example:
https://en.wikipedia.org/wiki/Node.js
Node.js has an event-driven architecture capable of asynchronous I/O
Event-driven and asynchronous are often used as synonyms. Are there any differences between the two?
Also, what is the difference between epoll and aio? How do they fit together?
Lastly, I've read many times that AIO in Linux is horribly broken. How exactly is it broken?
Thanks.
Events is one of the paradigms to achieve asynchronous execution.
But not all asynchronous systems use events. That is about semantic meaning of these two - one is super-entity of another.
epoll and aio use different metaphors:
epoll is a blocking operation (epoll_wait()) - you block the thread until some event happens and then you dispatch the event to different procedures/functions/branches in your code.
In AIO, you pass the address of your callback function (completion routine) to the system and the system calls your function when something happens.
Problem with AIO is that your callback function code runs on the system thread and so on top of the system stack. A few problems with that as you can imagine.
They are completely different things.
The events-driven paradigm means that an object called an "event" is sent to the program whenever something happens, without that "something" having to be polled in regular intervals to discover whether it has happened. That "event" may be trapped by the program to perform some actions (i.e. a "handler") -- either synchronous or asynchronous.
Therefore, handling of events can either be synchronous or asynchronous. JavaScript, for example, uses a synchronous eventing system.
Asynchronous means that actions can happen independent of the current "main" execution stream. Mind you, it does NOT mean "parallel", or "different thread". An "asynchronous" action may actually run on the main thread, blocking the "main" execution stream in the meantime. So don't confuse "asynchronous" with "multi-threading".
You may say that, technically speaking, an asynchronous operation automatically assumes eventing -- at least "completed", "faulted" or "aborted/cancelled" events (one or more of these) are sent to the instigator of the operation (or the underlying O/S itself) to signal that the operation has ceased. Thus, async is always event-driven, but not the other way round.
Event driven is a single thread where events are registered for a certain scenario. When that scenario is faced, the events are fired. However even at that time each of the events are fired in a sequential manner. There is nothing Asynchronous about it. Node.js (webserver) uses events to deal with multiple requests.
Asynchronous is basically multitasking. It can spawn off multiple threads or processes to execute a certain function. It's totally different from event driven in the sense that each thread is independent and hardly interact with the main thread in an easy responsive manner. Apache (webserver) uses multiple threads to deal with incoming requests.
Lastly, I've read many times that AIO in Linux is horribly broken. How exactly is it broken?
AIO as done via KAIO/libaio/io_submit comes with a lot of caveats and is tricky to use well if you want it to behave rather than silently blocking (e.g. only works on certain types of fd, when using files/block devices only actually works for direct I/O but those are the tip of the iceberg). It did eventually gain the ability to indicate file descriptor readiness with the 4.19 kernel) which is useful for programs using sockets.
POSIX AIO on Linux is actually a userspace threads implementation by glibc and comes with its own limitations (e.g. it's considered slow and doesn't scale well).
These days (2020) hope for doing arbitrary asynchronous I/O on Linux with less pain and tradeoffs is coming from io_uring...
I know the very basics about using coroutines as a base and implementing a toy scheduler. But I assume it's oversimplified view about asynchronous schedulers in whole. There are whole set of holes missing in my thoughts.
How to keep the cpu from running a scheduler that's running idle/waiting? Some fibers just sleep, others wait for input from operating system.
You'd need to multiplex io operations into an event based interface(select/poll), so you can leverage the OS to do the waiting, while still being able to schedule other fibers. select/poll have a timeout argument - for fibers that want to sleep, you can create a priority queue that uses that option of select/poll to emulate a sleep call.
Trying to serve fibers that does blocking operations (call read/write/sleep etc). directly won't work unless you schedule each fiber in a native thread - which kind of beats the purpose.
See http://swtch.com/libtask/ for a working implementation.
You should probably take a look at the setcontext family of functions (http://en.wikipedia.org/wiki/Setcontext). This will mean that within your application you will need to re-implement all functions that may block (read, write, sleep etc) into asynchronous forms and return to the scheduler.
Only the "scheduler fibre" will get to wait on completion events using select(), poll() or epoll(). This means when the scheduler is idle, the process will be sleeping in the select/poll/epoll call, and would not be taking up CPU.
Though it's a little bit late to answer, I'd like to mention that I have a practical implementation of a fiber library in C, called libevfibers.
Despite of being a young project, it is used in production. It provides a solution not only to classical asynchronous operations like reading/writing a socket, but also addresses the filesystem IO in a non-blocking manner. The project leverages 3 great libraries --- libcoro, libev and libeio.
You can control controlflow also via the use of coroutines. A library that supports the creation of those is BOOST.ASIO.
A good example is available here: Boost Stackful Coroutines
From an implementation point of view, you can start with an asynchronous event loop implementation. Then you can just implement the fiber scheduling on top of that by using the asynchronous event handlers to switch to the corresponding fiber.
A sleeping/waiting fiber just means that it isn't scheduled at the moment - it just switches to the event loop instead.
BTW, if you are looking for some actual code, have a look at http://svn.cmeerw.net/src/nginetd/trunk/ which is still work in progress, but tries to implement a fiber scheduler on top of a multi-threaded event loop (with Win32 I/O completion ports or Linux's edge-triggered epoll).
Does an asynchronous call always create a new thread? What is the difference between the two?
Does an asynchronous call always create or use a new thread?
Wikipedia says:
In computer programming, asynchronous events are those occurring independently of the main program flow. Asynchronous actions are actions executed in a non-blocking scheme, allowing the main program flow to continue processing.
I know async calls can be done on single threads? How is this possible?
Whenever the operation that needs to happen asynchronously does not require the CPU to do work, that operation can be done without spawning another thread. For example, if the async operation is I/O, the CPU does not have to wait for the I/O to complete. It just needs to start the operation, and can then move on to other work while the I/O hardware (disk controller, network interface, etc.) does the I/O work. The hardware lets the CPU know when it's finished by interrupting the CPU, and the OS then delivers the event to your application.
Frequently higher-level abstractions and APIs don't expose the underlying asynchronous API's available from the OS and the underlying hardware. In those cases it's usually easier to create threads to do asynchronous operations, even if the spawned thread is just waiting on an I/O operation.
If the asynchronous operation requires the CPU to do work, then generally that operation has to happen in another thread in order for it to be truly asynchronous. Even then, it will really only be asynchronous if there is more than one execution unit.
This question is darn near too general to answer.
In the general case, an asynchronous call does not necessarily create a new thread. That's one way to implement it, with a pre-existing thread pool or external process being other ways. It depends heavily on language, object model (if any), and run time environment.
Asynchronous just means the calling thread doesn't sit and wait for the response, nor does the asynchronous activity happen in the calling thread.
Beyond that, you're going to need to get more specific.
No, asynchronous calls do not always involve threads.
They typically do start some sort of operation which continues in parallel with the caller. But that operation might be handled by another process, by the OS, by other hardware (like a disk controller), by some other computer on the network, or by a human being. Threads aren't the only way to get things done in parallel.
JavaScript is single-threaded and asynchronous. When you use XmlHttpRequest, for example, you provide it with a callback function that will be executed asynchronously when the response returns.
John Resig has a good explanation of the related issue of how timers work in JavaScript.
Multi threading refers to more than one operation happening in the same process. While async programming spreads across processes. For example if my operations calls a web service, The thread need not wait till the web service returns. Here we use async programming which allows the thread not wait for a process in another machine to complete. And when it starts getting response from the webservice it can interrupt the main thread to say that web service has completed processing the request. Now the main thread can process the result.
Windows always had asynchronous processing since the non preemptive times (versions 2.13, 3.0, 3.1, etc) using the message loop, way before supporting real threads. So to answer your question, no, it is not necessary to create a thread to perform asynchronous processing.
Asynchronous calls don't even need to occur on the same system/device as the one invoking the call. So if the question is, does an asynchronous call require a thread in the current process, the answer is no. However, there must be a thread of execution somewhere processing the asynchronous request.
Thread of execution is a vague term. In a cooperative tasking systems such as the early Macintosh and Windows OS'es, the thread of execution could simply be the same process that made the request running another stack, instruction pointer, etc... However, when people generally talk about asynchronous calls, they typically mean calls that are handled by another thread if it is intra-process (i.e. within the same process) or by another process if it is inter-process.
Note that inter-process (or interprocess) communication (IPC) is commonly generalized to include intra-process communication, since the techniques for locking, and synchronizing data are usually the same regardless of what process the separate threads of execution run in.
Some systems allow you to take advantage of the concurrency in the kernel for some facilities using callbacks. For a rather obscure instance, asynchronous IO callbacks were used to implement non-blocking internet severs back in the no-preemptive multitasking days of Mac System 6-8.
This way you have concurrent execution streams "in" you program without threads as such.
Asynchronous just means that you don't block your program waiting for something (function call, device, etc.) to finish. It can be implemented in a separate thread, but it is also common to use a dedicated thread for synchronous tasks and communicate via some kind of event system and thus achieve asynchronous-like behavior.
There are examples of single-threaded asynchronous programs. Something like:
...do something
...send some async request
while (not done)
...do something else
...do async check for results
The nature of asynchronous calls is such that, if you want the application to continue running while the call is in progress, you will either need to spawn a new thread, or at least utilise another thread you that you have created solely for the purposes of handling asynchronous callbacks.
Sometimes, depending on the situation, you may want to invoke an asynchronous method but make it appear to the user to be be synchronous (i.e. block until the asynchronous method has signalled that it is complete). This can be achieved through Win32 APIs such as WaitForSingleObject.