Do 'asynchronous', 'non-blocking', and 'concurrent' imply one another? - multithreading

Here are the definitions by Wikipedia:
Asynchrony, in computer programming, refers to the occurrence of events independently of the main program flow and ways to deal with such events. These may be "outside" events such as the arrival of signals, or actions instigated by a program that take place concurrently with program execution, without the program blocking to wait for results.
And:
Concurrent computing is a form of computing in which several computations are executed during overlapping time periods—concurrently—instead of sequentially (one completing before the next starts).
In the context of single-threaded computation, do 'asynchronous', 'non-blocking', and 'concurrent' imply one another?
If not, could you give me a counter-example?
Note that I have excluded the word 'parallel' as it implies multiple threads.

Non-blocking operations are based on two approaches:
by simply returning without data (when no data is available - in such cases the caller has to "come back" by itself and "read" again)
by using callbacks. In that context "blocking" means that you wait for an operation to reach a certain state - whereas "non-blocking" means that you trigger the operation - and when that state is reached, you are notified.
Please note: both options do not imply concurrency or multiple threads on the client side. You absolutely can implement such a system using a single process ( think coroutines or node.js for example ).
In that sense: a non-blocking operation is always asynchronous - as you don't know when it will have results for you - or when it will call you back. Both concepts can be be implemented using concurrency, but there is absolute need for doing it that way.

Non-blocking and concurrent don't really apply to single threaded programs, due to the fact that they refer to ways of managing multiple threads. Non-blocking means that a program doesn't wait for all threads to finish before moving on, and concurrent computation can only happen if you have multiple threads doing the calculation. (Someone please correct me if I'm wrong.)
Asynchrony is the only term that applies to single threaded programming, in the form of human input, communication with other programs, etc. Because of this, no, they don't imply each other in the context of single threaded programs.

Related

Dart is Single Threaded but why it uses Future Objects and perform asynchronous operations

In Documentation, Dart is Single Threaded but to perform two operations at a time we use future objects which work same as thread.
Use Future objects (futures) to perform asynchronous operations.
If Dart is single threaded then why it allows to perform asynchronous operations.
Note: Asynchronous operations are parallel operations which are called threads
You mentioned that :
Asynchronous operations are parallel operations which are called threads
First of all, Asynchronous operations are not exactly parallel or even concurrent. Its just simply means that we do not want to block our flow of execution(Thread) or wait for the response until certain work is done. But the way we implement Asynchronous operations could decide either it is parallel or concurrent.
Parallellism vs Concurrency ?
Parallelism is actually doing lots of things simultaneously at the
same time. ex - You are walking and at the same time you're digesting
you food. Both tasks are completely running parallel and exactly at the
same time.
While
Concurrency is the illusion of Parallelism.Tasks seems to be Executed
parallel but they aren't. It like handing lots of things at a time but
only doing one task at a specific time. ex - You are walking and suddenly stop to tie your show lace. After tying your shoe lace you again start walking.
Now coming to Dart, Future Objects along with async and await keywords are used to perform asynchronous task. Here asynchronous doesn't means that tasks will be executed parallel or concurrent to each other. Instead in Dart even the asynchronous task is executed on the same thread which means that while we wait for another task to be completed, we will continue executing our synchronous code . Future Objects are used to represent the result of task which will be done at some time in future.
If you want to really execute your task concurrently then consider using Isolates(Which runs in separate thread and doesn't shares it memory with the main thread(or spawning thread).
Why? Because it is a necessity. Some operations, like http requests or timers, are asynchronous in nature.
There are isolates which allow you to execute code in a different process. The difference to threads in other programming languages is that isolates do not share memory with each other (which would lead to concurrency issues), they only communicate through messages.
To receive these messages (or wrapped in a Future, the result of it), Dart uses an event loop.
The Event Loop and Dart
Are Futures in Dart threads?
Dart is single threaded, but it can call native code(like c/c++) to perform asynchronous operations, which can introduce new thread.
In Flutter, Flutter engine is implement in c++, which provide the low-level implementation of Flutter’s core API, including asynchronous tasks like file and network I/O through new thread underneath.
Like Dart, JavaScript is also single threaded, I find this video very helpful to understand "Single Threaded" thing. what the heck is event loop
Here are a few notes:
Asynchronous doesn't mean multi-threaded. It means the code is not run at the same time. Usually asyncronous just means that it is scheduled to be run on the same thread (Isolate) after other tasks have finished.
Dart isn't actually single threaded. You can create another thread by creating another Isolate. However, within an Isolate the Dart code runs on a single thread and separate Isolates don't share memory. They can only communicate by messages.
A Future says that a value (or an error) will be returned at some point in the future. It doesn't say which thread the work is done on. Most futures are done on the current Isolate, but some futures (IO, for example) can be done on separate threads.
See this answer for links to more resources.
I have an article explaining this https://medium.com/#truongsinh/flutter-dart-async-concurrency-demystify-1cc739aaae57
In short, Flutter/Dart is not technically single-threaded, even though Dart code is executed in a single thread. Dart is a concurrent language with message passing pattern, that can take full advantage of modern multi-core architecture, without worrying about lock or mutex. Blocking in Dart can be either I/O-bound or CPU-bound, which should be solved, respectively, by Future and Dart’s Isolate/Flutter’s compute.

Grayzone between blocking and non-blocking I/O?

I am familar with programming according to the two paradigms, blocking and non-blocking, on the JVM (Java/nio, Scala/Akka).
However, I see a kind of grayzone in between that confuses me.
Look at any non-blocking program of your choice: it is full of blocking statements!
For example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed.
Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.
In contrast to that, the non-blocking paradigm would clearly be violated if we would call some external web-service in a blocking way to receive its result.
But what is in between these extremes? What about reading/writing a tiny file, a local socket, or making an API-call to an embedded data storage engine (such as SQLite, RocksDb, etc.). Is it ok to do blocking reads/writes to these APIs? They usually give strong timing guarantees in practice (say << 1ms as long as the OS is not stalled), so there is almost no practical difference to pure in-memory-access. As a precise example: is calling RocksDBs get/put within an Akka Actor considered to be an inadvisable blocking I/O?
So, my question is whether there are rules of thumb or precise criteria that help me in deciding whether I may stick to a simple blocking statement in my non-blocking program, or whether I shall wrap such a statement into non-blocking boilerplate (framework-depending, e.g., outsourcing such calls to a separate thread-pool, nesting one step deeper in a Future or Monad, etc.).
for example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed
That's not really what is considered "blocking". Those operations are constant time, and that constant is very low (a few cycles in general) compared to the latency of any IO operations (anywhere between thousands and billions of cycles) - except for page faults due to swapped memory, but if those happen regularly you have a problem anyway.
And if we want to get all nitpicky, individual instructions do not fully block a CPU thread as modern CPUs can reorder instructions and execute ones that have no data dependencies out of order while waiting for memory/caches or other more expensive instructions to finish.
Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.
Those are not considered as blocking the CPU from doing work. They should not even block user interactivity if they are correctly designed to present the results to the user when they are done without blocking the UI.
Is it ok to do blocking reads/writes to these APIs?
That always depends on why you are using non-blocking approaches in the first place. What problem are you trying to solve? Maybe one API warrants a non-blocking approach while the other does not.
For example most file IO methods are nominally blocking, but writes without fsync can be very cheap, especially if you're not writing to spinning rust so it can be overkill to avoid those methods on your compute threadpool. On the other hand one usually does not want to block a thread in a fixed threadpool while waiting for a multi-second database query

What really is asynchronous computing?

I've been reading (and working) quite a bit with massively multi-threaded applications, and with IO, and I've found that the term asynchronous has become some sort of catch-all for multiple vague ideas. I'm wondering if I understand it correctly. The way I see it is that there are two main branches of "asynchronicity".
Asynchronous I/O. Such as network read/write. What this really boils down to is efficient parallel processing between multiple CPUs, such as your main CPU and your NIC CPU. The idea is to have multiple processors running in parallel, exchanging data, without blocking waiting for the other to finish and return the results of it's job.
Minimizing context-switching penalties by minimizing use of threads. This seems to be what the .NET framework is focusing on with it's async/await features. Instead of spawning/closing/blocking threads, break parallel jobs into tasks, and use a software task scheduler to keep a pool of threads as busy as possible without resorting to spawning new threads.
These seem like two entirely separate concepts with no similarities that could tie them together, but are both referred to by the same "asynchronous computing" vocabulary.
Am I understanding all of this correctly?
Asynchronous basically means not blocking, i.e. not having to wait for an operation to complete.
Threads are just one way of accomplishing that. There are many ways of doing this, from hardware level, SO level, software level.
Someone with more experience than me can give examples of asyncronicity not related to threads.
What this really boils down to is efficient parallel processing between multiple CPUs, such as your main CPU and your NIC CPU. The idea is to have multiple processors running in parallel...
Asynchronous programming is not all about multi-core CPU's and parallelism: consider a single core CPU, with just one thread creating email messages and sends them. In a synchronous fashion, it would spend a few micro seconds to create the message, and a lot more time to send it through network, and only then create the next message. But in asynchronous program, the thread could create a new message while the previous one is being sent through the network. One implementation for that kind of program can be using .NET async/await feature, where you can have just one thread. But even a blocking IO program could be considered asynchronous: If the main thread creates the messages and queues them in a buffer, which another thread pulls them from and sends them in a blocking IO way. From the main thread's point of view - it's completely async.
.NET async/await just uses the OS api's which are already async - reading /writing a file, send /receive data through network, they are all async anyway - the OS doesn't block on them (the drivers themselves are async).
Asynchronous is a general term, which does not have widely accepted meaning. Different domains have different meanings to it.
For instance, async IO means that instead of blocking on IO call, something else happens. Something else can be really different things, but it usually involves some sort of notification of call completion. Details might differ. For instance, a notification might be built into the call itself - like in MS Completeion Ports (if memory serves). Or, it can be something verify do before you make a call so that the call can not block - this is what poll() and friends do.
Async might also well mean simply parallel execution. For instance, one might say that 'database is updated asynchronously' meaning that there is a dedicated thread which handles database connectivity, and that thread does not slow down the main processing thread.

How to articulate the difference between asynchronous and parallel programming?

Many platforms promote asynchrony and parallelism as means for improving responsiveness. I understand the difference generally, but often find it difficult to articulate in my own mind, as well as for others.
I am a workaday programmer and use async & callbacks fairly often. Parallelism feels exotic.
But I feel like they are easily conflated, especially at the language design level. Would love a clear description of how they relate (or don't), and the classes of programs where each is best applied.
When you run something asynchronously it means it is non-blocking, you execute it without waiting for it to complete and carry on with other things. Parallelism means to run multiple things at the same time, in parallel. Parallelism works well when you can separate tasks into independent pieces of work.
Take for example rendering frames of a 3D animation. To render the animation takes a long time so if you were to launch that render from within your animation editing software you would make sure it was running asynchronously so it didn't lock up your UI and you could continue doing other things. Now, each frame of that animation can also be considered as an individual task. If we have multiple CPUs/Cores or multiple machines available, we can render multiple frames in parallel to speed up the overall workload.
I believe the main distinction is between concurrency and parallelism.
Async and Callbacks are generally a way (tool or mechanism) to express concurrency i.e. a set of entities possibly talking to each other and sharing resources.
In the case of async or callback communication is implicit while sharing of resources is optional (consider RMI where results are computed in a remote machine).
As correctly noted this is usually done with responsiveness in mind; to not wait for long latency events.
Parallel programming has usually throughput as the main objective while latency, i.e. the completion time for a single element, might be worse than a equivalent sequential program.
To better understand the distinction between concurrency and parallelism I am going to quote from Probabilistic models for concurrency of Daniele Varacca which is a good set of notes for theory of concurrency:
A model of computation is a model for concurrency when it is able to represent systems as composed of independent autonomous components, possibly communicating with each other. The notion of concurrency should not be confused with the notion of parallelism. Parallel computations usually involve a central control which distributes the work among several processors. In concurrency we stress the independence of the components, and the fact that they communicate with each other. Parallelism is like ancient Egypt, where the Pharaoh decides and the slaves work. Concurrency is like modern Italy, where everybody does what they want, and all use mobile phones.
In conclusion, parallel programming is somewhat a special case of concurrency where separate entities collaborate to obtain high performance and throughput (generally).
Async and Callbacks are just a mechanism that allows the programmer to express concurrency.
Consider that well-known parallel programming design patterns such as master/worker or map/reduce are implemented by frameworks that use such lower level mechanisms (async) to implement more complex centralized interactions.
This article explains it very well: http://urda.cc/blog/2010/10/04/asynchronous-versus-parallel-programming
It has this about asynchronous programming:
Asynchronous calls are used to prevent “blocking” within an application. [Such a] call will spin-off in an already existing thread (such as an I/O thread) and do its task when it can.
this about parallel programming:
In parallel programming you still break up work or tasks, but the key differences is that you spin up new threads for each chunk of work
and this in summary:
asynchronous calls will use threads already in use by the system and parallel programming requires the developer to break the work up, spinup, and teardown threads needed.
async: Do this by yourself somewhere else and notify me when you complete(callback). By the time i can continue to do my thing.
parallel: Hire as many guys(threads) as you wish and split the job to them to complete quicker and let me know(callback) when you complete. By the time i might continue to do my other stuff.
the main difference is parallelism mostly depends on hardware.
My basic understanding is:
Asynchonous programming solves the problem of waiting around for an expensive operation to complete before you can do anything else. If you can get other stuff done while you're waiting for the operation to complete then that's a good thing. Example: keeping a UI running while you go and retrieve more data from a web service.
Parallel programming is related but is more concerned with breaking a large task into smaller chunks that can be computed at the same time. The results of the smaller chunks can then be combined to produce the overall result. Example: ray-tracing where the colour of individual pixels is essentially independent.
It's probably more complicated than that, but I think that's the basic distinction.
I tend to think of the difference in these terms:
Asynchronous: Go away and do this task, when you're finished come back and tell me and bring the results. I'll be getting on with other things in the mean time.
Parallel: I want you to do this task. If it makes it easier, get some folks in to help. This is urgent though, so I'll wait here until you come back with the results. I can do nothing else until you come back.
Of course an asynchronous task might make use of parallelism, but the differentiation - to my mind at least - is whether you get on with other things while the operation is being carried out or if you stop everything completely until the results are in.
It is a question of order of execution.
If A is asynchronous with B, then I cannot predict beforehand when subparts of A will happen with respect to subparts of B.
If A is parallel with B, then things in A are happening at the same time as things in B. However, an order of execution may still be defined.
Perhaps the difficulty is that the word asynchronous is equivocal.
I execute an asynchronous task when I tell my butler to run to the store for more wine and cheese, and then forget about him and work on my novel until he knocks on the study door again. Parallelism is happening here, but the butler and I are engaged in fundamentally different tasks and of different social classes, so we don't apply that label here.
My team of maids is working in parallel when each of them is washing a different window.
My race car support team is asynchronously parallel in that each team works on a different tire and they don't need to communicate with each other or manage shared resources while they do their job.
My football (aka soccer) team does parallel work as each player independently processes information about the field and moves about on it, but they are not fully asynchronous because they must communicate and respond to the communication of others.
My marching band is also parallel as each player reads music and controls their instrument, but they are highly synchronous: they play and march in time to each other.
A cammed gatling gun could be considered parallel, but everything is 100% synchronous, so it is as though one process is moving forward.
Why Asynchronous ?
With today's application's growing more and more connected and also potentially
long running tasks or blocking operations such as Network I/O or Database Operations.So it's very important to hide the latency of these operations by starting them in background and returning back to the user interface quickly as possible. Here Asynchronous come in to the picture, Responsiveness.
Why parallel programming?
With today's data sets growing larger and computations growing more complex. So it's very important to reduce the execution time of these CPU-bound operations, in this case, by dividing the workload into chunks and then executing those chunks simultaneously. We can call this as "Parallel" .
Obviously it will give high Performance to our application.
Asynchronous
Let's say you are the point of contact for your client and you need to be responsive i.e. you need to share status, complexity of operation, resources required etc whenever asked. Now you have a time-consuming operation to be done and hence cannot take this up as you need to be responsive to the client 24/7. Hence, you delegate the time-consuming operation to someone else so that you can be responsive. This is asynchronous.
Parallel programming
Let's say you have a task to read, say, 100 lines from a text file, and reading one line takes 1 second. Hence, you'll require 100 seconds to read the text file. Now you're worried that the client must wait for 100 seconds for the operation to finish. Hence you create 9 more clones and make each of them read 10 lines from the text file. Now the time taken is only 10 seconds to read 100 lines. Hence you have better performance.
To sum up, asynchronous coding is done to achieve responsiveness and parallel programming is done for performance.
Asynchronous: Running a method or task in background, without blocking. May not necessorily run on a separate thread. Uses Context Switching / time scheduling.
Parallel Tasks: Each task runs parallally. Does not use context switching / time scheduling.
I came here fairly comfortable with the two concepts, but with something not clear to me about them.
After reading through some of the answers, I think I have a correct and helpful metaphor to describe the difference.
If you think of your individual lines of code as separate but ordered playing cards (stop me if I am explaining how old-school punch cards work), then for each separate procedure written, you will have a unique stack of cards (don't copy & paste!) and the difference between what normally goes on when run code normally and asynchronously depends on whether you care or not.
When you run the code, you hand the OS a set of single operations (that your compiler or interpreter broke your "higher" level code into) to be passed to the processor. With one processor, only one line of code can be executed at any one time. So, in order to accomplish the illusion of running multiple processes at the same time, the OS uses a technique in which it sends the processor only a few lines from a given process at a time, switching between all the processes according to how it sees fit. The result is multiple processes showing progress to the end user at what seems to be the same time.
For our metaphor, the relationship is that the OS always shuffles the cards before sending them to the processor. If your stack of cards doesn't depend on another stack, you don't notice that your stack stopped getting selected from while another stack became active. So if you don't care, it doesn't matter.
However, if you do care (e.g., there are multiple processes - or stacks of cards - that do depend on each other), then the OS's shuffling will screw up your results.
Writing asynchronous code requires handling the dependencies between the order of execution regardless of what that ordering ends up being. This is why constructs like "call-backs" are used. They say to the processor, "the next thing to do is tell the other stack what we did". By using such tools, you can be assured that the other stack gets notified before it allows the OS to run any more of its instructions. ("If called_back == false: send(no_operation)" - not sure if this is actually how it is implemented, but logically, I think it is consistent.)
For parallel processes, the difference is that you have two stacks that don't care about each other and two workers to process them. At the end of the day, you may need to combine the results from the two stacks, which would then be a matter of synchronicity but, for execution, you don't care again.
Not sure if this helps but, I always find multiple explanations helpful. Also, note that asynchronous execution is not constrained to an individual computer and its processors. Generally speaking, it deals with time, or (even more generally speaking) an order of events. So if you send dependent stack A to network node X and its coupled stack B to Y, the correct asynchronous code should be able to account for the situation as if it was running locally on your laptop.
Generally, there are only two ways you can do more than one thing each time. One is asynchronous, the other is parallel.
From the high level, like the popular server NGINX and famous Python library Tornado, they both fully utilize asynchronous paradigm which is Single thread server could simultaneously serve thousands of clients (some IOloop and callback). Using ECF(exception control follow) which could implement the asynchronous programming paradigm. so asynchronous sometimes doesn't really do thing simultaneous, but some io bound work, asynchronous could really promotes the performance.
The parallel paradigm always refers multi-threading, and multiprocessing. This can fully utilize multi-core processors, do things really simultaneously.
Summary of all above answers
parallel computing:
▪ solves throughput issue.
Concerned with breaking a large task into smaller chunks
▪ is machine related (multi machine/core/cpu/processor needed), eg: master slave, map reduce.
Parallel computations usually involve a central control which distributes the work among several processors
asynchronous:
▪ solves latency issue
ie, the problem of 'waiting around' for an expensive operation to complete before you can do anything else
▪ is thread related (multi thread needed)
Threading (using Thread, Runnable, Executor) is one fundamental way to perform asynchronous operations in Java

Concurrent execution/Re-entrant /ThreadSafe/?

I read many answers given here for questions related to thread safety, re-entrancy, but when i think about them, some more questions came to mind, hence this question/s.
1.) I have one executable program say some *.exe. If i run this program on command prompt, and while it is executing, i run the same program on another command prompt, then in what conditions the results could be corrupted, i.e. should the code of this program be re-entrant or it should be thread safe alone?
2.) While defining re-entrancy, we say that the routine can be re-entered while it is already running, in what situations the function can be re-entered (apart from being recursive routine, i am not talking recursive execution here). There has to be some thread to execute the same code again, or how can that function be entered again?
3.) In a practical case, will two threads execute same code, i.e. perform same functionality. I thought the idea of multi-threading is to execute different functionality, concurrently(on different cores/processors).
Sorry if these queries seem different, but they all occured to me, same time when i read about the threadsafe Vs reentrant post on SO, hence i put them together.
Any pointers, reading material will be appreciated.
thanks,
-AD.
I'll try to explain these, in order:
Each program runs in its own process, and gets its own isolated memory space. You don't have to worry about thread safety in this situation. (However, if the processes are both accessing some other shared resource, such as a file, you may have different issues. For example, process 1 may "lock" the data file, preventing process 2 from being able to open it).
The idea here is that two threads may try to run the same routine at the same time. This is not always valid - it takes special care to define a class or a process in a way that multiple threads can use the same instance of the same class, or the same static function, without errors occurring. This typically requires synchronization in the class.
Two threads often execute the same code. There are two different conceptual ways to parition your work when threading. You can either think in terms of tasks - ie: one thread does task A while another does task B. Alternatively, you can think in terms of decomposing the the problem based on data. In this case, you work with a large collection, and each element is processed using the same routine, but the processing happens in parallel. For more info, you can read this blog post I wrote on Decomposition for Parallelism.
Two processes cannot share memory. So thread-safety is moot here.
Re-entrancy means that a method can be safely executed by two threads at the same time. This doesn't require recursion - threads are separate units of execution, and there is nothing keeping them both from attempting to run the same method simultaneously.
The benefits to threading can happen in two ways. One is when you perform different types of operations concurrently (like running cpu-intensive code and I/O-intensive code at the ame time). The other is when you can divide up a long-running operation among multiple processors. In this latter case, two threads may be executing the same function at the same time on different input data sets.
First of all, I strongly suggest you to look at some basic stuffs of computer system, especially how a process/thread is executing on CPU and scheduled by operating system. For example, virtual address, context switching, process/thread concepts(e.g., each thread has its own stack and register vectors while heap is shared by threads. A thread is an execution and scheduling unit, so it maintains control flow of code..) and so on. All of the questions are related to understanding how your program is actually working on CPU
1) and 2) are already answered.
3) Multithreading is just concurrent execution of any arbitrary thread. The same code can be executed by multiple threads. These threads can share some data, and even can make data races which are very hard to find. Of course, many times threads are executing separate code(we say it as thread-level parallelism).
In this context, I have used concurrent as two meaning: (a) in a single processor, multiple threads are sharing a single physical processor, but operating system gives a sort of illusion that threads are running concurrently. (b) In a multicore, yes, physically two or more threads can be executed concurrently.
Having concrete understanding of concurrent/parallel execution takes quite long time. But, you already have a solid understanding!

Resources