What are the best resources to learn Express.js? Can anybody explain the node.js framework,how exactly it works.
The nonblocking eventloop concept.
I've found the Express website explains things pretty well, and Express to be quite approachable for new users.
A multi-threaded system (Java and underlying JVM, for instance), contains many threads of execution that may each execute its own code instructions simultaneously (on a multi-core hardware CPU), or switched between, where each thread runs for a scheduled period of time, and then the OS schedules the next thread for execution.
Node programs are executed in the Node environment, which is single threaded, so there is only a single thread of code execution for the entire program and no multiple threads executing concurrently.
A simple analogy would be comparing the event loop with a standard programming construct, the while loop, which is exactly what it is.
while(1){
// Node sets this up. Do stuff.. Runs until our program terminates.
}
Starting a node program would start this loop. You could imagine your program being inserted into this loop.
If the first instruction in your program was to read a file from disk. That request would be dispatched to the underlying OS system call to read the file.
Node provides Asynchronous and Synchronous functions for things like reading a file, although the asynchronous is generally preferred because in a synchronous call, a problem with reading the file halts the entire program, in a single threaded system.
while(1){
require('fs').readFileSync('file.txt');
// stop everything until the OS reports the file has been read
}
In the (preferred) asynchronous version, the request to read the file is issued to the OS, and a callback function is specified, the loop continues. The program essentially waits for the OS to respond, and on the next loop (aka tick), your provided callback function (essentially just a location in memory) is called by the system with the result.
while(1){
// 1st loop does this
require('fs').readFile('file.txt', callback);
// 2nd loop does this, system calls our callback function with the result
callback(err, result)
}
There are anticipated advantages of a single threaded system. One is that there is no context switching between threads that needs to be done by the OS, which removes the overhead of performing that task in the system.
The other, and this is a hotly debated topic of how this compares against the way other systems and programming languages handle it - is the simplicity of programming using callback functions as a means to implement asynchronicity.
There are many good resources to learn Express.js e.g.:
http://shop.oreilly.com/product/0636920032977.do
https://www.udemy.com/all-about-nodejs/
https://www.manning.com/books/express-in-action
https://www.packtpub.com/web-development/mastering-web-application-development-express
http://expressjsguide.com/
https://github.com/azat-co/expressworks
You may want to check also these blogs:
https://codeforgeek.com/2014/10/express-complete-tutorial-part-1/
https://strongloop.com/strongblog/category/express/
Related
Like many before, I came across this diagram describing the architecture of NodeJS and since I'm new to the concept of Asynchronous programming, let me first share with you my (possibly flawed) understanding of Node's architecture.
To my knowledge, the main application script is first compiled into binary code using Chrome's V8 Engine. After this it moves through Node.JS bindings, which is a low-level API that allows the binary code to be handled by the event mechanism. Then, a single thread is allocated to the event loop, which loops infinitely, continually picks up the first (i.e. the oldest) event in the event queue and assigns a worker thread to process the event. After that, the callback is stored in the event queue, moved to a worker thread by the event-loop thread, and - depending on whether or not the callback function had another nested callback function - is either done or executes any of the callback functions that have not yet been processed.
Now here's what I don't get. The event-loop is able to continually assign events to worker threads, but the code that the worker threads have to process is still CPU blocking and the amount of worker threads is still limited. In a synchronous process, wouldn't it be able to assign different pieces of code to different worker threads on the server's CPU?
Let's use an example:
var fs = require('fs');
fs.readFile('text.txt', function(err, data) {
if(err) {
console.log(err);
} else {
console.log(data.toString());
}
});
console.log('This will probably be finished first.');
This example will log 'This will probably be finished first' and then output the data of the text.txt file later, since it's the callback function of the fs.readFile() function. Now I understand that NodeJS has a non-blocking architecture since the second block of code is finished earlier than the first even though it was called in a later stage. However, the total amount of time it takes for the program to be finished would still be the addition of the time it takes for each function to finish, right?
The only answer I can think of is that asynchronous programming allows for multithreading whereas synchronous programming does not. Otherwise, asynchronous event handling wouldn't actually be faster than synchronous programming, right?
Thanks in advance for your replies.
I've been reading up and going through as much of NodeJs code as I can but I'm a bit confused about this:
What exactly does Node being single threaded mean and what does non-blocking I/O mean? I can achieve the first one by spawning a child process and the second one by using async library. But I wanted to be clear what it meant and how non-blocking I/O can still slow up your app.
I'll try my best to explain.
Single-threaded means that the Node.js Javascript runtime - at a particular point in time - only is executing one piece of code from all the code it has loaded. In effect, it starts somewhere, and works it way down through all instructions (the call stack) until it's done. While it's executing code, nothing can interrupt this process, and all I/O must wait. Thankfully, most call stacks are relatively short, and lots of things we do in Node.js are more of the "bookkeeping" type than CPU-heavy.
Being single-thread though, any instructions that would take a long time would be a huge problem for the responsiveness in a system. The runtime can only do one thing at a time, so everything must wait until that instruction has finished. If any "I/O" instruction (say reading from disk) would block execution, then the system would be unnecessarily unavailable at that time.
But thankfully, we've got non-blocking I/O.
Instead of waiting for a file to be read:
console.log(readFileSync(filePath))
you write your code so that you DON'T wait for a file to be read:
readFile(filePath)
The readFile call returns almost instantly (perhaps in a a few nano-seconds), so the runtime can continue executing instructions that come next. But if the readFile call returns before the data has been read, there's no way that the the readFile call can return the file contents. That's where callbacks come in:
readFile(filePath, function(err, contents) { console.log(contents))
Still, the readFile call returns almost instantly. The runtime can continue. It will finish the current work before it (all instructions coming after readFile). Nothing is done with the function that's passed, other than storing a reference to it.
Then, at some later point in time (perhaps 10ms, 100ms, or 1000ms later) when reading the file has completed, the callback is called with as second argument the full contents of the file. Before that time, any number of other batches of work could have been done by the runtime.
Now I will address your comments about spawning child processes and Async library. You are wrong on both accounts.
Spawning a child process is a way to let Node.js use more than CPU core. Being single-threaded, a single Node.js has no purpose for using more than one core. Still, if you are on a multi-core computer, you may want to use all those cores. Hence, start multiple Node.js. processes.
The Async library will not give you non-blocking I/O, Node.js gives you that. What Node.js does not give you itself, is an easy way to deal with data coming in from multiple callbacks. The Async library can help a great deal with that.
As I'm not an expert on Node.js internals, I welcome corrections!
Related questions:
asynchronous vs non-blocking
What's the difference between: Asynchronous, Non-Blocking, Event-Base architectures?
After all the literature i've read on node.js I still come back to the question, does node.js itself make use of multiple threads under the hood? I think the answer is yes because if we use the simple asynch file read example something has to be doing the work to read the file but if the main event loop of node is not processing this work than that must mean there should be a POSIX thread running somewhere that takes care of the file reading and then upon completion places the call back in the event loop to be executed. So when we say Node.js runs in one thread do we actually mean that the event loop of node.js is only one thread? Or am i missing something here.....
To a Javascript program on node.js, there is only one thread.
If you're looking for technicalities, node.js is free to use threads to solve asynchronous I/O if the underlying operating system requires it.
The important thing is to never break the "there is only one thread" abstraction to the Javascript program. If there are more threads, all they can do is queue up work for the main thread in the Javascript program, they can never execute any Javascript code themselves.
I was going through the details of node.jsand came to know that, It supports asynchronous programming though essentially it provides a single threaded model.
How is asynchronous programming handled in such cases? Is it like runtime itself creates and manages threads, but the programmer cannot create threads explicitly? It would be great if someone could point me to some resources to learn about this.
Say it with me now: async programming does not necessarily mean multi-threaded.
Javascript is a single-threaded runtime - you simply aren't able to create new threads in JS because the language/runtime doesn't support it.
Frank says it correctly (although obtusely) In English: there's a main event loop that handles when things come into your app. So, "handle this HTTP request" will get added to the event queue, then handled by the event loop when appropriate.
When you call an async operation (a mysql db query, for example), node.js sends "hey, execute this query" to mysql. Since this query will take some time (milliseconds), node.js performs the query using the MySQL async library - getting back to the event loop and doing something else there while waiting for mysql to get back to us. Like handling that HTTP request.
Edit: By contrast, node.js could simply wait around (doing nothing) for mysql to get back to it. This is called a synchronous call. Imagine a restaurant, where your waiter submits your order to the cook, then sits down and twiddles his/her thumbs while the chef cooks. In a restaurant, like in a node.js program, such behavior is foolish - you have other customers who are hungry and need to be served. Thus you want to be as asynchronous as possible to make sure one waiter (or node.js process) is serving as many people as they can.
Edit done
Node.js communicates with mysql using C libraries, so technically those C libraries could spawn off threads, but inside Javascript you can't do anything with threads.
Ryan said it best: sync/async is orthogonal to single/multi-threaded. For single and multi-threaded cases there is a main event loop that calls registered callbacks using the Reactor Pattern. For the single-threaded case the callbacks are invoked sequentially on main thread. For the multi-threaded case they are invoked on separate threads (typically using a thread pool). It is really a question of how much contention there will be: if all requests require synchronized access to a single data structure (say a list of subscribers) then the benefits of having multiple threaded may be diminished. It's problem dependent.
As far as implementation, if a framework is single threaded then it is likely using poll/select system call i.e. the OS is triggering the asynchronous event.
To restate the waiter/chef analogy:
Your program is a waiter ("you") and the JavaScript runtime is a kitchen full of chefs doing the things you ask.
The interface between the waiter and the kitchen is mediated by queues so requests are not lost in instances of overcapacity.
So your program is assigned one thread of execution. You can only wait one table at a time. Each time you want to offload some work (like making the food/making a network request), you run to the kitchen and pin the order to a board (queue) for the chefs (runtime) to pick-up when they have spare capacity. The chefs will let you know when the order is ready (they will call you back). In the meantime, you go wait another table (you are not blocked by the kitchen).
So the accepted answer is misleading. The JavaScript runtime is definitionally multithreaded because I/O does not block your JavaScript program. As a waiter you can continue serving customers, while the kitchen cooks. That involves at least two threads of execution. The reality is that the runtime will maintain several threads of execution behind the scenes, in order to efficiently serve the single thread directly corresponding to your script.
By design, only one thread of execution is assigned to the synchronous running of your JavaScript program. This is a good thing because it makes your program easier to reason about than having to handle multiple threads of execution yourself. Don't worry: your JavaScript program can still get plenty complicated though!
Does an asynchronous call always create a new thread? What is the difference between the two?
Does an asynchronous call always create or use a new thread?
Wikipedia says:
In computer programming, asynchronous events are those occurring independently of the main program flow. Asynchronous actions are actions executed in a non-blocking scheme, allowing the main program flow to continue processing.
I know async calls can be done on single threads? How is this possible?
Whenever the operation that needs to happen asynchronously does not require the CPU to do work, that operation can be done without spawning another thread. For example, if the async operation is I/O, the CPU does not have to wait for the I/O to complete. It just needs to start the operation, and can then move on to other work while the I/O hardware (disk controller, network interface, etc.) does the I/O work. The hardware lets the CPU know when it's finished by interrupting the CPU, and the OS then delivers the event to your application.
Frequently higher-level abstractions and APIs don't expose the underlying asynchronous API's available from the OS and the underlying hardware. In those cases it's usually easier to create threads to do asynchronous operations, even if the spawned thread is just waiting on an I/O operation.
If the asynchronous operation requires the CPU to do work, then generally that operation has to happen in another thread in order for it to be truly asynchronous. Even then, it will really only be asynchronous if there is more than one execution unit.
This question is darn near too general to answer.
In the general case, an asynchronous call does not necessarily create a new thread. That's one way to implement it, with a pre-existing thread pool or external process being other ways. It depends heavily on language, object model (if any), and run time environment.
Asynchronous just means the calling thread doesn't sit and wait for the response, nor does the asynchronous activity happen in the calling thread.
Beyond that, you're going to need to get more specific.
No, asynchronous calls do not always involve threads.
They typically do start some sort of operation which continues in parallel with the caller. But that operation might be handled by another process, by the OS, by other hardware (like a disk controller), by some other computer on the network, or by a human being. Threads aren't the only way to get things done in parallel.
JavaScript is single-threaded and asynchronous. When you use XmlHttpRequest, for example, you provide it with a callback function that will be executed asynchronously when the response returns.
John Resig has a good explanation of the related issue of how timers work in JavaScript.
Multi threading refers to more than one operation happening in the same process. While async programming spreads across processes. For example if my operations calls a web service, The thread need not wait till the web service returns. Here we use async programming which allows the thread not wait for a process in another machine to complete. And when it starts getting response from the webservice it can interrupt the main thread to say that web service has completed processing the request. Now the main thread can process the result.
Windows always had asynchronous processing since the non preemptive times (versions 2.13, 3.0, 3.1, etc) using the message loop, way before supporting real threads. So to answer your question, no, it is not necessary to create a thread to perform asynchronous processing.
Asynchronous calls don't even need to occur on the same system/device as the one invoking the call. So if the question is, does an asynchronous call require a thread in the current process, the answer is no. However, there must be a thread of execution somewhere processing the asynchronous request.
Thread of execution is a vague term. In a cooperative tasking systems such as the early Macintosh and Windows OS'es, the thread of execution could simply be the same process that made the request running another stack, instruction pointer, etc... However, when people generally talk about asynchronous calls, they typically mean calls that are handled by another thread if it is intra-process (i.e. within the same process) or by another process if it is inter-process.
Note that inter-process (or interprocess) communication (IPC) is commonly generalized to include intra-process communication, since the techniques for locking, and synchronizing data are usually the same regardless of what process the separate threads of execution run in.
Some systems allow you to take advantage of the concurrency in the kernel for some facilities using callbacks. For a rather obscure instance, asynchronous IO callbacks were used to implement non-blocking internet severs back in the no-preemptive multitasking days of Mac System 6-8.
This way you have concurrent execution streams "in" you program without threads as such.
Asynchronous just means that you don't block your program waiting for something (function call, device, etc.) to finish. It can be implemented in a separate thread, but it is also common to use a dedicated thread for synchronous tasks and communicate via some kind of event system and thus achieve asynchronous-like behavior.
There are examples of single-threaded asynchronous programs. Something like:
...do something
...send some async request
while (not done)
...do something else
...do async check for results
The nature of asynchronous calls is such that, if you want the application to continue running while the call is in progress, you will either need to spawn a new thread, or at least utilise another thread you that you have created solely for the purposes of handling asynchronous callbacks.
Sometimes, depending on the situation, you may want to invoke an asynchronous method but make it appear to the user to be be synchronous (i.e. block until the asynchronous method has signalled that it is complete). This can be achieved through Win32 APIs such as WaitForSingleObject.