When does ALooper_pollAll return ALOOPER_POLL_TIMEOUT? - android-ndk

Let's assume, that a couple of file descriptors with corresponding callbacks are added to the looper, and then ALooper_pollAll() is called with a timeout of 1000 milliseconds. Soon after that some of the file descriptors become available and the looper starts to calling callbacks on them. Let's say that the last callback called has finished exactly after 200ms ellapsed since the start of ALooper_pollAll. Now, if there's no more data on either of the descriptors, when will the function return ALOOPER_POLL_TIMEOUT? Is it after 800ms (remaining timeout time after the callbacks has finished), or after 1000ms (initial timeout)?

After looking at Looper's sources it became clear that it's the former. Internally ALooper_pollAll calls ALooper_pollOnce in a loop, updating (reducing) the timeout time after each consequent call, so eventually when it becomes 0, ALOOPER_POLL_TIMEOUT is returned.

Related

fs.readFile operation inside the callback of parent fs.readFile does not execute asynchronously

I have read that fs.readFile is an asynchronous operation and the reading takes place in a separate thread other than the main thread and hence the main thread execution is not blocked. So I wanted to try out something like below
// reads take almost 12-15ms
fs.readFile('./file.txt', (err, data) => {
console.log('FIRST READ', Date.now() - start)
const now = Date.now()
fs.readFile('./file.txt', (err, data) => {
// logs how much time it took from the beginning
console.log('NESTED READ CALLBACK', Date.now() - start)
})
// blocks untill 20ms more, untill the above read operation is done
// so that the above read operation is done and another callback is queued in the poll phase
while (Date.now() - now < 20) {}
console.log('AFTER BLOCKING', Date.now() - start)
})
I am making another fs.readFile call inside the callback of parent fs.readFile call. What I am expecting is that the log NESTED READ CALLBACK must arrive immediately after AFTER BLOCKING since, the reading must be completed asynchronously in a separate thread
Turns out the log NESTED READ CALLBACK comes 15ms after the AFTER BLOCKING call, indicating that while I was blocking in the while loop, the async read operation somehow never took place. By the way the while loop is there to model some task that takes 20ms to complete
What exactly is happening here? Or am I missing some information here? Btw I am using windows
During your while() loop, no events in Javascript will be processed and none of your Javascript code will run (because you are blocking the event loop from processing by just looping).
The disk operations can make some progress (as they do some work in a system thread), but their results will not be processed until your while loop is done. But, because the fs.readFile() actually consists of three or more operations, fs.open(), fs.read() and fs.close(), it probably won't get very far while the event loop is blocked because it needs events to be processed to advance through the different stages of its work.
Turns out the log NESTED READ CALLBACK comes 15ms after the AFTER BLOCKING call, indicating that while I was blocking in the while loop, the async read operation somehow never took place. By the way the while loop is there to model some task that takes 20ms to complete
What exactly is happening here?
fs.readFile() is not a single monolithic operation. Instead, it consists of fs.open(), fs.read() and fs.close() and the sequencing of those is run in user-land Javascript in the main thread. So, while you are blocking the main thread with your while() loop, the fs.readFile() can't make a lot of progress. Probably what happens is you initiate the second fs.readFile() operation and that kicks off the fs.open() operation. That gets sent off to an OS thread in the libuv thread pool. Then, you block the event loop with your while() loop. While that loop is blocking the event loop, the fs.open() completes and (internal to the libuv event loop) an event is placed into the event queue to call the completion callback for the fs.open() call. But, the event loop is blocked by your loop so that callback can't get called immediately. Thus, any further progress on completing the fs.readFile() operation is blocked until the event loop frees up and can process waiting events again.
When your while() loop finishes, then control returns to the event loop and the completion callback for the fs.open() call gets called while will then initiate the reading of the actual data from the file.
FYI, you can actually inspect the code yourself for fs.readFile() right here in the Github repository. If you follow its flow, you will see that, from its own Javascript, it first calls binding.open() which is a native code operation and then, when that completes and Javascript is able to process the completion event through the event queue, it will then run the fucntion readFileAfterOpen(...) which will call bind.fstat() (another native code operation) and then, when that completes and Javascript is able to process the completion event
through the event queue, it will then call `readFileAfterStat(...) which will allocate a buffer and initiate the read operation.
Here, the code gets momentarily harder to follow as flow skips over to a read_file_context object here where eventually it calls read() and again when that completes and Javascript is able to process the completion event via the event loop, it can advance the process some more, eventually reading all the bytes from the file into a buffer, closing the file and then calling the final callback.
The point of all this detail is to illustrate how fs.readFile() is written itself in Javascript and consists of multiple steps (some of which call code which will use some native code in a different thread), but can only advance from one step to the next when the event loop is able to process new events. So, if you are blocking the event loop with a while loop, then fs.readFile() will get stuck between steps and not be able to advance. It will only be able to advance and eventually finish when the event loop is able to process events.
An analogy
Here's a simple analogy. You ask your brother to do you a favor and go to three stores for you to pick up things for dinner. You give him the list and destination for the first store and then you ask him to call you on your cell phone after he's done at the first store and you'll give him the second destination and the list for that store. He heads off to the first store.
Meanwhile you call your girlfriend on your cell phone and start having a long conversation with her. Your brother finishes at the first store and calls you, but you're ignoring his call because you're still talking to your girlfriend. Your brother gets stuck on his mission of running errands because he needs to talk to you to find out what the next step is.
In this analogy, the cell phone is kind of like the event loop's ability to process the next event. If you are blocking his ability to call you on the phone, then he can't advance to his next step (the event loop is blocked). His visits to the three stores are like the individual steps involved in carrying out the fs.readfile() operation.

Can a setImmediate() function scheduled in an I/O callback recalculate timeout for other I/O notifications?

There is the poll stage of the Node js's event loop. Its aim is to blockingly wait for I/O notifications and then execute needed callbacks. A timeout time of the waiting is calculated before entering this stage. If there were operations scheduled through setImmediate than the timeout is set at 0, if timers took place than timeout is set with a view of them and if there are no such impediments the blocking would continue "forever".
What will happen if the timeout is initially set at "infinity" and an I/O callback schedules a function with setImmediate or setInterval? Would the timeout be recalculated immediately or only after the I/O queue got empty (in a next loop iteration)?
So if we wait for two launched asynchronous opearations with callbacks A and B and after receiving an I/O notification A is executed and it in turn schedules some C function with setImmediate what the function would execute first B or C?
Note that for some of the more common asynchronous API with timer support (eg. the select() or kqueue() system call) a timer value of zero (0) represents the value of infinity so you cannot implement a function like setImmediate() or nextTick() the usual way by calculating the next timeout value and calling the asynchronous API. Instead all setImmediate() or nextTick() callbacks must be called synchronously before calling the asynchronous API.
Therefore, since C is a callback to setImmediate() it will be called before we re-enter the event loop to wait for B to fire.
So if A and only a gets triggered then the sequence of function calls would be: A and C.
But if A and B happens at the same time the sequence of function calls would be: A, B, C.
Note that I said A and C and not A, C, B because we didn't say B gets triggered. For all I know B will only get triggered 8 months from now or even never.
But if A gets triggered and 1 millisecond later B happens then the sequence of function calls would be: A, C, B
Is it possible for two events to happen in the same event loop? Generally yes. If it's network I/O it is unlikely because IP packets are serialized but it can still happen especially if you have a machine with multiple network cards. If it's graphical events like DOM updates then two things happening in the same event loop is quite common eg: both page resize and content reflow can happen at the same time especially with responsive CSS using media query where page resize can trigger content reflow.

Doubts about event loop , multi-threading, order of executing readFile() in Node.js?

fs.readFile("./large.txt", "utf8", (err, data) => {
console.log('It is a large file')
//this file has many words (11X MB).
//It takes 1-2 seconds to finish reading (usually 1)
});
fs.readFile("./small.txt","utf8", (err, data) => {
for(let i=0; i<99999 ;i++)
console.log('It is a small file');
//This file has just one word.
//It always takes 0 second
});
Result:
The console will always first print "It is a small file" for 99999 times (it takes around 3 seconds to finish printing).
Then, after they are all printed, the console does not immediately print "It is a large file". (It is always printed after 1 or 2 seconds).
My thought:
So, it seems that the first readFile() and second readFile() functions do not run in parallel. If the two readFile() functions ran in parallel, then I would expect that after "It is a small file" was printed for 99999 times,
the first readFile() is finished reading way earlier (just 1 second) and the console would immediately print out the callback of the first readFile() (i.e. "It is a large file".)
My questions are :
(1a) Does this mean that the first readFile() will start to read file only after the callback of second readFile() has done its work?
(1b) To my understanding, in nodeJs, event loop passes the readFile() to Libuv multi-thread. However, I wonder in what order they are passed. If these two readFile() functions do not run in parallel, why is the second readFile() function always executed first?
(2) By default, Libuv has four threads for Node.js. So, here, do these two readFile() run in the same thread? Among these four threads, I am not sure whether there is only one for readFile().
Thank you very much for spending your time! Appreciate!
I couldn't possibly believe that node would delay the large file read until the callback for the small file read had completed, and so I did a little more instrumentation of your example:
const fs = require('fs');
const readLarge = process.argv.includes('large');
const readSmall = process.argv.includes('small');
if (readLarge) {
console.time('large');
fs.readFile('./large.txt', 'utf8', (err, data) => {
console.timeEnd('large');
if (readSmall) {
console.timeEnd('large (since busy wait)');
}
});
}
if (readSmall) {
console.time('small');
fs.readFile('./small.txt', 'utf8', (err, data) => {
console.timeEnd('small');
var stop = new Date().getTime();
while(new Date().getTime() < stop + 3000) { ; } // busy wait
console.time('large (since busy wait)');
});
}
(Note that I replaced your loop of console.logs with a 3s busy wait).
Running this against node v8.15.0 I get the following results:
$ node t small # read small file only
small: 0.839ms
$ node t large # read large file only
large: 447.348ms
$ node t small large # read both files
small: 3.916ms
large: 3252.752ms
large (since busy wait): 247.632ms
These results seem sane; the large file took ~0.5s to read on its own, but when the busy waiting callback interfered for 2s, it completed relatively soon (~1/4s) thereafter. Tweaking the length of the busy wait keeps this relatively consistent, so I'd be willing to just say this was some kind of scheduling overhead and not necessarily a sign that the large file I/O had not been running during the busy wait.
But then I ran the same program against node 10.16.3, and here's what I got:
$ node t small
small: 1.614ms
$ node t large
large: 1019.047ms
$ node t small large
small: 3.595ms
large: 4014.904ms
large (since busy wait): 1009.469ms
Yikes! Not only did the large file read time more than double (to ~1s), it certainly appears as if no I/O at all had been completed before the busy wait concluded! i.e., it sure looks like the busy wait in the main thread prevented any I/O at all from happening on the large file.
I suspect that this change from 8.x to 10.x is a result of this "optimization" in Node 10: https://github.com/nodejs/node/pull/17054. This change, which splits the read of large files into multiple operations, seems to be appropriate for smoothing performance of the system in general purpose cases, but it is likely exacerbated by the unnatural long main thread processing / busy wait in this scenario. Presumably, without the main thread yielding, the I/O is not getting a chance to advance to the next range of bytes in the large file to be read.
It would seem that, with Node 10.x, it is important to have a responsive main thread (i.e., one that yields frequently, and doesn't busy wait like in this example) in order to sustain I/O performance of large file reads.
(1a) Does this mean that the first readFile() will start to read file only after the callback of second readFile() has done its work?
No. Each readFile() actually consists of multiple steps (open file, read chunk, read chunk ... close file). The logic flow between steps is controlled by Javascript code in the node.js fs library. But, a portion of each step is implemented by native threaded code in libuv using a thread pool.
So, the first step of the first readFile() will be initiated and then control is returned back to the JS interpreter. Then, the first step of the second readFile() will be initiated and then control returned back to the JS interpreter. It can ping pong back and forth between progress in the two readFile() operations as long as the JS interpreter isn't kept busy. But, if the JS interpreter does get busy for awhile, it will stall further progress when the current step that's proceeding in the background completes. There's a full step-by-step chronology at the end of the answer if you want to follow the details of each step.
(1b) To my understanding, in nodeJs, event loop passes the readFile() to Libuv multi-thread. However, I wonder in what order they are passed. If these two readFile() functions do not run in parallel, why is the second readFile() function always executed first?
fs.readFile() itself is not implemented in libuv. It's implemented as a series of individual steps in node.js Javascript. Each individual step (open file, read chunk, close file) is implemented in libuv, but Javascript in the fs library controls the sequencing between steps. So, think of fs.readfile() as a series of calls to libuv. When you have two fs.readFile() operations in flight at the same time, each will have some libuv operation going at any given time and one step for each fs.readFile() can be proceeding in parallel due to the thread pool implementation in libuv. But, between each step in the process, control comes back the JS interpreter. So, if the interpreter gets busy for some extended portion of time, then further progress in scheduling the next step of the other fs.readFile() operation is stalled.
(2) By default, Libuv has four threads for Node.js. So, here, do these two readFile() run in the same thread? Among these four threads, I am not sure whether there is only one for readFile().
I think this is covered in the previous two explanations. readFile() itself is not implemented in native code of libuv. Instead, it's written in Javascript with calls to open, read, close operations that are written in native code and use libuv and the thread pool.
Here's a full accounting of what's going on. To fully understand, one needs to know about these:
Main Concepts
The single threaded, non-pre-emptive nature of node.js running your Javascript (assuming no WorkerThreads are manually coded here - which they aren't).
The multi-threaded, native code of the fs module's file I/O and how that works.
How native code asynchronous operations communicate completion via the event queue and how event loop scheduling works when the JS interpreter is busy doing something.
Asynchronous, Non-Blocking
I presume you know that fs.readFile() is asynchronous and non-blocking. That means when you call it, all it does is initiate an operation to read the file and then it goes right onto the next line of code at the top level after the fs.readFile() (not the code inside the callback you pass to it).
So, a condensed version of your code is basically this:
fs.readFile(x, funcA);
fs.readFile(y, funcB);
If we added some logging to this:
function funcA() {
console.log("funcA");
}
function funcB() {
console.log("funcB");
}
function spin(howLong) {
let finishTime = Date.now() + howLong;
// spin until howLong ms passes
while (Date.now() < finishTime) {}
}
console.log("1");
fs.readFile(x, funcA);
console.log("2");
fs.readFile(y, funcB);
console.log("3");
spin(30000); // spin for 30 seconds
console.log("4");
You would see either this order:
1
2
3
4
A
B
or this order:
1
2
3
4
B
A
Which of the two it was would just depend upon the indeterminate race between the two fs.readFile() operations. Either could happen. Also, notice that 1, 2, 3 and 4 are all logged before any asynchronous completion events can occur. This is because the single-threaded, non-pre-emptive JS interpreter main thread is busy executing Javascript. It won't pull the next event out of the event queue until it's done executing this piece of Javascript.
Libuv Thread Pool
As you appear to already know, the fs module uses a libuv thread pool for running file I/O. That's independent of the main JS thread so those read operations can proceed independently from further JS execution. Using native code, file I/O will communicate with the event queue when they are done to schedule their completion callback.
Indeterminate Race Between Two Asynchronous Operations
So, you've just created an indeterminate race between the two fs.readFile() operations that are likely each running in their own thread. A small file is much more likely to complete first before the larger file because the larger file has a lot more data to read from the disk.
Whichever fs.readFile() finishes first will insert its callback into the event queue first. When the JS interpreter is free, it will pick the next event out of the event queue. Whichever one finishes first gets to run its callback first. Since the small file is likely to finish first (which is what you are reporting), it gets to run its callback. Now, when it is running its callback, this is just Javascript and even though the large file may finish and insert its callback into the event queue, that callback can't run until the callback from the small file finishes. So, it finishes and THEN the callback from the large file gets to run.
In general, you should never write code like this unless you don't care at all what order the two asynchronous operations finish in because it's an indeterminate race and you cannot count on which one will finish first. Because of the asynchronous non-blocking nature of fs.readFile(), there is no guarantee that the first file operation initiated will finish first. It's no different than firing off two separate http requests one after the other. You don't know which one will complete first.
Step By Step Chronology
Here's a step by step chronology of what happens:
You call fs.readFile("./large.txt", ...);
In Javascript code, that initiates opening the large.txt file by calling native code and then returns. The opening of the actual file is handled by libuv in native code and when that is done, an event will be inserted into the JS event queue.
Immediately after that operation is initiated, then that first fs.readFile() returns (not yet done yet, still processing internally).
Now the JS interpreter picks up at the next line of code and runs fs.readFile("./small.txt", ...);
In Javascript code, that initiates opening the small.txt file by calling native code and then returns. The opening of the actual file is handled by libuv in native code and when that is done, an event will be inserted into the JS event queue.
Immediately after that operation is initiated, then that second fs.readFile() returns (not yet done yet, still processing internally).
The JS interpreter is actually free to run any following code or process any incoming events.
Then, some time later, one of the two fs.readFile() operations finishes its first step (opening the file), an event is inserted into the JS event queue and when the JS interpreter has time, a callback is called. Since opening each file is about the same operation time, it's likely that the open operation for the large.txt file finishes first, but that isn't guaranteed.
After the file open succeeds, it initiates an asynchronous operation to read the first chunk from the file. This again is asynchronous and is handled by libuv so as soon as this is initiated, it returns control back to the JS interpreter.
The second file open likely finises next and it does the same thing as the first, initiates reading the first chunk of data from disk and returns control back to the JS interpreter.
Then, one of these two chunk reads finishes and inserts an event into the event queue and when the JS interpreter is free, a callback is called to process that. At this point, this could be either the large or small file, but for purposes of simplicity of explanation, lets assume the first chunk of the large file finishes first. It will buffer that chunk, see that there is more data to read and will initiate another asynchronous read operation and then return control back to the JS interpreter.
Then, the other first chunk read finishes. It will buffer that chunk and see that there is no more data to read. At this point, it will issue a file close operation which is again handled by libuv and control is returned back to the JS interpreter.
One of the two previous operations completes (a second block read from large.txt or a file close of small.txt) and its callback is called. Since the close operation doesn't have to actually touch the disk (it just goes into the OS), let's assume the close operation finishes first for purposes of explanation. That close triggers the end of the fs.ReadFile() for small.txt and calls the completion callback for that.
So, at this point, small.txt is done and large.txt has read one chunk from its file and is awaiting completion of the second chunk to read.
Your code now executes the for loop that takes whatever time that takes.
By the point that finishes and the JS interpreter is free again, the 2nd file read from large.txt is probably done so the JS interpreter finds it's event in the event queue and executes a callback to do some more processing on reading more chunks from that file.
The process of reading a chunk, returning control back to the interpreter, waiting for the next chunk completion event and then buffering that chunk continues until all the data has been read.
Then a close operation is initiated for large.txt.
When that close operation is done, the callback for the fs.readFile() for large.txt is called and your code that is timing large.txt will measure completion.
So, because the logic of fs.readFile() is implemented in Javascript with a number of discrete asynchronous steps with each one ultimately handled by libuv (open file, read chunk - N times, close file), there will be an interleaving of the work between the two files. The reading of the smaller file will finish first just because it has fewer and smaller read operations. When it finishes, the large file will still have multiple more chunks to read and a close operation left. Because the multiple steps of fs.readFile() are controlled through Javascript, when you do the long for loop in the small.txt completion, you are stalling the fs.readFile() operation for the large.txt file too. Whatever chunk read was in progress when that loop happened will complete in the background, but the next chunk read won't get issued until that small file callback completes.
It appears that there would be an opportunity for node.js to improve the responsiveness of fs.readFile() in competitive circumstances like this if that operation was rewritten entirely in native code so that one native code operation could read the contents of the whole file rather than all these transitions back and forth between the single threaded main JS thread and libuv. If this was the case, the big for loop wouldn't stall the progress of large.txt because it would be progressing entirely in a libuv thread rather than waiting for some cycles from the JS interpreter in order to get to its next step.
We can theorize that if both files were able to be read in one chunk, then not much would get stalled by the long for loop. Both files would get opened (which should take approximately the same time for each). Both operations would initiate a read of their first chunk. The read for the smaller file would likely complete first (less data to read), but actually this depends upon both OS and disk controller logic. Because the actual reads are handed off to threaded code, both reads will be pending at the same time. Assuming the smaller read finishes first, it would fire completion and then during the busy loop the large read would finish, inserting an event in the event queue. When the busy loop finishes, the only thing left to do on the larger file (but still something that can was read in one chunk) would be to close the file which is a faster operation.
But, when the larger file can't be read in one chunk and needs multiple chunks of reading, that's why its progress really gets stalled by the busy loop because a chunk finishes, but the next chunk doesn't get scheduled until the busy loop is done.
Testing
So, let's test out all this theory. I created two files. small.txt is 558 bytes. large.txt is 255,194,500 bytes.
Then, I wrote the following program to time these and allow us to optionally do a 3 second spin loop after the small one finishes.
const fs = require('fs');
let doSpin = false; // -s will set this to true
let fname = "./large.txt";
for (let i = 2; i < process.argv.length; i++) {
let arg = process.argv[i];
console.log(`"${arg}"`);
if (arg.startsWith("-")) {
switch(arg) {
case "-s":
doSpin = true;
break;
default:
console.log(`Unknown arg ${arg}`);
process.exit(1);
break;
}
} else {
fname = arg;
}
}
function padDecimal(num, n = 3) {
let str = num.toFixed(n);
let index = str.indexOf(".");
if (index === -1) {
str += ".";
index = str.length - 1;
}
let zeroesToAdd = n - (str.length - index);
while (zeroesToAdd-- >= 0) {
str += "0";
}
return str;
}
let startTime;
function log(msg) {
if (!startTime) {
startTime = Date.now();
}
let diff = (Date.now() - startTime) / 1000; // in seconds
console.log(padDecimal(diff), ":", msg)
}
function funcA(err, data) {
if (err) {
log("error on large");
log(err);
return;
}
log("large completed");
}
function funcB(err, data) {
if (err) {
log("error on small");
log(err);
return;
}
log("small completed");
if (doSpin) {
spin(3000);
log("spin completed");
}
}
function spin(howLong) {
let finishTime = Date.now() + howLong;
// spin until howLong ms passes
while (Date.now() < finishTime) {}
}
log("start");
fs.readFile(fname, funcA);
log("large initiated");
fs.readFile("./small.txt", funcB);
log("small initiated");
Then (using node v12.13.0), I ran it both with and without the 3 second spin. Without the spin, I get this output:
0.000 : start
0.015 : large initiated
0.016 : small initiated
0.021 : small completed
0.240 : large completed
This shows a 0.219 second delta between the time to complete small and large (while running both at the same time).
Then, inserting the 3 second delay, we get this output:
0.000 : start
0.003 : large initiated
0.004 : small initiated
0.009 : small completed
3.010 : spin completed
3.229 : large completed
We have the exact same 0.219 second delta between the time to complete the small and the large (while running both at the same time). This shows that the large fs.readFile() essentially made no progress during the 3 second spin. It's progress was completely blocked. As we've theorized in the previous explanation, this is apparently because the progression from one chunked read to the next is written in Javascript and while the spin loop is running, that progression to the next chunk is blocked so it can't make any further progress.
How Big A File Makes Big File Finish Second?
If you look in the code for fs.readFile() in the source for node v12.13.0, you can find that the chunk size it reads is 512 * 1024 which is 512k. So, in theory, it's possible that the larger file might finish first if it can be read in one chunk. Whether that actually happens or not depends upon some OS and disk implementation details, but I thought I'd try it on my laptop running a current version of Windows 10 with an SSD drive.
I found that, for a 255k "large" file, it does finish before the small file (essentially in execution order). So, because the large file read is started before the small file read, even though it has more data to read, it will still finish before the small file.
0.000 : start
0.003 : large initiated
0.003 : small initiated
0.007 : large completed
0.008 : small completed
Keep in mind, this is OS and disk dependent so this is not guaranteed.
File I/O in Node.js runs in separate thread. But this does not matter. Node.js always executes all callbacks in the main thread. I/O callbacks are never executed in a separate thread (the file read operation is done in a separate thread then when it is finished will signal the main thread to run your callback). This essentially makes node.js single-threaded because all the code you write runs in the main thread (we're of course ignoring the worker_threads module/API which allows you to manually execute code in separate threads).
But the bytes in the files are read in parallel (or as parallel as your hardware allows - depending on the number of free DMA channels, which disk each file is from etc). What is parallel is the wait. Asynchronous I/O in any language (node.js, Java, C++, Python etc.) is basically an API that allows you to wait in parallel but handle events in a single thread. There is a word for this kind of parallel: concurrent. It is essentially parallel wait (while data is handled in parallel by your hardware) but not parallel code execution.
I think that you understand the behavior of event loop and libuv, don't lose your way.
My answers :
1a) Of course the two read file are executed in two different threads , I tried to run your code replacing a large file with a small one and the output was
It is a large file
It is a small file
1b) The second call just end before in your case and then the callback is invoked before.
2 ) As you said , libuv has four threads by default , but be sure that the default are not changed setting the env variable UV_THREADPOOL_SIZE ( http://docs.libuv.org/en/v1.x/threadpool.html )
I tried to work with a large and a big file , to read the big file my PC take 23/25 ms , to read the small file it take 8/10 ms.
When I try to read both the process terminate in 26/27 ms and this demonstrate that the two read file are executed in parallel .
Try to measure the time that your code take from small file callback to large file callback :
console.log(process.env.UV_THREADPOOL_SIZE)
const fs = require('fs')
const start = Date.now()
let smallFileEnd
fs.readFile("./alphabet.txt", "utf8", (err, data) => {
console.log('It is a large file')
console.log(`From the start to now are passed ${Date.now() - start} ms`)
console.log(`From the small file end to now are passed ${Date.now() - smallFileEnd} ms`)
//this file has many words (11X MB).
//It takes 1-2 seconds to finish reading (usually 1)
// 18ms to execute
});
fs.readFile("./stackoverflow.js","utf8", (err, data) => {
for(let i=0; i<99999 ;i++)
if(i === 99998){
smallFileEnd = Date.now()
console.log('is a small file ')
console.log(`From the start to now are passed ${Date.now() - start} ms`)
// 4/7 ms to execute
}
});

How to know when the Promise is actually resolved in Node.js?

When we are using Promise in nodejs, given a Promise p, we can't know when the Promise p is actually resolved by logging the currentTime in the "then" callback.
To prove that, I wrote the test code below (using CoffeeScript):
# the procedure we want to profile
getData = (arg) ->
new Promise (resolve) ->
setTimeout ->
resolve(arg + 1)
, 100
# the main procedure
main = () ->
beginTime = new Date()
console.log beginTime.toISOString()
getData(1).then (r) ->
resolveTime = new Date()
console.log resolveTime.toISOString()
console.log resolveTime - beginTime
cnt = 10**9
--cnt while cnt > 0
return cnt
main()
When you run the above code, you will notice that the resolveTime (the time your code run into the callback function) is much later than 100ms from the beginTime.
So If we want to know when the Promise is actually resolved, HOW?
I want to know the exact time because I'm doing some profiling via logging. And I'm not able to modify the Promise p 's implementation when I'm doing some profiling outside of the black box.
So, Is there some function like promise.onUnderlyingConditionFulfilled(callback) , or any other way to make this possible?
This is because you have a busy loop that apparently takes longer than your timer:
cnt = 10**9
--cnt while cnt > 0
Javascript in node.js is single threaded and event driven. It can only do one thing at a time and it will finish the current thing it's doing before it can service the event posted by setTimeout(). So, if your busy loop (or any other long running piece of Javascript code) takes longer than you've set your timer for, the timer will not be able to run until this other Javascript code is done. "single threaded" means Javascript in node.js only does one thing at a time and it waits until one thing returns control back to the system before it can service the next event waiting to run.
So, here's the sequence of events in your code:
It calls the setTimeout() to schedule the timer callback for 100ms from now.
Then you go into your busy loop.
While it's in the busy loop, the setTimeout() timer fires inside of the JS implementation and it inserts an event into the Javascript event queue. That event can't run at the moment because the JS interpreter is still running the busy loop.
Then eventually it finishes the busy loop and returns control back to the system.
When that is done, the JS interpreter then checks the event queue to see if any other events need servicing. It finds the timer event and so it processes that and the setTimeout() callback is called.
That callback resolves the promise which triggers the .then() handler to get called.
Note: Because of Javascript's single threaded-ness and event-driven nature, timers in Javascript are not guaranteed to be called exactly when you schedule them. They will execute as close to that as possible, but if other code is running at the time they fire or if their are lots of items in the event queue ahead of you, that code has to finish before the timer callback actually gets to execute.
So If we want to know when the Promise is actually resolved, HOW?
The promise is resolved when your busy loop is done. It's not resolved at exactly the 100ms point (because your busy loop apparently takes longer than 100ms to run). If you wanted to know exactly when the promise was resolved, you would just log inside the setTimeout() callback right where you call resolve(). That would tell you exactly when the promise was resolved though it will be pretty much the same as where you're logging now. The cause of your delay is the busy loop.
Per your comments, it appears that you want to somehow measure exactly when resolve() is actually called in the Promise, but you say that you cannot modify the code in getData(). You can't really do that directly. As you already know, you can measure when the .then() handler gets called which will probably be no more than a couple ms after resolve() gets called.
You could replace the promise infrastructure with your own implementation and then you could instrument the resolve() callback directly, but replacing or hooking the promise implementation probably influences the timing of things even more than just measuring from the .then() handler.
So, it appears to me that you've just over-constrained the problem. You want to measure when something inside of some code happens, but you don't allow any instrumentation inside that code. That just leaves you with two choices:
Replace the promise implementation so you can instrument resolve() directly.
Measure when .then() is triggered.
The first option probably has a heisenberg uncertainty issue in that you've probably influenced the timing by more than you should if you replace or hook the promise implementation.
The second option measures an event that happens just slightly after the actual .resolve(). Pick which one sounds closest to what you actually want.

How does the stack work on NodeJs?

I've recently started reading up on NodeJS (and JS) and am a little confused about how call backs are working in NodeJs. Assume we do something like this:
setTimeout(function(){
console.log("World");
},1000);
console.log("Hello");
output: "Hello World"
So far on what I've read JS is single threaded so the event loop is going though one big stack, and I've also been told not to put big calls in the callback functions.
1)
Ok, so my first question is assuming its one stack does the callback function get run by the main event loop thread? if so then if we have a site with that serves up content via callbacks (fetches from db and pushes request), and we have 1000 concurrent, are those 1000 users basically being served synchronously, where the main thread goes in each callback function does the computation and then continues the main event loop? If this is the case, how exactly is this concurrent?
2) How are the callback functions added to the stack, so lets say my code was like the following:
setTimeout(function(){
console.log("Hello");
},1000);
setTimeout(function(){
console.log("World");
},2000);
then does the callback function get added to the stack before the timeout even has occured? if so is there a flag that's set that notifies the main thread the callback function is ready (or is there another mechanism). If this is infact what is happening doesn't it just bloat the stack, especially for large web applications with callback functions, and the larger the stack the longer everything will take to run since the thread has to step through the entire thing.
The event loop is not a stack. It could be better thought of as a queue (first in/first out). The most common usage is for performing I/O operations.
Imagine you want to read a file from disk. You start by saying hey, I want to read this file, and when you're done reading, call this callback. While the actual I/O operation is being performed in a separate process, the Node application is free to keep doing something else. When the file has finished being read, it adds an item to the event loop's queue indicating the completion. The node app might still be busy doing something else, or there may be other items waiting to be dequeued first, but eventually our completion notification will be dequeued by node's event loop. It will match this up with our callback, and then the callback will be called as expected. When the callback has returned, the event loop dequeues the next item and continues. When there is nothing left in node's event loop, the application exits.
That's a rough approximation of how the event loop works, not an exact technical description.
For the specific setTimeout case, think of it like a priority queue. Node won't consider dequeuing the item/job/callback until at least that amount of time has passed.
There are lots of great write-ups on the Node/JavaScript event loop which you probably want to read up on if you're confused.
callback functions are not added to caller stack. There is no recursion here. They are called from event loop. Try to replace console.log in your example and watch result - stack is not growing.

Resources