npm wait.for not working as expected - node.js

tried the code below as provided in the official site - https://www.npmjs.com/package/wait.for. But not working as expected.
Output:
*before calling test
after calling test
reverse for 216.58.196.142: ["syd15s04-in-f14.1e100.net"]*
Expected output:
*before calling test
reverse for 216.58.196.142: ["syd15s04-in-f14.1e100.net"]
after calling test*
What is that I can do to make it work?
var dns = require("dns"), wait=require('wait.for');
function test(){
var addresses = wait.for(dns.resolve4,"google.com");
for (var i = 0; i < addresses.length; i++) {
var a = addresses[i];
console.log("reverse for " + a + ": " + JSON.stringify(wait.for(dns.reverse,a)));
}
}
console.log("before calling test");
wait.launchFiber(test);
console.log("after calling test");

wait.launchFiber(test);
Means launch and forget. launchFiber starts a concurrent execution fiber. Inside the fiber you can use wait for, but the fiber is concurrent with the main execution thread.

Related

Loops with callbacks in node.js

I have the following code in node.js
for (var i = 0; i<allLetters.length; i++)
for (var k = 0; k<allLetters.length; k++){
var allFilesName = fs.readdirSync("/opt/ + allLetters[i] + "/" + allLetters[k]);
for (var t = 0; t< akkFilesName; t++)
dosomething(allFilesName[t];
}
dosomething is a function with callback, and include IO operation.
the problem is that my application doesn't executed the callback until it finish the i, k & t loops. Meaning, I see that all the CPU time is wasted on completed the callback, and just after it complete all the loops, it executed the callback, and returns from the callback.
I want that the loops and the callback will executed parallel, so I would get the result from the callback while I do the loop.
As stated in the comments, the each-Function of the async-Library does what you are looking for.

How to make node works concurrently?

Node.js is famous for concurrency, however, I'm confused by how to make it work concurrently. I started two requests from Chrome one by one very quickly, and I Expected the outputs in console should be:
"get a new request"
immediately after my second request, "get a new request" should be printed
after several seconds, "end the new request"
after several seconds, "end the new request"
However, what I saw is:
"get a new request"
after several seconds, "end the new request"
"get a new request"
after several seconds, end the new request
That means the second request is NOT handled until the first one is done. Below is my sample code, anything I missed?
var http = require("http");
var url = require("url");
function start(route) {
http.createServer(function(request, response) {
console.log('get a new request');
// a time consuming loop
for (var i=0; i<10000000000; ++i) {
}
route(url.parse(request.url).pathname);
response.writeHead(200, {"Content-Type": "text/plain"});
response.end();
console.log('end the new request');
}).listen(5858);
}
function saySomething(something) {
console.log(something);
}
exports.start = start;
exports.saySomething = saySomething;
You don't have to do anything.
It's based on non blocking I/O. Put simply, there is an event loop. A certain set of sync code are run, once done, the next iteration is run that picks up the next set of sync code to run. Anytime an async op is run (db fetch, setTimeout, reading a file, etc) the next tick of the event loop is run. This way there is never any code just waiting.
It's not threaded. In your example, the for loop is in one continuous chunk of code, so js will run the entire for loop before it can handle another http request.
Try putting a setTimeout around the for loop so that node can switch to the next event loop and in your case handle a web request.
node can't handle these:
for (var i=0; i<10000000000; ++i) {}
concurrently. But it handles IO concurrently
You might want to look at Clusters:
http://nodejs.org/api/cluster.html#cluster_how_it_works
http://rowanmanning.com/posts/node-cluster-and-express/
T̶h̶i̶s̶ ̶i̶s̶ ̶t̶h̶e̶ ̶e̶x̶p̶e̶c̶t̶e̶d̶ ̶b̶e̶h̶a̶v̶i̶o̶r̶,̶ ̶w̶e̶ ̶c̶a̶l̶l̶ ̶t̶h̶i̶s̶ ̶̶b̶l̶o̶c̶k̶i̶n̶g̶̶.̶ ̶T̶h̶e̶ ̶s̶o̶l̶u̶t̶i̶o̶n̶ ̶f̶o̶r̶ ̶h̶a̶n̶d̶l̶i̶n̶g̶ ̶c̶o̶n̶c̶u̶r̶r̶e̶n̶t̶ ̶r̶e̶q̶u̶e̶s̶t̶ ̶i̶s̶ ̶m̶a̶k̶i̶n̶g̶ ̶t̶h̶e̶ ̶c̶o̶d̶e̶ ̶̶n̶o̶n̶-̶b̶l̶o̶c̶k̶i̶n̶g̶̶.̶ ̶A̶s̶ ̶s̶o̶o̶n̶ ̶a̶s̶ ̶y̶o̶u̶ ̶c̶a̶l̶l̶e̶d̶ ̶̶r̶e̶s̶p̶o̶n̶s̶e̶.̶w̶r̶i̶t̶e̶H̶e̶a̶d̶̶ ̶t̶h̶e̶ ̶c̶o̶d̶e̶ ̶b̶e̶g̶a̶n̶ ̶b̶l̶o̶c̶k̶i̶n̶g̶ ̶w̶a̶i̶t̶i̶n̶g̶ ̶f̶o̶r̶ ̶̶r̶e̶s̶p̶o̶n̶s̶e̶.̶e̶n̶d̶̶.
EDIT 7/8/14:
Had to deal with this problem recently and found out you can use threads for this:
https://www.npmjs.org/package/webworker-threads
Webworker-threads provides an asynchronous API for CPU-bound tasks that's missing in Node.js:
var Worker = require('webworker-threads').Worker;
require('http').createServer(function (req,res) {
var fibo = new Worker(function() {
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
this.onmessage = function (event) {
postMessage(fibo(event.data));
}
});
fibo.onmessage = function (event) {
res.end('fib(40) = ' + event.data);
};
fibo.postMessage(40);
}).listen(port);
And it won't block the event loop because for each request, the fibo worker will run in parallel in a separate background thread.

Best way to execute parallel processing in Node.js

I'm trying to write a small node application that will search through and parse a large number of files on the file system.
In order to speed up the search, we are attempting to use some sort of map reduce. The plan would be the following simplified scenario:
Web request comes in with a search query
3 processes are started that each get assigned 1000 (different) files
once a process completes, it would 'return' it's results back to the main thread
once all processes complete, the main thread would continue by returning the combined result as a JSON result
The questions I have with this are:
Is this doable in Node?
What is the recommended way of doing it?
I've been fiddling, but come no further then following example using Process:
initiator:
function Worker() {
return child_process.fork("myProcess.js");
}
for(var i = 0; i < require('os').cpus().length; i++){
var process = new Worker();
process.send(workItems.slice(i * itemsPerProcess, (i+1) * itemsPerProcess));
}
myProcess.js
process.on('message', function(msg) {
var valuesToReturn = [];
// Do file reading here
//How would I return valuesToReturn?
process.exit(0);
}
Few sidenotes:
I'm aware the number of processes should be dependent of the number of CPU's on the server
I'm also aware of speed restrictions in a file system. Consider it a proof of concept before we move this to a database or Lucene instance :-)
Should be doable. As a simple example:
// parent.js
var child_process = require('child_process');
var numchild = require('os').cpus().length;
var done = 0;
for (var i = 0; i < numchild; i++){
var child = child_process.fork('./child');
child.send((i + 1) * 1000);
child.on('message', function(message) {
console.log('[parent] received message from child:', message);
done++;
if (done === numchild) {
console.log('[parent] received all results');
...
}
});
}
// child.js
process.on('message', function(message) {
console.log('[child] received message from server:', message);
setTimeout(function() {
process.send({
child : process.pid,
result : message + 1
});
process.disconnect();
}, (0.5 + Math.random()) * 5000);
});
So the parent process spawns an X number of child processes and passes them a message. It also installs an event handler to listen for any messages sent back from the child (with the result, for instance).
The child process waits for messages from the parent, and starts processing (in this case, it just starts a timer with a random timeout to simulate some work being done). Once it's done, it sends the result back to the parent process and uses process.disconnect() to disconnect itself from the parent (basically stopping the child process).
The parent process keeps track of the number of child processes started, and the number of them that have sent back a result. When those numbers are equal, the parent received all results from the child processes so it can combine all results and return the JSON result.
For a distributed problem like this, I've used zmq and it has worked really well. I'll give you a similar problem that I ran into, and attempted to solve via processes (but failed.) and then turned towards zmq.
Using bcrypt, or an expensive hashing algorith, is wise, but it blocks the node process for around 0.5 seconds. We had to offload this to a different server, and as a quick fix, I used essentially exactly what you did. Run a child process and send messages to it and get it to
respond. The only issue we found is for whatever reason our child process would pin an entire core when it was doing absolutely no work.(I still haven't figured out why this happened, we ran a trace and it appeared that epoll was failing on stdout/stdin streams. It would also only happen on our Linux boxes and would work fine on OSX.)
edit:
The pinning of the core was fixed in https://github.com/joyent/libuv/commit/12210fe and was related to https://github.com/joyent/node/issues/5504, so if you run into the issue and you're using centos + kernel v2.6.32: update node, or update your kernel!
Regardless of the issues I had with child_process.fork(), here's a nifty pattern I always use
client:
var child_process = require('child_process');
function FileParser() {
this.__callbackById = [];
this.__callbackIdIncrement = 0;
this.__process = child_process.fork('./child');
this.__process.on('message', this.handleMessage.bind(this));
}
FileParser.prototype.handleMessage = function handleMessage(message) {
var error = message.error;
var result = message.result;
var callbackId = message.callbackId;
var callback = this.__callbackById[callbackId];
if (! callback) {
return;
}
callback(error, result);
delete this.__callbackById[callbackId];
};
FileParser.prototype.parse = function parse(data, callback) {
this.__callbackIdIncrement = (this.__callbackIdIncrement + 1) % 10000000;
this.__callbackById[this.__callbackIdIncrement] = callback;
this.__process.send({
data: data, // optionally you could pass in the path of the file, and open it in the child process.
callbackId: this.__callbackIdIncrement
});
};
module.exports = FileParser;
child process:
process.on('message', function(message) {
var callbackId = message.callbackId;
var data = message.data;
function respond(error, response) {
process.send({
callbackId: callbackId,
error: error,
result: response
});
}
// parse data..
respond(undefined, "computed data");
});
We also need a pattern to synchronize the different processes, when each process finishes its task, it will respond to us, and we'll increment a count for each process that finishes, and then call the callback of the Semaphore when we've hit the count we want.
function Semaphore(wait, callback) {
this.callback = callback;
this.wait = wait;
this.counted = 0;
}
Semaphore.prototype.signal = function signal() {
this.counted++;
if (this.counted >= this.wait) {
this.callback();
}
}
module.exports = Semaphore;
here's a use case that ties all the above patterns together:
var FileParser = require('./FileParser');
var Semaphore = require('./Semaphore');
var arrFileParsers = [];
for(var i = 0; i < require('os').cpus().length; i++){
var fileParser = new FileParser();
arrFileParsers.push(fileParser);
}
function getFiles() {
return ["file", "file"];
}
var arrResults = [];
function onAllFilesParsed() {
console.log('all results completed', JSON.stringify(arrResults));
}
var lock = new Semaphore(arrFileParsers.length, onAllFilesParsed);
arrFileParsers.forEach(function(fileParser) {
var arrFiles = getFiles(); // you need to decide how to split the files into 1k chunks
fileParser.parse(arrFiles, function (error, result) {
arrResults.push(result);
lock.signal();
});
});
Eventually I used http://zguide.zeromq.org/page:all#The-Load-Balancing-Pattern, where the client was using the nodejs zmq client, and the workers/broker were written in C. This allowed us to scale this across multiple machines, instead of just a local machine with sub processes.

Node.js - execute the anonymous function that was passed to another function

I understand that Node.js has the concept of event-driven, asynchronous callbacks, by utilizing an event loop.
database.query("SELECT * FROM hugetable", function(rows) { var result = rows; });
console.log("Hello World");
Here, instead of expecting database.query() to directly return a result to us, we pass it a second parameter, an anonymous function.
Now, Node.js can handle the database request asynchronously. Provided that database.query() is part of an asynchronous library, this is what Node.js does: just as before, it takes the query and sends it to the database. But instead of waiting for it to be finished, it makes a mental note that says "When at some point in the future the database server is done and sends the result of the query, then I have to execute the anonymous function that was passed to database.query()."
I am trying the same with a sample code (as I am a newbie and not reached till Node.js DB interactions):
[root#example]# cat server8.js
function myfun(noparm , afterend)
{
for ( var i =0; i < 10; i ++)
console.log("The valus is " + i);
}
function mynextfn()
{
console.log("Hello World");
}
function afterend()
{
console.log("Hello afterend");
}
myfun(0, afterend);
mynextfn();
[root#idc-bldtool01 example]# node server8.js
The valus is 0
The valus is 1
The valus is 2
The valus is 3
The valus is 4
The valus is 5
The valus is 6
The valus is 7
The valus is 8
The valus is 9
Hello World
[root#iexample]#
As such I do not see the " concept of event-driven, asynchronous callbacks, by utilizing an event loop" ?
Can anyone please help me in implementing some basic examples?
You hand the afterend function over as a parameter, but you never call it.
Your function myfun must be:
function myfun(noparm , afterend)
{
for (var i = 0; i < 10; i++) {
console.log("The valus is " + i);
}
afterend();
}
Then that what you expect to will happen ;-)
And: Second thing is that your myfun function is completely synchronous. So there is no chance for Node.js to run mynextfn before the content of the myfun function.
Potentially afterend will be run before myfun, that depends on timing issues as both do not do any heavy lifting.

Stopping a Node.js callback by clearing setInterval

Need a call on this situation. If I have this function:
var timerID;
function playBack(client,log){
if (timerID) clearInterval(timerID);
timerID = setInterval(function(){
var buffer = [],
numberOfLogLines = log.length;
while(numberOfLogLines > 0){
var l = log.pop().trim(); // pull from the top
buffer.push(l);
if(l.split(",")[0] === "$TIMETOSEND"){ //flag for the timming signal
client.send("{\"playback\":" + JSON.stringify(buffer) + "}");
break;
}
}
},playBackSpeed);
}
and call
clearInterval(timerID)
at some other point does this stop the callback? I am trying to create a pause/play situation which to start the playBack again I would invoke the playBack(client,log) again to restart. Seems to work but, I am wondering does clearing the interval stop the callBack. I don't want to create a log jam for the case of the heavy "pause/play" happy people.
A call to clearInterval( timerID ) will stop any future calls to your anonymous function. If there are previous calls of it running, then these will finish.

Resources