Node.js setTimeout() behaviour - node.js

I want a piece of code to repeat 100 times with 1 sec of delay in between. This is my code:
for(var i = 0; i < 100; i++){
setTimeout(function(){
//do stuff
},1000);
}
While this seems correct to me it is not. Instead of running "do stuff" 100 times and waiting 1 sec in between what it does is wait 1 sec and then run "do stuff" 100 times with no delay.
Anybody has any idea about this?

You can accomplish it by using setInterval().
It calls function of our choice as long as clearTimeout is called to a variable timer which stores it.
See example below with comments: (and remember to open your developer console (in chrome right click -> inspect element -> console) to view console.log).
// Total count we have called doStuff()
var count = 0;
/**
* Method for calling doStuff() 100 times
*
*/
var timer = setInterval(function() {
// If count increased by one is smaller than 100, keep running and return
if(count++ < 100) {
return doStuff();
}
// mission complete, clear timeout
clearTimeout(timer);
}, 1000); // One second in milliseconds
/**
* Method for doing stuff
*
*/
function doStuff() {
console.log("doing stuff");
}
Here is also: jsfiddle example
As a bonus: Your original method won't work because you are basically assigning 100 setTimeout calls as fast as possible. So instead of them running with one second gaps. They will run as fast as the for loop is placing them to queue, starting after 1000 milliseconds of current time.
For instance, following code shows timestamps when your approach is used:
for(var i = 0; i < 100; i++){
setTimeout(function(){
// Current time in milliseconds
console.log(new Date().getTime());
},1000);
}
It will output something like (milliseconds):
1404911593267 (14 times called with this timestamp...)
1404911593268 (10 times called with this timestamp...)
1404911593269 (12 times called with this timestamp...)
1404911593270 (15 times called with this timestamp...)
1404911593271 (12 times called with this timestamp...)
You can see the behaviour also in: js fiddle

You need to use callback, node.js is asynchronous:
function call_1000times(callback) {
var i = 0,
function do_stuff() {
//do stuff
if (i < 1000) {
i = i + 1;
do_stuff();
} else {
callback(list);
}
}
do_stuff();
}
Or, more cleaner:
setInterval(function () {
//do stuff
}, 1000);

Now that you appreciate that the for loop is iterating in a matter of milliseconds, another way to do it would be to simply adjust the setTimeout delay according to the count.
for(var i = 0; i < 100; i++){
setTimeout(function(){
//do stuff
}, i * 1000);
}
For many use-cases, this could be seen as bad. But in particular circumstances where you know that you definitely want to run code x number of times after y number of seconds, it could be useful.
It's also worth noting there are some that believe using setInterval is bad practise.

I prefer the recursive function. Call the function initially with the value of counter = 0, and then within the function check to see that counter is less than 100. If so, do your stuff, then call setTimeout with another call to doStuff but with a value of counter + 1. The function will run exactly 100 times, once per second, then quit :
const doStuff = counter => {
if (counter < 100) {
// do some stuff
setTimeout(()=>doStuff(counter + 1), 1000)
}
return;
}
doStuff(0)

Related

Minimum amount of execution time before proceeding

So I need to make my for loop wait AT LEAST 124ms before executing the next loop run, note that it can however take more than 124ms to complete the stuff inside the loop as I'm getting data from a website's API and that has to be received before moving on.
My idea was someting like this:
for(i = 0; i < 1000; i++)
{
var startTime = Date.now();
//Do some stuff
executeTimeCheck();
function executeTimeCheck()
{
setTimeout(executeTimeCheck, 1);
if(((Date.now()) - startTime) >= 124){return;}
else{executeTimeCheck(); return;}
}
}
The problem is that I keep running out of stack (RangeError: Maximum call stack size exceeded). If you have any idea how to make this work please let me know.

phantomJS scraping with breaks not working

I'm trying to scrape some URLS from a webservice, its working perfect but I need to scrape something like 10,000 pages from the same web servicve.
I do this by creating multiple phantomJS processes and they each open and evaluate a different URL (Its the same service, all I change is one parameter in the URL of the website).
Problem is I don't want to open 10,000 pages at once, since I don't want their service to crash, and I don't want my server to crash either.
I'm trying to make some logic of opening/evaluating/insertingToDB ~10 pages, and then sleeping for 1 minute or so.
Let's say this is what I have now:
var numOfRequests = 10,000; //Total requests
for (var dataIndex = 0; dataIndex < numOfRequests; dataIndex++) {
phantom.create({'port' : freeport}, function(ph) {
ph.createPage(function(page) {
page.open("http://..." + data[dataIncFirstPage], function(status) {
I want to insert somewhere in the middle something like:
if(dataIndex % 10 == 0){
sleep(60); //I can use the sleep module
}
Every where I try to place sleepJS the program crashes/freezes/loops forever...
Any idea what I should try?
I've tried placing the above code as the first line after the for loop, but this doesn't work (maybe because of the callback functions that are waiting to fire..)
If I place it inside the phantom.create() callback also doesn't work..
Realize that NodeJS runs asynchronously and in your for-loop, each method call is being executing one after the other. That phantom.create call finishes near immediately, and then the next cycle of the for-loop kicks in.
To answer your question, you want the sleep command at the end of the phantom.create block, still in side the for-loop. Like this:
var numOfRequests = 10000; // Total requests
for( var dataIndex = 0; dataIndex < numOfRequests; dataIndex++ ) {
phantom.create( { 'port' : freeport }, function( ph ) {
// ..whatever in here
} );
if(dataIndex % 10 == 0){
sleep(60); //I can use the sleep module
}
}
Also, consider using a package to help with these control flow issues. Async is a good one, and has a method, eachLimit that will concurrently run a number of processes, up to a limit. Handy! You will need to create an input object array for each iteration you wish to run, like this:
var dataInputs = [ { id: 0, data: "/abc"}, { id : 1, data : "/def"} ];
function processPhantom( dataItem, callback ){
console.log("Starting processing for " + JSON.stringify( dataItem ) );
phantom.create( { 'port' : freeport }, function( ph ) {
// ..whatever in here.
//When done, in inner-most callback, call:
//callback(null); //let the next parallel items into the queue
//or
//callback( new Error("Something went wrong") ); //break the processing
} );
}
async.eachLimit( dataInputs, 10, processPhantom, function( err ){
//Can check for err.
//It is here that everything is finished.
console.log("Finished with async.eachLimit");
});
Sleeping for a minute isn't a bad idea, but in groups of 10, that will take you 1000 minutes, which is over 16 hours! Would be more convenient for you to only call when there is space in your queue - and be sure to log what requests are in process, and have completed.

How do I execute a piece of code no more than every X minutes?

Say I have a link aggregation app where users vote on links. I sort the links using hotness scores generated by an algorithm that runs whenever a link is voted on. However running it on every vote seems excessive. How do I limit it so that it runs no more than, say, every 5 minutes.
a) use cron job
b) keep track of the timestamp when the procedure was last run, and when the current timestamp - the timestamp you have stored > 5 minutes then run the procedure and update the timestamp.
var yourVoteStuff = function() {
...
setTimeout(yourVoteStuff, 5 * 60 * 1000);
};
yourVoteStuff();
Before asking why not to use setTimeinterval, well, read the comment below.
Why "why setTimeinterval" and no "why cron job?"?, am I that wrong?
First you build a receiver that receives all your links submissions.
Secondly, the receiver push()es each link (that has been received) to
a queue (I strongly recommend redis)
Moreover you have an aggregator which loops with a time interval of your desire. Within this loop each queued link should be poll()ed and continue to your business logic.
I have use this solution to a production level and I can tell you that scales well as it also performs.
Example of use;
var MIN = 5; // don't run aggregation for short queue, saves resources
var THROTTLE = 10; // aggregation/sec
var queue = [];
var bucket = [];
var interval = 1000; // 1sec
flow.on("submission", function(link) {
queue.push(link);
});
___aggregationLoop(interval);
function ___aggregationLoop(interval) {
setTimeout(function() {
bucket = [];
if(queue.length<=MIN) {
___aggregationLoop(100); // intensive
return;
}
for(var i=0; i<THROTTLE; ++i) {
(function(index) {
bucket.push(this);
}).call(queue.pop(), i);
}
___aggregationLoop(interval);
}, interval);
}
Cheers!

How to setTimeout in node.js?

I need to be able to make retries in node.js in the event of failure inside a function. I've setup a while loop as shown below, but I am getting slightly confused about how I should wrap the function call to not make sure that it won't block my whole server.
What should I do?
while(retryCount < 10 && !success){
// Alternative one
while(new Date().getTime() < now + 1000) {
myFunction();
}
// Or:
setTimeout( myFunction(), 1000);
}
You can store number of tryes in function object. It's will be fine for cronjob. If you need same behaviour in request context you must store attempts counter in request scope (not in function object).
var fnc = function() {
console.log('try');
if (true) { // Error condition
// Error here
if (!fnc.tryes) fnc.tryes = 0;
fnc.tryes++;
console.log(fnc.tryes);
if (fnc.tryes <= 10) {
setTimeout(fnc, 1000);
} else {
fnc.tryes = 0;
}
// Something wrong
} else {
// We hame result
}
};
fnc();
I'd say use the setTimeout method, that way the client won't be stuck inside the while loop that checks the time.
That outer while loop is going to block, you'd have to refactor using only setTimeout. However, the fact that you want this sort of thing indicates to me that your code structure is really terrible and needs more reworking. What is it that you are retrying? How are you detecting an error condition? Does doing it 10 times really make the chances of success higher?
I have a gist containing a generic function that will do this sort of thing for you, but I'm reluctant to share if this is an XY problem.

Node.js long poll logic help!

I m trying to implement a long polling strategy with node.js
What i want is when a request is made to node.js it will wait maximum 30 seconds for some data to become available. If there is data, it will output it and exit and if there is no data, it will just wait out 30 seconds max, and then exit.
here is the basic code logic i came up with -
var http = require('http');
var poll_function = function(req,res,counter)
{
if(counter > 30)
{
res.writeHeader(200,{'Content-Type':'text/html;charset=utf8'});
res.end('Output after 5 seconds!');
}
else
{
var rand = Math.random();
if(rand > 0.85)
{
res.writeHeader(200,{'Content-Type':'text/html;charset=utf8'});
res.end('Output done because rand: ' + rand + '! in counter: ' + counter);
}
}
setTimeout
(
function()
{
poll_function.apply(this,[req,res,counter+1]);
},
1000
);
};
http.createServer
(
function(req,res)
{
poll_function(req,res,1);
}
).listen(8088);
What i figure is, When a request is made the poll_function is called which calls itself after 1 second, via a setTimeout within itself. So, it should remain asynchronous means, it will not block other requests and will provide its output when its done.
I have used a Math.random() logic here to simulate data availability scenario at various interval.
Now, what i concern is -
1) Will there be any problem with it? - I simply don't wish to deploy it, without being sure it will not strike back!
2) Is it efficient? if not, any suggestion how can i improve it?
Thanks,
Anjan
All nodejs code is nonblocking as long as you don't get hunk in a tight CPU loop (like while(true)) or use a library that has blocking I/O. Putting a setTimeout at the end of a function doesn't make it any more parallel, it just defers some cpu work till a later event.
Here is a simple demo chat server that randomly emits "Hello World" every 0 to 60 seconds to and and all connection clients.
// A simple chat server using long-poll and timeout
var Http = require('http');
// Array of open callbacks listening for a result
var listeners = [];
Http.createServer(function (req, res) {
function onData(data) {
res.end(data);
}
listeners.push(onData);
// Set a timeout of 30 seconds
var timeout = setTimeout(function () {
// Remove our callback from the listeners array
listeners.splice(listeners.indexOf(onData), 1);
res.end("Timeout!");
}, 30000);
}).listen(8080);
console.log("Server listening on 8080");
function emitEvent(data) {
for (var i = 0; l = listeners.length; i < l; i++) {
listeners[i](data);
}
listeners.length = 0;
}
// Simulate random events
function randomEvents() {
emitData("Hello World");
setTimeout(RandomEvents, Math.random() * 60000);
}
setTimeout(RandomEvents, Math.random() * 60000);
This will be quite fast. The only dangerous part is the splice. Splice can be slow if the array gets very large. This can be made possibly more efficient by instead of closing the connection 30 seconds from when it started to closing all the handlers at once every 30 seconds or 30 seconds after the last event. But again, this is unlikely to be the bottleneck since each of those array items is backed by a real client connection that probably more expensive.

Resources