Parallel Processing on Azure - azure-web-app-service

Parallel Processing on Azure - azure-web-app-service

We're trying to fire off two commands in parallel to try and speed up the return of data but it looks like the processes are just being queued. The amount of cumulative time to run the processes individually is equal to running them both at the same time. And the second data set is not returned until the first has.
Can anyone shed some light on the situation please?
Thanks,
I'll attach an example.
Thread GetResult1Thread = new Thread(new ThreadStart(GetResult1));
GetResult1Thread.IsBackground = true;
GetResult1Thread.Start();
Thread GetResult2Thread = new Thread(new ThreadStart(GetResult2));
GetResult2Thread.IsBackground = true;
GetResult2Thread.Start();
}
public void GetResult1()
{
//Blocks time untill database request is complete
//Total Execution time 1 min
DataSet ds = db.getSPDataSet("getCBRWeeklyPercentageBody", "#JobID",JobID,
"#StartDate", StartDate, "#ResourcingGradeIDs",
ResourceID, "#StaffIds", StaffIds);
}
}
public void GetResult2()
{
//Thread pending do to Result1 is not released from database
//1 min delayed response
DataSet ds = db.getSPDataSet("getCBRWeeklyPercentageHeader",
"#JobID", JobID, "#StartDate", StartDate,
"#ResourcingGradeIDs", ResourceID, "#StaffIds", StaffIds);
}
And the JS------------------------------------------
//Request 1
$.ajax({
url: "api/SaveDefaultSettings",
data: {
},
type: "POST",
cache: false,
async: true
success: function (data) {
//Default save result
//Get Active Activity List
ResourceHours = [];
GetDefaultData();
selectedRows = SaveData;
ResourceHours = data;
var spl = StartDate.split("-")
if (spl[1].length == 1) {
spl[1] = "0" + spl[1];
}
StartDate = spl[2] + "/" + spl[1] + "/" + spl[0];
},
error: function (xhr, ajaxOptions, thrownError) {
}
});
//Request 2
$.ajax({
url: "api/GetResources",
data: { JobID: JobID, StartDate: StartDate },
type: "POST",
cache: false,
async: true
success: function (data) {
// CallProgressDialog("Processing", "Please wait while data is loading.");
Resources = [];
Resources = data;
Populate();
$(".QuantimDialog_Button0").click();
},
error: function (xhr, ajaxOptions, thrownError) {
}
});

For the test purpose can you please create a simpler case? At least cutting DB dependency would eliminate the need to establish DB connection and execute SP which might have some locking inside.
Also note that browsers have a limit of concurrent connections (How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?) as well as .NET application(https://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.defaultconnectionlimit(v=vs.110).aspx) and the server itself (Max Outgoing Socket Connections in .NET/Windows Server)
So less variables in the test - the more likely will be to narrow down the bottleneck

That is kind of an old school way of doing parallel processing. Now you can just use the Async stuff built into .Net. Just start two tasks and do an WhenAll on them.

Related

NodeJS Running multiple fetch requests inside multiple Node-Schedule jobs

My goal is to be able to run different functions in each of the multiple node schedule jobs, which will run at specific times of the day, and will call a specific function depending on the time of the day.
I have the following code, for a minimum example I have made it simple.
var f1 = function () {
return fetch('url1.php', {
method: 'post',
headers: {
'Content-Type': 'application/json'
}
})
}
var f2 = function () {
return fetch('url2.php', {
method: 'post',
headers: {
'Content-Type': 'application/json'
}
})
}
var tasks = [f1, f2];
const promesas = async(tasks) => {
return await Promise.all(tasks.map(function (e, j) {
rules[j] = new schedule.RecurrenceRule();
rules[j].dayOfWeek = [new schedule.Range(1, 5)];
rules[j].hour = hours[j];
rules[j].minute = mins[j];
return schedule.scheduleJob(rules[j], async function () {
try {
await tasks[j]()
} catch (error) {
console.log(error);
}
});
})).then(response => {
return Promise.resolve(response);
});
}
promesas(tasks);
I can run these functions just fine outside the jobs, or all within a single job. But not in the setup that I want which I describe at the beggining.
The first iteration works fine, the second just fails. The difference between each job is of one minute for testing purposes.

I am not sure what you're doing here, especially with your odd use of Promises. (Why await the Promise.all()? Why return Promise.resolve() of the already resolved response? Why await each individual task rather than just let it execute (or fail) asynchronously in what amounts to a cronjob script with no output?)
However, if this is working for you for the first iteration, that's fine.
Can I assume you're using node-schedule? If so, just note that you're actually using times of day 00:00, 01:01, etc. rather than "one minute between each for testing purposes" as you claim.
Here's how I would do what you seem to be trying to do:
const tasks = [f1, f2];
let i = 0;
for (let task of tasks) {
let rule = new schedule.RecurrenceRule();
rule.dayOfWeek = [new schedule.Range(1, 5)];
rule.minute = (new Date().getMinutes() + (++i)) % 60; // adjust to suit your needs
schedule.scheduleJob(rule, task);
}
I have confirmed this works, e.g. issues one task per minute starting from the minute after the script is started. No await, then, or Promise.all is necessary for the simple example you have given, but may be required depending on what you plan to do with the results of each job function.

Grouping redis.get for 2ms and then executing by mget

My application makes about 50 redis.get call to serve a single http request, it serves millions of request daily and application runs on about 30 pods.
When monitoring on newrelic i am getting 200MS average redis.get time, To Optimize this i wrote a simple pipeline system in nodejs which is simply a wrapper over redis.get and it pushes all the request in queue, and then execute the queue using redis.mget (getting all the keys in bulk).
Following is the code snippet:
class RedisBulk {
constructor() {
this.queue = [];
this.processingQueue = {};
this.intervalId = setInterval(() => {
this._processQueue();
}, 5);
}
clear() {
clearInterval(this.intervalId);
}
get(key, cb) {
this.queue.push({cb, key});
}
_processQueue() {
if (this.queue.length > 0) {
let queueLength = this.queue.length;
logger.debug('Processing Queue of length', queueLength);
let time = (new Date).getTime();
this.processingQueue[time] = this.queue;
this.queue = []; //empty the queue
let keys = [];
this.processingQueue[time].forEach((item)=> {
keys.push(item.key);
});
global.redisClient.mget(keys, (err, replies)=> {
if (err) {
captureException(err);
console.error(err);
} else {
this.processingQueue[time].forEach((item, index)=> {
item.cb(err, replies[index]);
});
}
delete this.processingQueue[time];
});
}
}
}
let redis_bulk = new RedisBulk();
redis_bulk.get('a');
redis_bulk.get('b');
redis_bulk.get('c');
redis_bulk.get('d');
My Question is: is this a good approach? will it help in optimizing redis get time? is there any other solution for above problem?
Thanks

I'm not a redis expert but judging by the documentation ;
MGET has the time complexity of
O(N) where N is the number of keys to retrieve.
And GET has the time complexity of
O(1)
Which brings both scenarios to the same end result in terms of time complexity in your scenario. Having a bulk request with MGET can bring you some improvements for the IO but apart from that looks like you have the same bottleneck.
I'd ideally split my data into chunks, responding via multiple http requests in async fashion if that's an option.
Alternatively, you can try calling GET with promise.all() to run GET requests in parallel, for all the GET calls you need.
Something like;
const asyncRedis = require("async-redis");
const client = asyncRedis.createClient();
function bulk() {
const keys = [];
return Promise.all(keys.map(client.get))
}

Firebase Nodejs : DEADLINE_EXCEEDED: Deadline exceeded on set() [duplicate]

I took one of the sample functions from the Firestore documentation and was able to successfully run it from my local firebase environment. However, once I deployed to my firebase server, the function completes, but no entries are made in the firestore database. The firebase function logs show "Deadline Exceeded." I'm a bit baffled. Anyone know why this is happening and how to resolve this?
Here is the sample function:
exports.testingFunction = functions.https.onRequest((request, response) => {
var data = {
name: 'Los Angeles',
state: 'CA',
country: 'USA'
};
// Add a new document in collection "cities" with ID 'DC'
var db = admin.firestore();
var setDoc = db.collection('cities').doc('LA').set(data);
response.status(200).send();
});

Firestore has limits.
Probably “Deadline Exceeded” happens because of its limits.
See this. https://firebase.google.com/docs/firestore/quotas
Maximum write rate to a document 1 per second
https://groups.google.com/forum/#!msg/google-cloud-firestore-discuss/tGaZpTWQ7tQ/NdaDGRAzBgAJ

In my own experience, this problem can also happen when you try to write documents using a bad internet connection.
I use a solution similar to Jurgen's suggestion to insert documents in batch smaller than 500 at once, and this error appears if I'm using a not so stable wifi connection. When I plug in the cable, the same script with the same data runs without errors.

I have written this little script which uses batch writes (max 500) and only write one batch after the other.
use it by first creating a batchWorker let batch: any = new FbBatchWorker(db);
Then add anything to the worker batch.set(ref.doc(docId), MyObject);. And finish it via batch.commit().
The api is the same as for the normal Firestore Batch (https://firebase.google.com/docs/firestore/manage-data/transactions#batched-writes) However, currently it only supports set.
import { firestore } from "firebase-admin";
class FBWorker {
callback: Function;
constructor(callback: Function) {
this.callback = callback;
}
work(data: {
type: "SET" | "DELETE";
ref: FirebaseFirestore.DocumentReference;
data?: any;
options?: FirebaseFirestore.SetOptions;
}) {
if (data.type === "SET") {
// tslint:disable-next-line: no-floating-promises
data.ref.set(data.data, data.options).then(() => {
this.callback();
});
} else if (data.type === "DELETE") {
// tslint:disable-next-line: no-floating-promises
data.ref.delete().then(() => {
this.callback();
});
} else {
this.callback();
}
}
}
export class FbBatchWorker {
db: firestore.Firestore;
batchList2: {
type: "SET" | "DELETE";
ref: FirebaseFirestore.DocumentReference;
data?: any;
options?: FirebaseFirestore.SetOptions;
}[] = [];
elemCount: number = 0;
private _maxBatchSize: number = 490;
public get maxBatchSize(): number {
return this._maxBatchSize;
}
public set maxBatchSize(size: number) {
if (size < 1) {
throw new Error("Size must be positive");
}
if (size > 490) {
throw new Error("Size must not be larger then 490");
}
this._maxBatchSize = size;
}
constructor(db: firestore.Firestore) {
this.db = db;
}
async commit(): Promise<any> {
const workerProms: Promise<any>[] = [];
const maxWorker = this.batchList2.length > this.maxBatchSize ? this.maxBatchSize : this.batchList2.length;
for (let w = 0; w < maxWorker; w++) {
workerProms.push(
new Promise((resolve) => {
const A = new FBWorker(() => {
if (this.batchList2.length > 0) {
A.work(this.batchList2.pop());
} else {
resolve();
}
});
// tslint:disable-next-line: no-floating-promises
A.work(this.batchList2.pop());
}),
);
}
return Promise.all(workerProms);
}
set(dbref: FirebaseFirestore.DocumentReference, data: any, options?: FirebaseFirestore.SetOptions): void {
this.batchList2.push({
type: "SET",
ref: dbref,
data,
options,
});
}
delete(dbref: FirebaseFirestore.DocumentReference) {
this.batchList2.push({
type: "DELETE",
ref: dbref,
});
}
}

I tested this, by having 15 concurrent AWS Lambda functions writing 10,000 requests into the database into different collections / documents milliseconds part. I did not get the DEADLINE_EXCEEDED error.
Please see the documentation on firebase.
'deadline-exceeded': Deadline expired before operation could complete. For operations that change the state of the system, this error may be returned even if the operation has completed successfully. For example, a successful response from a server could have been delayed long enough for the deadline to expire.
In our case we are writing a small amount of data and it works most of the time but loosing data is unacceptable. I have not concluded why Firestore fails to write in simple small bits of data.
SOLUTION:
I am using an AWS Lambda function that uses an SQS event trigger.
# This function receives requests from the queue and handles them
# by persisting the survey answers for the respective users.
QuizAnswerQueueReceiver:
handler: app/lambdas/quizAnswerQueueReceiver.handler
timeout: 180 # The SQS visibility timeout should always be greater than the Lambda function’s timeout.
reservedConcurrency: 1 # optional, reserved concurrency limit for this function. By default, AWS uses account concurrency limit
events:
- sqs:
batchSize: 10 # Wait for 10 messages before processing.
maximumBatchingWindow: 60 # The maximum amount of time in seconds to gather records before invoking the function
arn:
Fn::GetAtt:
- SurveyAnswerReceiverQueue
- Arn
environment:
NODE_ENV: ${self:custom.myStage}
I am using a dead letter queue connected to my main queue for failed events.
Resources:
QuizAnswerReceiverQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: ${self:provider.environment.QUIZ_ANSWER_RECEIVER_QUEUE}
# VisibilityTimeout MUST be greater than the lambda functions timeout https://lumigo.io/blog/sqs-and-lambda-the-missing-guide-on-failure-modes/
# The length of time during which a message will be unavailable after a message is delivered from the queue.
# This blocks other components from receiving the same message and gives the initial component time to process and delete the message from the queue.
VisibilityTimeout: 900 # The SQS visibility timeout should always be greater than the Lambda function’s timeout.
# The number of seconds that Amazon SQS retains a message. You can specify an integer value from 60 seconds (1 minute) to 1,209,600 seconds (14 days).
MessageRetentionPeriod: 345600 # The number of seconds that Amazon SQS retains a message.
RedrivePolicy:
deadLetterTargetArn:
"Fn::GetAtt":
- QuizAnswerReceiverQueueDLQ
- Arn
maxReceiveCount: 5 # The number of times a message is delivered to the source queue before being moved to the dead-letter queue.
QuizAnswerReceiverQueueDLQ:
Type: "AWS::SQS::Queue"
Properties:
QueueName: "${self:provider.environment.QUIZ_ANSWER_RECEIVER_QUEUE}DLQ"
MessageRetentionPeriod: 1209600 # 14 days in seconds

If the error is generate after around 10 seconds, probably it's not your internet connetion, it might be that your functions are not returning any promise. In my experience I got the error simply because I had wrapped a firebase set operation(which returns a promise) inside another promise.
You can do this
return db.collection("COL_NAME").doc("DOC_NAME").set(attribs).then(ref => {
var SuccessResponse = {
"code": "200"
}
var resp = JSON.stringify(SuccessResponse);
return resp;
}).catch(err => {
console.log('Quiz Error OCCURED ', err);
var FailureResponse = {
"code": "400",
}
var resp = JSON.stringify(FailureResponse);
return resp;
});
instead of
return new Promise((resolve,reject)=>{
db.collection("COL_NAME").doc("DOC_NAME").set(attribs).then(ref => {
var SuccessResponse = {
"code": "200"
}
var resp = JSON.stringify(SuccessResponse);
return resp;
}).catch(err => {
console.log('Quiz Error OCCURED ', err);
var FailureResponse = {
"code": "400",
}
var resp = JSON.stringify(FailureResponse);
return resp;
});
});

How to run asynchronous tasks synchronous?

I'm developing an app with the following node.js stack: Express/Socket.IO + React. In React I have DataTables, wherein you can search and with every keystroke the data gets dynamically updated! :)
I use Socket.IO for data-fetching, so on every keystroke the client socket emits some parameters and the server calls then the callback to return data. This works like a charm, but it is not garanteed that the returned data comes back in the same order as the client sent it.
To simulate: So when I type in 'a', the server responds with this same 'a' and so for every character.
I found the async module for node.js and tried to use the queue to return tasks in the same order it received it. For simplicity I delayed the second incoming task with setTimeout to simulate a slow performing database-query:
Declaration:
const async = require('async');
var queue = async.queue(function(task, callback) {
if(task.count == 1) {
setTimeout(function() {
callback();
}, 3000);
} else {
callback();
}
}, 10);
Usage:
socket.on('result', function(data, fn) {
var filter = data.filter;
if(filter.length === 1) { // TEST SYNCHRONOUSLY
queue.push({name: filter, count: 1}, function(err) {
fn(filter);
// console.log('finished processing slow');
});
} else {
// add some items to the queue
queue.push({name: filter, count: filter.length}, function(err) {
fn(data.filter);
// console.log('finished processing fast');
});
}
});
But the way I receive it in the client console, when I search for abc is as follows:
ab -> abc -> a(after 3 sec)
I want it to return it like this: a(after 3sec) -> ab -> abc
My thought is that the queue runs the setTimeout and then goes further and eventually the setTimeout gets fired somewhere on the event loop later on. This resulting in returning later search filters earlier then the slow performing one.
How can i solve this problem?

First a few comments, which might help clear up your understanding of async calls:
Using "timeout" to try and align async calls is a bad idea, that is not the idea about async calls. You will never know how long an async call will take, so you can never set the appropriate timeout.
I believe you are misunderstanding the usage of queue from async library you described. The documentation for the queue can be found here.
Copy pasting the documentation in here, in-case things are changed or down:
Creates a queue object with the specified concurrency. Tasks added to the queue are processed in parallel (up to the concurrency limit). If all workers are in progress, the task is queued until one becomes available. Once a worker completes a task, that task's callback is called.
The above means that the queue can simply be used to priorities the async task a given worker can perform. The different async tasks can still be finished at different times.
Potential solutions
There are a few solutions to your problem, depending on your requirements.
You can only send one async call at a time and wait for the first one to finish before sending the next one
You store the results and only display the results to the user when all calls have finished
You disregard all calls except for the latest async call
In your case I would pick solution 3 as your are searching for something. Why would you use care about the results for "a" if they are already searching for "abc" before they get the response for "a"?
This can be done by giving each request a timestamp and then sort based on the timestamp taking the latest.

SOLUTION:
Server:
exports = module.exports = function(io){
io.sockets.on('connection', function (socket) {
socket.on('result', function(data, fn) {
var filter = data.filter;
var counter = data.counter;
if(filter.length === 1 || filter.length === 5) { // TEST SYNCHRONOUSLY
setTimeout(function() {
fn({ filter: filter, counter: counter}); // return to client
}, 3000);
} else {
fn({ filter: filter, counter: counter}); // return to client
}
});
});
}
Client:
export class FilterableDataTable extends Component {
constructor(props) {
super();
this.state = {
endpoint: "http://localhost:3001",
filters: {},
counter: 0
};
this.onLazyLoad = this.onLazyLoad.bind(this);
}
onLazyLoad(event) {
var offset = event.first;
if(offset === null) {
offset = 0;
}
var filter = ''; // filter is the search character
if(event.filters.result2 != undefined) {
filter = event.filters.result2.value;
}
var returnedData = null;
this.state.counter++;
this.socket.emit('result', {
offset: offset,
limit: 20,
filter: filter,
counter: this.state.counter
}, function(data) {
returnedData = data;
console.log(returnedData);
if(returnedData.counter === this.state.counter) {
console.log('DATA: ' + JSON.stringify(returnedData));
}
}
This however does send unneeded data to the client, which in return ignores it. Somebody any idea's for further optimizing this kind of communication? For example a method to keep old data at the server and only send the latest?

node.js process out of memory in http.request loop

In my node.js server i cant figure out, why it runs out of memory. My node.js server makes a remote http request for each http request it receives, therefore i've tried to replicate the problem with the below sample script, that also runs out of memory.
This only happens if the iterations in the for loop are very high.
From my point of view, the problem is related to the fact that node.js is queueing the remote http requests. How to avoid this?
This is the sample script:
(function() {
var http, i, mypost, post_data;
http = require('http');
post_data = 'signature=XXX%7CPSFA%7Cxxxxx_value%7CMyclass%7CMysubclass%7CMxxxxx&schedule=schedule_name_6569&company=XXXX';
mypost = function(post_data, cb) {
var post_options, req;
post_options = {
host: 'myhost.com',
port: 8000,
path: '/set_xxxx',
method: 'POST',
headers: {
'Content-Length': post_data.length
}
};
req = http.request(post_options, function(res) {
var res_data;
res.setEncoding('utf-8');
res_data = '';
res.on('data', function(chunk) {
return res_data += chunk;
});
return res.on('end', function() {
return cb();
});
});
req.on('error', function(e) {
return console.debug('TM problem with request: ' + e.message);
});
req.write(post_data);
return req.end;
};
for (i = 1; i <= 1000000; i++) {
mypost(post_data, function() {});
}
}).call(this);
$ node -v
v0.4.9
$ node sample.js
FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory
Tks in advance
gulden PT

Constraining the flow of requests into the server
It's possible to prevent overload of the built-in Server and its HTTP/HTTPS variants by setting the maxConnections property on the instance. Setting this property will cause node to stop accept()ing connections and force the operating system to drop requests when the listen() backlog is full and the application is already handling maxConnections requests.
Throttling outgoing requests
Sometimes, it's necessary to throttle outgoing requests, as in the example script from the question.
Using node directly or using a generic pool
As the question demonstrates, unchecked use of the node network subsystem directly can result in out of memory errors. Something like node-pool makes the active pool management attractive, but it doesn't solve the fundamental problem of unconstrained queuing. The reason for this is that node-pool doesn't provide any feedback about the state of the client pool.
UPDATE: As of v1.0.7 node-pool includes a patch inspired by this post to add a boolean return value to acquire(). The code in the following section is no longer necessary and the example with the streams pattern is working code with node-pool.
Cracking open the abstraction
As demonstrated by Andrey Sidorov, a solution can be reached by tracking the queue size explicitly and mingling the queuing code with the requesting code:
var useExplicitThrottling = function () {
var active = 0
var remaining = 10
var queueRequests = function () {
while(active < 2 && --remaining >= 0) {
active++;
pool.acquire(function (err, client) {
if (err) {
console.log("Error acquiring from pool")
if (--active < 2) queueRequests()
return
}
console.log("Handling request with client " + client)
setTimeout(function () {
pool.release(client)
if(--active < 2) {
queueRequests()
}
}, 1000)
})
}
}
queueRequests(10)
console.log("Finished!")
}
Borrowing the streams pattern
The streams pattern is a solution which is idiomatic in node. Streams have a write operation which returns false when the stream cannot buffer more data. The same pattern can be applied to a pool object with acquire() returning false when the maximum number of clients have been acquired. A drain event is emitted when the number of active clients drops below the maximum. The pool abstraction is closed again and it's possible to omit explicit references to the pool size.
var useStreams = function () {
var queueRequests = function (remaining) {
var full = false
pool.once('drain', function() {
if (remaining) queueRequests(remaining)
})
while(!full && --remaining >= 0) {
console.log("Sending request...")
full = !pool.acquire(function (err, client) {
if (err) {
console.log("Error acquiring from pool")
return
}
console.log("Handling request with client " + client)
setTimeout(pool.release, 1000, client)
})
}
}
queueRequests(10)
console.log("Finished!")
}
Fibers
An alternative solution can be obtained by providing a blocking abstraction on top of the queue. The fibers module exposes coroutines that are implemented in C++. By using fibers, it's possible to block an execution context without blocking the node event loop. While I find this approach to be quite elegant, it is often overlooked in the node community because of a curious aversion to all things synchronous-looking. Notice that, excluding the callcc utility, the actual loop logic is wonderfully concise.
/* This is the call-with-current-continuation found in Scheme and other
* Lisps. It captures the current call context and passes a callback to
* resume it as an argument to the function. Here, I've modified it to fit
* JavaScript and node.js paradigms by making it a method on Function
* objects and using function (err, result) style callbacks.
*/
Function.prototype.callcc = function(context /* args... */) {
var that = this,
caller = Fiber.current,
fiber = Fiber(function () {
that.apply(context, Array.prototype.slice.call(arguments, 1).concat(
function (err, result) {
if (err)
caller.throwInto(err)
else
caller.run(result)
}
))
})
process.nextTick(fiber.run.bind(fiber))
return Fiber.yield()
}
var useFibers = function () {
var remaining = 10
while(--remaining >= 0) {
console.log("Sending request...")
try {
client = pool.acquire.callcc(this)
console.log("Handling request with client " + client);
setTimeout(pool.release, 1000, client)
} catch (x) {
console.log("Error acquiring from pool")
}
}
console.log("Finished!")
}
Conclusion
There are a number of correct ways to approach the problem. However, for library authors or applications that require a single pool to be shared in many contexts it is best to properly encapsulate the pool. Doing so helps prevent errors and produces cleaner, more modular code. Preventing unconstrained queuing then becomes an evented dance or a coroutine pattern. I hope this answer dispels a lot of FUD and confusion around blocking-style code and asynchronous behavior and encourages you to write code which makes you happy.

yes, you trying to queue 1000000 requests before even starting them. This version keeps limited number of request (100):
function do_1000000_req( cb )
{
num_active = 0;
num_finished = 0;
num_sheduled = 0;
function shedule()
{
while (num_active < 100 && num_sheduled < 1000000) {
num_active++;
num_sheduled++;
mypost(function() {
num_active--;
num_finished++;
if (num_finished == 1000000)
{
cb();
return;
} else if (num_sheduled < 1000000)
shedule();
});
}
}
}
do_1000000_req( function() {
console.log('done!');
});

the node-pool module can help you. For more détails, see this post (in french), http://blog.touv.fr/2011/08/http-request-loop-in-nodejs.html

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string