Apache Benchmark - real concurrency level - issue - node.js

Recently I've created simple node.js script to validate concurency of some long lasting DB operations. Basic idea of is to receive web request, than wait around 10 seconds and return result (code below).
/*jshint esversion: 6 */
const express = require('express');
const app = express();
var counter = 0;
app.get('/test', (req, res) => {
counter++;
console.log(counter);
// simulate long db query
setTimeout(() => {
res.send();
}, 10000);
});
app.listen(80, () => console.log('Example app listening on port 80!'));
Apache Benchmar tool was used to start initial test - by executing command with following parameters:
ab -c 20 -n 200 -v 3 http://localhost/test.
The problem is, that presented scripts shows only one !!! connection to be send by Apache Benchmark.
After deeper investigation I observed, that first request send by Apache Bechmark seems to be a kind of service status checker and is not executed concurrently. Request concurrency is enabled after first request. To illustrate this I prepared slightly modified version of the code, which just simply responds immediatelly to first request.
app.get('/test', (req, res) => {
counter++;
console.log(counter);
if (counter == 1) {
res.send();
return;
}
// simulate long db query
setTimeout(() => {
res.send();
}, 10000);
});
Modified version shows apache benchmark concurrency level to be as expected. Of course this testing approach is rather counter productive, do you know how to disable this unexpected bahaviour of Apache Benchmark ?

Related

Browsers are requesting twice on load

I have a script that plays music and because of the nature of the script it seems to be running 2 connections for every 1 listener.
What that means is it is creating 2 documents inside Mongodb.
I am wondering is there a way to tell mongoose to not insert document within 5 seconds if the insert has already ran?
I have run a console.log(req.url)
and I get the following
/dance?uuid=6f70c645-4ef4-4042-854d-a6e87b61e45f
/dance?uuid=6f70c645-4ef4-4042-854d-a6e87b61e45f
Which means it is requesting the file twice on load.
Is there away to only allow 1 request to go through and discard the other?
I have tried
router.get('/:stationname', (req, res, next) => {
console.log(req.url)
});
Also tried the following
var doCSUpdateAlreadyCalled = false;
router.get('/:stationname', (req, res, next) => {
if (!doCSUpdateAlreadyCalled) {
doCSUpdateAlreadyCalled = true
console.log(req.url)
}
});
But it still runs twice.

Why we need async Callback in Node JS Since Event Loop offers Worker Pool to handle expensive task?

I was studying how Node JS improves performance for multiple concurrent request! After reading couple of blogs I found out that:
When any request come an event is triggered and corresponding
callback function is placed in Event Queue.
An event Loop(Main Thread) is responsible for handling all requests
in Event Queue. Event Loop processes the request and send back the
response if request uses Non-Blocking I/O.
If request contains Blocking I/O Event Loop internally assigns this request to
an idle worker from Work Pool and when worker send back the result
Event Loop sends the response.
My Question is since Event Loop is passing heavy blocking work internally to Work Pool using libuv library, why we need Asynchronous callback?
For Better Understanding please see the below code:
const express = require('express')
const app = express()
const port = 3000
function readUserSync(miliseconds) {
var currentTime = new Date().getTime();
while (currentTime + miliseconds >= new Date().getTime()) {
}
return "User"
}
async function readUserAsync(miliseconds) {
var currentTime = new Date().getTime();
while (currentTime + miliseconds >= new Date().getTime()) {
}
return "User"
}
app.get('/sync', (req, res) => {
const user = readUserSync(80)
res.send(user)
})
app.get('/async', async (req, res) => {
const user = await readUserAsync(80)
res.send(user)
})
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`)
})
I checked performance for both endpoints using apache benchmark tool, assuming each I/O operation takes 80ms.
ab -c 10 -t 5 "http://127.0.0.1:3000/async/"
ab -c 10 -t 5 "http://127.0.0.1:3000/sync/"
And surprisingly for endpoint with async callback had higher number of request per second.
So how Event Loop, Thread Pool and async await works internally to handle more concurrent requests?

Node.js 10000 concurrent HTTP requests every 10 seconds

I have a use case where I need to make more than 10,000 external HTTP requests(to one API) in an infinite loop every 10 seconds. This API takes anywhere from 200-800ms to respond.
Goals:
Call an external API more than 10,000 times in 10 seconds
Have a failproof polling system that can run for months at a time without failing
Attempts:
I have attempted to use the Async library and limit requests to 300 concurrent calls(higher numbers fail quickly) but after about 300,000 requests I run into errors where I receive a connection refused(I am calling a local Node server on another port). I am running the local server to mimic the real API because our application has not scaled to more than 10,000 users yet and the external API requires 1 unique user per request. The local response server is a very simple Node server that has one route that waits between 200-800ms to respond.
Am I getting this error because I am calling a server running locally on my machine or is it because Node is having an issue handling this many requests? Any insight into the best way to perform this type of polling would be appreciated.
Edit: My question is about how to create a client that can send more than 10,000 requests in a 10 second interval.
Edit2:
Client:
//arr contains 10,000 tokens
const makeRequests = arr => {
setInterval(() => {
async.eachLimit(arr, 300, (token, cb) => {
axios.get(`http://localhost:3001/tokens/${token}`)
.then(res => {
//do something
cb();
})
.catch(err => {
//handle error
cb();
})
})
}, 10000);
}
Dummy Server:
const getRandomArbitrary = (min, max) => {
return Math.random() * (max - min) + min;
}
app.get('/tokens:token', (req, res) => {
setTimeout(() => {
res.send('OK');
}, getRandomArbitrary(200, 800))
});

Express hold a request until last request is finished

So I'm writing an application in node.js+express which I want to achieve the following goal.
User POST many requests at a nearly same time (like with command curl...& which & makes it run at the background)
Process each request at one time, hold other requests before one is finished. The order can be determined by request arrive time, if same then choose randomly. So if I POST 5 requests to add things in the database at nearly the same time, the first request will be added into database first, while other requests will be held (not responding anything yet) until the first request is been processed and respond with a 200 code then follow on processing the second request.
Is it possible to achieve this with express, so when I send couple requests at one time, it won't occur issue like something isn't add into MongoDB properly.
You can set up middleware before/after your routes to queue up and dequeue requests if one is in progress. As people have mentioned this is not really best practice, but this is a way to do it within a single process (will not work for serverless models)
const queue = [];
const inprogress = null;
app.use((req, res, next) => {
if (inprogress) {
queue.push({req, res, next})
} else {
inprogress = res;
}
})
app.get('/your-route', (req, res, next) => {
// run your code
res.json({ some: 'payload' })
next();
})
app.use((req, res, next) => {
inprogress = null;
if (queue.length > 0) {
const queued = queue.shift();
inprogress = queued.res;
queued.next();
}
next();
})

Current approach to reflect dynamic data from json in nodejs using setInterval

I am trying to develop a MERN stack application, and I have done numerous attempts at this. So, what I am trying to achieve is have some data I am pulling from an api and dump it to a database, then query from the database to create a JSON file every 5 minutes(using jenkins and python, the best approach I can think of). Below is a method I am implementing and it does not work. If I remove the setInterval() function and un-comment the callback function, the code works but does not update the data.
const express = require('express');
const fs = require('fs');
const app = express();
// Read JSON File
function readJSON(callback) {
fs.readFile('./name.json', "utf8", function(err, result) {
if (err) callback (err);
callback(JSON.parse(result));
});
}
// Process JSON File during callback
// readJSON(function(res) {
// app.get('/api/customers', (request, response) => {
// response.json(res);
// });
// });
// Attempt to run every 5 minutes
setInterval(readJSON(function(res) {
app.get('/api/customers', (request, response) => {
response.json(res);
})}, 60000 * 5); // 5 Minutes
const port = 5000
app.listen(port, () => `Server running on port ${port}`);
I thought of using sockets, but I don't want it to be real-time, only live data on an interval. Restful API's I don't believe are a good fit here either, I don't want a 2-way communication to modify/update the data. If my approach is bad, please let me know why you'd pick another approach. I am just trying to establish a foundation in full-stack web dev. Thanks!
A logical code would be:
On server side:
function readJSON(callback) {
fs.readFile('./name.json', "utf8", function(err, result) {
if (err) callback(err);
callback(null, JSON.parse(result));
});
}
app.get('/api/customers', (request, response) => {
readJSON((err, nameContent) => {
if(err) {
response.status(500).send(err);
}
response.send(nameContent);
})
});
And in the client side ask for the data every 5 minutes:
someAjaxMethod('/api/customers', (err, nameContent) => console.log(err, nameContent));

Resources