NodeJS can't get memcached values when highloaded

NodeJS can't get memcached values when highloaded - node.js

I have an application on NodeJS that uses Cluster, WS, and memcached-client to manage two memcached-servers
During normal times, it works like a charm
But during high load, my application stops working and fetches data from memcached-servers
That is, the logs inside client.get callback do not work, and are not written to the console, when the load is high, therefore the client does not receive its cached value (although it is present on the memcached server and sometimes even with high load it works fine). For a while it will look like it's dead and not doing anything
getValue = function(key, callback){
console.log(`Calculated server for choose: ${strategy(key, client.servers.length)}`) // works with highload
console.log(`Try to get from cache by key: ${key}.`); // works with highload
client.get( key, function(err, data) {
const isError = err || !data // doesn't work with highload
console.log('Data from cache is: ', data) // callback will be never executed
if (!isError) {
console.log(`Found data in cache key-value: ${key} - ${data}`);
}else{
console.log(`Not found value from cache by key: ${key}`);
}
const parsedData = isError ? null : JSON.parse(data.toString())
callback(isError, parsedData); // and this won't work also
});
}
And after some time, socket connection is simply closed (with 1000 code, no errors, looks like user just leaves out)
INFO [ProcessID-100930] Connection close [772003], type [ws], code [1000], message []
Then, after 5-10 seconds, all processes start working again as if nothing had happened and the memcached-client callback starts to execute correctly
I've been trying for so long to catch this moment and understand why this is happening, but I still don't understand. I have changed already several memcached clients(memjs now, memcached, mc) but still get the same behavior under high load
When receiving data from memcached-server, the callback simply does not work, and data from the memcached is not returned (although judging by the memcached logs, they were there at that moment)
Can someone suggest please?

Related

Querying DB2 every 15 seconds causing memory leak in NodeJS

I have an application which checks for new entries in DB2 every 15 seconds on the iSeries using IBM's idb-connector. I have async functions which return the result of the query to socket.io which emits an event with the data included to the front end. I've narrowed down the memory leak to the async functions. I've read multiple articles on common memory leak causes and how to diagnose them.
MDN: memory management
Rising Stack: garbage collection explained
Marmelab: Finding And Fixing Node.js Memory Leaks: A Practical Guide
But I'm still not seeing where the problem is. Also, I'm unable to get permission to install node-gyp on the system which means most memory management tools are off limits as memwatch, heapdump and the like need node-gyp to install. Here's an example of what the functions basic structure is.
const { dbconn, dbstmt } = require('idb-connector');// require idb-connector
async function queryDB() {
const sSql = `SELECT * FROM LIBNAME.TABLE LIMIT 500`;
// create new promise
let promise = new Promise ( function(resolve, reject) {
// create new connection
const connection = new dbconn();
connection.conn("*LOCAL");
const statement = new dbstmt(connection);
statement.exec(sSql, (rows, err) => {
if (err) {
throw err;
}
let ticks = rows;
statement.close();
connection.disconn();
connection.close();
resolve(ticks.length);// resolve promise with varying data
})
});
let result = await promise;// await promise
return result;
};
async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
setTimeout(getNewData, 2000);// check again in 2 seconds
};
Any ideas on where the leak is? Am i using async/await incorrectly? Or else am i creating/destroying DB connections improperly? Any help on figuring out why this code is leaky would be much appreciated!!
Edit: Forgot to mention that i have limited control on the backend processes as they are handled by another team. I'm only retrieving the data they populate the DB with and adding it to a web page.
Edit 2: I think I've narrowed it down to the DB connections not being cleaned up properly. But, as far as i can tell I've followed the instructions suggested on their github repo.

I don't know the answer to your specific question, but instead of issuing a query every 15 seconds, I might go about this in a different way. Reason being that I don't generally like fishing expeditions when the environment can tell me an event occurred.
So in that vein, you might want to try a database trigger that loads the key to the row into a data queue on add, or even change or delete if necessary. Then you can just put in an async call to wait for a record on the data queue. This is more real time, and the event handler is only called when a record shows up. The handler can get the specific record from the database since you know it's key. Data queues are much faster than database IO, and place little overhead on the trigger.
I see a couple of potential advantages with this method:
You aren't issuing dozens of queries that may or may not return data.
The event would fire the instant a record is added to the table, rather than 15 seconds later.
You don't have to code for the possibility of one or more new records, it will always be 1, the one mentioned in the data queue.

yes you have to close connection.
Don't make const data. you don't need promise by default statement.exec is async and handles it via return result;
keep setTimeout(getNewData, 2000);// check again in 2 seconds
line outside getNewData otherwise it becomes recursive infinite loop.
Sample code
const {dbconn, dbstmt} = require('idb-connector');
const sql = 'SELECT * FROM QIWS.QCUSTCDT';
const connection = new dbconn(); // Create a connection object.
connection.conn('*LOCAL'); // Connect to a database.
const statement = new dbstmt(dbconn); // Create a statement object of the connection.
statement.exec(sql, (result, error) => {
if (error) {
throw error;
}
console.log(`Result Set: ${JSON.stringify(result)}`);
statement.close(); // Clean up the statement object.
connection.disconn(); // Disconnect from the database.
connection.close(); // Clean up the connection object.
return result;
});
*async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
setTimeout(getNewData, 2000);// check again in 2 seconds
};*
change to
**async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
};
setTimeout(getNewData, 2000);// check again in 2 seconds**

First thing to notice is possible open database connection in case of an error.
if (err) {
throw err;
}
Also in case of success connection.disconn(); and connection.close(); return boolean values that tell is operation successful (according to documentation)
Always possible scenario is to pile up connection objects in 3rd party library.
I would check those.

This was confirmed to be a memory leak in the idb-connector library that i was using. Link to github issue Here. Basically there was a C++ array that never had it's memory deallocated. A new version was added and the commit can viewed Here.

node process can not find setTimeout object in subsequent requests

I am trying to clear timeout set using setTimeout method by node process, in subsequent requests (using express). So, basically, I set timeout when our live stream event starts (get notified by webhook) and aim to stop this for guest users after one hour. One hour is being calculated via setTimeout, which works fine so far. However, if event gets stopped before one hour, I need to clear the timeout. I am trying to use clearTimeOut but it just can't find same variable.
// Event starts
var setTimeoutIds = {};
var val = req.body.eventId;
setTimeoutIds[val] = setTimeout(function() {
req.app.io.emit('disable_for_guest',req.body);
live_events.update({event_id:req.body.eventId},{guest_visibility:false},function(err,data){
//All ok
});
}, disable_after_milliseconds);
console.log(setTimeoutIds);
req.app.io.emit('session_started',req.body);
When event ends:
try{
var event_id = req.body.eventId;
clearTimeout(setTimeoutIds[event_id]);
delete setTimeoutIds[event_id];
}catch(e){
console.log('Event ID could not be removed' + e);
}
req.app.io.emit('event_ended',req.body);
Output :
Output

You are defining setTimeoutIds in the scope of the handler. You must define it at module level.
var setTimeoutIds = {};
router.post('/webhook', function(req, res) {
...
That makes the variable available until the next restart of the server.
Note: this approach only works as long as you only have a single server with a single node process serving your application. Once you go multi-process and/or multi-server, you need a completely different approach.

Socket.io loses data on server side

Yellow,
so, I'm making a multiplayer online game on node (for funzies) and I'm stuck on a problem for over a week now. Perhaps the solution is simple, but I'm oblivious to it.
Long story short:
Data gets sent from client to server, this emit happens every
16.66ms.
Server receives them correctly and we collect all the data (lots of
fireballs in this case). We save them in player.skills_to_execute
array.
Every 5 seconds, we copy the data to seperate array (player_information), because
we are gona clean the current one, so it can keep collecting new
data, and then we send all the collected data back to the client.
Problem is definitely on server side. Sometimes this works, and sometimes it doesn't.
player_information is the array that I'm sending back to front, but before I send it, I do check with console.log on server if it does actually contain the data, and it does! But somehow that data gets deleted/overwritten right before sending it and it sends empty array (cause I check on frontend and I receive empty).
Code is fairly more complex, but I've minimized it here so it's easier to understand it.
This code stays on client side, and works as it should:
// front.js
socket.on("update-player-information", function(player_data_from_server){
console.log( player_data_from_server.skills_to_execute );
});
socket.emit("update-player-information", {
skills_to_execute: "fireball"
});
This code stays on server side, and works as it should:
// server.js
socket.on("update-player-information", function(data){
// only update if there are actually skills received
// we dont want every request here to overwrite actual array with empty [] // data.skills_to_execute = this will usually be 1 to few skills that are in need to be executed on a single client cycle
// naturally, we receive multiple requests in these 5 seconds,
// so we save them all in player object, where it has an array for this
if ( data.skills_to_execute.length > 0 ) {
player.skills_to_execute.push( data.skills_to_execute );
}
});
Now this is the code, where shit hits the fan.
// server.js
// Update player information
setInterval(function(){
// for every cycle, reset the bulk data that we are gona send, just to be safe
var player_information = [];
// collect the data from player
player_information.push(
{
skills_to_execute: player.skills_to_execute
}
);
// we reset the collected actions here, cause they are now gona be sent to front.js
// and we want to keep collecting new skills_to_execute that come in
player.skills_to_execute = [];
socket.emit("update-player-information", player_information);
}, 5000);
Perhaps anybody has any ideas?

Copy the array by value instead of by reference.
Try this:
player_information.push(
{
skills_to_execute: player.skills_to_execute.slice()
}
);
Read more about copying arrays in JavaScript by value or by reference here: Copying array by value in JavaScript

Request rate is large

Im using Azure documentdb and accessing it through my node.js on express server, when I query in loop, low volume of few hundred there is no issue.
But when query in loop slightly large volume, say around thousand plus
I get partial results (inconsistent, every time I run result values are not same. May be because of asynchronous nature of Node.js)
after few results it crashes with this error
body: '{"code":"429","message":"Message: {\"Errors\":[\"Request rate is large\"]}\r\nActivityId: 1fecee65-0bb7-4991-a984-292c0d06693d, Request URI: /apps/cce94097-e5b2-42ab-9232-6abd12f53528/services/70926718-b021-45ee-ba2f-46c4669d952e/partitions/dd46d670-ab6f-4dca-bbbb-937647b03d97/replicas/130845018837894542p"}' }
Meaning DocumentDb fail to handle 1000+ request per second?
All together giving me a bad impression on NoSQL techniques.. is it short coming of DocumentDB?

As Gaurav suggests, you may be able to avoid the problem by bumping up the pricing tier, but even if you go to the highest tier, you should be able to handle 429 errors. When you get a 429 error, the response will include a 'x-ms-retry-after-ms' header. This will contain a number representing the number of milliseconds that you should wait before retrying the request that caused the error.
I wrote logic to handle this in my documentdb-utils node.js package. You can either try to use documentdb-utils or you can duplicate it yourself. Here is a snipit example.
createDocument = function() {
client.createDocument(colLink, document, function(err, response, header) {
if (err != null) {
if (err.code === 429) {
var retryAfterHeader = header['x-ms-retry-after-ms'] || 1;
var retryAfter = Number(retryAfterHeader);
return setTimeout(toRetryIf429, retryAfter);
} else {
throw new Error(JSON.stringify(err));
}
} else {
log('document saved successfully');
}
});
};
Note, in the above example document is within the scope of createDocument. This makes the retry logic a bit simpler, but if you don't like using widely scoped variables, then you can pass document in to createDocument and then pass it into a lambda function in the setTimeout call.

nodejs http response.write: is it possible out-of-memory?

If i have following code to send data repeatedly to client every 10ms:
setInterval(function() {
res.write(somedata);
}, 10ms);
What would happen if the client is very slow to receive the data?
Will server get out-of-memory error?
Edit:
actually the connection is kept alive, sever send jpeg data endlessly (HTTP multipart/x-mixed-replace header + body + header + body.....)
Because node.js response.write is asynchronous,
so some users guess it may store data in internal buffer and wait until low layer tells it can send,
so the internal buffer will grow, am i right?
If i am right, then how to resolve this?
the problem is node.js does not notify me when data is send for a single write call.
In other word, i can not tell user this way is theoretically no risk of "out of memory" and how to fix it.
Update:
By the keyword "drain" event given by user568109, i studied the source of node.js, and got conclusion:
it really will cause "out-of-memory" error. I should check return value of response.write(...)===false and then handle "drain" event of the response.
http.js:
OutgoingMessage.prototype._buffer = function(data, encoding) {
this.output.push(data); //-------------No check here, will cause "out-of-memory"
this.outputEncodings.push(encoding);
return false;
};
OutgoingMessage.prototype._writeRaw = function(data, encoding) { //this will be called by resonse.write
if (data.length === 0) {
return true;
}
if (this.connection &&
this.connection._httpMessage === this &&
this.connection.writable &&
!this.connection.destroyed) {
// There might be pending data in the this.output buffer.
while (this.output.length) {
if (!this.connection.writable) { //when not ready to send
this._buffer(data, encoding); //----------> save data into internal buffer
return false;
}
var c = this.output.shift();
var e = this.outputEncodings.shift();
this.connection.write(c, e);
}
// Directly write to socket.
return this.connection.write(data, encoding);
} else if (this.connection && this.connection.destroyed) {
// The socket was destroyed. If we're still trying to write to it,
// then we haven't gotten the 'close' event yet.
return false;
} else {
// buffer, as long as we're not destroyed.
this._buffer(data, encoding);
return false;
}
};

Some gotchas:
If sending over http it is not be a good idea. The browser may consider the request as timeout if it is not finished within specified amount of time. Server too will close connection which is idle for too long. If client cannot keep up, the timeout is almost certain.
setInterval for 10ms is also subject to some restrictions. It doesn't mean it will repeat after every 10ms, 10ms is the minimum it will wait before repeating. It will be slower than what you set the interval.
Let's say you chance to overload the response with data, then at some point the server will end connection and respond by 413 Request Entity Too Large depending on what the limit is set.
Node.js has single threaded architecture with a max memory limitation of around 1.7 GB. If you set your above server limits to too high and have many incoming connections you will get process out of memory error.
So with appropriate limits it will either give timeout or be request too large. (And there are no other errors in your program.)
Update
You need to use drain event. The http response is a writable stream. It has its own internal buffer. When the buffer is emptied the drain event is triggered. You should learn more about streams as you would go in deeper. This will help you not just in http. You can find several resources about streams on web.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NodeJS can't get memcached values when highloaded - node.js

Related

Querying DB2 every 15 seconds causing memory leak in NodeJS

node process can not find setTimeout object in subsequent requests

Socket.io loses data on server side

Request rate is large

nodejs http response.write: is it possible out-of-memory?

Categories

Resources