Intermittent network timeouts while trying to fetch data in Node application

Intermittent network timeouts while trying to fetch data in Node application - node.js

We have a NextJS app with an Express server.
The problem we're seeing is lots of network timeouts to the API we are calling (the underlying exception says "socket hangup"). However, that API does not show any errors or a slow response time. It's as if the API calls aren't even making it all the way to the API.
Theories and things we've tried:
Blocked event loop: we tried replacing synchronous logging with asynchronous "winston" framework, to make sure we're not blocking the event loop. Not sure what else could be blocking
High CPU: the CPU can spike up to 60% sometimes. We're trying to minimize that spike by taking out some regexes we were using (since we heard those are expensive, CPU-wise).
Something about how big the JSON response is from the API? We're passing around a lot of data…
Too many complex routes in our Express routing structure: We minimized the number of routes by combining some together (which results in more complicated regexes in the route definitions)…
Any ideas why we would be seeing these fetch timeouts? They only appear during load tests and in production environments, but they can bring down the whole app with heavy load.

The code that emits the error:
function socketCloseListener() {
const socket = this;
const req = socket._httpMessage;
debug('HTTP socket close');
// Pull through final chunk, if anything is buffered.
// the ondata function will handle it properly, and this
// is a no-op if no final chunk remains.
socket.read();
// NOTE: It's important to get parser here, because it could be freed by
// the `socketOnData`.
const parser = socket.parser;
const res = req.res;
if (res) {
// Socket closed before we emitted 'end' below.
if (!res.complete) {
res.aborted = true;
res.emit('aborted');
}
req.emit('close');
if (res.readable) {
res.on('end', function() {
this.emit('close');
});
res.push(null);
} else {
res.emit('close');
}
} else {
if (!req.socket._hadError) {
// This socket error fired before we started to
// receive a response. The error needs to
// fire on the request.
req.socket._hadError = true;
req.emit('error', connResetException('socket hang up'));
}
req.emit('close');
The message is generated when the server does not send and response.
That's that easy bit.
But why would the API server not send a response?
Well, without seeing the minimum code that repro this I can only give you some pointers.
This issue here discusses at length the changes between version 6 and 8, in particular how a GET with a body now can cause it. This change of behaviour is more aligned to the REST specs.

Related

How to add continuous running code into nodejs postgresql client?

I'm stuck on a problem of wiring some logic into a nodejs pg client, the main logic has two part, the first one is connect to postgres server and getting some notification, it is as the following:
var rules = {} // a rules object we are monitoring...
const pg_cli = new Client({
....
})
pg_cli.connect()
pg_cli.query('LISTEN zone_rules') // listen to the zone_rules channel
pg_cli.on('notification', msg => {
rules = msg.payload
})
This part is easy and run without any issue, now what I'm trying to implement is to have another function keeps monitoring the rules, and when an object is received and put into the rules, the function start accumulating the time the object stays in the rules (which may be deleted with another notification from pg server), and the monitoring function would send alert to another server if the duration of the object passed a certain time. I tried to wrote the code in the following style:
function check() {
// watch and time accumulating code...
process.nextTick(check)
}
check()
But I found the onevent code of getting notification then didn't have a chance to run! Does anybody have any idea about my problem. Or should I doing it in another way?
Thanks!!!

Well, I found change the nextTick to setImmediate solve the problem.

typescript fetch response streaming

i am trying to stream a response. But i want to be able to read the response (and work with the data) while it is still being sent. I basically want to send multiple messages in one response.
It works internally in node.js, but when i tried to do the same thing in typescript it doesnt work anymore.
My attempt was to do the request via fetch in typescript and the response is coming from a node.js server by writing parts of the response on the response stream.
fetch('...', {
...
}).then((response => {
const reader = response.body.getReader();
reader.read().then(({done, value}) => {
if (done) {
return response;
}
console.log(String.fromCharCode.apply(null, value)); //just for testing purposes
})
}).then(...)...
On the Node.js side it basically looks like this:
// doing stuff with the request
response.write(first_message)
// do some more stuff
response.write(second_message)
// do even more stuff
response.end(last_message)
In Node.js, like i said, i can just read every message once its sent via res.on('data', ...), but the reader.read in typescript only triggers(?) once and that is when the whole response was sent.
Is there a way to make it work like i want, or do i have to look for another way?
I hope it is kinda understandable what i want to do, i noticed while writing this how much i struggled explaining this :D

I found the problem, and as usual it was sitting in front of the pc.
I forgot to write a header first, before writing the response.

Optimal method for nodejs to hand of image from database to browser

the end result that I need is to send multiple images to a web browser from a database.
The images are stored as blobs.
I know I can stream them out of the database and into a file and then I could just give the url to the file.
I also know I can hand off base64 string to the browser so it can render the image.
My question is which option is the most optimal? Or best practice? Keep in mind that if I go the stream method, I would have to check to see if the image has changed since the last time I displayed it...and if it has changed then I have to restream it out of the database.
I have been playing with the oracldb for node js and was able to successfully extract one blob into a file but I am also having trouble streaming multiple files.
This is a two question post:
Which is the most optimal:
1. Send Base64 string - I kind of like this method because i dont have to worry about streaming out the file and checking if it has changed since it is coming straight from the databse. My concern is can the browser/nodejs handle it? I know those strings can be very large. I could also be sending more than one image at a time.
Stream the blobs into files.
The second part question is how can i get multiple blobs out below is my code on streaming just one file, i found this example from github lobstream1.js
https://raw.githubusercontent.com/oracle/node-oracledb/master/examples/lobstream1.js
Focusing on the code:
// Stream a LOB to a file
var dostream = function(lob, cb) {
if (lob.type === oracledb.CLOB) {
console.log('Writing a CLOB to ' + outFileName);
lob.setEncoding('utf8'); // set the encoding so we get a 'string' not a 'buffer'
} else {
console.log('Writing a BLOB to ' + outFileName);
}
var errorHandled = false;
lob.on(
'error',
function(err) {
console.log("lob.on 'error' event");
if (!errorHandled) {
errorHandled = true;
lob.close(function() {
return cb(err);
});
}
});
lob.on(
'end',
function() {
console.log("lob.on 'end' event");
});
lob.on(
'close',
function() {
// console.log("lob.on 'close' event");
if (!errorHandled) {
return cb(null);
}
});
var outStream = fs.createWriteStream(outFileName);
outStream.on(
'error',
function(err) {
console.log("outStream.on 'error' event");
if (!errorHandled) {
errorHandled = true;
lob.close(function() {
return cb(err);
});
}
});
// Switch into flowing mode and push the LOB to the file
lob.pipe(outStream);
};
Fixed spooling out images with this method, I did change the dostream a bit.
for(var x = 0; x<result.rows.length;x++)
{
outputFileName = x + '.jpg';
console.log(outputFileName);
console.log(x);
var lob = result.rows[x][0];
dostream(lob,outputFileName);
// cb(null,lob);
}
Thank you for any help.

Given all the detail you provided in subsequent comments including the average image size, number of distinct images, memory available to Node.js, number of concurrent users, and the fact that it's "very critical to have the images up to date", here's my initial take...
For the first implementation, stick to the KISS principle and avoid over-engineering. Disable browser caching and don't cache images in Node.js. Instead, rely on the driver and Oracle Database to do the heavy lifting for you.
As for the table storing the images, try to use SecureFile LOBs over BasicFile LOBs (they are known to perform better) if possible. Also, look at the caching options available to both (CACHE, CACHE READS, and NOCACHE). Consider enabling the CACHE READS option based on your stated workload, but work with your DBA to ensure the buffer cache is sized appropriately so you will not impact others.
You can rely on the connection pool's connection request queue to help control how many people are fetching files concurrently. In fact, you might want to create a separate pool just for this purpose so that people fetching LOBs aren't blocking people doing other things in the application. For example, let's say you normally have one connection pool with 10 connections. You could create two connection pools with 5 connections each (use the connection pool cache to make this easy). Then, in the code path that fetches lobs, use the lob pool and use the other pool for everything else.
Given this setup, I'd also recommend NOT streaming the LOBs. Using the driver's ability to buffer the LOBs in Node.js will greatly simplify the code and you should have plenty of memory given such a small number of concurrent users/file fetches.
The biggest problem with this scenario that the images are pretty large and they'll always be flowing from the database through Node.js to the browser. But since you'll be on an internal network, this might not be much of a problem. If it does turn out to be a problem, you can start to add caching in either the browser or Node.js based on what makes the most sense.

Unless you do something like tiling or the base64 inline encoding, each image needs its own URL, so each invocation of node-oracledb would return just one image. You could do some kind of caching by writing to disk, but this seems extra IO - you will need to test to measure your own system's performance and memory requirements. Regarding accessing multiple images in node-oracledb there's some code in https://github.com/oracle/node-oracledb/issues/1041#issuecomment-459002641 that may be useful.

Is there any risk to read/write the same file content from different 'sessions' in Node JS?

I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?

Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.

How to avoid the need to delay event emission to the next tick of the event loop?

I'm writing a Node.js application using a global event emitter. In other words, my application is built entirely around events. I find this kind of architecture working extremely well for me, with the exception of one side case which I will describe here.
Note that I do not think knowledge of Node.js is required to answer this question. Therefore I will try to keep it abstract.
Imagine the following situation:
A global event emitter (called mediator) allows individual modules to listen for application-wide events.
A HTTP Server is created, accepting incoming requests.
For each incoming request, an event emitter is created to deal with events specific to this request
An example (purely to illustrate this question) of an incoming request:
mediator.on('http.request', request, response, emitter) {
//deal with the new request here, e.g.:
response.send("Hello World.");
});
So far, so good. One can now extend this application by identifying the requested URL and emitting appropriate events:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
emitter.emit('root');
}
else {
emitter.emit('404');
}
});
Following this one can write a module that will deal with a root request.
mediator.on('http.request', function(request, response, emitter) {
//when root is requested
emitter.once('root', function() {
response.send('Welcome to the frontpage.');
});
});
Seems fine, right? Actually, it is potentially broken code. The reason is that the line emitter.emit('root') may be executed before the line emitter.once('root', ...). The result is that the listener never gets executed.
One could deal with this specific situation by delaying the emission of the root event to the end of the event loop:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
process.nextTick(function() {
emitter.emit('root');
});
}
else {
process.nextTick(function() {
emitter.emit('404');
});
}
});
The reason this works is because the emission is now delayed until the current event loop has finished, and therefore all listeners have been registered.
However, there are many issues with this approach:
one of the advantages of such event based architecture is that emitting modules do not need to know who is listening to their events. Therefore it should not be necessary to decide whether the event emission needs to be delayed, because one cannot know what is going to listen for the event and if it needs it to be delayed or not.
it significantly clutters and complexifies code (compare the two examples)
it probably worsens performance
As a consequence, my question is: how does one avoid the need to delay event emission to the next tick of the event loop, such as in the described situation?
Update 19-01-2013
An example illustrating why this behavior is useful: to allow a http request to be handled in parallel.
mediator.on('http.request', function(req, res) {
req.onceall('json.parsed', 'validated', 'methodoverridden', 'authenticated', function() {
//the request has now been validated, parsed as JSON, the kind of HTTP method has been overridden when requested to and it has been authenticated
});
});
If each event like json.parsed would emit the original request, then the above is not possible because each event is related to another request and you cannot listen for a combination of actions executed in parallel for a specific request.

Having both a mediator which listens for events and an emitter which also listens and triggers events seems overly complicated. I'm sure there is a legit reason but my suggestion is to simplify. We use a global eventBus in our nodejs service that does something similar. For this situation, I would emit a new event.
bus.on('http:request', function(req, res) {
if (req.url === '/')
bus.emit('ns:root', req, res);
else
bus.emit('404');
});
// note the use of namespace here to target specific subsystem
bus.once('ns:root', function(req, res) {
res.send('Welcome to the frontpage.');
});

It sounds like you're starting to run into some of the disadvantages of the observer pattern (as mentioned in many books/articles that describe this pattern). My solution is not ideal – assuming an ideal one exists – but:
If you can make a simplifying assumption that the event is emitted only 1 time per emitter (i.e. emitter.emit('root'); is called only once for any emitter instance), then perhaps you can write something that works like jQuery's $.ready() event.
In that case, subscribing to emitter.once('root', function() { ... }) will check whether 'root' was emitted already, and if so, will invoke the handler anyway. And if 'root' was not emitted yet, it'll defer to the normal, existing functionality.
That's all I got.

I think this architecture is in trouble, as you're doing sequential work (I/O) that requires definite order of actions but still plan to build app on components that naturally allow non-deterministic order of execution.
What you can do
Include context selector in mediator.on function e.g. in this way
mediator.on('http.request > root', function( .. ) { } )
Or define it as submediator
var submediator = mediator.yield('http.request > root');
submediator.on(function( ... ) {
emitter.once('root', ... )
});
This would trigger the callback only if root was emitted from http.request handler.
Another trickier way is to make background ordering, but it's not feasible with your current one mediator rules them all interface. Implement code so, that each .emit call does not actually send the event, but puts the produced event in list. Each .once puts consume event record in the same list. When all mediator.on callbacks have been executed, walk through the list, sort it by dependency order (e.g. if list has first consume 'root' and then produce 'root' swap them). Then execute consume handlers in order. If you run out of events, stop executing.

Oi, this seems like a very broken architecture for a few reasons:
How do you pass around request and response? It looks like you've got global references to them.
If I answer your question, you will turn your server into a pure synchronous function and you'd lose the power of async node.js. (Requests would be queued effectively, and could only start executing once the last request is 100% finished.)
To fix this:
Pass request & response to the emit() call as parameters. Now you don't need to force everything to run synchronously anymore, because when the next component handles the event, it will have a reference to the right request & response objects.
Learn about other common solutions that don't need a global mediator. Look at the pattern that Connect was based on many Internet-years ago: http://howtonode.org/connect-it <- describes middleware/onion routing

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string