nodejs streams - how to abort reading on error - node.js

I am using 'multiparty' parser to read and process multipart-form-data in my application. My app middleware reads the uploaded file contents, parses it on-the-fly, and if successful, sends to next middleware. If not, i have to abort the request and return error.
/**
* Middleware to read n parse uploaded multi-part form data
* for campaign creation.
* After successful parse, sends to next layer.
*
**/
var mp = require('multiparty');
module.exports = function(req, res, next) {
console.log("File processor at work...");
var form = new mp.Form();
form.on('part', function(part) {
if (!part.filename) { part.resume();}
else {
part.setEncoding('utf8');
part.on('readable', function() {
readFile(req, res, part);
});
part.on('error', function (e) {
console.log("Am i reached??");
//return res.status(400).json(e);
form.emit('error', e);
})
}
});
form.on('error', function(err) {
console.log("Error while processing upload.", 3);
res.status(400).json(err);
});
form.on('close', function() {
console.log("Reading finished. Forwarding to next layer...");
return next();
});
form.parse(req);
}
In case of any errors in during parsing (i.e. readFile()), I want to return HTTP error without having to consume the remaining part buffer. I am sure there could be a decent way to do this, but I am not getting it right.
I tried throwing exception from within my readFile(), trying to catch it in form.on('part', ...). Even though I was able to catch the exception, it didn't abort the flow. A return() from form.on(part,..) would would return from this evt handler function, but not from the outer function.
As per the nodejs streams documentation, I tried emitting error event from within readFile() and handle the error in part.on('error', ...). This also gives the same behaviour, and does not end the processing.
What am I missing here? Is there a proper way to tell the stream that I don't want to process it any further..?

Related

Formidable using "end" event with file upload

I am using Formidable with Express in nodeJS in an attempt to have a simple single file upload scheme. I have confirmed that a file is actually sent over from the client-side, but where it seems to run into troubles is on the server-side.
index.js
app.post('/', (req, res) => {
const form = formidale();
form.on('file', (filename, file) => {
fs.rename(file.path, `./data/nodes.csv`, err => {
if (err) {
console.log(`There was an error in downloading a CSV file: ${err}`);
return null;
}
else {
console.log("CSV file has been uploaded correctly.");
}
});
});
form.on('error', err => {
console.log(`There was an error in downloading a CSV file: ${err}`);
return null;
});
form.on('end', () => {
console.log(fs.readFileSync('./data/nodes.csv')); // test to see if file exists
const nodes = assignMetrics();
console.log(nodes);
return nodes;
});
form.parse(req);
});
}
The main trouble I seem to find is that the form.on('end', ...) event does not seem to wait till the file has finished uploading to fire. I have confirmed this by trying to read the file in the event, but by that point it doesn't exist? The documentation though appears to suggest it is only meant to fire "after all files have been flushed [from the APIs pipe it infers]".
There appears to be no other events available that might wait till the file has been uploaded to be called? I also don't want to start throwing in layers of promises and such unless it is the only option, as each new layer of promises I find is a chance for unintended effects to happen.

How to auto send error of try catch block to Sentry in Node JS

I am building a Node JS application. I am using Sentry, https://docs.sentry.io/platforms/node/ in my application to monitor and report errors. But I am having a problem with global reporting for try catch block.
For example, I have a code block as follow.
const getUser = async (id) => {
try {
//do the database operation and return user
} catch (e) {
return {
data: null,
message: e.message
}
}
}
As you can see in the code, I am catching the error in the Try catch block. If I want to report the error to Sentry, I have to put in the following line in the catch block.
Sentry.captureException(e);
Basically, I am explicitly reporting the error. Is there a way to globally and automatically catch the error within the catch block and report it to the sentry. For, example, something like in PHP or Laravel. We will just have to initialize and configure the Sentry in one centralized place of the application and app will report any errors to the Sentry.
Sentry starts monitoring the whole application just by adding, the init function, somewhere in a global scope.
For example :
Sentry.init({
debug: appConfig.env === 'staging',
dsn: appConfig.sentryDSN,
environment: appConfig.env,
integrations: [new Integrations.BrowserTracing()],
release: [pjson.name, pjson.version].join('#'),
tracesSampleRate: 1.0,
ignoreErrors: [],
normalizeDepth: 10, // Or however deep you want your state context to be.
// (breadcrumb: Breadcrumb, hint?: BreadcrumbHint | undefined) => Breadcrumb | null
beforeBreadcrumb(breadcrumb: Sentry.Breadcrumb, hint: Sentry.BreadcrumbHint | undefined) {
return breadcrumb.category === 'xhr' ? breadcrumb : null;
},
});
You can stay just to that 'conf/init' if you like, this captures some errors, which are: `every error that inherits from the 'Error' object, like: TypeError, RangeError, ReferenceError, SyntaxError, etc , for more: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Error
But its better to handle the errors explicitly, and have the power to control what you want to send to Sentry(add/filter breadcrumbs, add tags, extra data etc).
In my reactjs app i have a middleware that all the errors are sent there, and inside there there is a logic..
Similarly i d suggest an error middleware where all the errors are send there, and at that middleware you exlicitly handle & send them to Sentry.
I assume the the tech stack node + express, so i d suggest, in the routes catch, to call next(error):
router.get('/path', function(req, res, next){
const getUser = async (id) => {
try {
//do the database operation and return user
} catch (error) {
//return {
// data: null,
// message: e.message
// }
next(error)
}
}
})
**OR based on Express 5, route handlers that return Promise will call next(value) automatically when they reject or throw an error **:
app.get('/path', async function (req, res, next) {
var user = await getUser(id)
res.send(user)
})
And into the app.js, you put the error handler middleware where it handles the errors..:
app.use(function(err, req, res, next){
// filter the error
// send it to Sentry
if(condition){
// add custom breadcrumb
Sentry.addBreadcrumb({
type: Sentry.Severity.Error,
category,
message,
level: Sentry.Severity.Error,
});
Sentry.configureScope((scope: any) => {
scope.setTag('section', section);// add tag
Sentry.captureException(error, scope); // capture the error
});
}
})

Nodejs global variable scope issue

I'm quite new to Nodejs. In the following code I am getting json data from an API.
let data_json = ''; // global variable
app.get('/', (req, res) => {
request('http://my-api.com/data-export.json', (error, response, body) => {
data_json = JSON.parse(body);
console.log( data_json ); // data prints successfully
});
console.log(data_json, 'Data Test - outside request code'); // no data is printed
})
data_json is my global variable and I assign the data returned by the request function. Within that function the json data prints just fine. But I try printing the same data outside the request function and nothing prints out.
What mistake am I making?
Instead of waiting for request to resolve (get data from your API), Node.js will execute the code outside, and it will print nothing because there is still nothing at the moment of execution, and only after node gets data from your api (which will take a few milliseconds) will it execute the code inside the request. This is because nodejs is asynchronous and non-blocking language, meaning it will not block or halt the code until your api returns data, it will just keep going and finish later when it gets the response.
It's a good practice to do all of the data manipulation you want inside the callback function, unfortunately you can't rely on on the structure you have.
Here's an example of your code, just commented out the order of operations:
let data_json = ''; // global variable
app.get('/', (req, res) => {
//NodeJS STARTS executing this code
request('http://my-api.com/data-export.json', (error, response, body) => {
//NodeJS executes this code last, after the data is loaded from the server
data_json = JSON.parse(body);
console.log( data_json );
//You should do all of your data_json manipluation here
//Eg saving stuff to the database, processing data, just usual logic ya know
});
//NodeJS executes this code 2nd, before your server responds with data
//Because it doesn't want to block the entire code until it gets a response
console.log(data_json, 'Data Test - outside request code');
})
So let's say you want to make another request with the data from the first request - you will have to do something like this:
request('https://your-api.com/export-data.json', (err, res, body) => {
request('https://your-api.com/2nd-endpoint.json', (err, res, body) => {
//Process data and repeat
})
})
As you can see, that pattern can become very messy very quickly - this is called a callback hell, so to avoid having a lot of nested requests, there is a syntactic sugar to make this code look far more fancy and maintainable, it's called Async/Await pattern. Here's how it works:
let data_json = ''
app.get('/', async (req,res) => {
try{
let response = await request('https://your-api.com/endpoint')
data_json = response.body
} catch(error) {
//Handle error how you see fit
}
console.log(data_json) //It will work
})
This code does the same thing as the one you have, but the difference is that you can make as many await request(...) as you want one after another, and no nesting.
The only difference is that you have to declare that your function is asynchronous async (req, res) => {...} and that all of the let var = await request(...) need to be nested inside try-catch block. This is so you can catch your errors. You can have all of your requests inside catch block if you think that's necessary.
Hopefully this helped a bit :)
The console.log occurs before your request, check out ways to get asynchronous data: callback, promises or async-await. Nodejs APIs are async(most of them) so outer console.log will be executed before request API call completes.
let data_json = ''; // global variable
app.get('/', (req, res) => {
let pr = new Promise(function(resolve, reject) {
request('http://my-api.com/data-export.json', (error, response, body) => {
if (error) {
reject(error)
} else {
data_json = JSON.parse(body);
console.log(data_json); // data prints successfully
resolve(data_json)
}
});
})
pr.then(function(data) {
// data also will have data_json
// handle response here
console.log(data_json); // data prints successfully
}).catch(function(err) {
// handle error here
})
})
If you don't want to create a promise wrapper, you can use request-promise-native (uses native Promises) created by the Request module team.
Learn callbacks, promises and of course async-await.

Send 'Received post' back to requester before async finishes (NodeJS, ExpressJS)

I have an API POST route where I receive data from a client and upload the data to another service. This upload is done inside of the post request (async) and takes awhile. The client wants to know their post req was received prior to the async (create project function) is finished. How can I send without ending the POST? (res.send stops, res.write doesn't send it out)
I thought about making an http request back to their server as soon as this POST route is hit. . .
app.post('/v0/projects', function postProjects(req, res, next) {
console.log('POST notice to me');
// *** HERE, I want to send client message
// This is the async function
createProject(req.body, function (projectResponse) {
projectResponse.on('data', function (data) {
parseString(data.toString('ascii'), function (err, result) {
res.message = result;
});
});
projectResponse.on('end', function () {
if (res.message.error) {
console.log('MY ERROR: ' + JSON.stringify(res.message.error));
next(new Error(res));
} else {
// *** HERE is where they finally receive a message
res.status(200).send(res.message);
}
});
projectResponse.on('error', function (err) {
res.status(500).send(err.message);
});
});
});
The internal system requires that this createProject function is called in the POST request (needs to exist and have something uploaded or else it doesn't exist) -- otherwise I'd call it later.
Thank you!
I think you can't send first response that post request received and send another when internal job i.e. createProject has finished no matter success or fail.
But possibly, you can try:
createProject(payload, callback); // i am async will let you know when done! & it will push payload.jobId in doneJobs
Possibility 1, If actual job response is not required:
app.post('/v0/projects', function (req, res, next) {
// call any async job(s) here
createProject(req.body);
res.send('Hey Client! I have received post request, stay tuned!');
next();
});
});
Possibility 2, If actual job response is required, try maintaining queue:
var q = []; // try option 3 if this is not making sense
var jobsDone = []; // this will be updated by `createProject` callback
app.post('/v0/projects', function (req, res, next) {
// call async job and push it to queue
let randomId = randomId(); // generates random but unique id depending on requests received
q.push({jobId: randomId });
req.body.jobId = randomId;
createProject(req.body);
res.send('Hey Client! I have received post request, stay tuned!');
next();
});
});
// hit this api after sometime to know whether job is done or not
app.get('/v0/status/:jobId', function (req, res, next) {
// check if job is done
// based on checks if done then remove from **q** or retry or whatever is needed
let result = jobsDone.indexOf(req.params.jobId) > -1 ? 'Done' : 'Still Processing';
res.send(result);
next();
});
});
Possibility 3, redis can be used instead of in-memory queue in possibility 2.
P.S. There are other options available as well to achieve the desired results but above mentioned are possible ones.

Node.js Streaming/Piping Error Handling (Change Response Status on Error)

I have millions of rows in my Cassandra db that I want to stream to the client in a zip file (don't want a potentially huge zip file in memory). I am using the stream() function from the Cassandra-Node driver, piping to a Transformer which extracts the one field from each row that I care about and appends a newline, and pipes to archive which pipes to the Express Response object. This seems to work fine but I can't figure out how to properly handle errors during streaming. I have to set the appropriate headers/status before streaming for the client, but if there is an error during the streaming, on the dbStream for example, I want to clean up all of the pipes and reset the response status to be something like 404. But If I try to reset the status after the headers are set and the streaming starts, I get Can't set headers after they are sent. I've looked all over and can't find how to properly handle errors in Node when piping/streaming to the Response object. How can the client tell if valid data was actually streamed if I can't send a proper response code on error? Can anyone help?
function streamNamesToWriteStream(query, res, options) {
return new Promise((resolve, reject) => {
let success = true;
const dbStream = db.client.stream(query);
const rowTransformer = new Transform({
objectMode: true,
transform(row, encoding, callback) {
try {
const vote = row.name + '\n';
callback(null, vote);
} catch (err) {
callback(null, err.message + '\n');
}
}
});
// Handle res events
res.on('error', (err) => {
logger.error(`res ${res} error`);
return reject(err);
});
dbStream.on('error', function(err) {
res.status(404).send() // Can't set headers after they are sent.
logger.debug(`dbStream error: ${err}`);
success = false;
//res.end();
//return reject(err);
});
res.writeHead(200, {
'Content-Type': 'application/zip',
'Content-disposition': 'attachment; filename=myFile.zip'
});
const archive = archiver.create('zip');
archive.on('error', function(err) { throw err; });
archive.on('end', function(err) {
logger.debug(`Archive done`);
//res.status(404).end()
});
archive.pipe(res, {
//end:false
});
archive.append(dbStream.pipe(rowTransformer), { name: 'file1.txt' });
archive.append(dbStream.pipe(rowTransformer), { name: 'file1.txt' });
archive.finalize();
});
}
Obviously it's too late to change the headers, so there's going to have to be application logic to detect a problem. Here's some ideas I have:
Write an unambiguous sentinel of some kind at the end of the stream when an error occurs. The consumer of the zip file will then need to look for that value to check for a problem.
Perhaps more simply, have the consumer execute a verification on the integrity of the zip archive. Presumably if the stream fails the zip will be corrupted.

Resources