Guys Heroku is terminating the req if the response takes more then 30sec to return, so is there any way I can wait for as long as the response would come back?
Well the user is uploading his file and I need to do something with the file in my server and after updates are done I will give a download link to the user. But mostly it takes more then 30 sec for the server to process the file so that the user need to wait for response
From the official Heroku Helpcenter : https://devcenter.heroku.com/articles/request-timeout
The timeout value is not configurable. If your server requires longer than 30 seconds to complete a given request, we recommend moving that work to a background task or worker to periodically ping your server to see if the processing request has been finished. This pattern frees your web processes up to do more work, and decreases overall application response times.
The short answer is : No, you can't change this configuration. I suggest you investigate why your application needs more than 30 seconds to process that request. If it takes longer than 10 seconds your really should consider the steps suggested in the Heroku Help Center 👆
Your Problem
You mention you need this for file processing. I understand that file processing could easily take longer than 30 seconds. Normally what I would do is to just create some sort of task reference and keep it in a database along with a status ("processing", "finished", "failed") - also store the original file and then just end the request of the user. This shouldn't take long. Then process the task ... with another endpoint or websocket connection the user could check if the task has been fullfilled.
Use a Task Queue
The following is just a basic interpretation of a solution - it's not meant for copy & pasting as it depends on so many things.
Routes (Endpoints)
Basically you need to have 3 routes in your backend. One for uploading the file, one for downloading the processed file and one for checking the status of the task.
1. Upload
app.post('/files', /* some middleware e.g. multer */, async (req, res) => {
// This is your upload controller
// I assume at this point the file has been uploaded and
// req.file contains a reference to the uploaded file.
// create new process task and add to queue
const task = await createNewTask(req.file);
queue.push(task);
// now a task has been created, but the user
// doesn't need to wait for it to finish
// so let's end the request here.
return req.status(200).json(task);
});
2. Check Status
app.get('/task/:id', async (req, res) => {
// From uploading a file in the first step, you'll
// get back a task id. Use the task id to check on
// the status.
const task = await getTask(req.params.id);
if (!task) {
return res.status(404).end();
} else {
return res.status(200).json(task);
}
});
The task can include informations like status, progress percentage, original filename, new filename or even a download link to the processed file once it's finished. Status could be something like pending, processing, finished or failed.
3. Download
app.get('/file/:filename', (req, res) => {
return req.status(200)
.sendFile('./path/to/file/' + req.params.filename);
});
Notes
It might be a good idea to rename the incoming files with a random id like a uuid. So it's easier to work with them in the automation process. Also the random id could be used for the task id at the same time.
It's up to you how big you want to go with this. For the task queue there are many different libraries to help you out with it. It could be an in-memory queue or one that's backed with a database.
Related
I'm currently looking to set up an endpoint that accepts a request, and returns the response data in increments as they load.
The application of this is that given one upload of data, I would like to calculate a number of different metrics for that data. As each metric gets calculated asynchronously, I want to return this metric's value to the front-end to render.
For testing, my controller looks as follows, trying to use res.write
uploadData = (req, res) => {
res.write("test");
setTimeout(() => {
res.write("test 2");
res.end();
}, 3000);
}
However, I think the issue stems from my client-side which I'm writing in React-Redux, and calling that route through an Axios call. From my understanding, it's because the axios request closes once receiving the first response, and the connection doesn't stay open. Here is what my axios call looks like:
axios.post('/api', data)
.then((response) => {
console.log(response);
})
.catch((error) => {
console.log(error);
});
Is there an easy way to do this? I've also thought about streaming, however my concern with streaming is that I would like each connection to be direct and unique between clients that are open for short amount of time (i.e. only open when the metrics are being calculated).
I should also mention that the resource being uploaded is a db, and I would like to avoid parsing and opening a connection multiple times as a result of multiple endpoints.
Thanks in advance, and please let me know if I can provide any more context
One way to handle this while still using a traditional API would be to store the metrics in an object somewhere, either a database or redis for example, then just long poll the resource.
For a real world example, say you want to calculate the following metrics of foo, time completed, length of request, bar, foobar.
You could create an object in storage that looks like this:
{
id: 1,
lengthOfRequest: 123,
.....
}
then you would create an endpoint in your API that like so metrics/{id}
and would return the object. Just keep calling the route until everything completes.
There are some obvious drawbacks to this of course, but once you get enough information to know how long the metrics will take to complete on average you can tweak the time in between the calls to your API.
I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?
Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.
NodeJS server with a Mongo DB - one feature will generate a report JSON file from the DB, which can take a while (60 seconds up - has to process hundreds of thousands of entries).
We want to run this as a background task. We need to be able to start a report build process, monitor it, and abort it if the user decides to change the params and re build it.
What is the simplest approach with node? Don't really want to get into the realms of separate worker servers processing jobs, message queues etc - we need to keep this on the same box and fairly simple implementation.
1) Start the build as a async method, and return to the user, with socket.io reporting progress?
2) Spin off a child process for the build script?
3) Use something like https://www.npmjs.com/package/webworker-threads?
With the few approaches I've looked at I get stuck on the same two areas;
1) How to monitor progress?
2) How to abort an existing build process if the user re-submits data?
Any pointers would be greatly appreciated...
The best would be to separate this task from your main application. That said, it'd be easy to run it in the background.
To run it in the background and monit without message queue etc., the easiest would be a child_process.
You can launch a spawn job on an endpoint (or url) called by the user.
Next, setup a socket to return live monitoring of the child process
Add another endpoint to stop the job, with a unique id returned by 1. (or not, depending of your concurrency needs)
Some coding ideas:
var spawn = require('child_process').spawn
var job = null //keeping the job in memory to kill it
app.get('/save', function(req, res) {
if(job && job.pid)
return res.status(500).send('Job is already running').end()
job = spawn('node', ['/path/to/save/job.js'],
{
detached: false, //if not detached and your main process dies, the child will be killed too
stdio: [process.stdin, process.stdout, process.stderr] //those can be file streams for logs or wathever
})
job.on('close', function(code) {
job = null
//send socket informations about the job ending
})
return res.status(201) //created
})
app.get('/stop', function(req, res) {
if(!job || !job.pid)
return res.status(404).end()
job.kill('SIGTERM')
//or process.kill(job.pid, 'SIGTERM')
job = null
return res.status(200).end()
})
app.get('/isAlive', function(req, res) {
try {
job.kill(0)
return res.status(200).end()
} catch(e) { return res.status(500).send(e).end() }
})
To monit the child process you could use pidusage, we use it in PM2 for example. Add a route to monit a job and call it every second. Don't forget to release memory when job ends.
You might want to check out this library which will help you manage multi processing across microservices.
I would like to know, how to use javascript to achieve my use case,
My app receives a post request, then it incr memcache key, then it publish the increased value straightaway to users(mobile app) using third party API.
Eg. first requst value become 1, publish 1.
second request value become 2, publish 2 ...
It works fine with requests less than 2k within 30 secs.
If the requests number goes up to 10k, users(mobile app) may receive too many messages from publisher(battery consuming)
So I have to the throttle publishing calls, instead of publishing per request, I want to publish the value every second. In 1 second, the value can be 1, then publish 1. In 2 second then value can be 100, then publish 100. So that I saved 99 publish calls.
When requests are not coming anymore, I don't want a worker keep running every second.
Each time it increments, cache the new value to a global variable and post it to clients using setInterval. Here is a simple example:
var key = 0;
// Update the cache to the present
// value on application start
memcache.get('key', updateKey);
// Handle increment request and
// save the new value
app.post('/post', function(req, res){
memcache.incr('key', updateKey);
});
// Update the cached key
function updateKey(err, val){
key = val;
}
// Publish to clients once
// a second
function publish(){
clients.emit(key);
}
setInterval(publish, 1000);
Starting and stopping this routine is a little more involved and may depend on how you're serving requests / incrementing the value.
Take a look at node-rate-limiter
You can implement it in a number of ways to solve your problem...
I am trying to upload multiple images at simultaneously. in server side i have to
create one unique folder for to keep that images .so how to make wait request until one request complete and send response to client. i have tried this one but when i send response i am getting this error Can't set headers after they are sent.so suggest me good solution . i am waiting for urs reply.
when i upload five images , in server side i have to check whether folder is already exist or not. if it is not exist i have to create new folder for that five images and then i have to check that folder reference is already exist or not in mongodb.if it is not exist i have to store that folder reference in mongodb.and then i have to send response to client. but here when i upload five image, five request is going to server so request is doing that terms before complete one request so same folder reference is storing and also it is creating five folder for five image.
function myMiddleware(req, res, next)
{
console.info("inside myMiddleware");
var handler = function()
{
console.info("middleware redundant. ActionDone, calling next");
next();
};
EventManager.once("finished",handler);
if (actionDone !== "working")
{
actionDone = "working";
function doneWaiting(){
// console.log("finished");
actionDone = "finished";
EventManager.emit( "finished" );
}
setTimeout(doneWaiting, 500);
}
I think that what you need here is implementing a lock on the function that checks/creates the folder and does the database check/create.
Take a look at this article to see an example