I am working on a project in which I can create new files and also can move them to the trash. Also, have the functionality to automatically delete these trashed files after a specific time (in my case it's 30 days). I have searched on the internet and found this node-schedule. I have created an API that starts the scheduler every day, but the problem is, every time I deploy the application I need to hit this API to run the scheduler. here is my code
const schedule = require('node-schedule');
const moment = require('moment');
export default async function handler(req, res) {
const rule = new schedule.RecurrenceRule();
rule.hour = 0;
const job = schedule.scheduleJob(rule, function () {
// delete files older than 30 days
});
res.send(200)
}
My question is, Is there any way to get rid of hitting API every time I deploy the application? or any better Idea to do this task?
Related
In Heroku long requests can cause H12 timeout errors.
The request must then be processed...by your application...within 30 seconds to
avoid the timeout.
src
Heroku suggests moving long tasks to background jobs.
Sending an email...Accessing a remote API...
Web scraping / crawling...you should move this heavy lifting into a background job which can run asynchronously from your web request.
src
Heroku's docs say requests shouldn't take longer than 500ms to return a response.
It’s important for a web application to serve end-user requests as
fast as possible. A good rule of thumb is to avoid web requests which
run longer than 500ms. If you find that your app has requests that
take one, two, or more seconds to complete, then you should consider
using a background job instead.
src
So if I have a background job, how do I tell the frontend when the background job is done and what the job returns?
On Heroku their example code just returns the background job id. But this won't give the frontend the information it needs.
app.post('/job', async (req, res) => {
let job = await workQueue.add();
res.json({ id: job.id });
});
For example this method won't tell the frontend when an image is done being uploaded. Or the frontend won't know when a call to an API, like an external exchange rate API, returns a result, like an exchange rate, and what that result is.
Someone suggested using job.finished() but doesn't this get you back where you started? Now your requests are waiting for the queue to finish in order to respond. So your requests are the a same length as when there was no queue and this could lead to timeout errors again.
const result = await job.finished();
res.send(result);
This is example uses Bull, Redis, Node.js.
Someone suggested websockets. I didn't find an example of this yet.
The idea of using a queue for long tasks is that you post the task and
then return immediately. I guess you are updating the database as last
step in your job, and only use the completed event for notifying the
clients. What you need to do in this case is to implement either a
websocket or similar realtime communication and push the notification
to relevant clients. This can become complicated so you can save some
time with a solution like https://pusher.com/ or similar...
https://github.com/OptimalBits/bull/issues/1901
I also saw a solution in heroku's full example, which I didn't originally see:
web server
// Fetch updates for each job
async function updateJobs() {
for (let id of Object.keys(jobs)) {
let res = await fetch(`/job/${id}`);
let result = await res.json();
if (!!jobs[id]) {
jobs[id] = result;
}
render();
}
}
frontend
// Fetch updates for each job
async function updateJobs() {
for (let id of Object.keys(jobs)) {
let res = await fetch(`/job/${id}`);
let result = await res.json();
if (!!jobs[id]) {
jobs[id] = result;
}
render();
}
}
// Attach click handlers and kick off background processes
window.onload = function() {
document.querySelector("#add-job").addEventListener("click", addJob);
document.querySelector("#clear").addEventListener("click", clear);
setInterval(updateJobs, 200);
};
I am making a web application that is scrapping the news from a news site and saves to my database (scrapping is done just for learning purpose). After the database is updated all the stored data is sent to the user frontend.
Here is the route responsible for the above action.
router.get('/news, postController.getNewsPost);
New news are added to the site being scrapped as the day passes. Lets say if no user logs in to my application my database does not updates because the route mentioned above does not fires.
I want my database to be updated periodically even when no users have logged into my web application.
I am new to backend development so please guide me on how i can achieve this, also let me know if more information is required.
The simplest solution would be to use a setInterval to execute a specific function every N seconds:
setInterval(function() {
// scrap news and save them to the database every 5 seconds
}, 5000)
More versatile and solid solutions can be implemented using an external library for scheduling task, something like Agenda
You can use node-cron module to schedule your scrapping job.
Install node-cron using npm:
$ npm install --save node-cron
Import node-cron and schedule your scrapping job:
var cron = require('node-cron');
cron.schedule('* * * * *', () => {
console.log('Scrapping....');
});
Here is the module link. https://www.npmjs.com/package/node-cron
You can use the following method
$ npm install --save node-cron
var cron = require('node-cron');
cron.schedule('second minute hour day month', () => {
//update database code
});
library link enter link description here
I have a react app that want to send a file to a Google cloud compute VM instance, get it processed there, and then display it. This is currently working fine. To save money I want to start the instance with a http request from the react app, and turn it off when the computations are done. I start the instance with the following function, taken from the official gce-documentation
const functions = require('firebase-functions');
const compute = require('#google-cloud/compute');
async function startComputeEngine() {
const instancesClient = new compute.InstancesClient();
const [response] = await instancesClient.start({
project: projectId,
zone,
instance: instanceName,
});
let operation = response.latestResponse;
const operationsClient = new compute.ZoneOperationsClient();
// Wait for the operation to complete.
while (operation.status !== 'DONE') {
operation = await operationsClient.wait({
operation: operation.name,
project: projectId,
zone: operation.zone.split('/').pop(),
});
if (Array.isArray(operation)) {
operation = operation[0]
}
}
return "Success: Instance started"
}
This also works, and the VM instance is started correctly.
When doing a HTTP-request to the VM-instance after this, I get ERR_CONNECTION_TIMED_OUT. This is the same error I get if the VM is off.
If I delay the http-request to the VM with ~30 seconds, I get another error: ERR_CONNECTION_REFUSED.
If I delay the HTTP-request with ~60 seconds it works as if should.
Even though the code to start the VM instance waits for the startup operation to be finished, it seems as it need additional time to boot up. Is there some way to detect when the HTTP-endpoint is ready, preferably from within a firebase function?
There is one way, you can use vm custom metadata. In your vm instance, once http server started successfully you can store a flag ServerRunning as true in vm metadata
//This code block is part of the vm
app.listen(PORT, function() {
//Set custom data here
})
setMetaData
And in your cloud function you can create a polling function, that regularly calls vm metadata server and checks if ServerRunning exists and set to true. If exists, then cancel the polling function and execute the remaining code block.
I have several selenium scripts that scrape websites on the web, I need these scripts to run every 10 minutes automatically and then send the results to my mongodb database, I know how to send and store data in the database, but I don't get how you run the scripts automatically every x amount of time, and then update your database without you having to do anything?
The backend uses node, express and mongoose. This is what I tried...
const router = require('express').Router()
const WebScript = require('../Scripts/WebScript.js')
router.get('/script/web-script', async (req, res) => {
const results = await WebScript.Script()
console.log(results)
}
module.exports = router
The script runs if I call the route on my localhost, but otherwise it doesn't start automatically. I've set up a server.js that is connected to my mongodb database and I've set up a schema to store the results in mongodb. 'console.log(results)' returns the scraped data like I want it to, but I just can't figure out how to run this automatically when I start the server, and also make it run every 10 minutes after as well.
There are several options:
Use a pinger
Pingers are basically robots that visit your site (usually to keep your site from sleeping) every set amount of time. A good choice is Uptime Robot. Since the script executes everytime you visit the site, this will work perfectly.
Use setTimeout
Set a simple timer in JavaScript:
setTimeout(function() {
// Run your script here
}, 1000 * 60 * 10); // Milliseconds for 10 minutes
I am currently developing my own desktop clock app and after successfully receiving the current date and time via custom API and locally, I've come to the point where
serious complications may occur in the future.
With the current implementation of (local-time), the time is updated locally - every minute per app instance.
It would be unnecessary silly if I try to achieve the same for the (server-time) -> to send a GET request each minute to the Server from every existing app instance...
So, here comes my question..
What are the more efficient alternatives?
P.S. The server environment is Node.js.
The received time is in the form of a JSON.
you can try this code
const serverTime = new Date(); // lets say this is time from server
const timer = (time) =>{
setTimeout(()=>{
time.setSeconds(time.getSeconds() + 1);
console.log(time)
timer(time);
},1000)
}
timer(serverTime);