Website gets slow when cron job is executing - cron

I have a website that is live. I have a cron job that executes every 24 hours. the cron job fetches and analyzes the data from a database table.
The problem is that the website gets very slow during the time when cron job is running. And gets back to normal after that. It gives me error Too many connections during this time.
I set the maximum allowed connections to 500 in mysql. The number of active connections that I checked in mysql were less than limit during that time.
I am unable to find any relevant help or even a clue to think in a particular direction.
Update:
I noticed one thing. the number of mysql connection continuously increases in this time. Although still less than the maximum limit.

nice command can change priority of a process. You want to lower the priority of the background process so it will try not to execute be executing while the website is being busy. E.g.
0 3 * * * nice -n 20 myjob arg arg
to execute myjob arg arg with lowered priority every day at 3am.
EDIT: Although, if the job is spending most of its time in database queries, this will not affect it much. MySQL has LOW_PRIORITY flag for INSERT and UPDATE statements that will do kind of the same thing for those queries.

Related

How to schedule millions of jobs in a node js properly?

I am using NodeJS,MongoDB and node-cron npm module to schedule jobs. For 10K of jobs it is taking less time and less memory. But when i am scheduling 100k jobs it is taking more than 10 minutes to schedule jobs and taking nearly 1.5GB of RAM and some times out of memory. Is there any best way achieve this like using activemq or rabbitmq?
One strategy is that you only schedule the next job to run. When it runs, you query the database and find the next job and schedule it.
If you add a new job, you check if it wants to run sooner than the now current next job and, if so, you schedule it and deschedule the previous next job (it will get rescheduled later after this new job runs).
If you remove a job, you check if it is the current next job. If it is, you deschedule it and find the next job in the database and schedule it.
If your database is configured for efficiently querying by job run time, this can be very efficient, uses hardly any memory and scales to an infinitely large number of jobs.

Cron Job sometimes "missed | Too late for the schedule" sometimes running

Sometimes Crons is working sometimes getting missed. I have attached all setting and result. Anyone can check and revert.
It's completely normal behaviour. Some jobs are skipped caused the time frame is out of scheduled time for specified cron job. In your case the reindex process is scheduled every 1 minute. If there is more things to index (lot of changes on products, categories etc.) one minute is's not enough to complete. Also there is only one process per cron group, in your case index. Use Separate Process in cron configuration means that indexes process will run as separate process in relation to other cron groups.

Run cron at a different frequency in specific interval

I have a cron job that runs every 30 minutes, starting 10 minutes past a whole hour:
0+10/30+*+*+*+?
Now, this needs to be changed, so that in a specific time interval, it runs every 15 minutes instead. E.g. at 7.50, 8.05, 8.20 and 8.35. Then every 30 minutes again.
Is this possible with a single cron job and if so, how? Or do I need multiple jobs to accomplish this?
Thank you in advance.
not easy in a single cron, and that is also hard to read.
multiple jobs may work fine and show much clear
// This will start at 1:10am, and every 30minutes run once.
0+10/30+1-23/2+*+*+?
// This will start at 0:10am, and every 15minutes run once.
0+10/15+0-24/2+*+*+?
you may also consider to void the two job running at the same time.
As far as I've understood, this is not possible within a single cron job.
setup cron from morning to evening only points out that three different cron jobs are needed, so I am closing my question.

How does locking jobs work in agenda node.js?

We have been using agenda for sometime in our node.js server and are confused about how the job locking mechanism works in agenda.
Sometimes we see that the 'lockedAt' field for a job in the database has a non-null value and it changes to null suddenly.
Sometimes, the 'lockedAt' value stays as null and jobs get stuck.
Sometimes, after 10 minutes the 'lockedAt' value gets updated, but still the job is stuck.
And if I restart the node.js server, all the locked jobs get unlocked and get executed properly.
Why is this locking mechanism needed ? What does it do ?
When you running a Job, the db entry will automatically set a lockedAt timestamp, this ensure no other node instance will run the same job. It means "hey I am working on this job, don't touch it"
When another node instance read the job, it checks current time against the lock timestamp, if it has been less than 10 minutes, it will not run the job.
This "10 minutes" is stored in agenda global config or job config options, both are from code. You can change it by setting defaultlocklifetimenumber to a different value.

Best practice beanstalkd (queue) and node.js

I currently do service using beanstalkd and node.js.
I would like when jobs fail, retry n time before give up the job.
If the job succede i want do it the same job 10 time.
So, what is the best practice, stock in mongo db with the jobId the error and success count, or delete and put a new job with a an error and success count in the body.
I dont know if i'm clear? so tell me , thanks a lot
There is a stats-job <id>\r\n that should also be available via the API library that returns, among other things, how many times the specific job has been reserved, released, buried, and so on.
This allows for a number of retries of failed jobs by checking previous reservation/releases.
To run the same job multiple times, I would personally create either one additional job, with a success count that would then be incremented (into another new job) - or, all nine new jobs, with optional delays before they start.
You have a couple of ways to do this:
you can release the job, and obtain from stats the number of reserves
you can put a new job with a retry count, and keep track of history in the data payload
You should do the later, and you don't need MongoDB as a second dependency.

Resources