We have been using agenda for sometime in our node.js server and are confused about how the job locking mechanism works in agenda.
Sometimes we see that the 'lockedAt' field for a job in the database has a non-null value and it changes to null suddenly.
Sometimes, the 'lockedAt' value stays as null and jobs get stuck.
Sometimes, after 10 minutes the 'lockedAt' value gets updated, but still the job is stuck.
And if I restart the node.js server, all the locked jobs get unlocked and get executed properly.
Why is this locking mechanism needed ? What does it do ?
When you running a Job, the db entry will automatically set a lockedAt timestamp, this ensure no other node instance will run the same job. It means "hey I am working on this job, don't touch it"
When another node instance read the job, it checks current time against the lock timestamp, if it has been less than 10 minutes, it will not run the job.
This "10 minutes" is stored in agenda global config or job config options, both are from code. You can change it by setting defaultlocklifetimenumber to a different value.
Related
I am using NodeJS,MongoDB and node-cron npm module to schedule jobs. For 10K of jobs it is taking less time and less memory. But when i am scheduling 100k jobs it is taking more than 10 minutes to schedule jobs and taking nearly 1.5GB of RAM and some times out of memory. Is there any best way achieve this like using activemq or rabbitmq?
One strategy is that you only schedule the next job to run. When it runs, you query the database and find the next job and schedule it.
If you add a new job, you check if it wants to run sooner than the now current next job and, if so, you schedule it and deschedule the previous next job (it will get rescheduled later after this new job runs).
If you remove a job, you check if it is the current next job. If it is, you deschedule it and find the next job in the database and schedule it.
If your database is configured for efficiently querying by job run time, this can be very efficient, uses hardly any memory and scales to an infinitely large number of jobs.
Sometimes Crons is working sometimes getting missed. I have attached all setting and result. Anyone can check and revert.
It's completely normal behaviour. Some jobs are skipped caused the time frame is out of scheduled time for specified cron job. In your case the reindex process is scheduled every 1 minute. If there is more things to index (lot of changes on products, categories etc.) one minute is's not enough to complete. Also there is only one process per cron group, in your case index. Use Separate Process in cron configuration means that indexes process will run as separate process in relation to other cron groups.
I have a website that is live. I have a cron job that executes every 24 hours. the cron job fetches and analyzes the data from a database table.
The problem is that the website gets very slow during the time when cron job is running. And gets back to normal after that. It gives me error Too many connections during this time.
I set the maximum allowed connections to 500 in mysql. The number of active connections that I checked in mysql were less than limit during that time.
I am unable to find any relevant help or even a clue to think in a particular direction.
Update:
I noticed one thing. the number of mysql connection continuously increases in this time. Although still less than the maximum limit.
nice command can change priority of a process. You want to lower the priority of the background process so it will try not to execute be executing while the website is being busy. E.g.
0 3 * * * nice -n 20 myjob arg arg
to execute myjob arg arg with lowered priority every day at 3am.
EDIT: Although, if the job is spending most of its time in database queries, this will not affect it much. MySQL has LOW_PRIORITY flag for INSERT and UPDATE statements that will do kind of the same thing for those queries.
I currently do service using beanstalkd and node.js.
I would like when jobs fail, retry n time before give up the job.
If the job succede i want do it the same job 10 time.
So, what is the best practice, stock in mongo db with the jobId the error and success count, or delete and put a new job with a an error and success count in the body.
I dont know if i'm clear? so tell me , thanks a lot
There is a stats-job <id>\r\n that should also be available via the API library that returns, among other things, how many times the specific job has been reserved, released, buried, and so on.
This allows for a number of retries of failed jobs by checking previous reservation/releases.
To run the same job multiple times, I would personally create either one additional job, with a success count that would then be incremented (into another new job) - or, all nine new jobs, with optional delays before they start.
You have a couple of ways to do this:
you can release the job, and obtain from stats the number of reserves
you can put a new job with a retry count, and keep track of history in the data payload
You should do the later, and you don't need MongoDB as a second dependency.
I tried adding a job to sqe by qsub. But it seems to be stuck. The state is shown as 'dt'. What could be wrong? I cannot add run any more jobs due to this. How can I remove the job from queue?
Looks like a grid engine status. If that is the case the 'd' means the job is in the process of being deleted while the 't' means the job is being transfered to the node where it is supposed to run. This combination usually occurs only if the node crashed while the job was being transfered to it.
You should be able to delete it with qdel -f JOBID if you are the administrator or the cluster is appropriately configured. If not ask your sysadmin/support to do it for you.