preferred way to schedule job #Scheduled vs crontab

preferred way to schedule job #Scheduled vs crontab - cron

I have to run one utility periodically for instance say, every minute.
So, I have two option #Scheduled spring boot vs crontab of linux box that we are using to deploy the artifact.
SO, my question is which way should I use?
what are the pros and cons for each solution , any other solution if you can suggest.

Just for comparing between these two, I don't have much points, but only based on this situation which I faced now. I just built a new end point and am doing performance testing and stress testing for the same on production. I am yet to decide the cron schedule times, and those may need a slight tweaking over some more time of observation. Setting via #Scheduled needs me to deploy/restart application every time I make a change.
Application restart generally takes more time than crontab edit.
Other than this, a few points considering the aspects of availability and scalability -
Setting only via crontab on a single server would mean a single point of failure, if the server goes down.
Setting via #Scheduled also could mean the same.
If you have multiple instances of the server, this could mean endpoint getting triggered twice and you may not want to have the same. Worst case, is if the scaling up happens after a long time, and you wrote the #Scheduled endpoint long back, while it was only deployed on a single server and then you forgot. As soon as scaling up happens, the process will start getting hit twice.
So, none of these seem to be the best in terms of points of availability and scalability.
In such situations, ideally a distributed cron management system (I have heard about Rundeck) is needed, which manages which, out of the available servers is to be called to hit the desired end point and if needed to call the next server in case the first one is down.
In case of any need for investigation. logs of rundeck could be checked to find the server which was actually called.

Related

Looking for time based persistent scheduler - node js

I have been looking for a time based persistent scheduler. I looked into some applications (Agenda, node-cron, node-schedule). But I couldn't find anything that satisfies my criteria.
So my applications sends out reminders to our customers based on their event timings. I am hesitating to run a regular cronjob because I have to run every 15 mins or so in this case. And for each cronjob, I have to make a database call. I am trying not to use resources unnecessarily.
In addition to that, I am already running a lot of cronjobs. But in my case, when the job is completed, I want the cron to get cancelled/finished; not live on memory until the server restart happens.
I tried using the above specified applications by setting exact timestamps (agenda, node-cron, node-schedule). But the cron lives on forever even after the job is completed, and if i restart the server, all the scheduled jobs are cron. So persistence is also an issue I am facing.
My server uses node js. If there are any other languages/tools to make this work, I am all ears.
Looking forward to your help.
I tried following this solution. But this solution is for one predefined event. In my case, the number of reminders to be sent out are dynamic and jobs are to be scheduled on the fly.

Monitor cron jobs in real-time

I have an issue currently where I've got a cron job set to run at midnight each day to reset daily API requests for a service that I run. The job failed recently which caused me a whole bunch of headaches and I've been trying to find a solution to monitor all of my cron jobs so I don't have a situation like this happen again.
I haven't been able to find a sufficient solution however, and in response I am considering creating a platform that allows you to monitor cron jobs, see logs (and past logs), last run date, failure/success of the last run, etc... in real-time and would notify you if your job hasn't completed within a specified window of time or the job failed.
I believe this might be a pain point and a good solution for others as well.
What are you thoughts? Do you think that this would be useful, have any suggestions, or just think this would be a waste of time?

Did you hear about Rundeck? (https://www.rundeck.com/open-source)
It looks like it's exactly what you're looking for.
You install it on a server, and it's like a Web UI for a crontab.
You define jobs you want to run using the Web UI, how often you want them to run and you can see some history of the past executions, their status and their output. You can also see when the next execution will happen.
I think there are also some alerting features to notify you if a job is on failure. I'm not sure if it can notify you based on the job execution time though.
This might be a good fit for what you're looking for.

2 years later, I am asking myself exactly the same questions ) Definitely you should have created such service already, haven't you? Every backend coder needs this time from time, in theory. I'm surprised this question hasn't received enough activity/voting. I got an answer leading to this though: https://uptimerobot.com/cron-job-monitoring/ that might be a good solution. Need to test it out. It does not seem to be promoted enough, as it's not easy to find. Also there is https://cronitor.io/docs/cron-job-monitoring that has ability to transmit (somewhat limited) telemetry data, +a lot of SDKs to be used from within programming languages.

Simultaneous Lotus notes server-side agents

In my Lotus Notes workflow application, I have a scheduled server agent (every five minutes). When user's act on a document, a server-side agent is also triggered (this agent modifies the said document, server-side). In the production, we are receiving many complaints that the processing are incomplete or sometimes not being processed at all. I checked the server configuration and found out that only 4 agents can run concurrently. Being a global application with over 50,000 users, the only thing that I can blame with these issues are the volume of agent run, but I'm not sure if I'm correct (I'm a developer and lacks knowledge about these stuffs). Can someone help me find if my reasoning is correct (on simulteneous agents) and help me understand how I can solve this? Can you provide me references please. Thank you in advance!

Important thing to remember.
Scheduled agents on the server will only run one agent from the same database at any given time!
So if you have Database A with agent X (5 mins) and Y (10 mins). It will first run X. Once X completes which ever is scheduled next (X or Y) will run next. It will never let you run X + Y at the same time if they are in the same database.
This is intended behaviour to stop possible deadlocks within the database agents.
Also you have an schedule queue which will have a limit to the number of agents that can be scheduled up. For example if you have Agent X every 5 minutes, but it takes 10 minutes to complete, your schedule queue will slowly fill up and then run out of space.
So how to work around this? There is a couple of ways.
Option 1: Use Program Documents on the server.
Set the agent to scheduled "Never" and have a program document execute the agent with the command.
tell amgr run "dir/database.nsf" 'agentName'
PRO:
You will be able to run agents in <5 minute schedule.
You can run multiple agents in the same database.
CON:
You have to be aware of what the agent is interacting with, and code for it to handle other agents or itself running at the same time.
There can be serious performance implications in doing this. You need to be aware of what is going on in the server and how it would impact it.
If you have lots of databases, you have a messy program document list and hard to maintain.
Agents via "Tell AMGR" will not be terminated if they exceed the agent execution time allowed on the server. They have to be manually killed.
There is easy way to determine what agents are running/ran.
Option 2: Create an agent which calls out to web agents.
PRO:
You will be able to run agents in <5 minute schedule.
You can run multiple agents in the same database.
You have slightly better control of what runs via another agent.
CON:
You need HTTP running on the server.
There are performance implications in doing this and again you need to be aware of how it will interact with the system if multiple instances run or other agents.
Agents will not be terminated if they exceed the agent execution time allowed on the server.
You will need to allow concurrent web agents/web services on the server or you can potentially hang the server.
Option 3: Change from scheduled to another trigger.
For example "When new mail arrives". Overall this is the better option of the three.
...
In closing I would say that you should rarely use the "Execute every 5 mins" if you can, unless it is a critical agent that isn't going to be executed by multiple users across different databases.

Crons for Clusters

Just a quick question that has been bothering me today. I own five servers, all have the exact same image and run behind a load balancer. I want to run a process heavy cron on these servers every half an hour.
I don't want to put the cron on each machine, as it is resource heavy and would block all incoming connections for a good thirty seconds. In addition, I don't really want to put the cron on one machine, just to make sure it is redundant and it will be run.
My possible solutions to this would be to have a remote service that would run the cron, just by way of accessing a URL that would trigger it; I think that would be the most feasible at this point.
I'm really curious as to what other solutions might be available.
Thanks for your time!

You could set up staggered cron jobs on your 5 machines, so it runs every 2.5 hours on each of your 5 machines. Probably the cleanest way to do that is to schedule a job to run every 30 minutes, and have the job itself be a script that runs conditionally, depending on the current time and which machine it's on.
Or, if you have some kind of batch scheduling system, you could run a cron job on one system that submits a batch job, letting the scheduling system choose which server to use. This has the advantage that, assuming your batch system works properly, the job should still run if one of your servers is down. You'll likely need to set up some environment variables in your cron job to let it use the batch system properly.

How do you instruct a SharePoint Farm to run a Timer Job on a specific server?

We have an SP timer job that was running fine for quite a while. Recently the admins enlisted another server into the farm, and consequently SharePoint decided to start running this timer job on this other server. The problem is the server does not have all the dependencies installed (i.e., Oracle) on it and so the job is failing. I'm just looking for the path of least resistance here. My question is there a way to force a timer job to run on the server you want it to?
[Edit]
If I can do it through code that works for me. I just need to know what the API is to do this if one does exist.

I apologize if I'm pushing for the obvious; I just haven't seen anyone drill down on it yet.
Constraining a custom timer job (that is, your own timer job class that derives from SPJobDefinition) is done by controlling constructor parameters.
Timer jobs typically run on the server where they are submitted (as indicated by vinny) assuming no target server is specified during the creation of the timer job. The two overloaded constructors for the SPJobDefinition type, though, accept an SPServer and an SPJobLockType as the third and fourth parameters, respectively. Using these two parameters properly will allow you to dictate where your job runs.
By specifying your target server as the SPServer and an SPJobLockType of "Job," you can constrain the timer job instance you create to run on the server of your choice.
For documentation on what I've described, see MSDN: http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.administration.spjobdefinition.spjobdefinition.aspx.
I don't know anything about the code you're running, but custom timer jobs are commonly setup during Feature activation. I got the sense that your codebase might not be your own (?); if so, you might want to look for the one or more types/classes that derive from SPFeatureReceiver. In the FeatureActivated method of such classes is where you might find the code that actually carries out the timer job instantiation.
Of course, you'll also want to look at the custom timer job class (or classes) themselves to see how they're being instantiated. Sometimes developers will build the instantiation of the class into the class itself (via Factory Method pattern, for example). Between the timer job class and SPFeatureReceiver implementations, though, you should be on the way towards finding what needs to change.
I hope that helps!

Servers in a farm need to be identical.
If you happen to use VMs for your web front ends, you can snap a server and provision copies so that you know they are all identical.

Timer jobs per definition run on all web front ends.
If you need scheduled logic to run on a specific server, you either need to specifically code this in the timer job, or to use a "standard" NT Service instead.

I think a side effect of setting SPJobLockType to 'Job' is that it'll execute on the server where the job is submitted.

You could implement a Web Service with the business logig and deploy that Web Service to one machine. Then your Timer Job could trigger your web service periodically.
The it sould be not that important wher your timer job is running. SharePoint decides itself where to run the timer job.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string