Cron Job Microservices

Cron Job Microservices - cron

I am using spring cloud and have various microservices for an online shopping vendor. Everything is working as expected.
But, I got a requirement where I need to run a cron job over customer's records, get the customer's who's statement date matches the current date and calculate the rate of interest to be paid. This needs to be run every day.
I am confused about how to accommodate this cron job with MS architecture. Do I need to have another server having just this cron job?

Depending on the platform (eg: cf, k8s..) that you're orchestrating the batch-jobs in SCDF, you could write a simple Quartz based Boot Application that can interact with SCDF's REST endpoints to schedule the Task definitions defined in SCDF.
There are several online literatures on Quartz + Boot solution.
We are also working on a native scheduler integration for Cloud Foundry (via PCF Scheduler). Once it ready, you'd be able to schedule (i.e., cron-expressions) for Tasks from SCDF's Dashboard natively.

As I understand you should have one centralized supervisor of jobs, because multiplied instances can potentially run the same job at the same time.
This supervisor can be a microservice, which delegates job execution to other services via rest call or message queue, and wait for result.
It means that job supervisor becomes part of infrastructure, like message queue or database.

Related

AWS and NodeJS architecture for a scheduled/cron task in multi server setup

I am using AWS services in deploying my application which currently has the production site setup on an application load balancer running 2 instances of my NodeJS server.
My current concern is if I just setup a node-cron to trigger a task at 5:00am, it will do this for each server I spin up.
I need to implement an email delivery system where at 5:00am it will query my database table I made to generate customized emails (need to iterate over each individual;s record which has a unique array that helps build a list of items for each user). I then fire the object off to AWS SES.
What's are some ways you have done this?
**Currently based off my readings I am looking at two options:
**
Setup a node-cron child process within one cluster (but if I have auto-scaling, wouldn't this create a duplicate node-cron task), but this would probably require Redis and tracking the process across servers
OR
Setup an EventBridge API which fires an api.mybackendserver.com/send-email-event where I then carry out my logic. (this seems like the simpler approach, and the drawbacks would be potential CPU/RAM spikes which would be fine as i'm regionally based and would do this in off-peak hours).

EventBridge is definitely a way to go with CRON. If you're worried about usage spikes you could use CRON to invoke a Lambda function. That pushes events to SQS for each job. Those would be polled by EC2 instances.
Other way would be to schedule a task to increase number of instances before cron event occurs.

Looking for time based persistent scheduler - node js

I have been looking for a time based persistent scheduler. I looked into some applications (Agenda, node-cron, node-schedule). But I couldn't find anything that satisfies my criteria.
So my applications sends out reminders to our customers based on their event timings. I am hesitating to run a regular cronjob because I have to run every 15 mins or so in this case. And for each cronjob, I have to make a database call. I am trying not to use resources unnecessarily.
In addition to that, I am already running a lot of cronjobs. But in my case, when the job is completed, I want the cron to get cancelled/finished; not live on memory until the server restart happens.
I tried using the above specified applications by setting exact timestamps (agenda, node-cron, node-schedule). But the cron lives on forever even after the job is completed, and if i restart the server, all the scheduled jobs are cron. So persistence is also an issue I am facing.
My server uses node js. If there are any other languages/tools to make this work, I am all ears.
Looking forward to your help.
I tried following this solution. But this solution is for one predefined event. In my case, the number of reminders to be sent out are dynamic and jobs are to be scheduled on the fly.

Advise needed - Running Python code on GOOGLE CLOUD PLATFORM serverless

I have a python code which reads data from one cloud system via rest api using the requests module and then writes data back to another cloud system via rest api . This code runs anywhere from 1 to 4 hours every week. Is there a place in Google Cloud Platform , I can execute this code on a periodic basis. Sort of like a scheduled batch job . Is there a serverless option to do this in App Engine . I know about the App engine cron service but seems like it is only for calling a URL regularly . Any thoughts ? Appreciate your help.

Google Cloud Scheduler could be the tool you are looking for. As it is mentioned in its documentation:
Cloud Scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, cloud infrastructure operations, and more. You can automate everything, including retries in case of failure to reduce manual toil and intervention.
Here you have the quickstart for Cloud Scheduler, and also another tutorial for Cron jobs.

You can use the Google Genomics API pipelines.run endpoint to run a long-running job on a Google Compute Engine virtual machine and then it will destroy the machine when it's done. If your job will run for less than 24 hours and it can handle a failure, then you can use a Preemptible VM to save cost.
Pipelines: Run
https://cloud.google.com/genomics/reference/rest/v2alpha1/pipelines/run
Preemptible Virtual Machines
https://cloud.google.com/preemptible-vms/
You could use Cloud Scheduler to kick off the job
Pipelines may be preferred to trying to use one of the serverless technologies because they don't tend to handle the long running jobs as well.

You can use AI Platform Training to run any arbitrary Python package — it doesn’t have to be a machine learning job.

Tasks that need to be performed on a certain date in Azure

I am developing an application using Azure Cloud Service and web api. I would like to allow users that create a consultation session the ability to change the price of that session, however I would like to allow all users 30 days to leave the session before the new price affects the price for all members currently signed up for the session. My first thought is to use queue storage and set the visibility timeout for the 30 day time limit, but this seems like this could grow the queue really fast over time, especially if the message should not run for 30 days; not to mention the ordering issues. I am looking at the task scheduler as well but the session pricing changes are not a recurring concept but more random. Is the queue idea a good approach or is there a better and more efficient way to accomplish this?

The stuff you are trying to do should be done with a relational database. You can use timestamps to record when prices for session changed. I wouldn't use a queue at all for this. A queue is more for passing messages in a distributed system. Your problem is just about tracking what prices changed on what sessions and when. That data should be modeled in a database.

I think this scenario is more suitable to use Azure Scheduler. Programatically create a Job with one time recurrence with set date as 30 days later to run once. Once this job gets triggered automatically by scheduler, assign an action to callback to one of your API/Service to do the price & other required updates and also remove this Job from the scheduler as part of this action to have a clean jobs list. Anyways premium plan of Azure Scheduler Job Collection will give you unlimited number of jobs to run.
Hope this is exactly what you were looking for...

I would consider using Azure WebJobs. A WebJob basically gives you the ability to run a .NET console application within the context of an Azure Web App. It can be run on demand, continuously, or in response to a reoccurring schedule. If your processing requirements are low and allow for it they can also run in the same process that your Web App is running in to save you $$$ as they are free that way.
You could schedule the WebJob to run once or twice per day and examine the situation and react as is appropriate. Since it's really just a .NET worker role you have ultimate flexibility.

Lotus Notes Agent

Where can I find a great online reference on Lotus Notes Agent. I currently having problems with having simultaneous agents and understanding agents, how it works, best practices, etc? Thanks in advance!

I currently having problems with having simultaneous agents
Based on this comment I take it you are running a scheduled agent?
The way that scheduled agents work is that only one agent from a particular database can be run at one time, even if you have multiple Agent manager (AMGR) threads. Also agents cannot run less then every 5 minutes. The UI will let you put in a lower number, but it will change it.
The other factors to take into account is how long your agent will run for. If it runs for longer then the interval time you setup you will end up backlogging the running time. Also the server can be configured to kill agents that run over a certain time. So you need to make sure the agent runs within that timeframe.
Now to bypass all this you can execute an agent from the Domino console like as follows.
tell amgr run "database.nsf" 'agentName'
This will run in it's own thread outside of the scheduler. Because of this you can create a program document to execute an agent in less then 5 minute intervals and multiple agents within the same database.
This is dangerous in doing this however, as you have to be aware of a number of issues.
As the agent is outside the control of the scheduler you can't kill it as you would in the scheduler.
Running multiple threads can tie up more processes. So while the scheduler will backlog everything if the agent runs longer then the schedule, doing a program document in this instance will crash the server.
You need to be aware of what the agent is doing in the database so that it won't interfere with any other agents in the same database, and can cope if it is run twice in parallel.
For more reading material on this:
Improving Agent Manager Performance.
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.help.domino.admin.doc/DOC/H_AGENT_MANAGER_NOTES_INI_VARIABLES.html
Agent Manager trouble shooting.
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.help.domino.admin.doc/DOC/H_ABOUT_TROUBLESHOOTING_AGENTS.html
Troubleshooting Agents (Old material but still relevant)
http://www.ibm.com/developerworks/lotus/library/ls-Troubleshooting_agents/index.html
... and related tech notes:
Title: How to run two agents concurrently in the same database using a wrapper agent
http://www.ibm.com/support/docview.wss?uid=swg21279847
Title: How to run multiple agents in the same database using a Program document
http://www.ibm.com/support/docview.wss?uid=swg21279832

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string