nifi processor cron scheduling - cron

I am trying to setup a cron schedule in nifi which runs every minute from 9AM till 16PM from sunday to thrusday.
I am using following expression but the nifi flow will not trigger.
0 0/1 9,10,11,12,13,14,15,16 ? * SUN,MON,TUE,WED,THU *
I am using containarized image of nifi avaliable form docker hub. https://hub.docker.com/r/apache/nifi/builds
I have tried to validate the expression using cronmaker and it seems that expression is valid. please help.

Related

How to make job wait for cluster to become available

I have a workflow in Databricks called "score-customer", which I can run with a parameter called "--start_date". I want to make a job for each date this month, so I manually create 30 runs - passing a different date parameter for each run. However, after 5 concurrent runs, the rest of the runs fail with:
Unexpected failure while waiting for the cluster (1128-195616-z656sbvv) to be ready.
I want my jobs to wait for the cluster to become available instead of failing, how is this achieved?

Nested Cron Job in AWS Lambda Function

Requirement: To send reminder to n users at their appropriate time. E.g user 1 at 9:10AM, user 2 at 10:50PM, user 3 at 4:20 AM and so on.
Solution in Nodejs
I have a Nodejs Cron job which runs at every 55 min (i.e. 9:55, 10:55, 11:55). At first it deletes all the child cron job and then fetch data from database and check for reminder settings for users. Based on reminder settings in database, it creates child cron jobs for all users to send the reminders.
Solution in AWS Lambda
I created lambda function and schedule it for 55 min. Inside lambda, I am doing the same thing as it was done in nodejs but since lambda's execution is finished, the child cron job are not getting executed.
I thought about step functions but not sure as how to achieve this since it is dynamic. Also someone suggested to trigger SNS but this will also not work in my scenario.
Someone please help me in achieving this with AWS Lambda.
Why not have 1 cron job that runs every minute that sends all reminders that need to be sent based on the database information? I don't really see why you need nested cron jobs?
In any case, you could also use DynamoDB's time to live attribute and a stream that triggers a Lambda function. Create a record to send a reminder at X every Y, with X being the expiration time. The Lambda function triggers, and when done you create a new DDB record with as expiration time X+Y. You might not even need a cron job Lambda in this case.
I suppose you could use aws-sdk to create cloudwatch rules dynamically in your nodejs cron job.
for better contrast, create 2 separate functions.
Main Cron Job (delete child cron job in cloud-watch, retrieve data from database, create cloudwatch rules that invoke the child cron job at specific time)
Child Cron Job (only send reminders)
To read more: Nodejs Create CloudWatch

App Engine Flexible cron is terminated after 120 seconds

My App Engine Flexible cron sometimes takes more than 120 seconds. So, whenever it exceeds 120 seconds, app engine throws 502 error. It doesn't terminate my nodejs task, it only terminates the http request started by App Engine Cron job.
There is one value 240 seconds, I didn't understand where its coming from. I guess this is a retry request. It would be helpful if anyone can highlight this as well.
As per App Engine documentation, a cron can run for an hour. Is this true for http requests started by cron job as well?
To be clear, I want to run my cron for more than 120 seconds and http request to be active for 1 hour.
Even though you have switched to Kubernetes Engine, I would like to take the chance and clarify the purpose of cron jobs here.
As it is stated in the official documentation, cron jobs are used to perform jobs at regular time intervals. They involve invoking a URL through an HTTP request and run for up to 60 minutes while respecting the request's own limitations.
Some good uses for cron jobs: sending report emails on a daily basis, update cached data at regular time intervals or update summary information every hour. When a task involves obtaining external information, especially when there is a large number of operations involved that may exceed the time an HTTP connection remains open, or when there are different types of data that are coming from the external application, I would not consider it a good use of cron jobs.
If you are using Kubernetes now, and consider it to be more useful for the tasks you need to perform, go ahead and continue with it.

Azure Scheduled Web job is triggering twice sometimes

We have an Azure WebJob which is scheduled to run at 8:00 AM UTC daily(CRON - 0 00 08 * * *).Most of the days it is triggering correctly but on some days it is triggering twice (Second run is occurring ~10 secs after first run ). I can see in the web job history that when it triggered twice,first run's trigger property(from WebJob History JSON) is showing as "External - " and second run trigger property is showing as "Schedule - 0 0 8 * * *" but we don't have any external services triggering this WebJob.
When i checked the Job scheduler log for more details, "Web Job Invoked" status is only present for those days when the webjob got triggered twice.
Your problem appears to be that you appear to have two different things triggering your WebJob:
You probably have a settings.job (wiki) with a cron expression.
You may also have an Azure Scheduler Job Collection hitting your WebJob from the outside (possibly with a different schedule, which is why they don't always match).
Suggestion is to get rid of #2, and only keep the internal WebJobs scheduling via settings.job.

How to handle cron execution time and user wait

I have scheduled a cron job that is executed every minute.
This cron job generates a pdf file using a distant web service. This operation alone takes a few seconds (something like 3 seconds), that means the cron job will be able to generate 20 pdf files per minute approximately.
If the visitor requests 60 documents, that means it will take 3 minutes for the server to generate all the pdf files.
Executing parallel cron jobs to do this task is not possible as all the files request must be handled individually for database relationships and integrity reasons. Basically, each file can only be handle one by one.
Therefore, is there any logic I could apply in order to :
execute multiple occurrences of the same cron job to speed up the process and decrease the user waiting time
and make the file creation process handled by one cron job only so that a specific creation process is not handled by another cron job doing the same task.
Thank you

Resources