How to make job wait for cluster to become available - databricks

I have a workflow in Databricks called "score-customer", which I can run with a parameter called "--start_date". I want to make a job for each date this month, so I manually create 30 runs - passing a different date parameter for each run. However, after 5 concurrent runs, the rest of the runs fail with:
Unexpected failure while waiting for the cluster (1128-195616-z656sbvv) to be ready.
I want my jobs to wait for the cluster to become available instead of failing, how is this achieved?

Related

Hybris-Cronjob getting stuck in running state

I have a cronjob which processes lakhs of data with API calls and runs very frequently. Now the issue is sometimes the API calls are timed out and the cronjob gets stuck in running state. When next trigger time comes a new cronjob is started on a different thread. This results in duplicate data and multiple instances of same job running on multiple thread. How do i stop it.?

PM2 Cluster Mode. Execution on one process blocks the event loop on other processes as well

I am using PM2 in cluster mode and have 2 instances of my node.js application running. I have some long executing cron jobs (about 30 seconds) that I am trying to run. I am placing an if statement before the execution of the cron jobs to ensure that they only run on the first process via
if (process.env.NODE_APP_INSTANCE === 0) {
myCronFunction()
}
The goal was that since there are two processes, and PM2 should be load balancing them, if the cron job executes on process one, then process two would still be available to respond to the request. I am not sure what's going on, if PM2 is not successfully load balancing them, or what. But when my cron job executes on instance one, instance two is still not responding to requests until after the job on instance one finishes executing.
I'm not sure why that is. It is my understanding that they are supposed to be completely independent of one another.
Anyone have any ideas?

Nested Cron Job in AWS Lambda Function

Requirement: To send reminder to n users at their appropriate time. E.g user 1 at 9:10AM, user 2 at 10:50PM, user 3 at 4:20 AM and so on.
Solution in Nodejs
I have a Nodejs Cron job which runs at every 55 min (i.e. 9:55, 10:55, 11:55). At first it deletes all the child cron job and then fetch data from database and check for reminder settings for users. Based on reminder settings in database, it creates child cron jobs for all users to send the reminders.
Solution in AWS Lambda
I created lambda function and schedule it for 55 min. Inside lambda, I am doing the same thing as it was done in nodejs but since lambda's execution is finished, the child cron job are not getting executed.
I thought about step functions but not sure as how to achieve this since it is dynamic. Also someone suggested to trigger SNS but this will also not work in my scenario.
Someone please help me in achieving this with AWS Lambda.
Why not have 1 cron job that runs every minute that sends all reminders that need to be sent based on the database information? I don't really see why you need nested cron jobs?
In any case, you could also use DynamoDB's time to live attribute and a stream that triggers a Lambda function. Create a record to send a reminder at X every Y, with X being the expiration time. The Lambda function triggers, and when done you create a new DDB record with as expiration time X+Y. You might not even need a cron job Lambda in this case.
I suppose you could use aws-sdk to create cloudwatch rules dynamically in your nodejs cron job.
for better contrast, create 2 separate functions.
Main Cron Job (delete child cron job in cloud-watch, retrieve data from database, create cloudwatch rules that invoke the child cron job at specific time)
Child Cron Job (only send reminders)
To read more: Nodejs Create CloudWatch

App Engine Flexible cron is terminated after 120 seconds

My App Engine Flexible cron sometimes takes more than 120 seconds. So, whenever it exceeds 120 seconds, app engine throws 502 error. It doesn't terminate my nodejs task, it only terminates the http request started by App Engine Cron job.
There is one value 240 seconds, I didn't understand where its coming from. I guess this is a retry request. It would be helpful if anyone can highlight this as well.
As per App Engine documentation, a cron can run for an hour. Is this true for http requests started by cron job as well?
To be clear, I want to run my cron for more than 120 seconds and http request to be active for 1 hour.
Even though you have switched to Kubernetes Engine, I would like to take the chance and clarify the purpose of cron jobs here.
As it is stated in the official documentation, cron jobs are used to perform jobs at regular time intervals. They involve invoking a URL through an HTTP request and run for up to 60 minutes while respecting the request's own limitations.
Some good uses for cron jobs: sending report emails on a daily basis, update cached data at regular time intervals or update summary information every hour. When a task involves obtaining external information, especially when there is a large number of operations involved that may exceed the time an HTTP connection remains open, or when there are different types of data that are coming from the external application, I would not consider it a good use of cron jobs.
If you are using Kubernetes now, and consider it to be more useful for the tasks you need to perform, go ahead and continue with it.

How to handle cron execution time and user wait

I have scheduled a cron job that is executed every minute.
This cron job generates a pdf file using a distant web service. This operation alone takes a few seconds (something like 3 seconds), that means the cron job will be able to generate 20 pdf files per minute approximately.
If the visitor requests 60 documents, that means it will take 3 minutes for the server to generate all the pdf files.
Executing parallel cron jobs to do this task is not possible as all the files request must be handled individually for database relationships and integrity reasons. Basically, each file can only be handle one by one.
Therefore, is there any logic I could apply in order to :
execute multiple occurrences of the same cron job to speed up the process and decrease the user waiting time
and make the file creation process handled by one cron job only so that a specific creation process is not handled by another cron job doing the same task.
Thank you

Resources