I want to automatically start a job on an Azure Batch AI cluster once a week. The jobs are all identical except for the starting time. I thought of writing a PowerShell Azure Function that does this, but Azure Functions v2 doesn't support PowerShell and I don't want to use v1 in case it will be phased out. I would prefer not to do this in C# or Java. How can I do this?
Currently, there's no option available to trigger a job on Azure Batch AI cluster. Maybe you want to run a shell script which in turn can create a regular schedule using system's task scheduler. Please see if this doc by Said Bleik helps:
https://github.com/saidbleik/batchai_mm_ad#scheduling-jobs
I assume this way you can add multiple schedules for the job!
Azure Batch portal has "Job schedules" tab. You can go there, add a Job, and set a schedule for the Job. You can specify the recurrence in the Schedule
Scheduled jobs
Job schedules enable you to create recurring jobs within the Batch service. A job schedule specifies when to run jobs and includes the specifications for the jobs to be run. You can specify the duration of the schedule--how long and when the schedule is in effect--and how frequently jobs are created during the scheduled period.
Related
I have created an http cloud scheduler task. I'm expecting it to have a maximum run time of 5 minutes. However my task is reporting DEADLINE_EXCEEDED after exactly 1 minute.
When I run gcloud scheduler jobs describe MySyncTask to view my task it reports attemptDeadline: 300s. The service I am calling is cloud run and I have also set a 300s limit.
I am running the task manually by clicking "force a job run" in the GUI.
After 1 minute exactly in the logs it reports DEADLINE_EXCEEDED
When you execute a job from the GUI, it will be executed using the default attemptDeadline value, which is 60 seconds according to this question.
If you want to run it manually, I suggest to run the job from the Cloud Shell and pass the --attempt-deadline flag with the desired value, as shown on this answer:
gcloud beta scheduler jobs update http <job> --attempt-deadline=1800s --project <project>
We need to execute a long running exe running on a windows machine and thinking of ways to integrate with the workflow. The plan is to include the exe as a task in the Databricks workflow.
We are thinking of couple of approaches
Create a DB table and enter a row when this particular task gets started in the workflow. Exe which is running on a windows machine will ping the database table for any new records. Once a new record is found, the exe proceeds with actual execution and updates the status after completion. Databricks will query this table constantly for the status and once completed, task finishes.
Using databricks API, check whether the task has started execution in the exe and continue with execution. After application finishes, update the task status to completion until then the Databricks task will run like while (true). But the current API doesn't support updating the task execution status (To Complete) (not 100% sure).
Please share thoughts OR alternate solutions.
This is an interesting problem. Is there a reason you must use Databricks to execute an EXE?
Regardless, I think you have the right kind of idea. How I would do this with the jobs api is as described:
Have your EXE process output a file to a staging location probably in DBFS since this will be locally accessible inside of databricks.
Build a notebook to load this file, having a table is optional but may give you addtional logging capabilities if needed. The output of your notebook should use the dbutils.notebook.exit method which allows you to output any value string or array. You could return "In Progress" and "Success" or the latest line from your file you've written.
Wrap that notebook in a databricks job and execute on an interval with a cron schedule (you said 1 minute) and you can retrieve the output value of your job via the get-output endpoint
Additional Note, the benefit of abstracting this into return values from a notebook is you could orchestrate this via other workflow tools e.g. Databricks Workflows or Azure Data Factory with inside an Until condition. There are no limits so long as you can orchestrate a notebook in that tool.
If I have for example a (multitask) Databricks job with 3 tasks in series and the second one fails - is there a way to start from the second task instead of running the whole pipeline again?
Right now this is not possible, but if you refer to the Databrick's Q3 (2021) public roadmap, there were some items around improving multi-task jobs.
Update: September 2022. This functionality was released back in May 2022nd with name repair & rerun
If you are running Databricks on Azure it is possible via Azure Data Factory.
How to delay a job in azure devops pipelines, I have multiple that will be running simultaneously, the problem is in the checkout phase I get the error saying files are used by another process.
I found "delayForMinutes" and running a powershell script but they only work for tasks not for jobs.
My goal is to have the checkout phase for the job to be delayed not the tasks in it.
You can do something like after the checkout job add a agentless Job with in that you can include a delay task. Then you can continue the other task in a separate agent job
I need to know , how to stop a azure databricks cluster by doing configuration when it is running infinitely for executing a job.(without manual stopping)and as well as create an email alert for it, as the job running time exceeds its usual running time.
You can do this in the Jobs UI, Select your job, under Advanced, edit the Alerts and Timeout values.
This Databricks docs page may help you: https://docs.databricks.com/jobs.html