ADF Pipelines and Triggers - azure

Lets assume a scenario where pipeline A runs every day and pipeline B runs once in every month and it is dependent on pipeline A (pipeline B should trigger after successful completion of pipeline A).
Using scheduled trigger, we cannot have hard dependencies between 2 pipelines, where as with tumbling window, we cannot exactly specify the day which the pipeline B should run(it has only two options, minutes and hours where as scheduled trigger has months and weeks also)
Both the triggers has its disadvantages with respect to this scenario.
What could be the best possible solution for this scenario?

You can Run Pipeline A everyday, and have an IF check that checks if its a specific date today, then run Pipeline B if TRUE and nothing if FALSE.
For the settings of If Condition, you can use this as variable, if you want to run it every 1st of every month:
#Contains('01',Substring(formatDateTime(utcnow()),8,2))

Related

Customize pipelines list when a pipeline is scheduled multiple times a day, more frequently than the code changes

I need to run a GitLab pipeline at four specific times each day, which I have solved by setting up four schedules, one for each desired point in time. All pipelines run on the same branch, master.
In the list of pipelines, I get the following information for each pipeline:
status (success or not)
pipeline ID, a label indicating the pipeline was triggered by a schedule, and a label indicating the pipeline was run on the latest commit on that branch
the user that triggered the pipeline
branch and commit on which the pipeline was run
status (success, warning, failure) for each stage
duration and time at which the pipeline was run (X hours/days/... ago)
This seems optimized to pipelines which typically run no more than once after each commit: in such a scenario, it is relatively easy to identify a particular pipeline.
In my case, however, the code itself has relatively little changes (the main purpose of the pipeline is to verify against external data which changes several times a day). As a result, I end up with a list of near-identical entries. In fact, the only difference is the time at which the pipeline was run, though for anything older than 24 hours I will get 4 pipelines that ran “2 days ago”.
Is there any way I can customize these entries? For a scheduled pipeline, I would like to have an indicator of the schedule which triggered the pipeline or the time of day (even for pipelines older than 24 hours), optionally a date (e.g. “August 16” rather than “5 days ago”).
To enable the use of absolute times in GitLab:
Click your Avatar in the top right corner.
Click Preferences.
Scroll to Time preferences and uncheck the box next to Use relative times.
Your pipelines will now show the actual date and time at which they were triggered rather than a relative time.
More info here: https://gitlab.com/help/user/profile/preferences#time-preferences

Azure logic apps - Schedulers

I have a job which need to be scheduled daily at 8am, 9am and every 15min from 10am to 4pm how can I schedule this job on azure logic apps?
You will not be able to schedule individual time in logic app.
In order to work through your requirement :
You could have the logic app triggered (scheduled) for every 15 minutes. In the logic app flow you could extract the time.
formatDateTime(utcNow(),"HH:mm")
You could compare with your schedule and run the job if the conditions met.
Pseudo Logic
if time is 8:00 AM or if time is 9:00 AM
Yes, set the flag (a boolean variable) to True
No, check if the time is between 10 AM to 4 PM
Yes , Set the flag to true
No ,Set the flag to false
if the flag is true run the job else, do nothing.

Azure DataFactory: Start / End time of schedule Pipelines

I have a pipelines in Azure DataFactory which is scheduled to run hourly.
Since every schedule task will have start time and end time (e.g. 1am - 2am) to copy files within this interval. I would like to know if old task overrun like finishing at 2:15am, what will be behaviour of next task?
(a) running task with start time and end time 2am-4am
(b) running task with start time and end time 3am-4am
My aim is to make sure no missing copying files.
I have tested this in my ADF.
Conclusion:
The previous pipeline's status won't affect the next task start time. So in your case, if you the previous pipeline started at 1am and finished at 2:15am, your next task will still start at 2am.
My test:
I create a Schedule trigger which runs every 3 min. My pipeline runs about 6 min.
Monitor pipeline runs and trigger runs:
My first task ends at 3/4/21, 3:32:41 PM, and the next task starts at 3/4/21, 3:30:00 PM. So if old task overrun, it won't affect the next task start time.

Azure Data Factory - Tumbling Window Trigger - Limit hours it is running

With an Azure Data Factory "Tumbling Window" trigger, is it possible to limit the hours of each day that it triggers during (adding a window you might say)?
For example I have a Tumbling Window trigger that runs a pipeline every 15 minutes. This is currently running 24/7 but I'd like it to only run during business hours (0700-1900) to reduce costs.
Edit:
I played around with this, and found another option which isn't ideal from a monitoring perspective, but it appears to work:
Create a new pipeline with a single "If Condition" step with a dynamic Expression like this:
#and(greater(int(formatDateTime(utcnow(),'HH')),6),less(int(formatDateTime(utcnow(),'HH')),20))
In the true case activity, add an Execute Pipeline step executing your original pipeline (with "Wait on completion" ticked)
In the false case activity, add a wait step which sleeps for X minutes
The longer you sleep for, the longer you can possibly encroach on your window, so adjust that to match.
I need to give it a couple of days before I check the billing on the portal to see if it has reduced costs. At the moment I'm assuming a job which just sleeps for 15 minutes won't incur the costs that one running and processing data would.
there is no easy way but you can create two deployment pipelines for the same job in Azure devops and as soon as your winodw 0700 to 1900 expires you replace that job with a dummy job using azure dev ops pipeline.

How to re-try an ADF pipeline execution until conditions are met

An ADF pipeline needs to be executed on a daily basis, lets say at 03:00 h AM.
But prior execution we also need to check if the data sources are available.
Data is provided by an external agent, it periodically loads the corresponding data into each source table and let us know when this process is completed using a flag-table: if data source 1 is ready it set flag to 1.
I don't find a way to implement this logic with ADF.
We would need something that, for instance, at 03.00 h would trigger an 'element' that checks the flags, if the flags are not up don't launch the pipeline. Past, lets say, 10 minutes, check again the flags, and be like this for at most X times OR until the flags are up.
If the flags are up, launch the pipeline execution and stop trying to launch the pipeline any further.
How would you do it?
The logic per se is not complicated in any way, but I wouldn't know where to implement it. Should I develop an Azure Funtions that launches the Pipeline or is there a way to achieve it with an out-of-the-box AZDF activity?
There is a UNTIL iteration activity where you can check if your clause.
Example:
Your azure function (AF) checking the flag and returns 0 or 1.
Build ADF pipeline with UNTIL activity where you check the output of AF (if its 1 do something). In UNTIL activity you can have your process step. For example, you have a variable flag that will before until activity is 0. In your until you check if it's 1. if it is do your processing step, if its not, put WAIT activity on 10 min or so.
So you have the ability in ADF to iterate until something it's not satisfied.
Hope that this will help you :)

Resources