Azure Data Factory - How can I trigger Scheduled/OneTime Pipelines? - azure

Background : I have scheduled pipelines running for copying data from source to destination. This is scheduled to run daily at a specific time.
Problem : The input dataset to the pipeline is external and not available at specific time intervals. This means the copy activity will have to wait until the Scheduled Start time mentioned in the Pipeline to kickoff. Considering the volume of data, I don't want to waste my time here.
Requirement : At any given time I have access to the time when my input data set is available. With this in hand, I want to know how to trigger a ADF Pipeline from C# though its scheduled to start only at a specific time.

I ran into this same issue, I needed to run my pipeline only when a local job was completed. To do that I modified the local job to kick off the pipeline as its last step. I have a write up here on how to start an ADF pipeline with C#. Here is the link to the ADF developer reference which might also be helpful. I also have an example here on how to trigger ADF pipelines from Azure Functions, if you are interested. This is using the same code from the first example but I get the benefit of running the whole process in the cloud and the ability to use the azure function scheduler.
Here is the relevant method to modify the pipeline. You would need to change the start and end dates based on when you want the slice to run.
public void StartPipeline(string resourceGroup, string dataFactory, string pipelineName, DateTime slice)
{
var pipeline = inner_client.Pipelines.Get(resourceGroup, dataFactory, pipelineName);
pipeline.Pipeline.Properties.Start = DateTime.Parse($"{slice.Date:yyyy-MM-dd}T00:00:00Z");
pipeline.Pipeline.Properties.End = DateTime.Parse($"{slice.Date:yyyy-MM-dd}T23:59:59Z");
pipeline.Pipeline.Properties.IsPaused = false;
inner_client.Pipelines.CreateOrUpdate(resourceGroup, dataFactory, new PipelineCreateOrUpdateParameters()
{
Pipeline = pipeline.Pipeline
});
}

To trigger ADF you need to have input dataset in 'Ready' state. If it is in ready state you can manually go to Monitoring tab to manually 'Re-Run', if input dataset is not ready then you need to make that dataset ready to manually start ADF.

If you want to trigger the job only once then you can set StartDate and EndDate to be the same time:
pipeline.Pipeline.Properties.Start = DateTime.Parse($"{someDate:yyyy-MM-dd}T00:00:00Z");
pipeline.Pipeline.Properties.End = DateTime.Parse($"{someDate:yyyy-MM-dd}T00:00:00Z");
pipeline.Pipeline.Properties.IsPaused = false;

Here is some example from Microsoft Doc...(link for reference)
(Only applies to V2)
{
"properties": {
"name": "MyTrigger",
"type": "ScheduleTrigger",
"typeProperties": {
"recurrence": {
"frequency": "Hour",
"interval": 1,
"startTime": "2017-11-01T09:00:00-08:00",
"endTime": "2017-11-02T22:00:00-08:00"
}
},
"pipelines": [{
"pipelineReference": {
"type": "PipelineReference",
"referenceName": "SQLServerToBlobPipeline"
},
"parameters": {}
},
{
"pipelineReference": {
"type": "PipelineReference",
"referenceName": "SQLServerToAzureSQLPipeline"
},
"parameters": {}
}
]
}
}
Save the code with .JSON file in your dir and deploy using following command...
Set-AzureRmDataFactoryV2Trigger -ResourceGroupName resourceGroupName -DataFactoryName dataFactoryName -Name "ScheduleTriggerName" -DefinitionFile ".\ScheduleTriggerName.json"

Check this out: https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipeline-execution-triggers.
As of today, I believe you can use this:
POST
https://management.azure.com/subscriptions/mySubId/resourceGroups/myResourceGroup/providers/Microsoft.DataFactory/factories/myDataFactory/pipelines/copyPipeline/createRun?api-version=2017-03-01-preview

Related

Azure Data Factory Get Metadata activity returning "(404) not found" error when getting column count

I am trying to implement a Get Metadata activity to return the column count of files I have in a single blob storage container.
Get Metadata activity is returning this error:
Error
I'm fairly new to Azure Data Factory and cannot solve this. Here's what I have:
Dataset:Source dataset
Name- ten_eighty_split_CSV
Connection- Blob storage
Schema- imported from blob storage file
Parameters- "FileName"; string; "#pipeline().parameters.SourceFile"
Pipeline:
Name: ten eighty split
Parameters: "SourceFile"; string; "#pipeline().parameters.SourceFile"
Settings: Concurrency: 1
Get Metadata activity: Get Metadata
Only argument is "Column count"
Throws the error upon debugging. I am not sure what to do, (404) not found is so broad I could not ascertain a specific solution. Thanks!
The error occurs because you have given incorrect file name (or) name of a file that does not exist.
Since you are trying to use blob created event trigger to find the column count, you can use the procedure below:
After configuring the get metadata activity, create a storage event trigger. Go to Add trigger -> choose trigger -> Create new.
Click on continue. You will get a Trigger Run Parameters tab. In this, give the value as #triggerBody().fileName.
Complete the trigger creation and publish the pipeline. Now whenever the file is uploaded into your container (on top of which you created storage event trigger), it will trigger the pipeline automatically (no need to debug). If the container is empty and you try to debug by giving some value for sourceFile parameter, it would give the same error.
Upload a sample file to your container. It will trigger the pipeline and give the desired result.
The following is the trigger JSON that I created for my container:
{
"name": "trigger1",
"properties": {
"annotations": [],
"runtimeState": "Started",
"pipelines": [
{
"pipelineReference": {
"referenceName": "pipeline1",
"type": "PipelineReference"
},
"parameters": {
"sourceFile": "#triggerBody().fileName"
}
}
],
"type": "BlobEventsTrigger",
"typeProperties": {
"blobPathBeginsWith": "/data/blobs/",
"blobPathEndsWith": ".csv",
"ignoreEmptyBlobs": true,
"scope": "/subscriptions/b83c1ed3-c5b6-44fb-b5ba-2b83a074c23f/resourceGroups/<user>/providers/Microsoft.Storage/storageAccounts/blb1402",
"events": [
"Microsoft.Storage.BlobCreated"
]
}
}
}

Azure Logic App : how to trigger logic app workflow 2000+ times?

"triggers": {
"manual": {
"inputs": {
"schema": {}
},
"kind": "Http",
"runtimeConfiguration": {
"concurrency": {
"maximumWaitingRuns": 99,
"runs": 1
}
},
"type": "Request"
}
}
From the above code can I increase the trigger amount from "maximumWaitingRuns": 99 to 2000+?
If yes then how?
I also want to trigger it 2000+ times using single click.
One of the workaround to make the workflow run multiple times is to use a recurrence trigger instead of an HTTP trigger.
Another workaround would be mentioning the iterations and run the workflow through Postman. After saving your request to one of your collections, navigate to your collection >> Run >> Mention the number of iterations >> Run Sample.

Azure DevOps API - how to reference other pipeline as resource parameter

I have an Azure DevOps pipeline and want to reference other pipeline that my pipeline will fetch the artefacts from. I am struggling to find a way to actually do it over REST API.
https://learn.microsoft.com/en-us/rest/api/azure/devops/pipelines/runs/run%20pipeline?view=azure-devops-rest-6.1 specifies there is a BuildResourceParameters or PipelineResourceParameters but I cannot find a way to get it to work.
For example:
Source pipeline A produces an artefact B in run C. I want to tell API to reference the artefact B from run C of pipeline A rather than refer to the latest.
Anyone?
In your current situation, we recommend you can follow the below request body to help you select your reference pipeline version.
{
"stagesToSkip": [],
"resources": {
"repositories": {
"self": {
"refName": "refs/heads/master"
}
},
"pipelines": {
"myresourcevars": {
"version": "1313"
}
}
},
"variables": {}
}
Note: The name 'myresourcevars' is the pipeline name you defined in your yaml file:
enter image description here

Azure data factory pipeline: conditional checking on variable activity

I have a web activity to call a REST API and save it output into a table. But one of its value will not available always. So we need to do a conditional checking while setting its output into a variable activity.
you can see how we have done that in the variable activity.
This is the rest APIs output.
{
"value": {
"id": "464a115fd3cb",
"runId": "464a115fd3cb",
"parameters": {},
"invokedBy": {
"id": "99448303872CU28",
"name": "TRIGGER_TIMESHEET_API",
"invokedByType": "ScheduleTrigger"
},
"isLatest": true
},
"continuationToken": "+RID:~sj5QALRCCB4w5hYAAAAADQ",
"ADFWebActivityResponseHeaders": {
"Pragma": "no-cache"
}
}
Here "continuationToken" will not be a part of all the API responses. So if this value is available in the API response, we need to set that in the variable activity.
In the attached screenshot, you can see that we are setting the variable. But if that key is not available in the API response, it will throw an error.
So we are looking for a solution to check whether that key is existing in the JSON output.
Any help appreciated.
I think you almost get your goal already,please use Set Variable Activity and If-Condition Activity:
Set Variable Activity:
If-Condition Activity to judge the name is empty or not:
Then you could configure the True Activity and False Activity:

Is there a option to get the event grid trigger url + key at output value from the deployment of a Azure Function?

Is there a option to get the event grid trigger url + key at output value from the deployment of a Azure Function?
The scenario we would like to do is as followed:
- We deploy a Function Service in a VSTS release via ARM.
- With the Function service deployed we deploy the event grid subscription.
Thanks,
Shraddha Agrawal
Yes, there is a way using the REST API to obtain a function access code. The following are the steps:
Let assume a name of the function is EventGridTrigger2 and the run.csx:
#r "Newtonsoft.Json"
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
public static void Run(JObject eventGridEvent, TraceWriter log)
{
log.Info(eventGridEvent.ToString(Formatting.Indented));
}
and the function.json file:
{
"bindings": [
{
"type": "eventGridTrigger",
"name": "eventGridEvent",
"direction": "in"
}
],
"disabled": false
}
As you can see the above bindings is untyped, which will work for any output schema such as InputEventSchema, EventGridSchema (default schema) and CloudEventV01Schema (after fixing some bug).
The destination property of the created Subscription looks like the following:
"destination": {
"properties": {
"endpointUrl": null,
"endpointBaseUrl": "https://myFunctionApp.azurewebsites.net/admin/extensions/EventGridExtensionConfig"
},
"endpointType": "WebHook"
},
Note, that the full subscriberUrl for Azure EventGrid trigger has the following format, where the query string contains parameters for routing request to the properly function:
https://{FunctionApp}.azurewebsites.net/admin/extensions/EventGridExtensionConfig?functionName={FunctionName}&code={masterKey}
For creating a subscriber we have to use its full subscriberUrl included a query string. In this moment, the only unknown value is the masterKey.
To obtain a Function App (Host) masterkey we have to use a management REST API call:
https://management.azure.com/subscriptions/{mySubscriptionId}/resourceGroups/{myResGroup}/providers/Microsoft.Web/sites/{myFunctionApp}/functions/admin/masterkey?api-version=2016-08-01
the response has the following format:
{
"masterKey": "*************************************************"
}
Note, that the authentication Bearer token is required for this call.
Once we have a masterKey for the FunctionApp (host), we can use it for any function within this host.
I think you are asking: "how can I deploy an Azure Function with a step in a VSTS Release using ARM and get its trigger url so that I can use the trigger url in the next VSTS Release step?"
It's not very well documented, but using the official docs, this blog post and some trial and error, we've figured out how.
This is what the ARM should look like:
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {}
"variables": {},
"resources": [],
"outputs": {
"triggerUrl": {
"type": "string",
"value": "[listsecrets(resourceId('Microsoft.Web/sites/functions', 'functionAppName', 'functionName'),'2015-08-01').trigger_url]"
}
}
}
You deploy it with an "Azure Resource Group Deployment" step, make sure that you enter a variable name in the "Deployment outputs" text box, let say triggerUrl.
Example output:
{"triggerUrl":{"type":"String","value":"https://functionAppName.azurewebsites.net/api/functionName?code=1234"}}
Then you put a PowerShell step (or an Azure PowerShell step) afterwards that picks up the value from the variable.
$environmentVariableName = "triggerUrl"
$outputVariables = (Get-Item env:$environmentVariableName).Value
Then do something with it.
With update of Functions App V2.0.12050 the URI of the Event-Grid trigger is a little different. See also here

Resources