Is it possible to somehow package and execute already written azure function as a custom activity in azure data factory?
My workflow is next:
I want to use azure function (which is doing some data processing) in ADF pipeline as a custom activity. This custom activity is just one of the activities in pipeline but its key to be executed.
Is it possible to somehow package and execute already written azure
function as a custom activity in azure data factory?
As I know, there is no way to do that so far. In my opinion, you do not need to package the Azure Function. I suggest you using Web Activity to invoke the endpoint of your Azure Function which could merge into previous pipeline nicely.
Related
I have scenario where i need to remove some characters from xml tags which is stored in ADLS. I am looking for an option with ADF. Can someone help me here with approach i should follow?
This is not possible by ADF. May you can have piece code to do this in
Azure Functions. As, Azure Data Factory can do data movement and data
transformation only. When you are saying about tags that means it does
not come under that.
You may use the Azure Function activity in a Data Factory pipeline to run Azure Functions. To launch an Azure Function, you must first set up a connected service connection and an activity that specifies the Azure Function you want to perform.
There is the Microsoft document which have deep insights about Azure Function Activity in ADF | Here.
Is it possible to trigger Azure Functions or AppService webapp whenever an insert operation is performed against a table on Azure SQL MI?
if not, is there a way to trigger applications outside Azure SQL rather than using LogicApp? I want to avoid LogicApp because it requries using one more application, and it is still using polling.
Link below said it is not for Azure functions
https://feedback.azure.com/forums/355860-azure-functions/suggestions/16711846-sql-azure-trigger-support
Link below suggests using LogicApp.
Trigger Azure Function by inserting (adding) new row into table, SQL Server Database
Today, in Azure SQL, there is no such possibility. The closest option is to create a Timer Trigger Azure Function that checks if there has been any changes in the table you want to monitor (using Change Tracking, for example).
If you are using Azure SQL MI instead, you could create a SQLCLR procedure that calls an Azure Function via an HTTP request or, another option, via Azure Event Hubs or Azure Event Grid
There have been several feature requests for triggering Azure functions based on changes to at Azure SQL database. For example:
https://github.com/Azure/azure-functions-python-worker/issues/365
It seems that they are not prioritizing it, since it is possible to implement this functionality using logic apps.
We have series of CSV files landing every day (daily Delta) then these need to be loaded to Azure database using Azure Data Factory (ADF). We have created a ADF Pipeline which moves data straight from an on-premises folder to an Azure DB table and is working.
Now, we need to make this pipeline executed based on an event, not based on a scheduled time. Which is, based on creation of a specific file on the same local folder. This file is created when the daily delta files landing is completed. Let's call this SRManifest.csv.
The question is, how to create a Trigger to start the pipeline when SRManifest.csv is created? I have looked into Azure event grid. But it seems, it doesn't work in on-premises folders.
You're right that you cannot configure an Event Grid trigger to watch local files, since you're not writing to Azure Storage. You'd need to generate your own signal after writing your local file content.
Aside from timer-based triggers, Event-based triggers are tied to Azure Storage, so the only way to use that would be to drop some type of "signal" file in a well-known storage location, after your files are written locally, to trigger your ADF pipeline to run.
Alternatively, you can trigger an ADF pipeline programmatically (.NET and Python SDKs support this; maybe other ones do as well, plus there's a REST API). Again, you'd have to build this, and run your trigger program after your local content has been created. If you don't want to write a program, you can use PowerShell (via Invoke-AzDataFactoryV2Pipeline).
There are other tools/services that integrate with Data Factory as well; I wasn't attempting to provide an exhaustive list.
Have a look at the Azure Logic Apps for File System connector Triggers. More details here.
Is there a way to create an Azure ADF Pipeline to ingest the incoming POST requests? I have this gateway app (outside Azure) that is able to publish data via REST as it arrives from the application and this data needs to be ingested into a Data Lake. I am utilizing the REST calls from another pipeline to pull the data but this basically needs to do the reverse - the data will be pushed and i need to be constantly 'listening' to those calls...
Is this something an ADF pipeline should do or maybe there are any other Azure components able to do it?
Previous comment is right and is one of the approach to get it working but would need bit of coding (for azure function).
There could also be an alternate solution to cater to your requirement is with Azure Logic Apps and Azure data factory.
Step 1: Create a HTTP triggered logic app which would be invoked by your gateway app and data will be posted to this REST callable endpoint.
Step 2: Create ADF pipeline with a parameter, this parameter holds the data that needs to be pushed to the data lake. It could be raw data and can be transformed as a step within the pipeline before pushing it to the data lake.
Step 3: Once logic app is triggered, you can simply use Azure data factory actions to invoke the data factory pipeline created in step 2 and pass the posted data as a pipeline parameter to your ADF pipeline.
This should be it, with this - you can spin up your code-less solution.
If your outside application is already pushing via REST, why not have it make calls directly to the Data Lake REST APIs? This would cut out the middle steps and bring everything under your control.
Azure Data Factory is a batch data movement service. If you want to push the data over HTTP, you can implement a simple Azure Function to accept the data and write it to the Azure Data Lake.
See Azure Functions HTTP triggers and bindings overview
Can we create a file (preferably json) and store it in its supported storage sinks (like Blob, Azure Data Lake Service etc) using the parameters that are passed to Azure Data Factory v2 pipeline at run-time. I suppose it can be done via Azure Batch but it seems to be an overkill for such a trivial task. Is there a better way to do that?
Here are all the transform activities ADFv2 currently equips with, I'm afraid there isn't a direct way to create a file in ADFv2. You could leverage Custom activity to achieve this by running your customized code logic on an Azure Batch pool of virtual machines. Hope it'll help a little.