Copy data from private s3 implementation using Azure Data Factory - azure

I'm looking into using azure data factory for copying files from a S3 bucket that is not hosted at Amazon. The current solution that we are using includes some azure functions, a logic app and a web app. With azure data factory this should al be much simpler and easier to maintain.
I've looked into the properties that can be given for an Amazon S3 linked service.
"name": "AmazonS3LinkedService",
"properties": {
"type": "AmazonS3",
"typeProperties": {
"accessKeyId": "<access key id>",
"secretAccessKey": {
"type": "SecureString",
"value": "<secret access key>"
}
},
"connectVia": {
"referenceName": "<name of Integration Runtime>",
"type": "IntegrationRuntimeReference"
}
}
But i do not see a property to set a different host in the documentation.
My question is, is this possible at all?

I think it is just a connector for Amazon S3. At least from model layer, there is no property for different host.

Related

How to create a folder inside of an Azure Data Lake container using an ARM template?

I have an Azure ADLS storage account called eventcoadltest and I have a container called eventconnector-transformed-data-fs.
I have deployed this ADLS through an ARM template but I need to create a directory inside of eventconnector-transformed-data-fs as shown below (the folder debugging was created through the UI but I need to achieve the same with an ARM template):
I have found some posts that indicate this is not possible but it can be bypassed with some workarounds:
How to create empty folder in azure blob storage
Use ARM template to create directories in Azure Storage Containers?
How to create a folder inside container in Azure Data Lake Storage Gen2 with the help of 'azure-storage' Package
ARM template throws incorrect segments lengths for array of storage containers types
how to create blob container inside Storage Account using ARM templates
Microsoft Azure: How to create sub directory in a blob container
How to create an azure blob storage using Arm template?
How to create directories in Azure storage container without creating extra files?
How to create a folder inside container in Azure Data Lake Storage Gen2 with the help of 'azure-storage' Package
I have tried to modify my ARM template as well to achieve a similar result but I haven't had any success.
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"storageAccountDLName": {
"type": "string"
},
"sku": {
"type": "string"
},
"directoryOutput":{
"type": "string"
}
},
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2021-02-01",
"sku": {
"name": "[parameters('sku')]",
"tier": "Standard"
},
"kind": "StorageV2",
"name": "[parameters('storageAccountDLName')]",
"location": "[resourceGroup().location]",
"tags": {
"Contact": "[parameters('contact')]"
},
"scale": null,
"properties": {
"isHnsEnabled": true,
"networkAcls": {
"bypass": "AzureServices",
"virtualNetworkRules": [],
"ipRules": [],
"defaultAction": "Allow"
}
},
"dependsOn": [],
"resources": [
{
"type": "storageAccounts/blobServices/containers",
"name": "[concat('default/', 'eventconnector-raw-data-fs/test')]",
"apiVersion": "2021-02-01",
"properties": {},
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts', parameters('storageAccountDLName'))]"
]
}
]
}
]
}
The following code was modified for trying to create the folders inside of the containers.
"type": "storageAccounts/blobServices/containers",
"name": "[concat('default/', 'eventconnector-raw-data-fs/test')]"
The reason why I am trying to solve this problem is because I won't have access to create folders in our production environment, so that's why I need to do the deployment fully through ARM. How can I create this folder with the deployment script? Is there another alternative for achieving my desired result? Any idea or suggestion is welcome :)
this doesn't make any sense, as you can not create folders in Azure Storage. They don't have folders. blobs are individual objects\entities. you are confused to believe folders exist, because UI renders them as folders, however THERE ARE NO FOLDERS in a Azure Storage Blob Container.
TLDR: you can not do this at all no matter how hard you try
After some research I found out that it is possible to create a folder via Databricks with the following command:
dbutils.fs.mkdirs("dbfs:/mnt/folder_desktop/test/uploads")
I had to configure Databricks with my Azure Datafactory in order to run this command.

ARM Deployment: Get Azure Function API Key

As part of a Stream Analytics deployment solution I want to retrieve the API key for a Azure Function App in an ARM template via e.g. the listkeys() function. Is there a way to retrieve this key via an ARM template respectively during an ARM deployment and if yes, how?
Thanks
The new Key Management API of Azure Functions has gone live. Its possible via the following ARM Script. Also check this Github issue
"variables": {
"functionAppId": "[concat(parameters('functionAppResourceGroup'),'/providers/Microsoft.Web/sites/', parameters('functionAppName'))]"
},
"resources": [
{
"type": "Microsoft.KeyVault/vaults/secrets",
"name": "[concat(parameters('keyVaultName'),'/', parameters('functionAppName'))]",
"apiVersion": "2015-06-01",
"properties": {
"contentType": "text/plain",
"value": "[listkeys(concat(variables('functionAppId'), '/host/default/'),'2016-08-01').functionKeys.default]"
},
"dependsOn": []
}
]
This question has already been answered here:
listKeys for Azure function app
Get Function & Host Keys of Azure Function In Powershell
What is important in this context is to set the 'Minimum TLS Version' to '1.0' before deploying the job. Otherwise you will get failures when testing the connection health.

How to delete an Azure Table after it is copied to another storage using ADF

I have 100 tables which i want to copy to another storage account frequently. After the copy is over, i want to delete the source tables. I am able to copy entities inside tables to another storage account using ADF Copy Activity. But couldn't figure out a way to delete the source tables after successful copy.
I am using DataFactory .NET API to create pipelines, datasets etc. I thought of Custom Activity as the solution but not sure how to plug this actvity into pipeline through API ?
Any code samples or suggestions are highly appreciated.
As metioned that we could do that with Custom Activity.
but not sure how to plug this actvity into pipeline through API?
We could use the create or update Pipeline API to create or update the pipeline API.
We could get more info about how to use custom activities in an Azure Data from this tutorials.
The following is the snippet from the tutorials.
1.Create a custom activity .NET Class Library project implements that IDotNetActivity interface
2.Launch Windows Explorer, and navigate to bin\debug or bin\release folder
3.Zip the all of the file under bin\release folder and upload to the azure storage customactivitycontainer
4.Create Azure Storage linked service
5.Create Azure Batch linked service
We could use the create or update Pipeline API to create a pipeline that uses the custom activity
{
"name": "ADFTutorialPipelineCustom",
"properties": {
"description": "Use custom activity",
"activities": [
{
"Name": "MyDotNetActivity",
"Type": "DotNetActivity",
"Inputs": [
{
"Name": "InputDataset"
}
],
"Outputs": [
{
"Name": "OutputDataset"
}
],
"LinkedServiceName": "AzureBatchLinkedService",
"typeProperties": {
"AssemblyName": "MyDotNetActivity.dll",
"EntryPoint": "MyDotNetActivityNS.MyDotNetActivity",
"PackageLinkedService": "AzureStorageLinkedService",
"PackageFile": "customactivitycontainer/MyDotNetActivity.zip",
"extendedProperties": {
"SliceStart": "$$Text.Format('{0:yyyyMMddHH-mm}', Time.AddMinutes(SliceStart, 0))"
}
},
"Policy": {
"Concurrency": 2,
"ExecutionPriorityOrder": "OldestFirst",
"Retry": 3,
"Timeout": "00:30:00",
"Delay": "00:00:00"
}
}
],
"start": "2016-11-16T00:00:00Z",
"end": "2016-11-16T05:00:00Z",
"isPaused": false
}
}
About how to operate azure storage table please refer to document.

Is it possible to create a SendGrid account through Azure CLI?

Every tutorial and resource I've seen has you create a SendGrid account through the GUI, but I want to be able to use the cli. Is it possible?
Something like:
az sendgrid create
Although you cannot create a SendGrid account using Azure Cli, you can create one using an ARM template, as following
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"name": {
"type": "string"
},
"location": {
"type": "string"
},
"plan_name": {
"type": "string"
},
"plan_publisher": {
"type": "string"
},
"plan_product": {
"type": "string"
},
"plan_promotion_code": {
"type": "string"
},
"password": {
"type": "secureString"
},
"email": {
"type": "string"
},
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"company": {
"type": "string"
},
"website": {
"type": "string"
},
"acceptMarketingEmails": {
"type": "string"
}
},
"resources": [
{
"apiVersion": "2015-01-01",
"name": "[parameters('name')]",
"type": "Sendgrid.Email/accounts",
"location": "[parameters('location')]",
"plan": {
"name": "[parameters('plan_name')]",
"publisher": "[parameters('plan_publisher')]",
"product": "[parameters('plan_product')]",
"promotionCode": "[parameters('plan_promotion_code')]"
},
"properties": {
"password": "[parameters('password')]",
"acceptMarketingEmails": "[parameters('acceptMarketingEmails')]",
"email": "[parameters('email')]",
"firstName": "[parameters('firstName')]",
"lastName": "[parameters('lastName')]",
"company": "[parameters('company')]",
"website": "[parameters('website')]"
}
}
]
Then you can use az group deployment create to provision your template.
but I want to be able to use the cli. Is it possible?
As far as I know, azure doe not support create sendgrid via CLI at this time.
C:\Users>az --help
For version info, use 'az --version'
Group
az
Subgroups:
account : Manage subscriptions.
acs : Manage Azure Container Services.
ad : Synchronize on-premises directories and manage Azure Active Directory resources.
appservice: Manage your Azure Web apps and App Service plans.
batch : Manage Azure Batch.
cloud : Manage the registered Azure clouds.
component : Manage and update Azure CLI 2.0 (Preview) components.
container : Set up automated builds and deployments for multi-container Docker applications.
disk : Manage Azure Managed Disks.
documentdb: Manage your Azure DocumentDB (NoSQL) database accounts.
feature : Manage resource provider features, such as previews.
group : Manage resource groups and template deployments.
image : Manage custom Virtual Machine Images.
iot : Connect, monitor, and control millions of IoT assets.
keyvault : Safeguard and maintain control of keys, secrets, and certificates.
lock : Manage Azure locks.
network : Manages Azure Network resources.
policy : Manage resource policies.
provider : Manage resource providers.
redis : Access to a secure, dedicated cache for your Azure applications.
resource : Manage Azure resources.
role : Use role assignments to manage access to your Azure resources.
snapshot : Manage point-in-time copies of managed disks, native blobs, or other snapshots.
sql : Manage Azure SQL Databases and Data Warehouses.
storage : Durable, highly available, and massively scalable cloud storage.
tag : Manage resource tags.
vm : Provision Linux or Windows virtual machines in seconds.
vmss : Create highly available, auto-scalable Linux or Windows virtual machines.
Commands:
configure : Configure Azure CLI 2.0 Preview or view your configuration. The command is
interactive, so just type `az configure` and respond to the prompts.
feedback : Loving or hating the CLI? Let us know!
find : Find Azure CLI commands based on a given query.
login : Log in to access Azure subscriptions.
logout : Log out to remove access to Azure subscriptions.
No, it's not possible.
Here you can see all available commands: https://learn.microsoft.com/en-us/cli/azure/reference-index?view=azure-cli-latest

Since it's not possible to create Blob container in Azure ARM, then how can I enable Archive using ARM?

According to the documentation I can enable the Azure Event Hubs Archive feature using an Azure Resource Manager template. The template takes a blobContainerName argument:
"The blob container where you want your event data be archived."
But afaik it's not possible to create a blob container using an ARM template, then how am I supposed to enable the Archive feature on an Event Hub?
The purpose of the ARM template is to provision everything from scratch, not to manually create some of the resources using the portal.
It wasn't possible before to create containers in your storage account, but this has been changed. New functionality has been added to the ARM template for Storage Accounts which enable you to create containers.
To create a storage account with a container called theNameOfMyContainer, add this to your resources block of the ARM template.
{
"name": "[parameters('storageAccountName')]",
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2018-02-01",
"location": "[resourceGroup().location]",
"kind": "StorageV2",
"sku": {
"name": "Standard_LRS",
"tier": "Standard"
},
"properties": {
"accessTier": "Hot"
},
"resources": [{
"name": "[concat('default/', 'theNameOfMyContainer')]",
"type": "blobServices/containers",
"apiVersion": "2018-03-01-preview",
"dependsOn": [
"[parameters('storageAccountName')]"
],
"properties": {
"publicAccess": "Blob"
}
}]
}
To my knowledge, you can use None, Blob or Container for your publicAccess.
It's still not possible to create Queues and Tables, but hopefull this will be added soon.
Just like you said, there is no way to create a blob in Azure ARM Template, so the only logical answer to this question is: supply existing blob at deployment time. One way to do that would be to create a blob with powershell and pass it as a parameter to ARM Deployment.

Resources