Is it possible to copy files from blob storage / file share into an FTP within an ADF? - azure

I have a couple of pipelines in ADF, which in the end will produce some files. These files are currently being stored in file shares and blob storages. However, I'd like to move them inside a FTP server.
As of now, I have create a linked service to that FTP and a dataset that points to the FTP and the folder that I want to use to upload the files. However, when I use the activity "Copy Data" and use this dataset, I get the error "the linked service in sink dataset does not support sink".
As far as I understand, this is only possible with a SFTP, which is not valid for me, it must be an FTP (technical limitations).
Can you provide me some guidance here?
Best regards!

You can call Azure Logic App activity which contains SFTP and FTP connectors + Azure Storage connectors:
http://microsoft-bitools.blogspot.com/2018/06/execute-logic-apps-in-azure-data.html

Related

Moving files among azure blob without downloading

Currently, I have a blob container with about 5TB archive files. I need to move some of those files to another container. Is that a way to avoid download and upload files related? I do not need to access the data of those file. I do not want to get any bill about reading archive files either.
Thanks.
I suggest that you can use Data Factory. It usually used to transfer big data.
Copy performance and scalability achievable using ADF:
You can learn from below tutorial:
Copy and transform data in Azure Blob storage by using Azure Data Factory
Hope this helps.
You can use azcopy for that. It is a command line util that you can use to initiate server to server transfers:
AzCopy uses server-to-server APIs, so data is copied directly between storage servers. These copy operations don't use the network bandwidth of your computer.

Azure Blob Storage as Source and FTP as destination

Is there anyway I can transfer txt files from my Azure Blob Storage to a FTP directly, going serverless?
If possible using SSIS or Azure Data Factory.
Thanks!
you can use Azure Logic App:
Connectors to blob storage
Connectors to FTP
A simple logic app to push a blob to a FTP would be:
SSIS has a lot of connectors that can talk directly to AZURE storage. As for FTP, you may have to use a third party software (WinSCP) that can accomplish uploading of the file to FTP (if the built in FTP Task doesnt accomplish it already). If you are looking to go directly from Azure to FTP, you may have to rely on custom C# code. I am not even sure if that is possible.
You could use SSIS. Azure data factory copy activity doesn’t support ftp as sink.

Azure storage sync mechanisms

I have a problem that I have been wracking my brain about and figured I would need some perspective and insight from people who are a lot more knowledgeable about this.
What I have currently: Web based application hosted in azure uses azure blob store to store files that are generated as part of data import processes. We have a seperate application that extends the original web application that allows users to upload files and these files are currently also stored in azure blob store.
Where I am trying to go: I have a requirement that wants the ability to map network file shares on a users laptop and be able to access these files that currently reside in the blob.
Since Azure blob does not support SMB I have no way of actually doing this with a blob store.
I could use Azure files in conjunction with a File Server running the sync agent. However, this requires a lot of work both in terms of refactoring, setup and some custom service that add remove permissions on the file server.
I'm wondering if there is a service or a piece of software that exists in the market currently that allows me to continue using blob and perhaps sync the blob files into a file server that can then allow users to access and open files using windows file explorer? I found one that looks like an open source project but only does a one way sync from the blob to the file share. Ideally I'd like to find a solution that does a two way sync like azure file sync does.
Any thoughts and ideas will be appreciated.
Since the max number of blob containers, file shares is unlimited. Per my understanding, you could leverage the following approaches:
Migrate the data from blob storage to azure file share instead of blob storage, then the subsequent file store is azure file storage.
Note: Currently you must specify storage account key when mounting file shares, details you could follow this feedback. I recommend that you'd better do not map network file shares on a users laptop.
You could still use the blob storage, and you could create each blob container for each user and generate each blob container SAS token for your users, then the users could leverage Azure Storage Explorer to manage their blob files or use AzCopy and other command tools to download the blob files into their laptop file system.
Note: For security consideration, you could combine a stored access policy with a SAS, in order to revoke the permissions, you just need to invalidate the related access policy instead of regenerating the account key. Details you could follow Controlling a SAS with a stored access policy and Shared Access Signatures, Part 2: Create and use a SAS with Blob storage.

Can Azure Data Factory write to FTP

I want to write the output of pipeline to an FTP folder. ADF seems to support on-premises file but not FTP folder.
How can I write the output in text format to an FTP folder?
Unfortunately FTP Servers are not a supported data store for ADF as of right now. Therefore there is no OOTB way to interact with an FTP Server for either reading or writing.
However, you can use a custom activity to make it possible, but it will require some custom development to make this happen. A fellow Cloud Solution Architect within MS put together a blog post that talks about how he did it for one of his customers. Please take a look at the following:
https://blogs.msdn.microsoft.com/cloud_solution_architect/2016/07/02/creating-ftp-data-movement-activity-for-azure-data-factory-pipeline/
I hope that this helps.
Upon thinking about it you might be able to achieve what you want in a mildly convoluted way by writing the output to a Azure Blob storage account and then either
1) manually: downloading and pushing the file to the "FTP" site from the Blob storage account or
2) automatically: using Azure CLI to pull the file locally and then push it to the "FTP" site with a batch or shell script as appropriate
As a lighter weight approach to custom activities (certainly the better option for heavy work).
You may wish to consider using azure functions to write to ftp (note there is a time out when using a consumption plan - not in other plans, so it will depend on how big the files are).
https://learn.microsoft.com/en-us/azure/azure-functions/functions-create-storage-blob-triggered-function
You could instruct data factory to write to a intermediary blob storage.
And use blob storage triggers in azure functions to upload them as soon as they appear in blob storage.
Or alternatively, write to blob storage. And then use a timer in logic apps to upload from blob storage to ftp. Logic Apps hide a tremendous amount of power behind there friendly exterior.
You can write a Logic app that will pick your file up from Azure storage and send it to an FTP site. Then call the Logic App using a Data Factory Web Activity.
Make sure you do some error handling in your Logic app to return 400 if the ftp fails.

Azure Blob - Multiple files into one zip file before downloading

I'm currenlty using Azure Blob to store files, and upload/download from ASP.Net Application hosted outside of Azure. (I do not have Web Role and Worker Role.)
Is it possible to zip multiple files into one zip file within Azure Blob before downloading?
Thanks in advance!
THe only way to achieve this would be to do it by using a WIndows Azure Compute Role in the cloud. You obviously wouldn't want to do it on your on-prem servers as you'd round-trip the files twice.
One approach you might consider would be to build a download 'client' in Silverlight. This could handle the communications to blob stgorage and pull down the blobs (maybe in parallel) and then create the zip client side for saving.
But the short answer is this is not possible using WIndows Azure storage alone.

Resources