Azure data factory sftp - azure

I have copied the file from an sFTP loc which is a zip file and in zip, I have CSV format but when the file came to azure blob but the file extension came as .zip.txt can someone suggest how this is happening and how can I get CSV as it is.

Have you tried using the "compression type" option?
This will work for legacy zip. If the zip is with AES encryption or with a password, you will need a custom activity and do the unzipping using an Azure function with some code inside.

Related

Generate a password protected CSV file using Azure Data Factory

I want to protect a csv file with password using Azure Data Factory. The file is located in Datalake. I have full access to the datalake folders.
Is there any way that I can do this in ADF?
Thanks in advance.
Kind regards,
Arjun Rathinam
I'm pretty sure you can't actually password protect a CSV file by itself, like mentioned here: https://www.codeproject.com/Questions/778720/How-I-can-make-csv-file-password-protected. It's just a text file like .txt files.
It would need to be wrapped in a file format that does support encryption and password protection, a couple options mentioned here: https://www.import2.com/csv/how-to-encrypt-a-csv-file.
Option 1: Compress your CSV file to a zip folder, and them encrypt that folder.
Option 2: Import the CSV file into a spreadsheet software (like Excel or Numbers) and then add your password encryption.

XLSX files in azure blob storage get downloaded as zip files

We have some files in our Azure blob storage - they are all xlsx files.
When we download them via Azure portal (we navigate to the storage account, then to the container, and then select a file and download it) it downloads and saves as zip file.
If after downloading we change its extension to xlsx then Excel will recognize it and open without issues. However, something is forcing that extension to change from xlsx (as we see it in the container) to the .zip whilst it is downloaded.
The same happens when we access the files programmatically (via c# code) or generate a shared access signature.
What could it be and how to fix it?
Thanks!
my work around when accessing xlsx files programmatically with C#, is to manually add the mime type specifically for the xlsx file type as, they were one's giving me issues(pdf and pictures work fine), PS, I store filenames in my DB with a corresponding filename. i.e
if (YourModel.FileName.EndsWith("xlsx"))
{
return File(YourModel.FileData, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
}

How to unzip .gz file from blob storage in Azure Data Factory?

I have folder (say folder A) in blob storage that has more than one zipped file (.gz format). I want to unzip all the file and save back to folder (say folder B in blob storage).
This is the approach I was trying. GetMetadata-->ForEach Loop.
Inside foreach loop, I tried copy activity. However, the unzipped file is corrupted.
Glad to hear the issue is resolved now:
"Actually the issue was file extension. I added file extension and it
works for me now. "
I help you post it and others can know that. This can be beneficial to other community members.

Uploading Excel to Azure storage is corrupting the file and providing a Security warning as well

I am uploading an Excel memorystream to Azure Storage as a blob. Blob is saved successfully but corrupted while opening or downloading. Tested once with Excel
This provides Security warning everytime for the .csv files. But the file opens normally after that.
The same memorystream is working fine on local as I am able to convert the memorystream into Excel/CSV with no errors.
Any Help!!
Got the answer after some Google.
I was uploading an Excel/CSV to azure storage and while opening the file especially .csv it produces a Security warning. But the same same memorystream was working find on local.
Got some interesting answer here:
"It is possible for .csv files to contain potentially malicious code, so we purposely don't include it in our list of safe-to-open files.
In a future update, we could provide a way to customize the list of files a user would consider safe to open."
The link is:: https://github.com/microsoft/AzureStorageExplorer/issues/164

How can I decompress my .zip file and store in ADL/Blob storage?

I have a ftp as a source connection where some I have zip file and others are not in compress form. I want to copy the files from ftp,decompress zip files and put all files into azure data lake or azure blob storage wherever it's possible to get decompressed.
I'm using copy data activity where I have a source as ftp and properties is zipDeflate,fastest and binary copy and the sink side, I'm just defining the destination ADL path. The files are getting copied to ADL but they're copying in compress form only.
Please let me know if it's possible to achieve the above objective by using copy activity process?
Using binary copy is your issue here, data factory wont understand the data it is moving to uncompress it. Try the same setup without binary copy!
Hope this helped!

Resources