How to unzip .gz file from blob storage in Azure Data Factory? - azure

I have folder (say folder A) in blob storage that has more than one zipped file (.gz format). I want to unzip all the file and save back to folder (say folder B in blob storage).
This is the approach I was trying. GetMetadata-->ForEach Loop.
Inside foreach loop, I tried copy activity. However, the unzipped file is corrupted.

Glad to hear the issue is resolved now:
"Actually the issue was file extension. I added file extension and it
works for me now. "
I help you post it and others can know that. This can be beneficial to other community members.

Related

XLSX files in azure blob storage get downloaded as zip files

We have some files in our Azure blob storage - they are all xlsx files.
When we download them via Azure portal (we navigate to the storage account, then to the container, and then select a file and download it) it downloads and saves as zip file.
If after downloading we change its extension to xlsx then Excel will recognize it and open without issues. However, something is forcing that extension to change from xlsx (as we see it in the container) to the .zip whilst it is downloaded.
The same happens when we access the files programmatically (via c# code) or generate a shared access signature.
What could it be and how to fix it?
Thanks!
my work around when accessing xlsx files programmatically with C#, is to manually add the mime type specifically for the xlsx file type as, they were one's giving me issues(pdf and pictures work fine), PS, I store filenames in my DB with a corresponding filename. i.e
if (YourModel.FileName.EndsWith("xlsx"))
{
return File(YourModel.FileData, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
}

Unzip and Rename underlying File using Azure Logic App

Possible to rename an underlying file while Unzipping using Logic App? I am calling an HTTP activity to download a ZIP file. That Zip contains only 1 Underlying file with some value appended to the name. I want to store the Unzipped file with a better name so that it can be used further. Is it possible ?
Incoming ZIP File --> SAMPLEFile.ZIP
Underlying File --> SampleTextFile20200824121212.TXT
Desired File --> SampleTextFile.TXT
Suggestions ?
As far as I know, we can't implement this requirement directly in "Extract archive to folder" action. We can just rename the file by copy it from one folder to another folder (shown as below).
You can create a new ticket on feedback page to ask azure team for this feature.

How to rename file in BLOB container

Can you please let me know if it is a possible to rename a file in a BLOB container using SDK. In most of the suggestion it is recommended to create a new BLOB, copy the content of the old BLOB & then delete the old BLOB. However, due to size of the files I do not seem it will be an ideal solution in the architecture that I am working with. So I am searching for an alternate way of achieving result.
Kind Regards,

How to create a zip file from Azure blob storage container files using Pipeline

I have some dynamically created files in a blob storage container. I want to send it through email as a single attachment.
The total file size is less than 5 MB.
But here the difficulty I am facing is, when I try to compress the file using CopyData options, the compressed/zipped file not creating properly with multiple files.
If I try to zip a single file by giving its full path and filename, it is working fine. But when I give a folder name to compress all the files in that folder, it is not working correctly.
Please note that here I am not using any kind of external C# code or libraries.
Any help appreciated
Thank you
You can reference my settings in Data Factory Copy active:
Source settings:
Source dataset settings:
Sink settings:
Sink dataset settings:
Pipeline works ok:
Check the zip file in contianer containerleon:
Hope this helps.

How can I decompress my .zip file and store in ADL/Blob storage?

I have a ftp as a source connection where some I have zip file and others are not in compress form. I want to copy the files from ftp,decompress zip files and put all files into azure data lake or azure blob storage wherever it's possible to get decompressed.
I'm using copy data activity where I have a source as ftp and properties is zipDeflate,fastest and binary copy and the sink side, I'm just defining the destination ADL path. The files are getting copied to ADL but they're copying in compress form only.
Please let me know if it's possible to achieve the above objective by using copy activity process?
Using binary copy is your issue here, data factory wont understand the data it is moving to uncompress it. Try the same setup without binary copy!
Hope this helped!

Resources