Generate a password protected CSV file using Azure Data Factory - azure

I want to protect a csv file with password using Azure Data Factory. The file is located in Datalake. I have full access to the datalake folders.
Is there any way that I can do this in ADF?
Thanks in advance.
Kind regards,
Arjun Rathinam

I'm pretty sure you can't actually password protect a CSV file by itself, like mentioned here: https://www.codeproject.com/Questions/778720/How-I-can-make-csv-file-password-protected. It's just a text file like .txt files.
It would need to be wrapped in a file format that does support encryption and password protection, a couple options mentioned here: https://www.import2.com/csv/how-to-encrypt-a-csv-file.
Option 1: Compress your CSV file to a zip folder, and them encrypt that folder.
Option 2: Import the CSV file into a spreadsheet software (like Excel or Numbers) and then add your password encryption.

Related

Azure data factory sftp

I have copied the file from an sFTP loc which is a zip file and in zip, I have CSV format but when the file came to azure blob but the file extension came as .zip.txt can someone suggest how this is happening and how can I get CSV as it is.
Have you tried using the "compression type" option?
This will work for legacy zip. If the zip is with AES encryption or with a password, you will need a custom activity and do the unzipping using an Azure function with some code inside.

Use Azure Data Factory to copy files and place a csv of files copied

I am trying to implement the following flow in an Azure Data Factory pipeline:
Copy files from an SFTP to a local folder.
Create a comma separated file in the local folder with the list of files and their
sizes.
The first step was easy enough, using a 'Copy Data' step with 'SFTP' as source and 'File System' as sink.
The files are being copied, but in the output of this step, I don't see any file information.
I also don't see an option to create a file using data from a previous step.
Maybe I'm using the wrong technology?
One of the reasons I'm using Azure Data Factory, is because of the integration runtime, which allows us to have a single fixed IP to connect to the external SFTP. (easier firewall configuration)
Is there a way to implement step 2?
Thanks for any insight!
There is no built-in feature to achieve this.
You need to use ADF with other service, I suppose you to first use azure function to check the files and then do copy.
The structure should be like this:
You can get the size of the files and save them to the csv file:
Get size of files(python):
How to fetch sizes of all SFTP files in a directory through Paramiko
And use pandas to save the messages as csv(python):
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
Writing a pandas DataFrame to CSV file
Simple http trigger of azure function(python):
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=python
(Put the processing logic in the body of the azure function. Basically, you can do anything you want in the body of the azure function except for the graphical interface and some unsupported things. You can choose the language you are familiar with, but in short, there is not a feature in ADF that satisfies your idea.)

Uploading Excel to Azure storage is corrupting the file and providing a Security warning as well

I am uploading an Excel memorystream to Azure Storage as a blob. Blob is saved successfully but corrupted while opening or downloading. Tested once with Excel
This provides Security warning everytime for the .csv files. But the file opens normally after that.
The same memorystream is working fine on local as I am able to convert the memorystream into Excel/CSV with no errors.
Any Help!!
Got the answer after some Google.
I was uploading an Excel/CSV to azure storage and while opening the file especially .csv it produces a Security warning. But the same same memorystream was working find on local.
Got some interesting answer here:
"It is possible for .csv files to contain potentially malicious code, so we purposely don't include it in our list of safe-to-open files.
In a future update, we could provide a way to customize the list of files a user would consider safe to open."
The link is:: https://github.com/microsoft/AzureStorageExplorer/issues/164

Databricks File Save

I'm using Databricks on Azure and am using a library called OpenPyXl.
I'm running the sameple cosde shown here: and the last line of the code is:
wb.save('document.xlsx', as_template=False)
The code seems to run so I'm guessing it's storing the file somewhere on the cluster. Does anyone know where so that I can then transfer it to BLOB?
To save a file to the FileStore, put it in the /FileStore directory in DBFS:
dbutils.fs.put("/FileStore/my-stuff/my-file.txt", "Contents of my
file")
Note: The FileStore is a special folder within Databricks File System - DBFS where you can save files and have them accessible to your web browser. You can use the File Store to:
For more detials, refer "Databricks - The FileStore".
Hope this helps.

how to load local file into Azure SQL DB

I have not been able to find a solution to this so will ask the experts.
A co-worker has a .txt file on his laptop that we want to load into Azure SQL DB using SSMS and Bulk Insert. We can open the local file easily enough but we don't know how to reference this file in FROM clause.
Assuming a file named myData.txt is saved to
c:\Users\Someone
how do we tell Azure SQL DB where that file is?
You don't. :) You have to upload a file to an Azure Blob Store and then, from there, you can use BULK INSERT or OPENROWSET to open the file.
https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-2017
I've written an article that describes the steps to open a JSON file here:
https://medium.com/#mauridb/work-with-json-files-with-azure-sql-8946f066ddd4
I fixed this problem by uploading the file to a local database and then use a linked server to my Azure db to insert or update the record. Much easier than creating a Blob Storage. However, if the file is very big or you have a lot of files to upload you might not want to use my method as linked servers is not the quickest connection.

Resources