Want to setup setting data in Windows Azure Stream Analytics - azure

Need help to setup the Reference data in stream analytics. I want to add setting(default) data of my application into stream analytics. I can add the reference data and by doing upload sample file I can upload JSON or CSV file. However while firing a join query it gives 0 rows as all reference data haven't stored (So null if left outer join).
I investigate the issue and I think it is due to Path Pattern, but I do not have much idea about it.

Based on your description, as you said, you had been sure that the issue was caused by Path Pattern/Path Prefix Pattern, but I could not give some helpful suggestion for you without any details, such as the screenshot of your Path Pattern setting.
So just list some resources as references for you, hope these help for resolving your issue.
Two screenshots about Path Prefix Pattern/Path Pattern which be introduced from Link 1 & 2.
A sample Use Stream Analytics to process exported data from Application Insights introduce how to read stream data from Blob Storage at its section Create an Azure Stream Analytics instance, which step as similar as for Reference data.
Hope it helps.

The issue was due to not properly formatted JSON file.

Related

How to configure path for Delta Live Table in cloud_files

I am new to the Databricks Delta Live table. I have some small doubts and need your help to understand the concept behind it. I am unable to proceed without this.
I have a file in the Azure data lake container, and I know that I need to give the path under "cloud_files" so that delta live table can read files from this folder and show them. But my doubt is, if I give only the path, how do I mention the storage account name and container name? Also, do I need to provide an access key in order to read the data securely ?
I think I am missing something, I have gone through various articles and Youtube demo videos, and everywhere they just mention the path but do not tell me how to configure the path.
Please help me to understand this concept.
Thank You.
This is my code for the Delta Live table:
CREATE LIVE TABLE customers_raw
COMMENTS "This is raw table"
AS
SELECT *
FROM cloud_files("/raw_data/customers.csv", "csv")
You need to specify full URL for this folder, like, abfss://<container>#<storage>.dfs.core.windows.net/raw_data/customers.csv. Otherwise if you specify it /raw_data/customers.csv it will consider it as a folder on DBFS, and will fail. Please note that in this case you will need to setup corresponding Spark properties so DLT can access data - you can find it in the following answer.

How to use a Tab-Delimited UTF-16le file as source in a Microsoft Azure data Factory dataflow

I am working for a customer in the medical business (so excuse the many redactions in the screenshots). I am pretty new here so excuse any mistakes I might make please.
We are trying to fill a SQL database table with data coming from 2 different sources (CSV files). Both are delivered on a BLOB storage where we have read access.
The first flow I build to do this with azure data factory works perfectly so I just thought to clone that flow and point it to the second source. However the CSV files from the second source are TAB delimited and UTF-16le encoded. Luckily you can set these parameters when you create a dataset:
Dataset Settings
When I verify the dataset by using the "Preview Data" option, I see a nice list with data coming from the CSV file:Output from preview data So it appears to work fine !
Now I create a new dataflow and in the source I use the newly created Data source. All settings I left at default. data flow settings
Now when I open Data Preview and click refresh I get garbage and NULL outputs instead of the nice data I received when testing the data source. output from source block in dataflow In my first dataflow i created this does produce the expected data from the csv file but somehow the data is now scrambled ?
Could someone please help me with what I am missing or doing wrong here ?
Tried to repro and here you could see if you have the Dataset settings,
Encoding as UTF-8 instead of UTF-16 then you will ne able to preview the data.
Data Preview inside the Dataflow:
And if even I try to have the UTF-16LE enabled for the encoding having such issues:
Hence, for now you could change the Encoding and use the pipeline.

Copying data using Data Copy into individual files for blob storage

I am entirely new to Azure, so if this is easy please just tell me to RTFM, but I'm not used to the terminology yet so I'm struggling.
I've created a data factory and pipeline to copy data, using a simple query, from my source data. The target data is a .txt file in my blob storage container. This part is all working quite well.
Now, what I'm attempting to do is to store each row that's returned from my query into an individual file in blob storage. This is where I'm getting stuck, and I'm not sure where to look. This seems like something that'll be pretty easy, but as I said I'm new to Azure and so far am not sure where to look.
You can type 1 in the Max rows per file of the Sink setting and don't set the file name in the dataset of sink. If you need, you can specify the file name prefix in the File name prefix setting.
Screenshots:
The dataset of sink
Sink setting in the copy data activity
Result:

Azure Data Factory - Recording file name when reading all files in folder from Azure Blob Storage

I have a set of CSV files stored in Azure Blob Storage. I am reading the files into a database table using the Copy Data task. The Source is set as the folder where the files reside, so it's grabbing it's file and loading it into the database. The issue is that I can't seem to map the file name in order to read it into a column. I'm sure there are more complicated ways to do it, for instance first reading the metadata and then read the files using a loop, but surely the file metadata should be available to use while traversing through the files?
Thanks
This is not possible in a regular copy activity. Mapping Data Flows has this possibility, it's still in preview, but maybe it can help you out. If you check the documentation, you find an option to specify a column to store file name.
It looks like this:

Export Sharepoint list to .csv and upload to Azure Data Lake Using Flow

I am trying to using Microsoft Flow to export a Sharepoint List to Azure Data Lake.
I want it so that anytime a particular online list is changed, its entire contents are loaded into a file in Data Lake. If the file already exists, I want to overwrite it. Can someone please explain how I can go about doing this, I have tried multiple ways, but they are not getting the job done.
Thanks
I was able to get the items in the SharePoint list to near perfection. I will post the Flow here in case anyone in the future needs it.
So what I did is that every 5 minutes I "create" a file in Azure Data Lake which overwrites the file if it exists. The content of the files cannot be blank, so I added a newline to the content. Then I use Get Items to retrieve all the items in the SharePoint List. From there, using an Apply to each loop, I append the content of the current row of the Sharepoint list to the Data Lake file (separated by | and ending with a new line after all the content is added). This works to near perfection, with the only caveat being the newline at the beginning of the file, which I eliminate using PowerQuery.
This is exactly what I needed. If anybody sees a way to make this better, please post so that we can get this to perfection.

Resources