What is the purpose of having two folders in Azure Data-lake Analytics - azure

I am a newbie to Azure Data lake.
The below screenshot has 2 folders (Storage Account and Catalog), one for Datalake analytics and other data lake store.
My Question is why is the purpose of each folder and why are we using U-SQL for transformations when this can be done in the data factory.
Please explain the data flow process from the data store to the data lake.
enter image description here
Thanks you,
Addy

I have addressed your query on MSDN thread:
https://social.msdn.microsoft.com/Forums/en-US/f8405bdb-0c85-4d37-8f2e-0dab983c7f94/what-is-the-purpose-of-having-two-folders-in-azure-datalake-analytics?forum=AzureDataLake
Hope this helps.

Related

Using Azure Data Factory to migrate Salesforce data to Dynamics 365

I'm looking for some advice around using Azure Data Factory to migrate data from Salesforce to Dynamics365.
My research has discovered plenty of articles about moving salesforce data to sinks such as azure data lakes or blob storage and also articles that describe moving data from azure data lakes or blob storage into D365.
I haven't found any examples where the source is salesforce and the sink is D365.
Is it possible to do it this way or do I need to copy the SF data to an intermediate sink such as Azure Data Lake or blob storage and then use that as the source of a copy/dataflow to then send to D365?
I will need to perform transformations on the SF data before storing it in D365.
Thanks
I would recommend to add ADLS Gen 2 as a Stage between SalesForce and D365
I am afraid that a direct sink as D365 can be done

Azure Lake to Lake transfer of files

My company has two Azure environments. The first one was a temporary environment and is being re-purposed / decommissioned / I'm not sure. All I know is I need to get files from one Data Lake on one environment, to a DataLake on another. I've looked at adlcopy and azcopy and neither seem like they will do what I need done. Has anyone encountered this before and if so, what did you use to solve it?
Maybe you can think about Azure Data Factory, it can helps you transfer files or data from one Azure Data Lake to Another Data Lake.
You can reference Copy data to or from Azure Data Lake Storage Gen2 using Azure Data Factory.
This article outlines how to use Copy Activity in Azure Data Factory to copy data to and from Data Lake Storage Gen2. It builds on the Copy Activity overview article that presents a general overview of Copy Activity.
For example, you can learn from this tutorial: Quickstart: Use the Copy Data tool to copy data.
In this quickstart, you use the Azure portal to create a data factory. Then, you use the Copy Data tool to create a pipeline that copies data from a folder in Azure Blob storage to another folder.
Hope this helps.

Are we able to use Snappy-data to Update a record in Azure Data lake ? OR is Azure data lake append only?

I am currently working on azure data lake with snappy-data integration,I have a query on snappy-data are we able to update the data in the snappy-data to azure data lake storage, or we can append only on the azure data lake storage i searched in forum but i can't reach for that proper solution on it,if any one know about that query on it please share it,thank you.
Azure Data Lake Store, much like HDFS, is an append only store. You can append to a file or replace it altogether. There is no way to update an existing file.
I've achieved MERGE style behaviour in USQL by using a Azure Data Lake table as the middle ground between input and output. Check out my blog post with the code showing how I did it with a series of joins.
https://www.purplefrogsystems.com/paul/2016/12/writing-a-u-sql-merge-statement/
This will give you append behaviour in your output.

Where are Azure Data Lake Analytics databases stored?

I created a database with some tables through a U-SQL script run through the Azure Data Lake Tools for Visual Studio (see screenshot below). Is that database stored in the Data Lake Store?
The file structure as shown in the Azure portal
In addition to Amit's answer:
The data that is stored in the store is stored in the \catalog folder of your default ADLS account. It will be charged at the same rate as the remaining data.
The cost of the data that is stored in the internal metadata service is internalized into the ADLA COGS calculations.
Some of the artifacts related to databases are stored in the Azure Data Lake Store. However not all of the artifacts related to databases are stored in the associated ADLS account. More specifically some of the metadata associated with the databases are stored in a ADL service-managed internal location that is not directly accessible to you. What you will see in the ADLS account is the data associated with the tables and databases in an internal format. Hope this information is useful.
Thanks,
Amit

How to transfer csv files from Google Cloud Storage to Azure Datalake Store

I'd like to have our daily csv log files transferred from GCS to Azure Datalake Store, but I can't really figure out what would be the easiest way for it.
Is there a built-in solution for that?
Can I do that with Data Factory?
I'd rather avoid running a VM scheduled to do this with the apis. The idea comes from the GCS->(DataFlow->)BigQuery solution.
Thanks for any ideas!
Yes, you can move data from Google Cloud Storage to Azure Data lake Store using Azure Data Factory by developing custom copy activity. However, in this activity, you will be using APIs for transferring that data. See details on this article.

Resources