Lineage not created when scanning Delta table in Azure Purview - databricks

A delta table is created from data bricks under the Azure blob storage container by providing its mount path. It is scanned in Azure purview using the Azure blob storage asset, the Lineage is not generated.
It would be helpful if any suggestion to achieve this is provided.
Or provision of direct scan from Databricks delta table is also appreciable.

Purview do not support Lineage from Azure Blob or Data Bricks yet.

Related

Copy Data from Azure Data Lake to SnowFlake without stage using Azure Data Factory

All the Azure Data Factory examples of copying data from Azure Data Lake Gen 2 to SnowFlake use a storage account as stage. If the stage is not configured (as shown in picture), I get this error in Data Factory even when my source is a csv file in Azure data lake - "Direct copying data to Snowflake is only supported when source dataset is DelimitedText, Parquet, JSON with Azure Blob Storage or Amazon S3 linked service, for other dataset or linked service, please enable staging".
At the same time, SnowFlake documentation says the the external stage is optional. How can I copy data from Azure Data Lake to SnowFlake using Data Factory's Copy Data Activity without having an external storage account as stage?
If staging storage is needed to make it work, we shouldn't say that data copy from Data Lake to SnowFlake is supported. It works only when, Data Lake data is is first copied in a storage blob and then to SnowFlake.
Though Snowflake supports blob storage, Data Lake storage Gen2, General purpose v1 & v2 storages, loading data into snowflake is supported- through blob storage only.
The source linked service is Azure Blob storage with shared access signature authentication. If you want to directly copy data from Azure Data Lake Storage Gen2 in the following supported format, you can create an Azure Blob linked service with SAS authentication against your ADLS Gen2 account, to avoid using staged copy to Snowflake.
Select Azure blob storage in linked service, provide SAS URI details of Azure data lake gen2 source file.
Blob storage linked service with data lake gen2 file:
You'll have to configure blob storage and use it as staging. As an alternative you can use external stage. You'll have to create a FILE TYPE and NOTIFICATION INTEGRATION and access the ADLS and load data into Snowflake using copy command. Let me know if you need more help on this.

Migrate data from Azure data lake in one subscription to another

I have been looking for options to migrate data present in my ADLS in one subscription to ADLS in another subscription within Azure. I tried ADF for this purpose and it worked fine.
But the copy speed is too slow in ADF. It copies at a speed of 10-15 KB/sec. Is there some way to increase speed of copy while using ADF?
Yes, there is a way you can migrate data from Azure Data Lake between different subscription: Data Factory.
No matter Data Lake Gen1 or Gen2, Data Factory all support them as the connector. Please ref these tutorials:
Copy data to or from Azure Data Lake Storage Gen1 using Azure Data Factory.
Copy and transform data in Azure Data Lake Storage Gen2 using Azure
Data Factory.
You can create the source and sink dataset in different subscription through linked service:
But this option may cost you some money. You also could ref the Azure Az-copy tutorials: Copy blobs between Azure storage accounts by using AzCopy.
Here is another blog How To Copy Files From One Azure Storage Account To Another:
In this post, Bloger will outline how to copy data from one Azure
Storage Account in one subscription to another Storage Account in
another subscription.
These maybe what you're looking for.

Connection between Azure Data Factory and Databricks

I'm wondering what is the most appropriate way of accessing databricks from Azure data factory.
Currently I've got databricks as a linked service to which I gain access via a generated token.
What do you want to do?
Do you want to trigger a Databricks notebook from ADF?
Do you want to supply Databricks with data? (blob storage or Azure Data Lake store)
Do you want to retrieve data from Databricks? (blob storage or Azure Data Lake store)

If I delete Azure Data Lake Analytics Account will it delete it's Default Data Source?

I'm fairly new to Azure, and just trying out Azure Data Lake Analytics.
I created a new Azure Data Lake Analytics account for testing purposes and would like to delete it now, however I used an existing Azure Data Lake Storage (ADLS) account as the default storage account during setup. I now know I probably should have added the existing ADLS as associated data store.
I assume I can safely delete the Azure Data Lake Analytics account now without affecting the underlying default storage account, but I want to check before I do this as it would be a massive problem if this the existing ADLS gets deleted.
Any pointers would be much appreciated. thanks
The two are separate. Deleting the Azure Data Lake Analytics service will not affect the Azure Data Lake Store.
As a disclaimer, test test test. Set up another instance of both in the same way and then confirm the delete behaviour, just to be 110% sure.
Azure Data Lake Team here. I can positively confirm that deleting the Azure Data Lake Analytics account will NOT delete the default or any linked Azure Data Lake Store account associated with it.

HDInsight and Azure Table Storage

I'm wondering if Azure Table Storage can be used as a data source for Map/Reduce tasks on HDInsight cluster.
Obviously data can be exported from Table Storage into a flat file and then imported into HDInsight, but would be good to have more seamless integration.
This article was published by Mostafa Elhemali from the HDInsight team: http://blogs.msdn.com/b/mostlytrue/archive/2014/04/04/analyzing-azure-table-storage-data-with-hdinsight.aspx

Resources