Azure Data Catalog gen1 allowed us to get metadata from tables inside a Databricks workspace, by using ODBC connection. Do we have have ODBC connectivity for Azure Purview too? If so, can someone share the details?
Unfortunately, ODBC connectivity for Azure Purview is not supported.
Reference: Supported data sources and file types in Azure Purview
I would suggest you to provide feedback on the same:
https://feedback.azure.com/forums/932437-azure-purview
All of the feedback you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure.
Related
Unity Catalog is the Azure Databricks data governance solution for the Lakehouse. Whereas, Microsoft Purview provides a unified data governance solution to help manage and govern your on-premises, multicloud, and software as a service (SaaS) data.
Question: In our same Azure Cloud project, can we use Unity Catalog for the Azure Databricks Lakehouse, and use Microsoft Purview for the rest of our Azure project?
Update: In our current Azure subscription, we have divided workload as follows:
SQL related workload: we are doing all our SQL database work using Databricks only (no Azure SQL databases are involved). That is, we are using Databricks Lakehouse, Delta Lake, Deatricks SQL etc. to perform ETL and all Data Analytics work.
All Non-SQL workload: All other assets (Excel files, csv files, pdf, media files etc.) are stored in various Azure storage accounts.
MS Purview is doing a good job in scanning assets in scenario 2 above, and it easily creates a holistic, up-to-date map of our data landscape with automated data discovery, sensitive data classification, and end-to-end data lineage. It also enables our data consumers to access valuable, trustworthy data management.
However, our almost 50% of the work (SQL, ETL, Data Analytics etc.) is done in Azure Databricks where we have significant challenges with Purview. We were wondering if it's possible to keep Purview and Unity Catalog separate as follows: Purview does its Data Governance work for scenario 1 only and Unity Catalog does its Data Governance work for scenario 2 only.
This recently released update may resolve our issue of making Purview work better with Azure Databricks but we have not tried it yet: Connect to and manage Azure Databricks in Microsoft Purview (Preview)
As of right now there is no official integration between Unity Catalog and Purview yet, but it may come in the future. You may join Azure Databricks roadmap webinar that will be tomorrow to get more information.
Regarding the actual question - imho, nothing prevents you from using UC & Purview in the same Azure project.
P.S. You can get metadata & lineage information into Purview by loading data from information schema tables and using Purview APIs to store it in Purview.
Is there a way to access the metadata of Azure Data Catalog? I looked up the documentation and went through the Azure Activity log of Azure Data Catalog. However, it seems like there is no access activities(i.e. who accessed Azure Data Catalog at what point of time) log I can use. Is there such activity anywhere in Azure at the moment?
Unfortunately there is no such way to check the activity logs. I would recommend you to please have a look at Azure Purview which has updated Data Catalog features.
You can refer to this document which has describes how to configure metrics, alerts, and diagnostic settings for Azure Purview using Azure Monitor: Azure Purview metrics in Azure Monitor
Can i connect power bi to Azure Hdinsight directly? And how do i do go about doing it?
I have tried googling online but there isn't any article with clear instruction on this. It is deploy under azure Vnet as well.
As of today, Azure HDInsight HBase is not supported to connect Power BI.
I would suggest you to vote up an idea submitted by another customer.
https://ideas.powerbi.com/ideas/idea/?ideaid=9d8134df-19be-4608-80df-0b08f9c140d4
https://ideas.powerbi.com/ideas/idea/?ideaid=9c08cf4a-7326-493f-bff2-3967f305b6ca
All of the feedback you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Power BI connectors.
I am trying to test out Azure Purview and connect it to an Azure SQL Server. Since the SQL server is hosted in the cloud I want to use the default AutoResolve Integrated Runtime to get connected but there is not one setup or an option to setup a new one. Has anyone else using Purview been able to setup (or needed to setup) an AutoResolve IR?
To connect to Azure SQL DB/MI you can directly go to the Azure Purview portal and register new data sources and select Azure SQL DB/MI.
In this article - Manage data sources in Azure Purview (Preview), you learn how to register new data sources, manage collections of data sources, and view sources in Azure Purview (Preview).
Only to connect on-premise SQL server you need to Set up a
self-hosted integration runtime to scan the data source.
If the data source is located on Azure, you don't need any integration runtime to scan the data source.
Reference: Register and scan an Azure SQL Database.
CHEEKATLAPRADEEP-MSFT is absolutely correct, to go a step further, since you know what an auto resolve integration runtime is, you probably are utilizing Azure Data Factory so in addition to registering your SQL Server, you can also link your Azure Data Factory for data lineage purposes. Based on the pipelines that are executed, it will autonomously create the data lineage.
Navigation to Link Data Factory
Data lineage created by linking Data Factory
Keep in mind, you will have to execute pipelines after linkage for it to pick up the data lineage. Also, for sources or destinations not supported yet, it will not get the data lineage.
May I ask what is the security protocol (Https/TCPIP etc) applied in the following scenarios in Azure? I need these details to write my design document.
Between Azure Services
Azure Data Factory interacting with Azure Storage
Azure Databricks interacting with Azure Storage
Azure Python SDK connecting to Storage Account (Is it TCP/IP ?)
If there is any support page in MS Azure, please direct me there.
Inside the Azure data centers used TLS/SSL for communication between
services and you can read about it "Encryption of data in transit"
section on this page.
The main SDK implementations are wrappers around the REST API and
Python SDK is one of them.