How to connect Data Lake store in Azure analysis services - azure-hdinsight

How to connect Data Lake store in Azure analysis services
Can we use HIVE ODBC or any other options?

I assume you want to use Azure Data Lake as a data source for Azure Analysis Services (e.g. you have fact and dimension files in the Data Lake).
There is no connector in Azure Analysis Services to pull data directly from Azure Data Lake at present, although hopefully this is something Microsoft will address soon.
As a workaround you could try the following:
Azure Analysis Services will allow you to use Azure Blob Storage as a data source. So once you have transformed your data in Azure Data Lake you then need to copy the fact and dimension files into Azure Blob Storage (e.g. using Azure Data Factory) and then you should be able to use Azure Analysis Services to build your model.
Note that the Blob Storage data source option is only available if you build a 1400 compatibility model in Azure Analysis Services. This option is only available if you have the latest version of SQL Server Data Tools for Visual Studio (you may need to upgrade to version 17.1 of SSDT).
I hope this helps

Related

How to store data from Azure Analysis services into Azure Datalake using Azure Data Factory?

How to store data from Azure Analysis services into Azure Datalake using Azure Data Factory?
I have two tables in Azure Analysis service and I need to copy and store those data into Azure Datalake using Azure Data Factory.
Could you please help me or share the reference url.
There is no direct connector for Azure analysis services in ADF but there are some ways via which you can copy the data :
1 such way is to create a linked server across a on prem/IaaS or SQL MI database instance and analysis services and get the data from the SQL instance through ADF.
Can we copy data from Azure Analysis Services using Azure Data Factory?
The above links explains it in details w.r.t the setup.
https://datasharkx.wordpress.com/2021/03/16/copy-data-from-ssas-aas-through-azure-data-factory/

Different ways to access data from Azure Data Share to Azure SQL Database

We are working on implementing a new project in Azure. The idea is to move out of on-premise systems into the cloud as we have our vendors, partners and clients moving into the cloud. The option we are trying out is to use Azure Data Share and have Azure SQL Database subscribe to the data.
The thing we are now trying to explore is once a new data snapshot is created how do we import this data into Azure SQL Database?
For instance we have Partner information and this information is made available via Azure Data Share and new data snapshot is created daily.
The part that I am not sure of is how to synchronize this data between Azure Data Share and Azure SQL Database.
Also, Is there an api available to expose this data out to external vendors, partners or clients from Azure SQL Database after we have data sync to Azure SQL Database from Azure Data Share?
Azure Data Share -> Azure SQL Database
Yes, Azure SQL Database is a supported.
Azure Data Share -> SQL Server Database (on-prem)? Is this option supported?
No, SQL Server Database (on-prem) is not supported.
Is there an api that could be consumed to read data?
Unfortunately, there is no such API that could be consumed to read data.
Azure Data Share enables organizations to simply and securely share data with multiple customers and partners. In just a few clicks, you can provision a new data share account, add datasets, and invite your customers and partners to your data share. Data providers are always in control of the data that they have shared. Azure Data Share makes it simple to manage and monitor what data was shared, when and by whom.
Azure Data Share helps enhance insights by making it easy to combine data from third parties to enrich analytics and AI scenarios. Easily use the power of Azure analytics tools to prepare, process, and analyze data shared using Azure Data Share.
Which Azure data stores does Data Share support?
Data Share supports data sharing to and from Azure Synapse Analytics, Azure SQL Database, Azure Data Lake Storage, Azure Blob Storage, and Azure Data Explorer. Data Share will support more Azure data stores in the future.
The below table details the supported data sources for Azure Data Share.
How to synchronize this data between Azure Data Share and Azure SQL Database.
You need to choose “Snapshot setting” to refresh data automatically.
A data provider can configure a data share with a snapshot setting. This allows incremental updates to be received on a regular schedule, either daily or hourly. Once configured, the data consumer has the option to enable the schedule.

Need solution to integrate Grafana with Azure data lake

I want to integrate Azure data lake storage with Grafana for visualization of time series data. I need to know what all the tools I can use to make it possible.
I used ADF to extract data from csv files stored in data lake and move to a table in Azure data explorer. After that, I used Azure data explorer plugin in grafana to visualize the same. It worked fine. But I need to know is there any other approach which may be better or cost-effective.
Integrating Grafana with Azure Data Lake is the best option when compared to others because the other options include data movements using ADF and additional cost for Azure SQL Datawarehouse along with the cost of PowerBI.
Reason:
Grafana is a leading open source software designed for visualizing time series analytics. It is an analytics and metrics platform that enables you to query and visualize data and create and share dashboards based on those visualizations. Combining Grafana’s beautiful visualizations with Azure Data Explorer’s snappy ad hoc queries over massive amounts of data, creates impressive usage potential.
The Grafana and Azure Data Explorer teams have created a dedicated plugin which enables you to connect to and visualize data from Azure Data Explorer using its intuitive and powerful Kusto Query Language. In just a few minutes, you can unlock the potential of your data and create your first Grafana dashboard with Azure Data Explorer.
For more details on visualizing data from Azure Data Explorer in Grafana please visit our documentation, “Visualize data from Azure Data Explorer in Grafana”.
Other options:
For Azure Data Lake Gen1:
You can use a mix of services to create visual representations of data stored in Data Lake Storage Gen1.
You can start by using Azure Data Factory to move data from Data Lake Storage Gen1 to Azure SQL Data Warehouse.
After that, you can integrate Power BI with Azure SQL Data Warehouse to create visual representation of the data.
For Azure Data Lake Gen2:
You can use a mix of services to create visual representations of data stored in Data Lake Storage Gen2.
You can start by using Azure Data Factory to move data from Data Lake Storage Gen2 to Azure SQL Data Warehouse.
After that, you can integrate Power BI with Azure SQL Data Warehouse to create visual representation of the data.
Hope this helps.
They just released a new guide. This is for Grafana 5.3
https://learn.microsoft.com/en-us/azure/data-explorer/grafana
you are able to test this by running Grafana in a Docker container (or for real, if you want). I followed the guide, and it is working almost exactly as expected. The only issue I am having is Grafana is concatenating the column name and the data in the column, making reading and formatting tricky.

Azure Data Lake Analytics Vs Azure SQL Data Warehouse

I am using ADF to connect to sources and get data into Azure Data Lake store. After getting data into Data Lake Store, I want to do some transformation, aggregation and use that data in SSRS reports and also for creating Cubes.
Can anyone suggest me which will be the best option (Azure Data Lake Analytics or Azure SQL DW) ?
I am looking here to make a decision on to take which one after Data lake.
There are no more Azure SQL DW. What we have now are Azure Synapse (same as Azure DW) and Azure Synapse Analytics (instead of Azure Datalake analytics). Microsoft is stopping support (develop) USQL and Azure Datalake analytic. If volume of your data is huge and you want use Polybase technology the best choice is Azure Synapse and Azure Synapse Analytics. You can rich your ADF by using Databricks to do analytics stuff. By using Polybase you can do ELT instead of ETL.
Microsoft Azure is not anymore investing on Azure Data Lake Analytics (ADLA) , you can evidently see that number of enhancements /updates in last couple of years are almost none in ADLA. While on the other side Azure SQL Data Warehouse is their flagship service ( recently names as azure synapse analytics) and hence getting enhanced and updated very fast. Synapse is based on MPP architecture and provides all required capabilities of big data computing.
What is the size of your data? Azure Data Lake is more meant for petabyte size big data processing and Azure SQL Data Warehouse for large relational DWH solutions (starting from 250/500 GB and up).
With Azure Data Lake you can even have the data from a data lake feed a NoSQL database, a SSAS cube, a data mart, or go right into Power BI. With Azure SQL Datawarehouse you can have cubes, Power BI reports and SSRS
If you need SQL Server Reporting Services, Integration Services (and you have complex SSIS logic), and Analysis Services (SSAS), you may better consider an Azure SQL VM.

Best way to extract data from Azure Data Lake to SQL Server

I am looking for a best programmatic way to extract data from Azure Data Lake to MSSQL database, which is installed on a VM within Azure.
Currently I am considering following options:
Azure Data Factory
SSIS (Using Azure Data Lake Store Connection Manager)
User-Defined Outputter Example1, Example2
Custom C# code that reads Azure Data Lake data and inserts it into SQL Server DB
Any other good ways I am missing?
Data factory v2 (currently in public preview), also supports hosting SSIS to give you a data factory AND ssis option.
And not necessarily a good idea for many scenarios, but Azure Logic Apps has both a data lake store connector and SQL Server connector, which could be useful in scenarios such as writing lots of small files on a schedule or trigger.
You also may not need to go full on c# and instead use PowerShell, there are powershell modules for both data lake store and sql server.

Resources