Datalake database in sink synapse pipeline - azure

We have a blockage on a synapse pipeline, we want to create a sink on a lake database from a workflow. But impossible to select the lake database created, only the default is displayed. I looked on some forums but I do not find much and they say that it is in development at Microsot.Do you have an idea please?

posting it as answer for other community members.
First publish your lake database to the azure synapse and then try to add it in your sink on pipeline.
As in below image Database 1 is created and published and it is getting displayed in Sink Database and Database 2 is created but not published hence it is not getting displayed in Sink Database.

Related

What to do when my Data source is not supported by Azure Synapse's Data Flow?

I am trying to transform data from Salesforce before loading it to dedicated SQL pool.
When I try to create a dataset from Synapse's Dataflow, I am not able to choose Salesforce a Data store:
Can anyone suggest how to transform data from Salesforce or any other Datasource that is not supported by Dataflow?
As per the Official Documentation, Currently Dataflows does not support Salesforce data as source or sink.
If you want, you can raise the feature request in the Synapse portal.
As an alternate, you can use Copy activity in the Azure Data factory to copy data from Salesforce to Dedicated SQL pool and then you can transform it using Dataflows in synapse from Dedicated SQL DB to Dedicated SQL DB.
Follow the below steps to achieve your requirement:
First create a Data Factory Workspace.
Select the Author hub and a create a pipeline. Now, drag the copy activity from the workspace and select the source. You can see that Salesforce is supported when you select new source dataset. Select it and create a linked service for that.
Now, select the sink dataset and click on Azure Synapse analytics.
Create a linked service for the Dedicated SQL database and select it.
Then, you can select the table in the Dedicated SQL and copy your data by running this.
After this copy, go to Synapse workspace and click on the Source of the Dataflow.
Select the Azure Synapse Analytics in source and click on continue.
Now, click on New to create linked service for the SQL DB. Give the subscription and server name and authenticate with your database.
After the creation of linked service, select it and give your table which is result of the copy in the DB.
Now, go to sink and select Azure Synapse Analytics and create another linked service for it as same above and select the resultant table in DB which you want after transform.
By following the above process, we can achieve the transformation from Salesforce data to Dedicated SQL DB.
Can anyone suggest how to transform data from Salesforce or any other Datasource that is not supported by Dataflow?
You can try this approach for the data stores which are not supported by the Data flows and please refer this to check various data stores supported by Copy activity before doing this process for the other data stores.

Loading data into Azure Synapse Analytics from Azure SQL Database

I am followin this tutorial to move data from SQL to Azure Synapse https://learn.microsoft.com/en-us/azure/data-factory/load-azure-sql-data-warehouse?tabs=data-factory
However, once I get to step 5c I cannot select a a Database name, Do I have to create an Azure Synapse Database first to copy data over there? I though that is what this tutorial will do?
I have a SQL database and I want to move the data into Azure Synapse.
Thanks
Yes, in Azure data factory your source and sink needs to be already present w.r.t database scenarios.
So it is expected that you already have an Azure SQL database and Azure SQL datawarehouse in place before proceeding with copy activity

Azure Synapse Studio - WorkFlow

I am new to Azure Synapse Studio.
I am working with Synapse analytics and Loaded the data from NYTaxi and successfully created Database using a loading user etc.
But once I create a Workspace in Synapse Analytics and then Launched the Azure Synapse Studio.
I could not see any database
I wanted to know how to create a Dataset
I wanted to know how to deal with PowerBI within Studio
Also related to Apache Spart etc I need help
Thanks in Advance
Vijay Perepa
With Azure Synapse Analytics (Workspace preview) deployment no SQL Pool is deployed. You can do this in the Synapse workspace (create new SQL pool), also with sample data.
A dataset can be created in the data area (tab linked). Main purpose is metadata information (e. g. for Parquet files in your attached Azure Data Lake Store or a SQL Pool table) that can be used in a Data Flow.
PowerBI: You can link a PowerBI Workspace to Azure Synapse Analytics (Manage - Linked Service). With this you can create PowerBI datasets accessing data in your SQL Pool.
As a good starting point I would recommend the Documentation. There you also find some usefull Tutorials. Lot' s of samples are available on GitHub. Hope this helps.

Is possible to read an Azure Databricks table from Azure Data Factory?

I have a table into an Azure Databricks Cluster, i would like to replicate this data into an Azure SQL Database, to let another users analyze this data from Metabase.
Is it possible to acess databricks tables through Azure Data factory?
No, unfortunately not. Databricks tables are typically temporary and last as long as your job/session is running. See here.
You would need to persist your databricks table to some storage in order to access it. Change your databricks job to dump the table to Blob storage as it's final action. In the next step of your data factory job, you can then read the dumped data from the storage account and process further.
Another option may be databricks delta although I have not tried this yet...
If you register the table in the Databricks hive metastore then ADF could read from it using the ODBC source in ADF. Though this would require an IR.
Alternatively you could write the table to external storage such as blob or lake. ADF can then read that file and push it to your sql database.

Add SQL Server as a data source in Azure Data Lake Analytics

I'm doing some tests with Azure Data Lake Analytics and I can’t add a new SQL Server database as a Data Source. When I click on "Add data source", the only two available options are: "Azure Data Lake Storage Gen1" and "Azure Storage".
What I want is to add one SQL Server database so that I can run U-SQL queries against it.
Our SQL Server firewall is correctly configured to allow access to Azure Services, but I am not allowed to add it as a data source.
How can this be done? Is it a matter of other configuration issues?
Any help would be greatly appreciated.
Per my research ,there is no other configuration issues for sql server data source in DLA. Based on this official doc, DLA only supports two data sources:Data Lake Store and Azure Storage.
As workaround , I suggest you using Azure Data Factory to transfer data from sql server database to azure storage so that you could run U-SQL script against data source.
Any concern,please let me know.

Resources