Export data from Azure SQL managed instance to Azure Data Lake Storage as json - azure

I have a requirement to export data from Azure SQL managed instance to data lake storage as json documents. I have to use SQL Server Integration Services to accomplish this. I tried using the Flexible File Destination Data Flow task but when I see the supported file formats there's no json being supported. What other options do I have to accomplish this.

Azure Data Factory support data movement between Azure Managed Instance and Data lake account, but unfortunately when the destination is Azure Data Lake storage it also doesn't support JSON format using SSIS.
Azure Data Lake Store Destination
The Azure Data Lake Store Destination component enables an SSIS
package to write data to an Azure Data Lake Store. The supported file
formats are: Text, Avro, and ORC.
Workaround: The possible workaround you can try to use Data Flow activity in Azure Data Factory. Load the data from Managed Instance and transform it using Pivot transformation and store the processed data in Data Lake. This approach doesn't involve SSIS. Check this similar kind of request and approach here.

Related

Using Azure Data Factory to migrate Salesforce data to Dynamics 365

I'm looking for some advice around using Azure Data Factory to migrate data from Salesforce to Dynamics365.
My research has discovered plenty of articles about moving salesforce data to sinks such as azure data lakes or blob storage and also articles that describe moving data from azure data lakes or blob storage into D365.
I haven't found any examples where the source is salesforce and the sink is D365.
Is it possible to do it this way or do I need to copy the SF data to an intermediate sink such as Azure Data Lake or blob storage and then use that as the source of a copy/dataflow to then send to D365?
I will need to perform transformations on the SF data before storing it in D365.
Thanks
I would recommend to add ADLS Gen 2 as a Stage between SalesForce and D365
I am afraid that a direct sink as D365 can be done

Can use Azure Synapse and Azure Data Factory to transform and convert CSV to XML?

I have a concern about Azure Synapse and Azure Data Factory like this:
We using Azure Data lake gen 2 to export data from D365FO. After that, use Azure Data Factory or Azure Synapse to transform (complex) and create a xml file.
My customer decided to use Azure Synapse to transform, but i checked, look like Azure Synapse only support to XML as Source, not Sink and doesn't support transform (flexiable like XSLT)?
You guys have any opinions about that? please help, thanks!
Currently, XML format is only supported as a source in the Azure Data Factory and not supported as sink.
You can raise a feature request here.
https://feedback.azure.com/d365community/post/

When to use Data Factory (copy) over direct pull in SQL synapse

I am just going through some Microsoft Document and doing handOn for Data engineering related things.
I have couple of queries for a scenrerio - "copy CSV file(s) from Blob storage to Synapse analytics (stage table(s)):
I read that we can do direct data pull in Synapse with the process of creating external tables. (https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/load-data-wideworldimportersdw)
If above is possible, then in what cases we do use Azure Data factory Copy or data flow method?
While working with Azure data factory, is it a good idea to use Polybase, because it will use Blob storage again as staging in this scenrerio (i.e. I am copying file from Blob only and again using blob for staging)?
I searched for answers to my queries but haven't found any satisfactory answer yet.
If you're just straight loading data from CSV into DW, use Copy. Polybase is recommended, but not always needed for small files.
If you need to transform that data or perform updates, then use data flows.

Extracting and Transforming Data from local MySQL to Azure Synapse Data Warehouse

I'm trying to setup a Demo Data Warehouse in Azure Synapse. I would like to extract data from a local MySQL database, transform and aggregate some data and store it in fact-/dimension tables in Azure Synapse Analytics.
Currently I have an instance of Azure SQL Data Warehouse and Data Factory. I created a connection to my MySQL database in Data Factory and my thought was, i can use this connector as input for a new Data Flow, which transforms the dataset and stores it to my destination dataset, which is linked to my Azure Synapse Data Warehouse.
The Problem is, Data Factory just support some Azure Services like Azure Data Lake or Azure SQL Database as Source for a new Data Flow.
What would be the best practice for solving this Problem? Create an Instance of Azure SQL Database, copy the Data from the local MySQL Database to the Azure SQL Database and use it then as Source for a new Data Flow?
Best practice here is to use the Copy Activity in an ADF pipeline to land the data from MySQL into Parquet in Blob or ADLS G2, then transform the data using Data Flows.

Best way to extract data from Azure Data Lake to SQL Server

I am looking for a best programmatic way to extract data from Azure Data Lake to MSSQL database, which is installed on a VM within Azure.
Currently I am considering following options:
Azure Data Factory
SSIS (Using Azure Data Lake Store Connection Manager)
User-Defined Outputter Example1, Example2
Custom C# code that reads Azure Data Lake data and inserts it into SQL Server DB
Any other good ways I am missing?
Data factory v2 (currently in public preview), also supports hosting SSIS to give you a data factory AND ssis option.
And not necessarily a good idea for many scenarios, but Azure Logic Apps has both a data lake store connector and SQL Server connector, which could be useful in scenarios such as writing lots of small files on a schedule or trigger.
You also may not need to go full on c# and instead use PowerShell, there are powershell modules for both data lake store and sql server.

Resources