I am trying to copy data from Azure data lake Gen2 to Azure synapse(SQL data warehouse) through Azure data factory. Following are some details:
source(ADLS) linked service authentication type: service principal
sink(Synapse) linked service authentication type: managed identity
Copy method selected : Polybase
While validating, i am getting this error: "Source linked service should not have authentication method as Service principal".
when i selected "bulk insert" copy type, it works fine.. can anyone help me understand this? is it written anywhere that for polybase we should have same authentication type for linked service?
This is because direct copy by using PolyBase from Azure Data Lake Gen2 only support Account key authentication or managed identity authentication. You can refer to this documentation.
So if you want to direct copy by using PolyBase, you need change your authentication method to account key or managed identity.
There is a workaround, Staged copy by using PolyBase. You can refer to this documentation about this.
Related
I am trying to create an ADF Linked Service connection to a Synapse Link Serverless SQL Pool connected to ADSL Storage. I can successfully get a connection but when I try and use a dataset to access the data I get a permission issue.
I can successfully access the data via Synapse studio :
This is the error I get when I use the data set in ADF.
I can also look at the schemas in SSMS , where they appear as External tables. But get a similar credential error at the same point.
Has anyone come across this issue please ?
There are a few pieces of information you haven’t supplied in your question but I believe I know what happened. The external table worked in Synapse Studio because you were connected to the Serverless SQL pool with your AAD account and it passed through your AAD credentials to the data lake and succeeded.
However when you setup the linked service to the Serverless SQL Pool Im guessing you used a SQL auth account for the credentials. With SQL auth it doesn’t know how to authenticate with the data lake so looked for a server scoped credential but couldn’t find one.
The same happened when you connected from SSMS with a SQL auth account I’m guessing.
You have several options. If it’s important to be able to access the external table with SQL auth you can execute the following to tell it how to access the data lake. This assumes the Synapse Workspace Managed Service Identity has Storage Blob Data Reader or Storage Blob Data Contributor role on the data lake.
CREATE CREDENTIAL [https://<YourDataLakeName>.dfs.core.windows.net]
WITH IDENTITY = 'Managed Identity';
Or you could change the authentication on the linked service to use the Managed Service Identity.
I have been trying to connect my SSIS package (on prem) to connect to my data lake store. I have installed the Azure Feature pack which has worked fine.
But when I create a Data Lake connection in my ssis package, I need the following .
Image of SSIS Azure Data Lake connector Manager
ADLS Host – which is fine I know how to get that.
Authentication ( Azure AD User Identity )
UserName & Password, - which I am having issues with.
My question is how do I define a username and password for my data lake?
You can find them in Azure AD User which is within the same subscription with your Azure Data Lake. Usually, it is your email address and password which you used to login Azure portal.
More details, you can refer to this documentation.
I recently secured my azure functions by Azure Active Directory. Hence you need access token set in auth header to enable you to call them. I am successfully able to do that from my Front end Angular apps. But in my backend I have Azure Data factory as well, How can i enable Azure Data factory to use Azure AD while calling functions and not host key?
You can use the web activity to get the bearer token and then pass this to the subsequent calls .
Azure Data Factory supports Managed Identity:
Managed identity for Data Factory
When creating a data factory, a managed identity can be created along with factory creation. The managed identity is a managed application registered to Azure Activity Directory, and represents this specific data factory.
Managed identity for Data Factory benefits the following features:
Store credential in Azure Key Vault, in which case data factory managed identity is used for Azure Key Vault authentication.
Connectors including Azure Blob storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure SQL Database, and Azure SQL Data Warehouse.
Web activity.
I'm coping CSV files from Azure blob to Azure Data Lake using Azure data factory using Copy data tool.
I'm following this link: https://learn.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-copy-data-tool
Fron Copy data tool my source configuration and test connection successed. However, the destination connection (that is Data lake) is creating problem.
I'm getting error : Make sure the ACL and firewall rule is correctly configured in the Azure Data Lake Store account.
I followed this link for Fairwall setting: https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-secure-data (Set IP address range for data access)
Enabled fairwall and Allow access to Azure service "ON"
Still, I'm getting same error. Could any one please suggest. How to fix this?
Get your Managed Identity Application ID from Azure Data Factory properties.
Go to Azure Data Lake Storage and navigate to Data Explorer -> Access -> Add and then provide the ID in the 'Select User or group' field.
It will identify your Azure Data Factory instance/resource and then provide ACLs(R/W/X) as per your requirement.
Except the firewall setting, please also be sure that your account has necessary permission on the target ADLS account. Please refer to this doc for more details: https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-data-lake-store#linked-service-properties
Your account and application ADF need to have permission to work on ADLS. Also, see did you gave permission to children folders as well.
The client 'abc#abc.com' with object id 'abcabcabcabcabc' does not
have authorization to perform action
'Microsoft.Resources/deployments/write' over scope
'/subscriptions/abcabcabc/resourcegroups/abc-01-east/providers/Microsoft.Resources/deployments/publishing-123123123123'
I was trying to create a pipeline using azure data factory to pull data from sql-server to azure blob, but i am facing the above issue while i was trying to use my integration runtime which already exsist in my azure portal.
At present I have data factory contributor role assigned to me, what other roles should I have to avoid this issue?
I had a similar issue being a contributor for an ADF. With this role, you seem to be able to open the ADF UI, but the moment you try to publish anything, you get the above error. Making me a data factory contributor for that ADF didn't help.
What did help was making me a data factory contributor on the resource group level. So go to the resource group that contains the ADF, go to IAM and add you as a data factory contributor.
I also noticed, you need to close the data factory ui before IAM changes take effect.
Azure's roles are a bit of a mystery to me so it would be useful if someone could provide an explanation of how and why.
Steps
1 - Register an Enterprise APP in your Azure Active Directory
2 - Create a key in the Enterprise APP and save the value somewhere
3 - Go to your Azure SQL Database through Management Console and
CREATE USER [your application name] FROM EXTERNAL PROVIDER;
4 - Change the authentication method for Principal and use the application id and key on the form
For more information:
https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database