Self hosted Integration Runtime fails on Sink but Auto Resolve (Default) Integration Runtime succeeds - azure-integration-runtime

Good day,
ADF v2 data loads that were working for months are suddenly failing on the Sink (to Azure SQL database) activity.
The error message points to a firewall entry for the target database:
Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.
I have confirmed that the IP of my on premise integration runtime server is specified in the firewall rules of the Azure database.
When testing the linked service connectivity with the Integration Runtime parameter as 'AutoResolveIntegrationRuntime' the test is successful but when setting it to my self hosted IR then I get the error message above.
Also, when the data factory job executes it, processes 'Lookup' or 'Stored Proc' activities to my Azure database without any problems (and I can see it uses both IRs at this point) but for 'Sink' activities it fails and I can see it uses my self hosted IR at these points.
I have one IR node only and have confirmed it is active and running. It is also on the latest version.

Ok - so the firewall engineer fixed it. Apparently the Azure database traffic was subject to general internet traffic firewall rule. The connection failed on MSSQL traffic so he created a firewall rule specifically for that and now it is sorted.

Related

Copy data from self-hosted Integration runtime to azure data lake?

I'm trying to copy data, using the copy activity in a synapse-pipeline, from a self hosted integration runtime rest api call to a azure data lake gen2. Using preview I can see the data from the rest api call but when I try to do the copy activity it is queued endlessly. Any idea why this happens? The Source is working with a self hosted integration Runtime and the Sink with azure integration runtime. Could this be the problem? Otherwise both connections are tested and working...
Edit: When trying the the web call, it tells me it's processing for a long time but I know I can connect to the rest api source since when using the preview feature in the copy activity it shows me the response....
Running the diagnostic tool, I receive the following error:
It seems to be a problem with the certificate. Any ideas?
Integration runtime
If you use a Self-hosted Integration Runtime (IR) and copy activity waits long in the queue until the IR has available resource to execute, suggest scaling out/up your IR.
If you use an Azure Integration Runtime that is in a not optimal region resulting in slow read/write, suggest configuring to use an IR in another region.
You can also try performance tuning tips
how do i scale up the IR?
Scale Considerations

Self-Hosted Integration Runtime Copy Activity Timeout

I’m trying to implement a pipeline in ADF where I copy data from a Function App to an on-prem SQL Server. I have installed the Self-Hosted Integration Runtime to access the on-prem database and set my copy activity to use the self-hosted IR.
First I was a getting a firewall error, so I added a rule to allow the node where IR is installed to call the function app but now I am getting a timeout error.
Any ideas why the timeout?
Please check the General parameters of the copy activity. Try to Increase Timeout of your copy activity. By default it is 7 days.
Also, try to increase the retry count in the copy activity. The default is zero (no retry). Increasing the count and retry interval should allow it to attempt to regain connection.
Please refer this Microsoft Documentation: Troubleshoot copy activity on Self-hosted IR

False negative in Azure Application Insights Availability Test

I have setup Availability Test using Application Insights and using uptrends.com. I am pinging simple http resources. This resource does redirects before it returns 200.
I am getting a lot of errors in azure AI, from certain regions only. I cannot confirm, in any way (not via VMs deployed in that region, not via other services like uptrends.com) that there are failures.
The error is:
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [...]:443
The failure occurs after just 20 seconds, when I have timeout configured at 120 seconds.
It appears Availability Tests are giving me false negatives ... any ideas?
This could happen when the IP address of "Availability Test clients" in this "partiular Azure region (from where the failure is being noticed frequently)" is not added to the allowed list on the Firewall/security system. Please refer to Availability tests and Addresses grouped by location (Azure Public Cloud) table in the above link to get all the required addresses related to "Availability Test Clients" of a particular Azure region and ensure that they are added to your firewall's Allowed List.
For more details, you may also refer to the discussion here on similar topic: https://learn.microsoft.com/en-us/answers/questions/755004/index.html

Azure Batch within a VNET that has a Service endpoint policy for Storage

I am struggling to get my Azure batch nodes to start within a Pool that is configured to use a virtual network. The virtual network has been configured with a service endpoint policy that has a "Microsoft.Storage" policy definition and it points at a single storage account. Without the service endpoints defined on the virtual network the Azure batch pool works as expected, but with it the following error occurs and the node never starts.
I have tried creating the Batch account in both Pool allocation modes. This did not seem to make a difference, the pool resizes successfully and then the nodes are stuck in "Starting" mode. In the "User Subscription" mode I found the start-up error because I can see the VM instance in my account:
VM has reported a failure when processing extension 'batchNodeExtension'. Error message: "Enable failed: processing file downloads failed: failed to download file[0]: failed to download file: unexpected status code: actual=403 expected=200" More information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot
From what I can determine this is an Azure VM extension that is running to configure the VM for Azure Batch. My base image is Canonical, ubuntuserver, 18.04-lts (batch.node.ubuntu 18.04). I can see that the extensions is attempting to download from:
https://a52a7f3c745c443e8c2cac69.blob.core.windows.net/nodeagentpackage-version9-22-0-2/Ubuntu-18.04/batch_init-ubuntu-18.04-1.8.7.tar.gz (note I removed the SAS token from this URL for posting here)
there are 8 further files that are downloaded and it looks like this is configuring the Batch agent on the node.
The 403 error indicates that the node cannot connect to this storage account, which makes sense given the service endpoint policy. It does not include this storage account within it and this storage account is external to my Azure subscription. I thought that I might be able to add it to the service endpoint policy, but I have no way of determining what Azure subscription it is part of it. If I knew this I thought I could add it like:
Endpoint policy allows you to add specific Azure Storage accounts to allow list, using the resourceID format. You can restrict access to all storage accounts in a subscription
E.g. /subscriptions/subscriptionId (from https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoint-policies-overview)
I tried adding security group rules using service tags for Azure storage, but this did not help. The node still cannot connect and this makes sense given the description of service endpoint policies.
The reason for my interest in this is the following post:
[https://github.com/Azure/Batch/issues/66][1]
I am trying to minimise the bandwidth charges from my storage account by using service endpoints.
I have also tried to create my own VM, but I am not sure whether the "batchNodeExtension" script is run automatically for VMs that you're using with Batch.
I would really appreciate any pointers because I am running out of ideas to try!
Batch requires a generic rule for all of Storage (can be regional variant) as specified at https://learn.microsoft.com/en-us/azure/batch/batch-virtual-network#network-security-groups-specifying-subnet-level-rules. Currently it is mainly used to download our agent and maintain state/get information needed to run tasks.
I am facing the same problem with Azure Machine Learning. We are trying to fight data exfiltration by using the SP Policies in order to prevent sending the data to any non-subscription storage accounts.
Since Azure ML Computes depends on the Batch service, we were unable to run any ML compute if the SP policy is associated to the compute subnet.
Microsoft stated the follwoing:
Filtering traffic on Azure services deployed into Virtual Networks: At this time, Azure Service Endpoint Policies are not supported for any managed Azure services that are deployed into your virtual network.
https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoint-policies-overview#scenarios
I understand from this kind of restriction, that any service that use Azure Batch (which almost all services in Azure?) cannot use the SP Policy which make it useless freature...
Finally we endup by removing the SP policy completly from our network architecture and considered it only for scenarios where you to want to restrict customers to access specific storage accounts.

Copy cannot be started due to gateway was offline, how to run my pipelines?

We have SSIS packages running on a server with sql server agent. However, we want to move this job to a cloud solution. One solution is to use a powershell script, but we also tried to replace SSIS with Azure Data Factory.
However, as stated above, the gateway requires my computer to be online and can't be installed on a domain controller (server). Does this mean that data factory cannot be used to fill our database at night (when the pc's are shutdown) and is therefore not a good replacement for SSIS?
Are there any other solutions for this problem?
The Data Gateway can be installed on any computer in your network that has access to the SQL Server. Obviously both the gateway and the SQL server need to be up at the time the activity runs.

Resources