How do I connect an Azure self-hosted Integration Runtime to a Data Factory private endpoint? - azure

I have set up a self-hosted Integration Runtime on an on-premises Windows server and have successfully connected it to a data factory instance using Public endpoint (found in Networking option in the DF). However, when I try to connect it to a DF with a Private endpoint option, I get the error message: "Integration Runtime (self-hosted) Node is not registered". It is definitely not an issue with the authentication key as it works using the Public endpoint for the DF.
Can anybody please point me in the direction of the steps needed to allow the on-premises IR to connect to the private endpoint in DF? I cannot find an adequate detailed description online so any pointers will be appreciated.
Here’s my current setup:
1 vnet, 2 subnets
1 private endpoint to df, 1 private endpoint to storage
1 private DNS link to core windows, 1 private DNS link to data factory
I have NOT created a virtual network gateway as don’t think it’s required
I’m fairly new to Azure and have only basic knowledge of networking principals.

I believe that connecting on-premises Windows server from your Azure virtual network absolutely needs set up an Internet Protocol security (IPsec) VPN (site-to-site) connection or an Azure ExpressRoute (private peering) connection.
Technically, by using Azure Private Link, you can connect to various platforms as a service (PaaS) deployments in Azure via a private endpoint. A private endpoint is a private IP address within a specific virtual network and subnet. This allows us to access the self-hosted Integration Runtime in our Azure VNets by using a private endpoint without a virtual network gateway. Here are detailed steps for the description of this scenario that securing Azure Data Services is using a VNET and Private Endpoints.
However, we need a VPN connection to secure the connection between the on-premise network and Azure VNet because a private IP address cannot be routed through the public Internet.
From official document:
You can also connect an on-premises network to your virtual network by
setting up an Internet Protocol security (IPsec) VPN (site-to-site)
connection or an Azure ExpressRoute (private peering) connection.

Related

Azure Data Factory route to external SFTP without SHIR

ADF with AIR connected to external SFTP server to grab file. This SFTP have firewall that have to whitelist IP address where connection come from. Trafix should be routed via Managed NET private endpoint and then some how to NAT gateway and public IP.
Is it possible to implement such thing ?
We want to avoid using any IaaS/VMs.
• Yes, you surely can implement such thing in your Azure environment considering that the SFTP server from which you have to grab a file is also hosted in Azure protected by an Azure firewall or even if it is not and if the SFTP server is hosted in on-premises protected by a hardware firewall, then also the setup for implementing this is quite possible.
For the above purpose, you will have to deploy a private link service and a load balancer with public IP address enabled in your tenant. Also, ensure that the load balancer is hosted in a virtual network subnet created in the same region and resource group where Azure Data Factory with Azure Integrated runtime is deployed. Since you have already deployed the ADF with AIR in the managed virtual network with private endpoints, it is possible for this managed private endpoint in an approved state to send traffic to a given private link resource.
• Once you create a private endpoint connection, it is in a ‘Pending’ state unless approved by the resource owner. If the owner approves the connection, a private link is established. Otherwise, the private link won't be established. In either case, the managed private endpoint will be updated with the status of the connection. Once, the private endpoint is setup, then you can setup the load balancer with the SFTP server’s public IP address as the backend and the private endpoint of the ADF with AIR as the frontend for you to fetch the file from the SFTP server and initiate the connection from the ADF. Kindly refer to the below reference diagram and documentation link given for more information in this regard as to how to implement the said setup using the above stated Azure resources.
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-managed-virtual-network-on-premise-sql-server
In the above link, you have to connect to an on-premises SQL server instead of a SFTP server and the ADF managed private endpoint is shown connecting to another allowed Azure service resources for which instead, in your case, you can directly connect it to the Private Link service deployed in as stated earlier.
I found two options to do this:
Run SHIR at VMSS with custom extention installation.
Run SHIR at Windows Container at AKS. However AKS use VMSS behind, so VMSS looks more simple for some cases.
My goal was avoid using IaaS services to connect to external (non-Azure) SFTP server.

Is it possible to use the Azure Integration Runtime for onpremise data sources?

In most cases it is always the installation of a self-hosted integration runtime in the onpremise network. Or on another host machine (e.g. vm or even vm in the cloud) that can reach the onpremise data source (network settings, firewall rules). I am curious if it is possible to use the Azure integration runtime (for the managed service experience) to connect to the onpremise data sources. Is it possible if I set up a site-to-site with the azure vnet? Or express route? Do i have to expose the endpoints of the onpremise datasource? And how can the azure integration runtime can connect to it it is not in a vnet?
I am not strong in networking but i read all the documentation. I just cannot get my head around it.
It's possible to use Azure Integration Runtime for on-premise data sources because
By enabling Managed Virtual Network, Azure Integration Runtime supports connecting to data stores using private link service in a private network environment. Refer here.
By using a private link, you can connect to various platforms as a service (PaaS) deployments in Azure via a private endpoint. With a private endpoint, you can connect to PaaS via a private IP address within a specific virtual network and subnet.
By setting up an Internet Protocol security (IPsec) VPN (site-to-site) connection or an Azure ExpressRoute (private peering) connection, you can connect an on-premises network to your Azure virtual network.
Also, there is a similar STARTED feature request here for more reference.

Why we have a lot of connections between app services from same resource group?

We have three App Services in Azure (API1, API2, API3).
API2 is getting data from CosmosDB.
API3 is getting data from other CosmosDB.
Main API1 calls API2 to get some data. Then using this data calls API3.
We have poor performance of API1 and we are trying to figure out why. We noticed that there are too many connections in metrics. Also we have issue with SNAT ports.
We tried to setup these APIs to the same VNet but it doesn't help and we are not sure how to set up it correctly.
Do you have any idea what we should setup?
UPDATE:
Seems like VNet helped us with SNAT ports issue but performance of API was still very poor.
What really helped us was change from Windows to Linux. When all APIs runs on the Linux servers we don't see any connections anymore.
Not sure what's specific configurations about three APIs on your side. If you want to use IP from Vnet instead of an external one, you can use a separate environment ASE.
Alternatively, you can use a private link to the app service. By using Private Endpoint, you can connect privately to your web app. Read Connect privately to a web app by using Azure Private Endpoint (Preview).
Today, you can secure this connection using VNet service endpoints
which keep the traffic within the Microsoft backbone network and allow
the PaaS resource to be locked down to just your VNet. However, the
PaaS endpoint is still served over a public IP address and therefore
not reachable from on-premises through Azure ExpressRoute private
peering or VPN gateway. With today’s announcement of Azure Private
Link, you can simply create a private endpoint in your VNet and map it
to your PaaS resource (Your Azure Storage account blob or SQL Database
server). These resources are then accessible over a private IP address
in your VNet, enabling connectivity from on-premises through Azure
ExpressRoute private peering and/or VPN gateway and keep the network
configuration simple by not opening it up to public IP addresses.
For more information, you could read here.

Unable to connect Azure Function with Azure SQL using private endpoint

I've created a SQL Server and then created a private link with my TESTVNET/SUBNET1 with private IP 10.1.1.4. I've now disabled Public access for the SQL server.
I have an Azure function running on App Service which I've VNET Integrated with VNET/SUBNET2.
Subnet 2 shows it's delegated to server farms. (also if someone can explain what does delegate to means, I found I cannot create any VM in that subnet as well, probably it's just can't be useful for any other purpose)
Now when my azure function tries to connect to DB. it fails with below error:
2020-08-30T15:25:45.216 [Error] Unhandled rejection SequelizeAccessDeniedError: Cannot open server "10.1.1.4" requested by the login. The login failed.
However, if I give the public FQDN it gives me below error.
2020-08-30T15:29:43.654 [Error] Unhandled rejection SequelizeAccessDeniedError: Reason: An instance-specific error occurred while establishing a connection to SQL Server. The public network interface on this server is not accessible. To connect to this server, use the Private Endpoint from inside your virtual network.
Here the Private DNS created by Private endpoint should have been ideally used to get the private IP of the SQL database, but it seems the function is not using the private DNS probably because not running in an isolated environment.
Now in my Azure function Application settings, I've added WEBSITE_VNET_ROUTE_ALL =1 which should mean that all the requests should be routed to VNET. So now If I enable public access internet, and allow Azure services to access DB (I think azure added the public IP by default). The function gets connected to the DB.
Now I want to understand where I'm going wrong and why is the private endpoint connection not working. Any help is appreciated.
In the DB firewall settings, I've allowed traffic from below to subnets:
Network Configuration
TESTVNET: 10.1.0.0/16
SUBNET 1: 10.1.1.0/24
SUBNET 2: 10.1.2.0/24
I've disabled Service endpoint for SQL in both SUBNET 1 and SUBNET 2. My NSG has default settings i.e.
AllowVnetInBound, AllowAzureLoadBalancerInBound, DenyAllInBound
AllowVnetOutBound, AllowInternetOutBound, DenyAllOutBound.
Since my private link has a private IP present in the same VNET I don't think NSG should have any impact.
New to Azure, testing things out. Thank you for your patience.
To make Azure Function connect to a private endpoint you will need to use VNET integration.
After your app integrates with your VNet, it uses the same DNS server that your VNet is configured with. By default, your app won't work with Azure DNS Private Zones. To work with Azure DNS Private Zones you need to add the following app settings:
WEBSITE_DNS_SERVER with value 168.63.129.16
WEBSITE_VNET_ROUTE_ALL with value 1
These settings will send all of your outbound calls from your app into your VNet in addition to enabling your app to use Azure DNS private zones. Reference here.
Then you could set up Private Link for Azure SQL Database. You can create an Azure VM from a new subnet in the same VNet to check connectivity using SQL Server Management Studio (SSMS). If you enable the private endpoint, you should get a client private IP from that Azure VM to connect the Azure SQL database with its FQDN.
For more information, you could read private endpoint VS service endpoint in this blog.

Connecting Azure public services from on-premise

How does network traffic flow from on-premise to the Azure public service if there's a site-to-site VPN-tunnel between Azure and on-premise?
Does the VPN-connection route traffic only to VNET? What if there's a service that does not reside in the VNET? Does the traffic still enter the VPN-tunnel or it goes straight to public network from the on-premise?
Sorry if this is a bit vague question, but I'm trying to understand how the traffic flows from on-prem to Azure Storage/Azure SQL Database/Azure Data Factory/other public services
From what is VPN gateway,
A VPN gateway is a specific type of virtual network gateway that is
used to send encrypted traffic between an Azure virtual network and an
on-premises location over the public Internet.
you will see that VPN-connection route traffic only to VNETs. A Site-to-Site VPN gateway connection is used to connect your on-premises network to an Azure virtual network over an IPsec/IKE (IKEv1 or IKEv2) VPN tunnel.
Generally, without any reverse proxy, traffic flows from on-prem to Azure Storage/Azure SQL Database/Azure Data Factory/other public services that do not reside in the VNET, will be directly routed with a public DNS name or IP address of those services.
Especially, a preview feature Private Link allows you to connect to various PaaS services in Azure via a private endpoint. With Private Link, customers can enable cross-premises access to the private endpoint using ExpressRoute, private peering, or VPN tunneling. Read Using cases of Private Link for Azure SQL Database and Using Private Endpoints for Azure Storage (Preview)
Hope this helps.

Resources