Azure Application Gateway pointing to Azure CDN? - azure

Has anyone experience with pointing an Application Gateway to an Azure CDN? We can add internal and public services to our App Gateway but as soon as we point to an Azure CDN, then problems starts to arise.
Here is what we are seeing:
When adding a health probe pointing to the CDN it says everything is fine and it got a 200 reply.
When using the Connection Troubleshoot tool and pointing it to the CDN, it says it is reachable.
However, when checking the Backend Health, it says
The backend health status could not be retrieved. This happens when an
NSG/UDR/Firewall on the application gateway subnet is blocking traffic
on ports 65503-65534 in case of v1 SKU, and ports 65200-65535 in case
of the v2 SKU or if the FQDN configured in the backend pool could not
be resolved to an IP address. To learn more visit -
https://aka.ms/UnknownBackendHealth.
For one the backend health is not "unknown" but "unhealthy". And the tips there are not really useful. We don't have a blocking firewall and there are no NSGs.
And to make it even more confusing, it actually appears that the endpoint is functioning when accessing the App Gateway listener. But it is a bit sporadic and sometimes it doesn't work.
Any suggestion on how to debug this, as the tools available seems to indicate everything is fine until it is configured and the Backend Health says it is not?
Update:
It does in fact work if we use an IP from the CDN directly. Could indicate DNS issue however our DNS log does show that the App Gateway resolves the DNS.

Related

Azure Application Gateway: Cannot connect to backend server in

Due to a recent layoff I got bombarded to azure admin out of the blue. I am pretty new to this and haven't yet got the chance to follow an admin course.
Facing the following issue: We host a couple of websites on an Azure Windows Server VM running IIS. These are accessible through an application gateway with a public IP. I was asked to add two new listeners for a new part of the website. I created the appropriate targeting in the backend pool, created http and https settings and added the listeners and Rules. However, when browsing to the site, it throws a 502 error and when i check the backend health, it gives below error.
Cannot connect to backend server. Check whether any NSG/UDR/Firewall is blocking access to server. Check if application is running on correct port.
I opened up the appropriate inbound ports on the NSG of the AZ Web interface on the VM and also on the local firewall of the server hosting IIS. AFAIK there are no additional NSG rules on the application gateway.
What am i missing here? :s
I have extensive experience working with Application Gateways and I can tell you that a 502 Bad Gateway means something is definitely wrong at the backend or misconfigured AGW settings - that's what the error says, so nothing surprising. From my experience here are different scenarios I've faced for this error:
Backend server can't be reached due to an NSG Rule controlling access from the AGW subnet to the backend subnet.
Backend server can be reached but the port is not opened at the server's firewall.
Backend server can be reached, port is opened but application is not listening on those ports or application is not even running.
AGW listeners were misconfigured.
Here's what you can try:
First validate whether the Application and VM are fine by trying to access the application from another VM in the same subnet.
Next, try to get a VM in a different subnet and try to access the application, to mimic the AGW trying to connect to the backend. This will help you validate whether your NSGs are properly configured.
Lastly, revisit all the AGW settings and look for any misconfiguration in the listeners or other settings. (Added this based on your comments).
Taking this approach to troubleshooting will quickly help you identify which layer is causing the issue. Also, it would be a good practice to start documenting all AGW errors you get along your journey and also the remediation steps etc. This will help you tremendously in the future - this is not the last time you'll face issues with your AGW!
If you've checked your firewall issues and it's not solving the problem it could be user permissions on your VM.
I then ran the following command in ps and it sorted it for me.
** netsh http add urlacl url="http://*:{port}/" user="Everyone" **
A good test to see if this will work is if you can access your app using invoke-webrequest and using a localhost URI, but not using the server's NIC Private IP.
You'll also need to set your host address to use the wildcard in your config file.

Stopping Default IIS website causes Azure Application Gateway '502 Bad Gateway' Error for ALL websites in IIS

I'm having an issue with hosting multiple .NET websites on Windows Server/IIS and Azure Application Gateway.
We host multiple sites on a single Azure Windows VM running IIS, sitting behind Azure Application Gateway WAFv2. The VM is connected to App Gateway using a backend pool configured to point to the private IP of the VM, with the VNets peering configured between the App Gateway and VM VNets.
When I stop the default website in IIS, ALL websites then return a '502 Bad Gateway' error from Azure Application Gateway, and the backend health status changes to 'Unhealthy' for the backend pool where the VM resides.
Can anyone tell me why stopping the Default site would cause Application Gateway to error for all sites?
EDIT:
Screenshot of IIS bindings as requested
EDIT 2: Apparently I can't answer my own question, however after working through this with our CSP I have the answer. By default the App Gateway Backend Health check looks at the default IIS site. If you stop that then the Backend Health Check fails and goes Unhealthy. At this point APP Gateway will no longer even ATTEMPT to route any requests, regardless of URL to that backend pool.
If the application gateway has no VMs or virtual machine scale set configured in the back-end address pool, it can’t route any customer request and sends a bad gateway error.
Following the below command to show back-end address pool JSON result.
Get-AzApplicationGateway -Name "SampleGateway" -ResourceGroupName "ExampleResourceGroup"
Here is an official guideline for troubleshooting the 502 error.
https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-troubleshooting-502#overview
Also, here is a simple troubleshooter.
https://support.microsoft.com/en-us/help/4504111/azure-application-gateway-with-bad-gateway-502-errors
If I were to try and troubleshoot this, I would likely start with a brand new "test" instance of IIS and set up a reverse proxy on port 80 whose only job is to listen to incoming requests to port 80. Those requests would then be forwarded by your reverse proxy to your actual websites bound to different ports (e.g. 81, 82, 83, etc).
The idea here is to have all of your websites running on different ports such that when you stop one of your sites, the others continue to run without a problem.
Given your setup with up to 40 sites hosted in a single instance of IIS, I would only attempt this type of troubleshooting with a brand new "test" instance of IIS.
Create a brand new "test" instance of IIS.
Create a reverse proxy. To do this, create a new site and name it (e.g. rev-proxy) and give it a binding of port 80.
Deploy one actual site (e.g. myfirstsite). Give it a port binding of something other than 80 (e.g. 81).
Double click your rev-proxy site and add a URL Rewrite -> Inbound Rules -> Blank rule. See attached picture. Add a rule such that when a user requests "myfirstsite" that request is forwarded onto port 81. Use the "Test Pattern" button to test your pattern. The image is only a suggestion and your pattern should correspond to the URL your users are using to request your site and not necessarily to the name you give your site in IIS.
An example of a reverse proxy with a URL Rewrite
Found the answer to this after many months of messing about!
With Azure Application Gateway, the default health probes for each backend pool ping and look for a response on the configured IP address or FQDN in the backend pool itself.
In my case this is set to the local IP address of the Virtual machine (when I configured this 18-24 months ago I recall our Azure CSP telling me there was a bug with using the FQDN in the backend pool configuration).
This means, that when the Health Probe is attempting to communicate with the VM, the Default Website in IIS is the only thing configured to respond to any requests on this IP address.
If you stop the Default Site, the Health Probe gets no response to it's requests a the Backend Pool status goes to Unhealthy as you would expect.
The really interesting thing here is that as soon as the Backend Pool Health Probe status goes Unhealthy, Azure Application Gateway ceases to even attempt to route any traffic to the affected backend pool. Instead it immediately reports the 502 Bad gateway error, and will continue to do so until the Health Probe status is corrected and goes back to healthy!

Set kubernetes VM with nodeports as backend for application gateway

I have two VMs that are part of a kubernetes cluster. I have a single service that is exposed as NodePort (30001). I am able to reach this service on port 30001 through curl on each of these VMs. When I create an Azure application gateway, the gateway is not directing traffic to these VMs.
I've followed the steps for setting up the application gateway as listed in the Azure documentation.
I constantly get a 502 from the gateway.
In order for the Azure Application Gateway to redirect or route traffic to the NodePort you need to add the Backend servers to the backend pool inside the Azure Application Gateway.
There are options to choose Virtual Machines as well.
A good tutorial explaining how to configure an application gateway in azure and direct web traffic to the backend pool is:
https://learn.microsoft.com/en-us/azure/application-gateway/quick-create-portal
I hope this solves your problem.
So I finally ended up getting on a call with the support folks. It turned out that the UI on Azure's portal is slightly tempremental.
For the gateway to be able to determine which of your backends are healthy it needs to have a health probe associated with the HTTP setting (the HTTP Setting is the one that determines how traffic from the gateway flows to your backends).
Now, when you are configuring the HTTP setting, you need to select the "Use Custom Probe" but when you do that it doesn't show the probe that you have already created. Hence, I figured that wasn't required.
The trick to first check the box below "Use Custom probe" which reads "Pick hostname from backend setttings", and then click on custom probe and your custom probe will show up and things will work.

azure gateway as priority traffic manager

We have a set-up an azure gateway of tier WAF V2 (so it would be zone-redundant). It has a backend pool containing 2 WebApps -AppServices (supposedly a Primary and a Secondary).
The idea behind it was to use the gateway similarly to priority traffic manager: Routing usually to the primary WebApp, and only routing to the secondary WebApp in case the first one goes down.
The Problem is that the only way I found to do that is to order the rules associated with the listeners of the backend pool (because I believe azure prioritizes them according to the order they are listed). But given that both Apps are in the same backend pool, Im unsure of how to do that.
So now the gateway randomly routes to either the first or second WebApp.
Any advice or suggestions would be much appreciated,
Thank you
Note: Also we have tried setting a traffic manager in between the gateway and the WebApps, but the gateway keeps connecting to the primary WebApp even when its down and its probe becomes of health status unknown.
Application Gateway is a layer 7 load balancer, which means it works with web traffic only (HTTP/HTTPS/WebSocket). It supports capabilities such as SSL termination, cookie-based session affinity, and round robin for load balancing traffic. This indicated that the application gateway frontend randomly distribute the incoming traffics to the endpoint if both endpoints are healthy. So you could see the gateway randomly routes to either the first or second WebApp. See the application gateway FAQ. The app gateway does not work like priority-based traffic manager which always requests to the primary web app unless the primary web app is unhealthy.
About the health status unknown, the most common reason is that access to the backend is being blocked by an NSG or custom DNS. Ref: Troubleshooting bad gateway errors in Application Gateway

Azure private DNS zone resolving

We have a private DNS zone setup for the zone project.local. For app service instances living in an app service environment, each service has its own record pointing to the load balancer in front of the service (so all have the same IP).
We have an App Gateway instance linked to a public IP in front to make this all publiccally available. The gateway is available via a public URL and routes the request to the load balancer.
Now what we see is the following:
From external, everything is fine. We can get to the services using the external URL, gateway forwards it and all is well.
From internal, we want to use the internal DNS address set in the private zone. This is not working, calls from service to service throw an error stating that the host URL could not be resolved.
When I log into a VM in the same vNET or use the Kudu console, I'm able to resolve the DNS address to the correct IP. What I do notice is that when using nslookup, it says it's getting a non authoritative answer.
It's very hard to get any more information for debug purposes. We're not sure why resolving isn't working as per documentation these records should work for all of the components in the same vNET. The authoritative error might be related, but again: not sure. So any ideas on what else to check would be highly appreciated.
Disclaimer: I also have a support ticket open for the same question, but wanted to put this out there to see if there's anyone else who might have encountered the same since this is pretty new tech.
Azure DNS Private Zones are able to resolve names between VMs and Cloud services. It does not look like it can be used by Azure Web Apps or Azure App Services at this time. 
You can see more information on name resolution for resources in Azure Virtual Networks Here.
If you would like to request this feature be added to DNS Private Zones, you can leave your feedback Here.

Resources