Azure Traffic Manger Priority Routing is not working - azure

I created an Azure Traffic manager and routing with Priority.As per this
The Traffic Manager profile contains a prioritized list of service
endpoints. By default, Traffic Manager sends all traffic to the
primary (highest-priority) endpoint. If the primary endpoint is not
available, Traffic Manager routes the traffic to the second endpoint.
If both the primary and secondary endpoints are not available, the
traffic goes to the third, and so on
My Traffic Manager monitoring
Low Priority
High Priority
I tried to increase the priority and decrease the priority but there is no change.
Still, you can see that traffic manager pointing towards the teststatic site alone
Another question from the above doc
If the primary endpoint is not available
Here what is mean by not available? As I'm using Azure Web Apps for my testing purpose, So I thought When Stopping my webapp could be not available. But I'm wrong, Even though I stop the web app, still, the traffic manager pointing the stopped web app. So I'm confused about what is mean by not available here?

In your screenshots, the test endpoint monitor status is always a Degraded status. This indicated that the endpoint is not included in DNS reponses and does not receive traffic. So the Traffic Manager is still pointing towards the teststatic site alone. Traffic Manager considers an endpoint to be ONLINE only when the probe receives an HTTP 200 response back from the probe path If the monitoring protocol is HTTP or HTTPS. Any other non-200 response is a failure.
You need to troubleshoot degraded state on Azure Traffic Manager and see Traffic Manager shows monitor status is degraded – Resolution
what is mean by not available here?
The traffic manager chooses an endpoint based on the status of each endpoint (disabled endpoint are not returned), the current health of each endpoint and the chosen traffic-routing method. If the endpoint is not available, that is to say the endpoint is not included the DNS response or is an unhealthy endpoint. But an exception to this is if all endpoints are degraded, in which case all of them will be considered to be returned in the query response. You can get more details from endpoint monitor status.
An endpoint is unhealthy when any of the following events occur: A
non-200 response is received (including a different 2xx code, or a
301/302 redirect); Request for client authentication; Timeout (the
timeout threshold is 10 seconds; Unable to connect.
Besides, Type ipconfig /flushdns to flush the DNS resolver cache when you verity the Traffic Manager Settings.

Related

active passive load balancing in Azure for VM

what's a good native Azure service that I can use for Active/Passive load balancing on VM's with private endpoints? The application on these servers will cause issues if more than one node is active and we'd. The VM are in availability zones with connected via private endpoints only. We need connection to TCP ports so it's not just port 443 access.
Thank you
what's a good native Azure service that I can use for Active/Passive
load balancing on VM's with private endpoints?
You can use Azure Traffic Manager with the Private Endpoints for load balancing the Azure VM.
if you are using Azure Traffic Manager then you need to remember one thing that Health Monitor feature is not available for Azure Traffic manager with private End Points
Understanding Traffic Manager probes
Traffic Manager considers an endpoint to be ONLINE only when the probe receives an HTTP 200 response back from the probe path. If you application returns any other HTTP response code you should add that response code to Expected status code ranges of your Traffic Manager profile.
A 30x redirect response is treated as failure unless you have specified this as a valid response code in Expected status code ranges of your Traffic Manager profile. Traffic Manager does not probe the redirection target.
For HTTPs probes, certificate errors are ignored.
The actual content of the probe path doesn't matter, as long as a 200 is returned. Probing a URL to some static content like "/favicon.ico" is a common technique. Dynamic content, like the ASP pages, may not always return 200, even when the application is healthy.
A best practice is to set the probe path to something that has enough logic to determine that the site is up or down. In the previous example, by setting the path to "/favicon.ico", you are only testing that w3wp.exe is responding. This probe may not indicate that your web application is healthy. A better option would be to set a path to a something such as "/Probe.aspx" that has logic to determine the health of the site. For example, you could use performance counters to CPU utilization or measure the number of failed requests. Or you could attempt to access database resources or session state to make sure that the web application is working.
If all endpoints in a profile are degraded, then Traffic Manager treats all endpoints as healthy and routes traffic to all endpoints. This behavior ensures that problems with the probing mechanism do not result in a complete outage of your service.
Else you can even use Azure Front door premium as it supports traffic routing to private link. by which you need to use application gateway/load balancer as backend private IP's and front door as the routing methods.

Azure Traffic Manager monitoring status is 'degraded' for Azure Application Gateway

Azure Traffic Manager monitoring status for Endpoints(Azure Application Gateway/WAF)is degraded. Web app behind the Application gateway is healthy and able to access through Traffic Manager.
Any help will be appreciated.
thanks.
You can press F12 to check the Network status of your webpage to determine what status is returned. You also can use tools to show the HTTP status code return from the probe URL. An endpoint is unhealthy when any of the following events occur:
A non-200 response is received (including a different 2xx code, or a 301/302 redirect) or non-any of the responses configured in the Expected status code ranges.
Request for client authentication
Timeout
Unable to connect
Also, If all endpoints in a profile are degraded, then Traffic Manager treats all endpoints as healthy and routes traffic to all endpoints. This behavior ensures that problems with the probing mechanism do not result in a complete outage of your service.
You could verify if any of the above events happen on your side and modify the health probe configuration. Such as the health probe path should have enough logic to identify the endpoint is up or down. Edit expected status code ranges and probe timeout. See more information about configuring endpoint monitoring.
I got the solution, this issue comes only if your listener is Multisites. If the listener is Basic for app gateway then it works as expected.
The solution, need to set custom header settings against the hostname. Like below:
hostname:web1.com,newheader:web2.com
You also need to set the custom status code range.
Refer: https://sakaldeep.com.np/1156/troubleshooting-azure-traffic-manager-monitoring-status-is-degraded-for-azure

Azure Traffic Manager make sure no traffic is flowing after disabling endpoint

I am trying to find a powershell command which helps find out a way that there is no open connections or any traffic is flowing to endpoint1 or confirm traffic is moving smoothly to endpoint2 after disabling endpoint1:
$e[0].EndpointStatus = "Disabled"
Set-AzureRmTrafficManagerEndpoint -TrafficManagerEndpoint $e
Is there a command to do this? I am not able to find anything in google or should I use some wait command to wait for like a minute to flush out all open connections?
*Basically looking for a way to make sure all in-flight connections are drained from one endpoint before disabling it.
Traffic does not flow through your Traffic Manager instance. Therefore, the functionality you are asking for from Traffic Manager does not exist. Traffic Manager simply resolves DNS queries to an IP address of one of your endpoints using the routing method (priority, weighted, performance, etc) you configured it for.
After disabling an endpoint, you could still see traffic going to the disabled endpoint for a period of time measured by your Traffic Manager profile DNS TTL setting. For example, if you disable an endpoint at 3:01:00 and your DNS TTL setting is 90 seconds, then you could see traffic until 3:02:30 because that's how long it could take for any client's DNS cache to expire. One way to monitor this is through the Queries by Endpoint Returned metric described here. This should work in most cases. However, it's not 100%. Just because you disabled an endpoint in Traffic Manager won't stop a client that know's the IP address of your endpoint from calling it. You can decide whether or not this scenario is likely for your application and clients. So, to be absolutely certain there are no active clients using the endpoint, you will need some monitoring in place at the endpoint.
Finally, if you gracefully stop your web app, virtual machine, or other service hosting the endpoint you want disabled, then any active requests to your application will complete before the service shuts down, assuming your application completes requests in a reasonable time (a few seconds).
Documentation on how to test and verify your Traffic Manager settings is available here.

Azure Website doesnt detect Traffic Manager Change

I have an Azure website (website.mycompany.com) that uses a WCF service for some data. The WCF Service sits behind an Azure Traffic Manager (service.mycompany.com) running in "priority mode", with 2 instances of the service for failover handling. With priority mode, the primary always serves up the data first, unless it's unavailable. If unavailable, the 2nd instance will reply.. and so on down the line.
We've had a few instances recently where the primary endpoint for service.mycompany.com was offline. For "partnerships" who point to service.mycompany.com, they detected the switch and all was fine. Lately however, our own site (website.mycompany.com) does NOT detect the traffic manager switch, and the website has errors since the service fails to reply.
Our failover endpoint in these instances is up, and in the past the Azure website detected the switch, it's only recently we've encountered this issue. Has anyone experienced similar issues? Are there perhaps any DNS changes that we need to tweak in our Azure Website to help it detect TTL's?
Has anyone experienced similar issues?
Do you mean the traffic manager can't switch to another endpoint immediately?
Traffic manager works at the DNS level, here are the reasons why traffic manager can't switch immediately:
The duration of the cache is determined by the 'time-to-live' (TTL) property of each DNS record. Shorter values result in faster cache expiry and thus more round-trips to the Traffic Manager name servers. Longer values mean that it can take longer to direct traffic away from a failed endpoint.
The traffic manager endpoint monitor effects the response time. More information about how azure traffic manager works, please refer to the link.
The following timeline is a detailed description of the monitoring process.
Also we can check traffic manager profile using nslookup and ipconfig in windows. About how to vertify traffic Manager settings, please refer to the link.
By the way, because traffic manager works at the DNS level, it cannot influence existing connections to any endpoint. When it directs traffic between endpoints (either by changed profile settings, or during failover or failback), Traffic Manager directs new connections to available endpoints. However, other endpoints might continue to receive traffic via existing connections until those sessions are terminated. To enable traffic to drain from existing connections, applications should limit the session duration used with each endpoint.
I'm going to refer you to my answer here because while the situation isn't exactly the same, it seems like it could have the same solution. To summarize, I find it likely that you have a connection left open to the down service that isn't being properly closed. This connection is independent of TTL, which only deals with DNS caching, and as such bypasses Traffic Manager completely.

Do I need a Site Down page for Azure Traffic Manager in Priority Mode

I'm setting up an Azure Traffic Manager, in Priority Mode, for my website. I have a primary and a failover location, both being monitoring by a "FailoverMonitor.aspx" page - if any resources are down for the appropriate resource\region, I return a 500 error. I wanted to also make sure an error message was returned to the user if all locations were down.
In my testing, I decided to break both my primary (priority 1) and failover (priority 2), and in doing so, I saw that the primary location was served up.
This kind of surprised me, I half expected the site to not return anything at all.. but instead it served up a site that is considered to be in a "degraded" status.
I added a 3rd endpoint to the traffic manager that returns a "sorry we're down" page - but is this the intended methodology to return such a message? I just want to make sure I'm going through all intended steps and not misusing the service. Thanks!
When all the endpoints being monitored by Traffic Manager for a given profile are down, it makes a "best case effort" and responds as if all the endpoints are actually in an online state, instead of not returning any endpoint at all.
More details of this and other endpoint monitoring details can be found at: https://azure.microsoft.com/en-us/documentation/articles/traffic-manager-monitoring/
Relevant section copy pasted below:
What happens if all Traffic Manager endpoints (excluding endpoints with a Disabled or Stopped status) are failing their health checks, and show a Degraded status?
This most commonly is caused by an error in the configuration of the service (such as an access control list [ACL] blocking the Traffic Manager health checks), or an error in the configuration of the Traffic Manager profile (such as an incorrect monitoring path).
In this case, Traffic Manager makes a "best effort" attempt and responds as if all the Degraded status endpoints actually are in an online state. This is preferable to the alternative, which would be to not return any endpoint in the DNS response.
A consequence of this behavior is that if Traffic Manager health checks are not configured correctly, it might appear from the traffic routing as though Traffic Manager is working properly. However, in this case, endpoint failover will not happen if an endpoint fails, and this affects overall application availability. To ensure that this does not occur, it is important to check that the profile shows an Online status, and not a Degraded status. An Online status shows that the Traffic Manager health checks are working as expected.
I wanted to also make sure an error message was returned to the user if all locations were down.
Since Traffic Manager is a DNS only solution i'm not sure who's supposed to serve the "We're down" page..
A 3rd endpoint serving a static page should do the job.

Resources