I have 2 Azure vm's (Linux) being load balanced by a public Azure Cloud Service. Both instances show in the Azure Management portal for the same cloud service. I want to take down one instance and perform some maintenance. However since the instance is still showing even though the VM has been shutdown it the Cloud Service is still directing traffic to it. How do I delete an instance from the Cloud Service or stop the Cloud Service from directing traffic to a particular VM instance? Then afterwards how does one re-associate an existing VM to that service? (i.e. change from one Cloud Service to another).
Note: SSH works into the VM but other ports used by the VM are not working acting like they are trying to go to the other VM even though the correct endpoints are created to the active VM.
The purpose of a port probe in a load-balanced set is for the load balancer to be able to detect whether or not a VM is able to accept traffic. When configuring the load-balanced endpoint you can specify a webpage or a TCP endpoint for the probe - and this should be present on each instance. Traffic will be directed to the VM as long as the webpage returns 200 OK or the TCP endpoint accepts the connection when the load balancer probes. You can specify the time interval between probes and the number of probes that must fail before the endpoint is deemed dead and should be taken out of rotation (defaults are every 15 seconds and 2 probes).
You can take a VM out of load-balancer rotation by ensuring that the configured probe page returns something other than 200 OK and then bring it back into rotation by having it once again send a 200 OK.
When I have needed to keep my webservice running and returning status of 200 I have had to resort to removing the endpoint from the load-balanced set. It is pretty simple to do but it does take usually a minute for the webPortal to remove the endpoint and then again once you recreate the endpoint to put it back in the set.
Related
what's a good native Azure service that I can use for Active/Passive load balancing on VM's with private endpoints? The application on these servers will cause issues if more than one node is active and we'd. The VM are in availability zones with connected via private endpoints only. We need connection to TCP ports so it's not just port 443 access.
Thank you
what's a good native Azure service that I can use for Active/Passive
load balancing on VM's with private endpoints?
You can use Azure Traffic Manager with the Private Endpoints for load balancing the Azure VM.
if you are using Azure Traffic Manager then you need to remember one thing that Health Monitor feature is not available for Azure Traffic manager with private End Points
Understanding Traffic Manager probes
Traffic Manager considers an endpoint to be ONLINE only when the probe receives an HTTP 200 response back from the probe path. If you application returns any other HTTP response code you should add that response code to Expected status code ranges of your Traffic Manager profile.
A 30x redirect response is treated as failure unless you have specified this as a valid response code in Expected status code ranges of your Traffic Manager profile. Traffic Manager does not probe the redirection target.
For HTTPs probes, certificate errors are ignored.
The actual content of the probe path doesn't matter, as long as a 200 is returned. Probing a URL to some static content like "/favicon.ico" is a common technique. Dynamic content, like the ASP pages, may not always return 200, even when the application is healthy.
A best practice is to set the probe path to something that has enough logic to determine that the site is up or down. In the previous example, by setting the path to "/favicon.ico", you are only testing that w3wp.exe is responding. This probe may not indicate that your web application is healthy. A better option would be to set a path to a something such as "/Probe.aspx" that has logic to determine the health of the site. For example, you could use performance counters to CPU utilization or measure the number of failed requests. Or you could attempt to access database resources or session state to make sure that the web application is working.
If all endpoints in a profile are degraded, then Traffic Manager treats all endpoints as healthy and routes traffic to all endpoints. This behavior ensures that problems with the probing mechanism do not result in a complete outage of your service.
Else you can even use Azure Front door premium as it supports traffic routing to private link. by which you need to use application gateway/load balancer as backend private IP's and front door as the routing methods.
I deployed a new ACI using the option 'mcr.microsoft.com/azuredocs/aci-helloworld:latest (linux)' from the azure portal. Once that's deployed and running, visiting the FQDN for the container load up page below. Makes sense.
However, if I stop the ACI instance and wait a few minutes I get the following page for about the next 15 minutes. Except mine says functions 3.0. After those 15 minutes, I then get a DNS probe error message which makes sense. If my ACI is stopped why is there a function app responding to requests?
I can only speculate, but this may still be valuable information for you.
The 15 minutes gap
The 15 minutes gap sounds very much like DNS caching. When I deploy a container instance in West Europe region with hostname "my-important-container" and a public IP, I get a publicly available DNS record for it like this:
my-important-container.westeurope.azurecontainer.io
In this case, DNS record creation is done by the Azure platform for you. Microsoft engineers have probably set 15 minutes caching as a default value.
Creating a DNS record by hand, you can specify the number of seconds for which it will be cached in the global network of DNS servers, so that they don't have to resolve it using the authority server, every single time someone uses that name to access a web service. 15 minutes caching provides the ability to serve only 1 request instead of 1000, if there are 1000 requests to a website within a 15 minute time windows (from the same area, using the same non-authoritative server).
If you want to experiment with DNS caching, it is very easy using Azure. For exmaple, using Azure DNS Zones, or if you don't want to buy a domain, you can use Azure Private DNS Zones on private VNET and see how caching works.
The "Function app is up and running" phenomenon
This implies that Azure is hosting Container Instances on a common serverless platform together with Azure Functions. That IP address at that time is allocated to a serverless instance, but of course you have removed/stopped your container at that time, so the underlying layer is responding with a default placeholder message. It's kind of a falsy response at that time, because you are not actually using Functions, and your serverless workload is not actually 'up and running' at that time.
Microsoft could prevent this issue by injecting information on the context, while creating a serverless instance. That way, the instance would be aware if currently it is serving a container instance or a function, and would be able to respond with a more informative placeholder message if configured correctly.
We have a Web App hosted on multiple (scaled-out) Premium Dv2 instances using Azure App Service.
Occasionally our application fails to start-up after a restart. This will result in a 503 Service Unavailable response for requests to that instance. But when this happens, requests still get routed evenly between this instance and the healthy instances.
Shouldn't the load-balancer rather route requests away from this instance? Can this be achieved?
NOTE: We are not using API Management or App Service Environment.
Shouldn't the load-balancer rather route requests away from this instance?
Azure Load Balancer can probe the health of the various server instances. When a probe fails to respond, the load balancer stops sending new connections to the unhealthy instances.
AFAIK, before you get 503 error, it still get routed to that instance.
But when this happens, requests still get routed evenly between this instance and the healthy instances.
I found the following possible scenes that you still get routed when the instances are unhealthy.
1.The timeout and frequency values set in SuccessFailCount determine whether an instance is confirmed to be running or not running. In the Azure portal, the timeout is set to two times the value of the frequency.
2.The HTTP server doesn't respond at all after the timeout period. Depending on the timeout value that is set, multiple probe requests might go unanswered before the probe gets marked as not running.
3.If you have web roles that use w3wp.exe, you also get automatic monitoring of your website. Failures in your website code return a non-200 status to the load balancer probe.Consequently, the load balancer doesn't take that instance out of rotation.
4.The TCP server doesn't respond at all after the timeout period. When the probe is marked as not running depends on the number of failed probe requests that were configured to go unanswered before marking the probe as not running.
For more detail, you could refer to this article.
We have a third party product run as a windows service, expose as a web service. The goal is to dynamically provision the service instances in business peak hours.
Just to run the thought with you guys,
- I've already deployed the service on multiple vm, configured the vm in the same cloud service Availability Sets, configured azure to turn on/off vm instances based on cpu use
- I am to configure a separate vm, run iss arr there, add points to the endpoints on the vm configured above, with the hope ARR balanced the requests to the back-end vm dynamically
Will this work? What's the best practice for the IaaS scale? Any thoughts? Truly appreciate the input.
If I have understood correctly, you just need to use the built in load balancer of the cloud service. Create a load balance set for your endpoint. For example, if you want to balace the incoming traffic to port 80 in your application all you have to do is to create a LB-set for this port and configure this set to all the VMs in the Cloud Service.
The Azure Load Balancer randomly distributes a specific type of
incoming traffic across multiple virtual machines or services in a
configuration known as a load-balanced set. For example, you can
spread the load of web request traffic across multiple web servers or
web roles.
Configure a load-balanced set
Azure load balancing for virtual machines
No matter if VMs are up or down, once it turns on and if the endpoint is configured in the same LB-set, it will automatically start responding to requests once port 80 is online (IIS started and is returning STATUS 200 OK, for example). So, answering your question: yes, it will work with auto-scale or manuallying turning on/off vms.
I have 3 ubuntu Vm's configured (They are not websites but configured as a cloud service) with http endpoints in a load-balanced set. This is working really well with the probe configured to check every 15 secs on port 80. When the url does not return a status of 200 it gets removed from the load-balanced set until the next time that it returns a status of 200. I have used the webPortal to configure the endpoints and probe settings.
When the unhealthy instances is taken out of the load-balanced set I would like to be informed (via email preferable) of the situation so that I can fix the issue.