How are Azure ASE front ends themselves load balanced? - azure

I'm researching Azure App Services and App Service Environments. I can see that the "front end" acts as a load balancer for the workers. I can also see that there is a default number of 2 front ends, with more being added as the number of workers increase.
My question is, if the front ends act as a load balancer for the workers, what is deciding which of the multiple front ends serves a request? I'd always assumed a load balancer would need to be single instance or you'd end up with the same problem that was set out to solve.
As a follow up question, I'm also curious how the load is balanced to the workers? Is it simple round robin?

The front end is a layer seven-load balancer, acting as a proxy, distributing incoming HTTP requests between different applications and their respective Workers. Currently, the App Service load-balancing algorithm is a simple round robin between a set of servers allocated for a given application.
refer: https://msdn.microsoft.com/en-us/magazine/mt793270.aspx?f=255&MSPPError=-2147217396

Related

Does Azure (Standard) load-balancing require two or more nodes in backend pool?

I'm configuring/testing Azure (Standard) load balancer, currently with a backend pool that has a single VM; in the future, additional VMs will be added.
With only a single VM in the BP, I assumed my app can still be configured to use the LB. However, I'm finding that the app is not able to connect to the VM in the BP e.g. winhttp timeout (12002).
The only reason I can think of as to why the LB is not sending traffic to the VM is because maybe there is an unwritten requirement that a backend pool is required to have at least two VMs/nodes. I cannot find documentation that confirms or denies.
Of course I can just test myself by adding a second VM to the BP, but not quite ready to do that yet. So thought I'd ask
FYI - the LB has two backend pools: #1 has two VMs for that component of the app, #2 has one VM for that component of the app.
#1 works fine; the LB is spreading the load across both VMs.
#2 does not work
Just really wanting to know if Azure LB can work when the backend pool has a single node, or are two or more nodes required.
Any thoughts/details on this topic?
Just really wanting to know if Azure LB can work when the backend pool
has a single node, or are two or more nodes required.
As far as I know, you can target a single VM to the backend pool. There are SKU comparison.
For example, I have a single VM that host a default website with port 8080, then I can configure it like this,
Backend pool setting
Health probes
Load balancer rules
Access the backend website via load balancer public IP address
For the error message, you may check if your configuration is well and read troubleshoot Azure Load Balancer for more details.

High concurrency system on Google App Engine

Here is my situation.
I have a project hosted on Google Cloud, more specifically GAE (NodeJS) and Firestore.
I have a queue stored on Firestore that it could be up to 30 - 40k entries.
Each entry is basically an object with which I'll have to make an api call to an external service.
That external service allows only 10 requests/s for one IP.
At the moment I take batches of 10 and make for each one an api call, but it's to slow.
I already tried to instantiate multiple instances of the GAE service, but I still hit the limitation ( the instances use the same ip ?! ).
Another option would be to move the making of the api call in a Cloud Function and hit it there, but I think that I would bet the same outcome as with the GAE instances.
So, what do you think ?
Many thanks!
In my opinion, the requests per second per IP limit is put in place to throttle the overall amount of incoming requests and gaming this rule may cause issue to that service. The best way to handle this situation is either to get a paid subscription or to discuss the issue directly with the service provider.
Regarding the App Engine instances and IP addresses the short answer is:
No, GAE instances don't have their own dynamic IPs.
For more reference you can confirm it in the FAQ for App Engine:
App Engine does not currently provide a way to map static IP addresses to an application. In order to optimize the network path between an end user and an App Engine application, end users on different ISPs or geographic locations might use different IP addresses to access the same App Engine application. DNS might return different IP addresses to access App Engine over time or from different network locations.
tcptraceroute to a google service shows one of these points:
lga34s14-in-f14.1e100.net
According to the description of Google Edge Network:
Our Edge Points of Presence (PoPs) are where we connect Google's network to the rest of the internet via peering. We are present on over 90 internet exchanges and at over 100 interconnection facilities around the world.
To sum it up: your application should exit the Google's network from the Edge Point closest to it's target it would make sense that it's always the same point with the same IP and from the amount of the services and the client applications GCP hosts you can expect a reverse proxy being used by Google.

How do you set up Azure load balancing for micro-services?

We've got an API micro-services infrastructure hosted on Azure VMs. Each VM will host several APIs which are separate sites running on Kestrel. All external traffic comes in through an RP (running on IIS).
We have some API's that are designed to accept external requests and some that are internal APIs only.
The internal APIs are hosted on scalesets with each scaleset VM being a replica that hosts all of the internal APIs. There is an internal load balancer(ILB)/vip in front of the scaleset. The root issue is that we have internal APIs that call other internal APIs that are hosted on the same scaleset. Ideally these calls would go to the VIP (using internal DNS) and the VIP would route to one of the machines in the scaleset. But it looks like Azure doesn't allow this...per the documentation:
You cannot access the ILB VIP from the same Virtual Machines that are being load-balanced
So how do people set this up with micro-services? I can see three ways, none of which are ideal:
Separate out the APIs to different scalesets. Not ideal as the
services are very lightweight and I don't want to triple my Azure VM
expenses.
Convert the internal LB to an external LB (add a public
IP address). Then put that LB in it's own network security
group/subnet to only allow calls from our Azure IP range. I would
expect more latency here and exposing the endpoints externally in
any way creates more attack surface area as well as more
configuration complexity.
Set up the VM to loopback if it needs a call to the ILB...meaning any requests originating from a VM will be
handled by the same VM. This defeats the purpose of micro-services
behind a VIP. An internal micro-service may be down on the same
machine for some reason and available on another...thats' the reason
we set up health probes on the ILB for each service separately. If
it just goes back to the same machine, you lose resiliency.
Any pointers on how others have approached this would be appreciated.
Thanks!
I think your problem is related to service discovery.
Load balancers are not designed for that obviously. You should consider dedicated softwares such as Eureka (which can work outside of AWS).
Service discovery makes your microservices call directly each others after being discovered.
Also take a look at client-side load balancing tools such as Ribbon.
#Cdelmas answer is awesome on Service Discovery. Please allow me to add my thoughts:
For services such as yours, you can also look into Netflix's ZUUL proxy for Server and Client side load balancing. You could even Use Histrix on top of Eureka for latency and Fault tolerance. Netflix is way ahead of the game on this.
You may also look into Consul.io product for your cause if you want to use GO language. It has a scriptable configuration for better managing your services, allows advanced security configurations and usage of non-rest endpoints. Eureka also does these but requires you add a configuration Server (Netflix Archaius, Apache Zookeeper, Spring Cloud Config), coded security and support accesses using ZUUL/Sidecar.

Scaling of Azure service fabric Stateless services

Can you please give me a better understanding of how we can scale the stateless services without partitioning?
Say we have 5 nodes in a cluster and we have 5 instances of the service. On simple testing a node is behaving as sticky where all the requests I am sending are being served by only one node. In the scenario when we have high volume of requests that come in, can other instances be automatically used to serve the traffic. How do we handle such scale out situations in service fabric?
Thanks!
Usually there's no need to use partitioning for stateless SF services, so avoid that if you can:
more on SF partitioning, including why its not normally used for stateless services
If you're using the ServiceProxy API, it will maintain sticky connections to a given physical node in the cluster. If you're (say) exposing HTTP endpoints, you'll have one for each physical instance in the cluster (meaning you'll end up talking to one at a time, unless you manually cycle thru them). You can avoid this by:
Creating a new proxy instance for each call, which tends to be expensive if you do it alot (or manually cycle thru the list of instance endpoint URLs, which can be tedious and/or expensive)
Put a load balancer in front of your cluster and configure all traffic from your clients to SF nodes to be forwarded thru that. The load balancer can be configured for Round-Robin, etc. style semantics:
Azure Load Balancer
Azure Traffic Manager
Good luck!
You can query the request using the reverse proxy installed on each node. Using the https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reverseproxy
The reverse proxy then resolve the endpoint for you. If you have multiple instances of the a stateless service then it will forward your request to a random one.
If during heavy load you can increase the instance count of your service and the proxy then include the new instances automatically.
I will assume you are calling your services from outside your cluster. If yes, your problem is not specific for Service Fabric, it is Azure VMSS + LB.
Service Fabric runs on top of Virtual Machines Scale Set, these VMs are created behind a Load Balancer, when the client connects to your service, they are creating a connection through the load balancer to your service, whenever a connection is open, the load balancer assign one target VM for handling your request, and any request made from your client, while using the same connection(keep alive), will be handled by the same node, this is why your load goes to a single node.
LB won't round robin the requests because they are using the same connection, it is a limitation(feature) of the LB, to work around this problem, you should open multiple connections or use multiple clients(instances).
This is for default distribution mode(Hash-based). You have to check also the routing rules in the LB to check if the distribution mode is Hash-based(5 tuple= ip+port) or if it is IP affinity mode(ip only), otherwise multiple connections from same IP will still be linked to same node.
Source: Azure Load Balaner Distribution Mode

Azure load balancing with three websites

If I have three websites like web1.azurewebsites.net, web2...,web3... in the same region and want to have a loadbalancer that divides the traffic evenly amongst those websites (a so called round robin configuration). How can I accomplish this in azure?
I know that I can use the traffic manager but only if the websites are on different regions.
Sorry but very new to azure...
/Joe
Do you need to have three different web sites? Or can you have a single web site scaled up to multiple instances? (guide here)
If you scale up, then it should use a built-in load balancer that is based on a hash of the user's IP + port, as described here.
And yes, traffic manager isn't really intended for true load balancing; it's a DNS-level redirection, mainly for improving performance based on geography.

Resources