How to implement a custom load-balancing decision method to specify which exactly server should process a request?
Currently, I am working with Azure, so the MS solutions are more preferable (ARR or WLBS).
Each server instance may have several unique resources ("unique" means that only this particular server instance has it).
An application creates a unique ResourceID for each resource and gives this ResourceID to a client "on demand".
The client's further requests are specified by the ResourceID.
The custom load balancer decision method should allow me to specify how:
To get the ResourceID from the request (should work on Layer 7).
To get the ServerInstanceID (or IP or whatever is required) based on the ResourceID (from my custom table).
To notify the load balancer which exactly application server instance should process this request (pass the ServerInstanceID).
P.S. May be I should say "proxy" here instead of the "load balancer". But for the sake of high availability, it will require several proxy servers and the load balancer to spread traffic between them. So, a pure proxy solution will just bring another one tier to the application.
I have found two useful threads on the IIS.NET Forums:
Custom load balancing decision function
using URL Rewrite Module for custom load balancing
Two main approaches are recommended:
To use custom load balancing.
To use custom Application Request Routing (ARR).
The most interesting thing is that different threads recommend using each other :)
Nevertheless, I will review both the suggested approaches.
Related
I am using an IIS web garden for long running requests with 15 worker processes.
With, for example, 3 browsers, typically multiple worker processes are used.
With Apache jMeter, all requests are using the same worker process.
Is there a way to force the use of multiple worker processes?
This may have at least 2 explanations:
You have some hard coded ID or session ID in your test plan. Check for their presence and remove them, add Cookie Manager to your test
You have a load balancer that work in Source IP mode, in this case you need to either change policy to Round Robin or add 2 other machines
If you are using 1 thread with X iterations and expecting different workers then check that:
Cookie Manager is configured this way:
And Thread Group this way (notice "Same User on each iteration is unchecked"):
If issue persists, then please share you plan and check that you don't have somewhere in Header Manager a hardcoded id leading to using 1 worker
Well-behaved JMeter script should produce the same network footprint as the real browser do so if you're observing inconsistencies most probably your JMeter configuration is not matching requests which are being sent by the real browser.
Make sure that your JMeter test is doing what it is supposed to be doing by inspecting requests/responses details using View Results Tree listener
Use a 3rd-party tool like Wireshark or Fiddler to capture the requests originating from browser/JMeter, detect the differences and amend your JMeter configuration to eliminate them
More information: How to make JMeter behave more like a real browser
In the absolute majority of cases JMeter script is not working as expected due to missing or improperly implemented correlation of the dynamic values
I am trying to find a good way to horizontally scale a stateful NodeJS service.
The Problem
The problem is that most of the options I find online assume the service is stateless. The NodeJS cluster documentation says:
Node.js [Cluster] does not provide routing logic. It is, therefore important to design an application such that it does not rely too heavily on in-memory data objects for things like sessions and login.
https://nodejs.org/api/cluster.html
We are using Kubernetes so scaling across multiple machines would also be easy if my service was stateless, but it is not.
Current Setup
I have a list of objects that stay in memory, each object alone is a transaction boundary. Requests to this service always have the object ID in the url. Requests to the same object ID are put into a queue and processed one at a time.
Desired Setup
I would like to keep this interface to the external world but internally spread this list of objects across multiple nodes and based on the ID in the URL the request would be routed to the appropriate node.
What is the usual way to do it in NodeJS? I've seen people using the user session to make sure a given user always go to the same node, what I would like to do is the same thing but instead of using the user session using the ID in the url.
I am running a load test using JMeter on my Azure web services.
I scale my services on S2 with 4 instances and run JMeter 4 instances with 500 threads on each.
It starts perfectly fine but after a while calls start failing and giving Timeout error (HTTP status:500).
I have checked HTTP request queue on azure and found that on 2nd instance it is very high and two instances it is very low.
Please help me to success my load test.
I assume you are using Azure App Service. If you check the settings of your App, you will notice ARR’s Instance Affinity will be enabled by default. A brief explanation:
ARR cleverly keeps track of connecting users by giving them a special cookie (known as an affinity cookie), which allows it to know, upon subsequent requests, to which server instance they were talking to. This way, we can be sure that once a client establishes a session with a specific server instance, it will keep talking to the same server as long as his session is active.
This is an important feature for session-sensitive applications, but if it's not your case then you can safely disable it to improve the load balance between your instances and avoid situations like the one you've described.
Disabling ARR’s Instance Affinity in Windows Azure Web Sites
It might be due to caching of network names resolution on JVM or OS level so all your requests are hitting only one server. If it is the case - add DNS Cache Manager to your Test Plan and it should resolve your issue.
See The DNS Cache Manager: The Right Way To Test Load Balanced Apps article for more detailed explanation and configuration instructions.
I cant seem to find any documentation for it.
If connection draining is not available how is one supposed to do zero-downtime deployments?
Rick Rainey answered essentially the same question on Server Fault. He states:
The recommended way to do this is to have a custom health probe in
your load balanced set. For example, you could have a simple
healthcheck.html page on each of your VM's (in wwwroot for example)
and direct the probe from your load balanced set to this page. As long
as the probe can retrieve that page (HTTP 200), the Azure load
balancer will keep sending user requests to the VM.
When you need to update a VM, then you can simply rename the
healthcheck.html to a different name such as _healthcheck.html. This
will cause the probe to start receiving HTTP 404 errors and will take
that machine out of the load balanced rotation because it is not
getting HTTP 200. Existing connections will continue to be serviced
but the Azure LB will stop sending new requests to the VM.
After your updates on the VM have been completed, rename
_healthcheck.html back to healthcheck.html. The Azure LB probe will start getting HTTP 200 responses and as a result start sending
requests to this VM again.
Repeat this for each VM in the load balanced set.
Note, however, that Kevin Williamson from Microsoft states in his MSDN blog post Heartbeats, Recovery, and the Load Balancer, "Make sure your probe path is not a simple HTML page, but actually includes logic to determine your service health (eg. Try to connect to your SQL database)." So you may actually want an aspx page that can check several factors, including a custom "drain" flag you put somewhere.
Your clients need to simply retry.
The load balancer only forwards a request to an instance that is alive (determined by pings), it doesn't keep track of the connections. So if you have long-standing connections, it is your responsibility to clean them up on restart events or leave it to the OS to clean them up on restarts (which is obviously not gracefully in most of the cases).
Zero-downtime means that you'll always be able to reach an instance that is alive, nothing more- it gives you no guarantees on long running requests.
Note that when a probe is down, only new connections will go to other VMs
Existing connections are not impacted.
In every paper I have read about crawler proposals, I see that one important component is the DNS Resolver.
My question is:
Why is it necessary? Can't we just make a request to http://www.some-domain.com/?
DNS resolution is a well-known bottleneck in web crawling. Due to the
distributed nature of the Domain Name Service, DNS resolution may
entail multiple requests and round-trips across the internet,
requiring seconds and sometimes even longer. Right away, this puts in
jeopardy our goal of fetching several hundred documents a second.
There is another important difficulty in DNS resolution; the lookup
implementations in standard libraries (likely to be used by anyone
developing a crawler) are generally synchronous. This means that once
a request is made to the Domain Name Service, other crawler threads at
that node are blocked until the first request is completed. To
circumvent this, most web crawlers implement their own DNS resolver as
a component of the crawler.
http://nlp.stanford.edu/IR-book/html/htmledition/dns-resolution-1.html