Taper off connections nicely in IIS 8.5 - iis

Is there any mechanism in IIS 8.5 where you can taper off the current connections?
The scenario is that you have two servers that are Network Load Balanced by forward facing hardware. You do not have access or direct control over the NLB. You wish to take a server offline, but you do not wish to kill any connections, just prevent new connections from occurring and wait for current connections to die.

Related

Load balancing sockets on a horizontally scaling WebSocket server?

Every few months when thinking through a personal project that involves sockets I find myself having the question of "How would you properly load balance sockets on a dynamic horizontally scaling WebSocket server?"
I understand the theory behind horizontally scaling the WebSockets and using pub/sub models to get data to the right server that holds the socket connection for a specific user. I think I understand ways to effectively identify the server with the fewest current socket connections that I would want to route a new socket connection too. What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
I don't imagine this answer would be tied to a specific server implementation, but rather could be applied to most servers. I could easily see myself implementing this with vert.x, node.js, or even perfect.
First off, you need to define the bounds of the problem you're asking about. If you're truly talking about dynamic horizontal scaling where you spin up and down servers based on total load, then that's an even more involved problem than just figuring out where to route the latest incoming new socket connection.
To solve that problem, you have to have a way of "moving" a socket from one host to another so you can clear connections from a host that you want to spin down (I'm assuming here that true dynamic scaling goes both up and down). The usual way I've seen that done is by engaging a cooperating client where you tell the client to reconnect and when it reconnects it is load balanced onto a different server so you can clear off the one you wanted to spin down. If your client has auto-reconnect logic already (like socket.io does), you can just have the server close the connection and the client will automatically re-connect.
As for load balancing the incoming client connections, you have to decide what load metric you want to use. Ultimately, you need a score for each server process that tells you how "busy" you think it is so you can put new connections on the least busy server. A rudimentary score would just be number of current connections. If you have large numbers of connections per server process (tens of thousands) and there's no particular reason in your app that some might be lots more busy than others, then the law of large numbers probably averages out the load so you could get away with just how many connections each server has. If the use of connections is not that fair or even, then you may have to also factor in some sort of time moving average of the CPU load along with the total number of connections.
If you're going to load balance across multiple physical servers, then you will need a load balancer or proxy service that everyone connects to initially and that proxy can look at the metrics for all currently running servers in the pool and assign the connection to the one with the most lowest current score. That can either be done with a proxy scheme or (more scalable) via a redirect so the proxy gets out of the way after the initial assignment.
You could then also have a process that regularly examines your load score (however you decided to calculate it) on all the servers in the cluster and decides when to spin a new server up or when to spin one down or when things are too far out of balance on a given server and that server needs to be told to kick several connections off, forcing them to rebalance.
What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
As described above, you either use a proxy scheme or a redirect scheme. At a slightly higher cost at connection time, I favor the redirect scheme because it's more scalable when running and creates fewer points of failure for an existing connection. All clients connect to your incoming connection gateway server which is responsible for knowing the current load score for each of the servers in the farm and based on that, it assigns an incoming connection to the host with the lowest score and this new connection is then redirected to reconnect to one of the specific servers in your farm.
I have also seen load balancing done purely by a custom DNS implementation. Client requests IP address for farm.somedomain.com and that custom DNS server gives them the IP address of the host it wants them assigned to. Each client that looks up the IP address for farm.somedomain.com may get a different IP address. You spin hosts up or down by adding or removing them from the custom DNS server and it is that custom DNS server that has to contain the logic for knowing the load balancing logic and the current load scores of all the running hosts.
Route the websocket requests to a load balancer that makes the decision about where to send the connections.
As an example, HAProxy has a leastconn method for long connections that picks the least recently used server with the lowest connection count.
The HAProxy backend server weightings can also be modified by external inputs, #jfriend00 detailed the technicalities of weighting in their answer.
I found this project that might be useful:
https://github.com/apundir/wsbalancer
A snippet from the description:
Websocket balancer is a stateful reverse proxy for websockets. It distributes incoming websockets across multiple available backends. In addition to load balancing, the balancer also takes care of transparently switching from one backend to another in case of mid session abnormal failure.
During this failover, the remote client connection is retained as-is thus remote client do not even see this failover. Every attempt is made to ensure none of the message is dropped during this failover.
Regarding your question : that new connection will be routed by the load balancer if configured to do so.
As #Matt mentioned, for example with HAProxy using the leastconn option.

Too many persistent TCP connections

We have around 500 clients connected to a Linux RedHat ES 5 server.
Recently it occurs, that the server still holds connections to clients which have been rebootet without stopping the application, which communicates with the server, before.
A netstat on the client always returns only one established connection to the server. After a client reboot, communication runs over a new established connection. On server side sometimes the old connection is closed, sometimes it stays in state established so that we have a growing number of established connections to each client.
Because various client operating systems are affected, I think that this isn't an application issue, but one of the Linux OS of the server.
I tried to tune the values of
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 9
without success.
Also I tried to set the maximum file handles value from 1024 to 2048, but connections still never get closed, not even after the TCP keepalive time expires.
Does somebody have an idea what could cause that strange behaviour?
Those settings allow you to configure the default keep-alive behavior (when keep-alives are enabled). However, they do not make keep-alives automatic. The feature must still be explicitly enabled on a per-socket basis via the SO_KEEPALIVE socket option.
See http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/ for details. From section 3:
Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface.

IIS Zero Downtime Update ARR / Reverse Proxy

I have a C# console application / Windows sevice that uses the HttpListener stuff to handle requests, IIS is setup to reverse proxy to this via ARR.
My problem is that when I update this application there is a short downtime between the old instance being shut down and the new one being ready.
The approach I'm thinking about would be to add 2 servers to the server farm via local hostnames with 2 ports and on update I'd start the new instance which would listen on the unused port, stop listening for new requests on the old instance and gracefully shut it down (ie process the current requests). Those last 2 steps would be started by the new instance to ensure that it is ready to handle the requests.
Is IIS ARR load balancing smart enough to try the other instance and mark the now shut down one as unavailable without losing any requests until the new one is updated or do I have to add health checks etc (would that again lead to a short downtime period?)
One idea that I believe could work (especially if your IIS is only being used for this purpose) is to leverage the IIS overlapped recycling capabilities that are built-in when you make a configuration change. In this case what you could do is:
start a new instance of your app running listening in a different
port,
edit the configuration in ARR to point to the new port.
IIS should allow any existing requests running in the application pool within the recycling timeout to drain successfully while new requests will be sent to the new application pool.
Maybe if you share a bit more on the configuration you are using in ARR (like a snippet of %windir%\system32\inetsrv\config\applicationHost.config and the webFarms section)

IIS network load balancing

I have a clustered server with 4 nodes running Win server 2008 r2 with IIS 7.
Fail over kicks in when one of nodes fails but is there a way to have it round robin distribute incoming calls to different server?
This happens when incoming requests come from different client but our investigation shows that if there is one client that is making many requests, they all go to the same server.
I would like to the server to round robin request so that node 1 receives first request, node 2 receives second request and so on.
Each request could take a long time and having all requests go to the same node when I have 3 others idling is causing us perf issue. Thanks
NLB port rules have a couple of properties that control how requests are routed. The relevant properties seem to be:
Filtering mode - specifies whether a single host or multiple hosts in the cluster handle traffic for the given port
Affinity - controls how traffic is routed to hosts in the cluster
It is likely you need to set the Affinity value to none, which allows requests to be routed to multiple hosts within the cluster. The docs do not state whether round-robin or another algorithm is used for load balancing.
For more on Filtering Mode and Affinity: Network Load Balancing Manager Properties
How to: Edit a Network Load Balancing Port Rule
Round Robin Load Balancing will not distribute traffic coming from one destination. You will need to configure your load balancer to 'Least Connections'
Basically the NLB passes a new connection to the pool member or node that has the least number of active connections.

Does a software load balancer manage a two-way SSL connection? If so, how?

I don't have the faintest clue on how a software or hardware load balancer works. I guess the hardware load balancer is basically a switch and based on some algorithm decides which node to switch to for a incoming request. On the software load balancer front, I guess the software picks up a node and uses a reverse proxy connection to it. In such a scenario, 2-way SSL wont work as the load balancer cannot have the client's private key.
Again, I don't how a software load balancer works but as my application would need a load balancer and as the application uses 2-way SSL connection, I wanted to know how does a software load balancer take care of a 2-way SSL connection.
No, SSL works with a load balancer. They typically work at the TCP level, so the clients connect to the LB IP address, but it NATs the connections on to the real servers. The connection persists to the same real server for its lifetime, but if the same client makes another one, it can (and typically would) go to a different server.
For HTTPS this works fine, except that if you have a web server which supports SSL session caching, then the SSL session cache will be lost if the client comes back to a different server. In practice this is not a big problem. Of course HTTP keep-alive sessions aren't affected because they are a single TCP connection so they stay on the same realserver.
Generally speaking, a software load balancer will note that there is a new incoming connection request, assess the workload on the machines available, and allocate the new request to the most appropriate machine. When there is a session-based service, that connection will last for the duration of the session; rebalancing would only occur if a server went down, and would probably establish new connections in a newly balanced configuration.
So, as Jon implied, the SSL session would be established with a server, and would continue with that server until the session terminates.
If you want to route connections more dynamically, then it may be that the SSL session has to be terminated (decrypted) in front of the software that dynamically sends requests to different servers.
All these are possible - they are not necessarily efficient or implemented.
A software load balancer will distribute sessions evenly across multiple servers.
So, if a user hits your load balancer, it will send him to a specific server and that server will negotiate the SSL. The user will continually talk to this server until his session expires. At that point, he will hit the load balancer again.

Resources