Azure sticky-session/sourceIP load balancing and instance probing - azure

We have a cluster of instances running as workerroles on Azure. They contain services that rely on state. Users connect via Socket.IO.
We need stickySessions, so that a client can stay with a single instance and the service that is running on it. If the user is re-routed to another instance, the client breaks.
My questions:
The LB distribution mode sourceIP will allow sticky sessions. But only as long the balanced instance set does not change (instance added/removed). When an instance is probed unhealthy, does the set change? And what does that mean for current users? Will they be re-routed?
What is a "new user" anyway? SourceIP uses a 2-tuple hash. Can i rely on clients sitting on the same system not being marked "new" when probes flag an instance unhealthy? Will they still be routed to that instance?
And lastly. Is there any sure and easy way to achieve sticky sessions?

Short story:
Any flow already established will persist unless of course the destination itself dies; this is about new flows only. The ability to pin all flows from a specific source to a specific VM irrespective of the health status of the backend pool is not supported in Azure Load Balancer and not what sourceIP distribution mode is. If that is what you need, you can either use another tool in the arsenal or you can solve this by having each node reach out to wherever state is actually kept.
Read long story below for details.
Long story:
When you create an Azure Load Balancer function, the SDN stack uses the protocol header values to compute a hash. The distribution mode choice governs which protocol header values are used to compute the hash:
5-tuple: source IP, source port, destination IP, destination port,
protocol
2-tuple: source IP, destination IP
2-tuple is what is referred to as distribution mode sourceIP. Since the destination IP is always that of the VIP when this is evaluated, only the source IP controls the hash. And because port numbers aren't considered, any session from a given source IP address will result in the same hash.
The hash determines where a new session lands. The session will persist on that destination until the flow terminates.
If the pool changes because a node becomes healthy or sick, the hash will change and impact new flows only.
Any session already in progress stays wherever it is. You just cannot guarantee a given source will always end up the same destination over time with a changing pool.
Here's an example:
Let's assume you have a scenario where you have VM's A, B, C in a pool behind an LB configuration. You turn on sourceIP.
First session from client X comes in, lands on B. As long as the pool stays the same (all nodes remain healthy and sick nodes aren't healed), any additional sessions from X will also land on B.
When the pool changes because a node changes health status from healthy to sick or vice versa, a new session from client X may land on a node other than B. Any sessions already on B remain on B.
Unless of course B itself is dead, in which case those established sessions will time out.

Related

How to add a new independent IP to a running Hazelcast cluster without restarting existed node?

Hazelcast cluster running in different hosts IP1, IP2...
hazelcast.xml configure the TCP-IP members
enter image description here
Now I want to expanding the cluster to support more service.
I install a new hazelcast in new IP3
How can I add the new IP3 to the exsiting cluster without restarting IP1, IP2?
The members section in TCP is for finding the cluster.
You list some places where cluster members may be. The process starting tries those locations, and if it gets a response the response includes the locations of all cluster members.
When scaling up you frequently won't know the location in advance. The TCP list is one solution, but there's other ways if running on the cloud, etc.
For your specific question: You don't need to add IP3 to your XML. Or you can and it be picked up the next time the processes are restarted.
If you're new to Hazelcast, why not join the community slack
Usually, you don't need to change anything, and just starting a new member (IP3) with listed IP1 and IP2 will work. The third member will join the cluster.
How does it look like under the hood (simplified):
the newly started member tries to contact addresses in its member list; after a successful connection, it sends a "Who Is Master" request and reads the master node address from the response;
the new node makes a connection to the master address (if not established already) and asks it to join the cluster;
if the join is successful, the master replies with an updated member list (cluster view) and it also sends the updated list to all other cluster members;
There were significant improvements in handling corner cases for these "incomplete address configuration" issues in Hazelcast 5.2. So if you are on an older version I strongly recommend switching to an up-to-date one.
If for any reason the default behavior is not sufficient in your case, you can also use the /hazelcast/rest/config/tcp-ip/member-list REST endpoint to make member-list changes. The endpoint was also introduced in 5.2. Find details in the documentation: https://docs.hazelcast.com/hazelcast/5.2/network-partitioning/split-brain-recovery#eliminating-unsuccessful-cluster-merges

Node script - Failover from one server to another server

I have a nodejs script - lets call it "process1" on server1, and same script is running on server2 - "process2" (just with flag=false).
Process1 will be preforming actions and will be in "running" state at the beginning. process2 will be running but in "block" state with flag programmed within it.
What i want to acomplish is to, implement failover/fallback for this process. If process1 goes down flag on process2 will change, and process2 will take over all tasks from process1 (and vice versa when process1 cames back - fallback).
What is the best approach to do this? TCP connection between those?
NOTE: Even its not too much relevant, but i want to mention that these processes are going to work internally, establishing tcp connection with third server and parsing data we are getting from that server. Both of the processes will be running on both of the servers, but only ONE process at the time can be providing services - running with flag true (and not both of them)
Update: As per discussions bellow and internal research/test and monitoring of solution, using reverse proxy will save you a lot of time. Programming fail-over based on 2 servers only will cover 70% of the cases related with the internal process which is used on the both machines - but you will not be able to detect others 30% of the issues caused because of the issues with the network (especially if you are having a lot of traffic towards DATA RECEIVER).
This is more of an infrastructure problem than it is a Node one, and the same situation can be applied to almost any server.
What you basically need is some service that monitors Server 1 and determines whether it's "healthy" or "alive" and if so continue to direct traffic to it. If the service determines that the server is no longer in a stable condition (e.g. it takes too long to respond, returns an error) it will redirect any incoming traffic to Server 2. When it's happy Server 1 has returned to normal operating conditions it will redirect the traffic back onto it.
In most cases, the "service" in this scenario is a reverse proxy like Nginx or CloudFlare. In your situation, this server would act as a buffer between Data Reciever and your network (Server 1 / Server 2) and route the incoming traffic to the relevant server.
That looks like a classical use case for a reverse proxy. Using a well tested server such as nginx should provide plenty reliability the proxy won't fail (other than hardware failure) and you could put that infront of whatever cluster size you want. You'd even get the benefit of load-balancing if that is applicable and configured properly.
Alternatively and also leaning towards a load-balancing solution, you could have a front server push requests into a queue (ZMQ for example) and either push from the queue to the app server(s) or have your app-server(s) pull tasks from the queue independently.
In both solutions, if it's a requirement not to "push" 2 simultaneous results to your data receiver, you could use an outbound queue that all app-servers push into.

Does Azure load balancer allow connection draining

I cant seem to find any documentation for it.
If connection draining is not available how is one supposed to do zero-downtime deployments?
Rick Rainey answered essentially the same question on Server Fault. He states:
The recommended way to do this is to have a custom health probe in
your load balanced set. For example, you could have a simple
healthcheck.html page on each of your VM's (in wwwroot for example)
and direct the probe from your load balanced set to this page. As long
as the probe can retrieve that page (HTTP 200), the Azure load
balancer will keep sending user requests to the VM.
When you need to update a VM, then you can simply rename the
healthcheck.html to a different name such as _healthcheck.html. This
will cause the probe to start receiving HTTP 404 errors and will take
that machine out of the load balanced rotation because it is not
getting HTTP 200. Existing connections will continue to be serviced
but the Azure LB will stop sending new requests to the VM.
After your updates on the VM have been completed, rename
_healthcheck.html back to healthcheck.html. The Azure LB probe will start getting HTTP 200 responses and as a result start sending
requests to this VM again.
Repeat this for each VM in the load balanced set.
Note, however, that Kevin Williamson from Microsoft states in his MSDN blog post Heartbeats, Recovery, and the Load Balancer, "Make sure your probe path is not a simple HTML page, but actually includes logic to determine your service health (eg. Try to connect to your SQL database)." So you may actually want an aspx page that can check several factors, including a custom "drain" flag you put somewhere.
Your clients need to simply retry.
The load balancer only forwards a request to an instance that is alive (determined by pings), it doesn't keep track of the connections. So if you have long-standing connections, it is your responsibility to clean them up on restart events or leave it to the OS to clean them up on restarts (which is obviously not gracefully in most of the cases).
Zero-downtime means that you'll always be able to reach an instance that is alive, nothing more- it gives you no guarantees on long running requests.
Note that when a probe is down, only new connections will go to other VMs
Existing connections are not impacted.

Security implications of including worker port number in session ID

I wrote a multi-process realtime WebSocket server which uses the session id to load-balance traffic to the relevant worker based on the port number that it is listening on. The session id contains the hostname, source port number, worker port number and the actual hash id which the worker uses to uniquely identify the client. A typical session id would look like this:
localhost_9100_8000_0_AoT_eIwV0w4HQz_nAAAV
I would like to know the security implications for having the worker port number (in this case 9100) as part of the session id like that.
I am a bit worried about Denial of Service (DoS) threats - In theory, this could allow a malicious user to generate a large number of HTTP requests targeted at a specific port number (for example by using a fake sessionID which contains that port number) - But is this a serious threat? (assuming you have decent firewalls)? How do big companies like Google handle dealing with sticky sessions from a security perspective?
Are there any other threats which I should consider?
The reason why I designed the server like this is to account for the initial HTTP handshake and also for when the client does not support WebSocket (in which case HTTP long-polling is used - And hence subsequent HTTP requests from a client need to go to the same worker in the backend).
So there are several sub-questions in your question. I'll try to split them up and answer them accordingly:
Is DoS-Attack on a specific worker a serious threat?
It depends. If you will have 100 users, probably not. But you can be sure, that there are people out there, which will have a look at your application and will try to figure out the weaknesses and exploit those.
Now is a DoS-Attack on single workers a serious possibility, if you can just attack the whole server? I would actually say yes, because it is a more precise attack => you need less resources to kill the workers when you do it one by one. However if you allow connection from the outside only on port 80 for HTTP and block everything else, this problem will be solved.
How do big companies like Google handle dealing with sticky sessions?
Simple answer - who says, they do? There are multiple other ways to solve the problem of sessions, when you have a distributed system:
don't store anything session based on the server, just have a key in the cooky with which you can identify the user again, similar as with automatic login.
store the session state in a data base or object storage (this will add a lot of overhead)
store session information in the proxy (or broker, http endpoint, ...) and send them together with the request to the next worker
Are there any other threats which I should consider?
There are always unforeseen threats, and that's the reason, why you should never publish more information than necessary. In that case, most big companies don't even publish the correct name and version of their WebServer (for google it is gws for instance)
That being said, I see your point why you might want to keep your implementation, but maybe you can modify it slightly to store in your load balancer a dictionary with a hashed value of hostname, source port number, worker port number and have as a session id a collection of two hashes. Than the load balancer knows, by looking into the dictionary, to which worker it needs to be sent. This info should be saved together with a timestamp, when the info was retrieved the last time, and every minute you can delete unused data.

Peer to Peer: Methods of Finding Peers

Are there any known methods of finding peers without using a dedicated central server?
ie: If I have peers which are disconnecting and reconnecting to the internet but getting a new IP address each time, and I want to connect to them without setting up a dedicated server to register with.
I was thinking about using peers email address to send a manifest of connected peers periodically, with some sort of timecode, negating the need for a dedicated server. This would be a fallback if none of the peers could be connected to after trying all the previously known peer addresses. But existing models of finding peers would be preferable.
There's no way around having to know at least one initial peer to discover more.
Fully P2P protocols, such as Gnutella or Gnutella2, or the simpler Overnet (made famous by Storm Worm), are based on each client having a start-up list of a few peers. These can come off a web-based automated tracker for example. The client will discover the whole network or portions of it by asking other peers for more addresses, for example when delegating a file search.
If you truly can't have any kind of a centralized resource, the best you can do is find the first peer through broadcasted messages and ultimately IP address scanning. The first approach is well-meaning but in at least 98% of cases won't yield any results. The later approach, of course, is abusing the internet, as well as illegal in most countries.
I really would rethink having some kind of a central tracker. It can be something as simple as a PHP script on a webserver (the gnutella network, today, is held up by ten-twenty such scripts, hosted by people who don't even know each other). And this sure is more lightweight than email (which, due to spam filters at the very least, would not work anyway).
In the limited case of peers within an intranet, it is possible to send a broadcast UDP message to a known port asking for peers to report back.
The BitcoinQT client uses a variety of methods to find nodes, some of them might be useful to you.
Satoshi Client Node Discovery
IRC is no longer used, but might be the most easy to implement:
As of version 0.6.x the Bitcoin client no longer uses IRC bootstrapping by default, and as of version 0.8.2 support for IRC bootstrapping has been removed completely. This documentation below is accurate for most prior versions.
In addition to learning and sharing its own address, the node learned about other node addresses via an IRC channel. See irc.cpp.
After learning its own address, a node encoded its own address into a string to be used as a nickname. Then, it randomly joined an IRC channel named between #bitcoin00 and #bitcoin99. Then it issued a WHO command. The thread read the lines as they appeared in the channel and decoded the IP addresses of other nodes in the channel. It did this in a loop, forever, until the node was shutdown.
When the client discovered an address from IRC, it set the timestamp on the address to the current time, but it used a "penalty" of 51 minutes, which means it looked like it was actually seen almost an hour earlier.
Take advantage of any existing forum where data can posted. Think secret IRC channel, embedding data in photos and posting to photo sharing sites 4chan?, any site that would allow your application to login and post data without captia logins etc.
http://chatzilla.hacksrus.com/faq/#password
Another strategy might be to embedded messages in digital currency transactions. Pick a cheap coin that's likely to hang around ... DOGE or MOON coin maybe. Build wallet functionality into your app. such that you can post micro transactions back and forth between addresses that your app controls. There would still be a miners fee, but this is only fractions of pennies. Even if they later prohibit adding metadata to transactions, you could make a transaction equivalent to your IP address in MOON, and use vanity addresses in MOON coin for your app. such that when a new node comes online it knows what to search the blockchain for -- 2daMOON%bootStr#pM3. SEND - 104.003021133 MOON IP = 104.3.21.133 not an expensive proposition.
Old question but I've been thinking about this problem myself so will ad my 2-cents. In short, a central server is not required if a node is aware of at least one valid peer. New nodes must be added to the network by any current member (e.g. invited, or node spawns another node, depending on your application).
Assuming that:
agents keep track of peers; the size of this address book and how entries are managed will depend on the nature of the system; e.g. how long peers remain connected, if peers use stable addresses
agents share peer information with other peers
at least some agents remain available for relatively long periods of time relative to frequency node connects to network to update it's address book (or nodes have stable addresses)
in addition to peer addresses, availability information is also tracked (many options here depending on your system. examples include: whether peer has a stable address, when last seen, some availability metric, content/service type information, address valid-until time if known)
new agents are initialized with at least one valid peer (doesn't have to be a central node, can be any valid node)
trust mechanisms shall be required if malicious peers are a possibility
When a peer comes online, it queries the peers in it's peer table to discover which are active and perhaps removes expired dynamic addresses. Nodes exchange peer information and may become linked themselves. This peer discovery/exchange may continue a certain number of hops or via random walk until peer list if of sufficient size and/or quality.
A few more details:
Nodes connect and share peer information with frequency related to how often node addresses change, so address book doesn't become stale and node becomes disconnected because none of it's former peers are available at their last known addresses
Nodes may need to limit the number of peers they accept, to avoid tendency towards centralization around the most stable nodes.
Nodes should be selective about the peers they keep; i.e. ones in which they are more likely to exchange data (e.g. weight based upon history)
Node links may be asymmetric or symmetric depending on the application
Three ways, off the top of my head, though you're always going to need some central server to start the connection unless you went with option 3.
Central server that maintains known list of peers, with keep-alive.
One or more central servers that maintain some common resource peers can use to discover one another, but once connected no longer need the central server as long as the peer remains connected (something like BitTorrent); can chain peered connections as well.
Port/IP scanning (strongly not recommended).
In your example, you'd still have some kind of central server where the peers would be registered; the protocol is the only difference.
To put it simply no, there is no way to do this without a central sever.
If you want to do this you simply need one or more central servers, whether by dynamic dns or not. The clients need a method to discover where they should connect to, and the only truly sensible way to do this is with your own server, in the simplest scenario it only needs to send an IP address in response.
Virtual severs can be had for around $15/month, which IMO is considerably cheaper than trying to use or abuse someone else's bandwidth.
[Edit].
To put it simply, there is another way, as follows.
Upon reflection I think what I'd do is to designate a set of peers as cluster controllers and use a dynamic DNS service to allow other peers to discover the cluster controllers.
Choose a dynamic DNS provider I'll call it myc.ath.cx (I Use http://www.dyndns.com/).
Each peer has to be capable of becoming a cluster controller. A cluster controller will contain a list of all the other peers connected.
When a peer is started it looks up myc.ath.cx and attempts to connect. If connection cannot be made within a period, say 30 seconds, it takes over the registration of the DNS entry.
Any peer wishing to discover other peers can simply query myc.ath.cx and a list will be provided
All peers are responsible for periodically downloading the list of peers, in case they need to cluster controller.
The cluster controller will periodically query the DNS entry - if has changed from it's IP address then it knows that it is no longer the cluster controller - so it will contact the cluster controller that currently has the DNS entry and provide it's list of known hosts.
The cluster controller will periodically contact hosts on the list to ensure that they are still valid.
Your method of sending email does use a dedicated server, though; the peer's email server, to be precise.
Roughly, I don't think it's possible without using some sort of dedicated storage or server (which the email approach does, albeit obliquely) UNLESS you are able to characterize the connectivity to the internet that your peers are using.
Basically, if you have a set of X number of peers, that connect for Y amount of time, and they are then off the grid for Z amount of time... essentially, you can construct a probability equation about how likely it is that the set of peers that you last contacted is still available; where that probability approaches 1 (for a given set of X, Y, and Z above), you can most likely sustain a peer-to-peer network without using storage.
Possibly more in the spirit; instead of having a "dedicated central server", use simple online free service to specify a peer list. Set up a yahoo group, or something like that; clients can automatically look it up and get a peer address from which to query a set of peers; the client can be coded with the authentication to post to the group, and can post periodically its IP address so that others can request the set of known active peers.
If you want to get really tricky, you can start using basically steganographic methods to hide peer location information. I.e. get a google search for "blah"; find the first site listed in the results that has an unprotected (no CAPTCHA) message board; find the third (or whatever) post that starts with "Indubitably" (or whatever), and find the header of the first message there, and there's the IP address of a peer. If that doesn't work, go down the list of search terms to the next one.
But that's sneaky. :-)
Could you re-use an existing dedicated server for the purpose?
I am thinking in particular of registering each of the peers with a Dynamic DNS, but if you were willing to get a bit uglier, sharing access to a known Hotmail account or Google Doc or the like.
You can either use a central directory or some sort of broadcast protocol for service discovery. Assuming that you could get them indexed by Google, you could conceive of a system whereby each peer runs a web site with some unique, rare words contained on a specific page. You could then use Google search results based on these words to identify potential peers. This would essentially be a (noisy and slow) internet broadcast.
If the page structure was a well known pattern or contained identifiable connection information for that peer, it would be easy to distinguish them in the search results. Using such a public directory leaves you open to compromised nodes in the network that is formed, but this is pretty much true of any P2P network absent some security mechanism.
Getting the web sites crawled and highly ranked by Google (or some other search engine) for your particular arcane set of search terms would be the trick. I can think of a couple of ways, but they aren't ones that I would use. For a legitimate service, I'd rather spend the money or find a free web site that could function as a directory.
What about another P2P system built specifically to track online peers of other P2P systems?
Then we reduce the problem of finding peers for any new P2P system to simply finding peers for the 'main' P2P system, which will give you the addresses of online peers for the system you're interested in using...
This is a typical use of a distributed hash table algorithm. I'd suggest looking at something like pastry. It uses a overlay network (Application layer network) on top of other layers.
Each node has a GUID which is used to route requests across the peer network.
If you're loooking for an already established central server then see the metaserver entry on page here:
http://martindevans.appspot.com/
You can register peers on there and then other peers can find them. Obviously this is a central server, but it requires no maintenance on your part.

Resources