Leader address/location in Raft - discovery

This may be a very simple question but I've not been able to find a good answer to this yet. Maybe someone can help me.
Once a leader is elected -
The clients will send all requests ONLY to the leader. Is this correct?
Given that the location (for all practical purposes the IP address) of the leader is dynamic, how will the client know this IP address in a cluster?

The client attempts to submit requests to any host in the cluster. If the client guesses wrong the server either forwards the request to the leader or returns an error with the address of who it thinks is the leader.
The first method (forwarding to the leader) is a better experience for the client. Be sure to limit the number of forwarding hops and timeout on the client.
The second method (error with leader address) is much easier to implement. Since the leader is typically long-lived except during troubled times, it usually only takes one retry to get the request to the leader.

Related

How to do broadcast in Pulsar

I'm investigating a tech for our cluster. Pulsar looks good, but the usage looks more like a queueing system. Of course, queueing system is good to have, but I have a specific requirement: broadcasting.
We would like to use one machine to generate the data and publish it to a Pulsar topic. Then we use a group of servers, forming a replica. Each server consumes the message flow on that topic, and serves clients via WebSocket.
This is different than the Shared subscription, because each server needs to receive all messages, not a fraction of it.
I came to this post: https://kafkaesque.io/subscriptions-multiple-groups-of-consumers-on-pulsar-topic/ , which explains how to do such a job: each server needs create a new exclusive subscription, say use a UUID as its subscription name, from the unique exclusive subscription you can get the full message flow of that topic.
But since our server replica can be dynamic, so once some of the server restart, they will create new UUID subscription again, which will leave many orphan subscriptions on the topic, which eventually would become maintenance headache.
Anyone has the experience to setup a broadcast use case using Pulsar?
Actually, I found that the "Reader Interface" is exactly for this kind of use case:
https://pulsar.apache.org/docs/en/concepts-clients/#reader-interface
Using an exclusive subscription for each consumer is the only way to ensure that each of your consumers receives ALL of the messages on the topic, and Pulsar handles multiple subscriptions quite well.
The issue it seems is the server restart use case, and I don't think that simply connecting with a new UUID subscription is the right approach (putting aside the orphaned subscriptions). You really want to have the server reuse the previous subscription after it restarts. This is because each subscription keeps track of the last message in the topic that it had processed and acknowledged, so you can pick up exactly where you had left off before the server crashed if you reconnect with the same subscription UUID. If you connect with a new UUID, then you will start processing messages produced from that point in time forward, and all messages produced during the restart period will be "lost"
Therefore, you will need to find a mechanism to share these UUIDs across server failures and return them to the restarting server. One approach would be to have a mechanism similar to zookeeper leader election, in which each server is granted an exclusive lease that expires periodically. The server must then periodically refresh the lease to retain it. Then if the server were to crash, it would fail to refresh the lease on that UUID and the restarting server would then be granted the lease when it attempts to reconnect.
See https://curator.apache.org/curator-recipes/leader-election.html for a better explanation of the pattern.

Security implications of including worker port number in session ID

I wrote a multi-process realtime WebSocket server which uses the session id to load-balance traffic to the relevant worker based on the port number that it is listening on. The session id contains the hostname, source port number, worker port number and the actual hash id which the worker uses to uniquely identify the client. A typical session id would look like this:
localhost_9100_8000_0_AoT_eIwV0w4HQz_nAAAV
I would like to know the security implications for having the worker port number (in this case 9100) as part of the session id like that.
I am a bit worried about Denial of Service (DoS) threats - In theory, this could allow a malicious user to generate a large number of HTTP requests targeted at a specific port number (for example by using a fake sessionID which contains that port number) - But is this a serious threat? (assuming you have decent firewalls)? How do big companies like Google handle dealing with sticky sessions from a security perspective?
Are there any other threats which I should consider?
The reason why I designed the server like this is to account for the initial HTTP handshake and also for when the client does not support WebSocket (in which case HTTP long-polling is used - And hence subsequent HTTP requests from a client need to go to the same worker in the backend).
So there are several sub-questions in your question. I'll try to split them up and answer them accordingly:
Is DoS-Attack on a specific worker a serious threat?
It depends. If you will have 100 users, probably not. But you can be sure, that there are people out there, which will have a look at your application and will try to figure out the weaknesses and exploit those.
Now is a DoS-Attack on single workers a serious possibility, if you can just attack the whole server? I would actually say yes, because it is a more precise attack => you need less resources to kill the workers when you do it one by one. However if you allow connection from the outside only on port 80 for HTTP and block everything else, this problem will be solved.
How do big companies like Google handle dealing with sticky sessions?
Simple answer - who says, they do? There are multiple other ways to solve the problem of sessions, when you have a distributed system:
don't store anything session based on the server, just have a key in the cooky with which you can identify the user again, similar as with automatic login.
store the session state in a data base or object storage (this will add a lot of overhead)
store session information in the proxy (or broker, http endpoint, ...) and send them together with the request to the next worker
Are there any other threats which I should consider?
There are always unforeseen threats, and that's the reason, why you should never publish more information than necessary. In that case, most big companies don't even publish the correct name and version of their WebServer (for google it is gws for instance)
That being said, I see your point why you might want to keep your implementation, but maybe you can modify it slightly to store in your load balancer a dictionary with a hashed value of hostname, source port number, worker port number and have as a session id a collection of two hashes. Than the load balancer knows, by looking into the dictionary, to which worker it needs to be sent. This info should be saved together with a timestamp, when the info was retrieved the last time, and every minute you can delete unused data.

MQTT what is the purpose or usage of Last Will Testament?

I'm surely missing something about how the whole MQTT protocol works, as I can't grasp the usage pattern of Last Will Testament messages: what's their purpose?
One example I often see is about informing that a device has gone offline. It doesn't make very much sense to me, since it's obvious that if a device isn't publishing any data it may be offline or there could be some network problems.
So, what are some practical usages of the LWT? What was it invented for?
LWT messages are not really concerned about detecting whether a client has gone offline or not (that task is handled by keepAlive messages).
LWT messages are about what happens after the client has gone offline.
The analogy is that of a real last will:
If a person dies, she can formulate a testament, in which she declares what actions should be taken after she has passed away. An executor will heed those wishes and execute them on her behalf.
The analogy in the MQTT world is that a client can formulate a testament, in which it declares what message should be sent on it's behalf by the broker, after it has gone offline.
A fictitious example:
I have a sensor, which sends crucial data, but very infrequently.
It has formulated a last will statement in the form of [topic: '/node/gone-offline', message: ':id'], with :id being a unique id for the sensor. I also have a emergency-subscriber for the topic 'node/gone-offline', which will send a SMS to my phone every time a message is published on that channel.
During normal operation, the sensor will keep the connection to the MQTT-broker open by sending periodic keepAlive messages interspersed with the actual sensor readings. If the sensor goes offline, the connection to the broker will time out, due to the lack of keepAlives.
This is where LWT comes in: If no LWT is specified, the broker doesn't care and just closes the connection. In our case however, the broker will execute the sensor's last will and publish the LWT-message '/node/gone-offline: :id'. The message will then be consumed to my emergency-subscriber and I will be notified of the sensor's ID via SMS so that I can check up on what's going on.
In short:
Instead of just closing the connection after a client has gone offline, LWT messages can be leveraged to define a message to be published by the broker on behalf of the client, since the client is offline and cannot publish anymore.
Just because a device is not publishing does not mean it is not online or there is a network problem.
Take for example a sensor that monitors a value that only changes very infrequently, good design says that the sensor should only publish the changes to help reduce bandwidth usage as periodically publishing the same value is wasteful. If the value is published as a retained value then any new subscriber will always get the current value without having to wait for the sensor value to change and it publish again.
In this case the LWT is used to published when the sensor fails (or there is a network problem) so we know of the problem as soon at the client keep alive times out.
A in-depth article about Last-Will-and-Testament messages is available in the MQTT Essentials Blog Post series: http://www.hivemq.com/mqtt-essentials-part-9-last-will-and-testament/.
To summarize the blog post:
The Last Will and Testament feature is used in MQTT to notify other clients about an ungracefully disconnected client.
MQTT is often used in scenarios were unreliable networks are very common. Therefore it is assumed that some clients will disconnect ungracefully from time to time, because they lost the connection, the battery is empty or any other imaginable case. It would be good to know if a connected client has disconnected gracefully (which means with a MQTT DISCONNECT message) or not, in order to take appropriate action.

connection scheduling for LVS hash table

LVS supports a connection hash table, where a request message will firstly find out whether the connection is hashed in the LVS, and if so, the message will go to a fixed node.
LVS also supports some connection scheduling methods, like Round Robin. From the description of Round Robin, every request will be round-robin, which doesn't make sense to me. If the request finds an existing hashed connection, it will be delivered to a fixed node, and cannot be balanced with Round Robin.
This question confused me a lot and I cannot continue, thanks for your help.
When IPVS receives a new connection it selects a real server to handle that connection.
That selection is done using on of the schedulers. If the connection is not new it's handed to the real server, no scheduler is used.
Round Robin is on of the schedulers that distributes all new connection requests round-robin and not the packets for the existing connections.

Peer to Peer: Methods of Finding Peers

Are there any known methods of finding peers without using a dedicated central server?
ie: If I have peers which are disconnecting and reconnecting to the internet but getting a new IP address each time, and I want to connect to them without setting up a dedicated server to register with.
I was thinking about using peers email address to send a manifest of connected peers periodically, with some sort of timecode, negating the need for a dedicated server. This would be a fallback if none of the peers could be connected to after trying all the previously known peer addresses. But existing models of finding peers would be preferable.
There's no way around having to know at least one initial peer to discover more.
Fully P2P protocols, such as Gnutella or Gnutella2, or the simpler Overnet (made famous by Storm Worm), are based on each client having a start-up list of a few peers. These can come off a web-based automated tracker for example. The client will discover the whole network or portions of it by asking other peers for more addresses, for example when delegating a file search.
If you truly can't have any kind of a centralized resource, the best you can do is find the first peer through broadcasted messages and ultimately IP address scanning. The first approach is well-meaning but in at least 98% of cases won't yield any results. The later approach, of course, is abusing the internet, as well as illegal in most countries.
I really would rethink having some kind of a central tracker. It can be something as simple as a PHP script on a webserver (the gnutella network, today, is held up by ten-twenty such scripts, hosted by people who don't even know each other). And this sure is more lightweight than email (which, due to spam filters at the very least, would not work anyway).
In the limited case of peers within an intranet, it is possible to send a broadcast UDP message to a known port asking for peers to report back.
The BitcoinQT client uses a variety of methods to find nodes, some of them might be useful to you.
Satoshi Client Node Discovery
IRC is no longer used, but might be the most easy to implement:
As of version 0.6.x the Bitcoin client no longer uses IRC bootstrapping by default, and as of version 0.8.2 support for IRC bootstrapping has been removed completely. This documentation below is accurate for most prior versions.
In addition to learning and sharing its own address, the node learned about other node addresses via an IRC channel. See irc.cpp.
After learning its own address, a node encoded its own address into a string to be used as a nickname. Then, it randomly joined an IRC channel named between #bitcoin00 and #bitcoin99. Then it issued a WHO command. The thread read the lines as they appeared in the channel and decoded the IP addresses of other nodes in the channel. It did this in a loop, forever, until the node was shutdown.
When the client discovered an address from IRC, it set the timestamp on the address to the current time, but it used a "penalty" of 51 minutes, which means it looked like it was actually seen almost an hour earlier.
Take advantage of any existing forum where data can posted. Think secret IRC channel, embedding data in photos and posting to photo sharing sites 4chan?, any site that would allow your application to login and post data without captia logins etc.
http://chatzilla.hacksrus.com/faq/#password
Another strategy might be to embedded messages in digital currency transactions. Pick a cheap coin that's likely to hang around ... DOGE or MOON coin maybe. Build wallet functionality into your app. such that you can post micro transactions back and forth between addresses that your app controls. There would still be a miners fee, but this is only fractions of pennies. Even if they later prohibit adding metadata to transactions, you could make a transaction equivalent to your IP address in MOON, and use vanity addresses in MOON coin for your app. such that when a new node comes online it knows what to search the blockchain for -- 2daMOON%bootStr#pM3. SEND - 104.003021133 MOON IP = 104.3.21.133 not an expensive proposition.
Old question but I've been thinking about this problem myself so will ad my 2-cents. In short, a central server is not required if a node is aware of at least one valid peer. New nodes must be added to the network by any current member (e.g. invited, or node spawns another node, depending on your application).
Assuming that:
agents keep track of peers; the size of this address book and how entries are managed will depend on the nature of the system; e.g. how long peers remain connected, if peers use stable addresses
agents share peer information with other peers
at least some agents remain available for relatively long periods of time relative to frequency node connects to network to update it's address book (or nodes have stable addresses)
in addition to peer addresses, availability information is also tracked (many options here depending on your system. examples include: whether peer has a stable address, when last seen, some availability metric, content/service type information, address valid-until time if known)
new agents are initialized with at least one valid peer (doesn't have to be a central node, can be any valid node)
trust mechanisms shall be required if malicious peers are a possibility
When a peer comes online, it queries the peers in it's peer table to discover which are active and perhaps removes expired dynamic addresses. Nodes exchange peer information and may become linked themselves. This peer discovery/exchange may continue a certain number of hops or via random walk until peer list if of sufficient size and/or quality.
A few more details:
Nodes connect and share peer information with frequency related to how often node addresses change, so address book doesn't become stale and node becomes disconnected because none of it's former peers are available at their last known addresses
Nodes may need to limit the number of peers they accept, to avoid tendency towards centralization around the most stable nodes.
Nodes should be selective about the peers they keep; i.e. ones in which they are more likely to exchange data (e.g. weight based upon history)
Node links may be asymmetric or symmetric depending on the application
Three ways, off the top of my head, though you're always going to need some central server to start the connection unless you went with option 3.
Central server that maintains known list of peers, with keep-alive.
One or more central servers that maintain some common resource peers can use to discover one another, but once connected no longer need the central server as long as the peer remains connected (something like BitTorrent); can chain peered connections as well.
Port/IP scanning (strongly not recommended).
In your example, you'd still have some kind of central server where the peers would be registered; the protocol is the only difference.
To put it simply no, there is no way to do this without a central sever.
If you want to do this you simply need one or more central servers, whether by dynamic dns or not. The clients need a method to discover where they should connect to, and the only truly sensible way to do this is with your own server, in the simplest scenario it only needs to send an IP address in response.
Virtual severs can be had for around $15/month, which IMO is considerably cheaper than trying to use or abuse someone else's bandwidth.
[Edit].
To put it simply, there is another way, as follows.
Upon reflection I think what I'd do is to designate a set of peers as cluster controllers and use a dynamic DNS service to allow other peers to discover the cluster controllers.
Choose a dynamic DNS provider I'll call it myc.ath.cx (I Use http://www.dyndns.com/).
Each peer has to be capable of becoming a cluster controller. A cluster controller will contain a list of all the other peers connected.
When a peer is started it looks up myc.ath.cx and attempts to connect. If connection cannot be made within a period, say 30 seconds, it takes over the registration of the DNS entry.
Any peer wishing to discover other peers can simply query myc.ath.cx and a list will be provided
All peers are responsible for periodically downloading the list of peers, in case they need to cluster controller.
The cluster controller will periodically query the DNS entry - if has changed from it's IP address then it knows that it is no longer the cluster controller - so it will contact the cluster controller that currently has the DNS entry and provide it's list of known hosts.
The cluster controller will periodically contact hosts on the list to ensure that they are still valid.
Your method of sending email does use a dedicated server, though; the peer's email server, to be precise.
Roughly, I don't think it's possible without using some sort of dedicated storage or server (which the email approach does, albeit obliquely) UNLESS you are able to characterize the connectivity to the internet that your peers are using.
Basically, if you have a set of X number of peers, that connect for Y amount of time, and they are then off the grid for Z amount of time... essentially, you can construct a probability equation about how likely it is that the set of peers that you last contacted is still available; where that probability approaches 1 (for a given set of X, Y, and Z above), you can most likely sustain a peer-to-peer network without using storage.
Possibly more in the spirit; instead of having a "dedicated central server", use simple online free service to specify a peer list. Set up a yahoo group, or something like that; clients can automatically look it up and get a peer address from which to query a set of peers; the client can be coded with the authentication to post to the group, and can post periodically its IP address so that others can request the set of known active peers.
If you want to get really tricky, you can start using basically steganographic methods to hide peer location information. I.e. get a google search for "blah"; find the first site listed in the results that has an unprotected (no CAPTCHA) message board; find the third (or whatever) post that starts with "Indubitably" (or whatever), and find the header of the first message there, and there's the IP address of a peer. If that doesn't work, go down the list of search terms to the next one.
But that's sneaky. :-)
Could you re-use an existing dedicated server for the purpose?
I am thinking in particular of registering each of the peers with a Dynamic DNS, but if you were willing to get a bit uglier, sharing access to a known Hotmail account or Google Doc or the like.
You can either use a central directory or some sort of broadcast protocol for service discovery. Assuming that you could get them indexed by Google, you could conceive of a system whereby each peer runs a web site with some unique, rare words contained on a specific page. You could then use Google search results based on these words to identify potential peers. This would essentially be a (noisy and slow) internet broadcast.
If the page structure was a well known pattern or contained identifiable connection information for that peer, it would be easy to distinguish them in the search results. Using such a public directory leaves you open to compromised nodes in the network that is formed, but this is pretty much true of any P2P network absent some security mechanism.
Getting the web sites crawled and highly ranked by Google (or some other search engine) for your particular arcane set of search terms would be the trick. I can think of a couple of ways, but they aren't ones that I would use. For a legitimate service, I'd rather spend the money or find a free web site that could function as a directory.
What about another P2P system built specifically to track online peers of other P2P systems?
Then we reduce the problem of finding peers for any new P2P system to simply finding peers for the 'main' P2P system, which will give you the addresses of online peers for the system you're interested in using...
This is a typical use of a distributed hash table algorithm. I'd suggest looking at something like pastry. It uses a overlay network (Application layer network) on top of other layers.
Each node has a GUID which is used to route requests across the peer network.
If you're loooking for an already established central server then see the metaserver entry on page here:
http://martindevans.appspot.com/
You can register peers on there and then other peers can find them. Obviously this is a central server, but it requires no maintenance on your part.

Resources