Network programming and multi socket - multithreading

I have the idea to dev a gaming server (not a mmo server, but a server that can handle many game instance, like many chess party at the same times)
I was guessing that it could be interesting to manage (for the server) a client repartition on many socket at a time. So is there an interest to do this ?
For Example:
5 game instance on port 1234
5 game instance on port 1235
etc ...
I was thinking about firewall. A firewall will do its job faster on huge traffic on the same port or it could be faster to treat small traffic on many port ? and so on can this behavior could be change on different firewall ? (iptables, other ?)
Can we optimize the bandwith by separate the traffic in multiple socket ? or It doesn't matter and put all the traffic in same socket do the same result ?
Do you think multiport network can give a better latency for the client ?
A security gain? if a issue, not one that could give a root access, is find on a server the hacker could cheat with only few people because the traffic is cloisonned on many different socket?
Do you think it's could bring something that I don't think (not difficulty of programming of course) ?
Thanks you for your response.

Separating a traffic does not necessarily improve the performance of the network handling. Just because you're allocating more network resources, it doesn't mean that it's going to improve the overall performance because you'll most likely end up with bottleneck problem. You could have multiple ports but still have performance issue if you only use single thread, and conversely have single port and have really good proven performance if you use multiple threads(e.g. thread pool) for that socket( using epoll, kqueue, IOCP etc ). It all depends on how much resource you allocate overall. I strongly suggest you to use asynchronous socket for your server, either epoll and kqueue for unix (epoll for linux, kqueue for freeBSD) or IOCP for Windows. They have the best performance period.

My gut feeling is that using a dedicated port per game will slow down a server. Its better to have a single port, and to denote the game at an application level protocol.
It also limits you to ~64K games and can introduce issues with address reuse. On the other hand, I can't see what it gains either; to be honest it looks like unnecessary complexity.
I am the author of hellepoll so my gut feeling is considered but not authoritative.
Specifying the game at an application level also makes it easier to go in and out of games and a lobby on the same connection, at an application level.

What exactly is the question?
You'll need a multiplexing socket system call like poll (or ppoll or pselect) on Linux.
Your concern might be related to the C10k problem

Related

Where to host my NodeJS powered socket.io API?

I'm currently working on an api for an app that I'm developing, which I don't want to tell too much about. I'm a solo dev with no patents so it's probably a good idea to keep it anonymous. It uses socket.io together with node.js to interact with clients and the other way around, which I might be swapping out sometime later for elixir and it's sockets, but that isn't relevant for now. Now I'm trying to look into cloud hosting, but I'm having a rough time finding a good service to use.
These are my requirements:
24/7 uptime
Low memory and performance necessary (at least to start with). About 1+ gig with 2+ cores will most likely suffice (need 2 threads or more for node to handle async programming well)
Preferably free for like maybe even a year, or just really cheap, but that might be munch to ask
Must somehow be able to run some sort of database. Haven't really settled on this yet, but I want to implement a custom currency at some point, and probably have the ability to add some cooldowns. So it can be fairly simple and small. If anybody has any tips on what database I should use, that would also be very welcome. I was thinking of Cassandra because of the blazing fast performance and expandability. But I also wanna look into remote databases, especially if I'm gonna go international with the product
Ability to keep open socket.io connections, as you've probably guessed :P
Low ping decently high bandwith internet. The socket.io connections are lightweight and not a lot of data has to be sent. Mostly packets of a few kilobytes every now and then for all of the clients.
If this information is too vague or you want to know some other requirements I haven't thought of, let me know.
Check out Heroku (PaaS), they have a free version to start with

Inter-child-process communication options in node.js cluster

So I'm working on a node.js game server application at the moment, and I've hit a bit of a wall here. My issue is that I'm using socket.io to accept inbound connections from game clients. These clients might be connected to one of several Zones or areas of the game world.
The basic architecture is displayed below. The master process forks a child process for each Zone of the game which runs the Zone Manager process; a process dedicated to maintaining the Zone data (3d models, positions of players/entities, etc). The master process then forks multiple "Communication Threads" for each Zone Manager it creates. These threads create an instance of socket.io and listen on the port for that Zone (multiple threads listening on a single port). These threads will handle the majority of the game logic in their own process as well as communicate with the database backing the game server. The only issue is that in some circumstances they may need to communicate with the Zone Manager to receive information about the Zone, players, etc.
As an example: A player wants to buy/sell/trade with a non-player character (NPC) in the Zone. The Zone Communication thread needs to ask the Zone Manager thread if the player is close enough to the NPC to make the trade before it allows the trade to take place.
The issue I'm running into here is that I was planning to make use of the node.js cluster functionality and use the send() and on() methods of the processes to handle passing messages back and forth. That would be fine except for one caveat I've run into with it. Since all child processes spun off with cluster.fork() can only communicate with the "master" process. The node.js root process becomes a bottleneck for all communication. I ran some benchmarks on my system using a script that literally just bounced a message back and forth using the cluster's Inter-process Communication (IPC) and kept track of how many relays per second were being carried out. It seems that eventually node caps out at about 20k per second in terms of how many IPC's it can relay. This number was consistent on both a Phenom II 1.8ghz quad core laptop, and an FX-8350 4.0ghz 8-core desktop.
Now that sounds pretty decently high, except that this basically means that regardless of how many Zones or Communication Threads there are, all IPC is still bottlenecking through a single process that acts as a "relay" for the entire application. Which means that although it seems each individual thread can relay > 20k IPCs per second, the entire application as a whole will never relay more than that even if it were on some insane 32 core system since all the communication goes through a single thread.
So that's the problem I'm having. Now the dilemma. I've read a lot about the various other options out there and read like 20 different questions here on stack about this topic and I've seen a couple things popping up regularly:
Redis:
I'm actually running Redis on my server at the moment and using it as the socket.io datastore so that socket.io in multiple threads can share connection data so that a user can connect to any of N number of socket.io threads for their Zone so the server can sort of automatically load balance the incoming connections.
My concern with this is that it runs through the network stack. Hardly ideal for communication between multiple processes on the same server. I feel like the latency would be a major issue in the long run.
0MQ (zeromq/zmq):
I've never used this one for anything before, but I've been hearing a bit about it lately. Based on the reading I've done, I've found a lot of examples of people using it with TCP sockets, but there's not a lot of buzz about people using it for IPC. I was hoping perhaps someone here had worked with 0MQ for IPC before (possibly even in node.js?) and could shine some light on this option for me.
dnode:
Again I've never used this, but from what I've seen it looks like it's another option that is designed to work over TCP which means the network stack gets in the way again.
node-udpcomm:
Someone linked this in another question on here (which I can't seem to find again unfortunately). I've never even heard of it, and it looks like a very small solution that opens up and listens on UDP connections. Although this would probably still be faster than TCP options, we still have the network stack in the way right? I'm definitely like about a mile outside of my "programmer zone" as is here and well into the realm of networking/computer architecture stuff that I don't know much about lol
Anyway the bottom line is that I'm completely stuck here and have no idea what the best option would be for IPC in this scenario. I'm assuming at the moment that 0MQ is the best option of the ones I've listed above since it's the only one that seems to offer an "IPC" option for communication protocol which I presume means it's using a UNIX socket or something that's not going through the network stack, but I can't confirm that or anything.
I guess I'm just hoping some people here might know enough to point me in the right direction or tell me I'm already headed there. The project I'm working on is a multiplayer game server designed to work "out of the box" with a multiplayer game client both powering their 3D graphics/calculations with Three.js. The client and server will be made open source to everyone once I get them all working to my satisfaction, and I want to make sure that the architecture is as scalable as possible so people don't build a game on this and then go to scale up and end up hitting a wall.
Anyway thanks for your time if you actually read all this :)
I think that 0MQ would be a very good choice, but I admit that I don't know the others :D
For 0MQ it's transparent what transport you decide to use, the library calls are the same. It's just about choosing particular endpoint (and thus transport) during the call to zmq_bind and zmq_connect at the beginning. There are basically 4 paths you may decide to take:
"inproc://<id>" - in-process communication endpoint between threads via memory
"ipc://<filepath>" - system-dependent inter-process communication endpoint
"tcp://<ip-address>" - clear
"pgm://..." or "epgm://..." - an endpoint for Pragmatic Reliable Multicast
So to put is simply, the higher in the list you are, the faster it is and the less problems considering latency and reliability you have to face. So you should try to keep as high as possible. Since your components are processes, you should go with the IPC transport, naturally. If you later on need to change something, you can just change the endpoint definition and you are fine.
Now what is in fact more important than the transport you choose is the socket type, or rather pattern you decide to use. You situation is a typical request-response kind of communication, so you can either do
Manager: REP socket, Threads: REQ socket; or
Manager: ROUTER, Threads: REQ or DEALER
The Threads would then connect their sockets to the single Manager socket assigned to them and that's all, they can start to send their requests and wait for responses. All they have to decide on is the path they use as the endpoint.
But to describe in details what all those socket types and patterns mean is definitely out of the scope of this post, but you can and should read more about it in the ZeroMQ Guide. There you can not only learn about all the socket types, but also about many different ways how to connect your components and let them talk to each other. The one I mentioned is just a very simple one. Once you understand, you can build arbitrary hierarchies. It's like LEGO ;-)
Hope it helped a bit, cheers!

Dedicated server vs combined client-server?

I'm writing a game server and clients. The server generates a world and serves that up to various player clients. It is not massively multiplayer.
The client (at least for now) runs a fairly demanding software renderer, as well as doing the standard client side simulation (which is then subject to approval by the authoritative server), so any processing scalability available on this end would be good.
The server will also be processor intensive, and given the nature of the project, is likely to only become more so as the project progresses. The world is procedural and some of that happens during play; world options are set before a game begins, based on system capabilities at that time (ideally with minimal outside processes running). So scalability on the number of hardware threads is an absolute necessity for the server.
Again, as this is not an MMO, players will all have access to both client and server, so that they can set up their own servers. Many players will have only one machine on which to both host a server, and run the client, in order to be able to play with their friends. Based on this fact, which would serve me better?
A combined client and server process
Separate client and server processes
...And why?
PS. Even if some of the above complexity only comes to pass later, I need to get the architecture (in this regard) right, as early on as possible.
I'd say the most important aspect to help you with decision is - are there any computation intensive tasks that are shared for both client and server? If you can do some computation for both client and server once instead of twice, then it makes sense to have one process (but this may be quite hard to implement). Otherwise i say separate it.
I see few advantages with separated processes:
when client crashes (i'd say more probable with complicated renderers and/or buggy graphics drivers) server stays up
accidentally(and/or angry QQ) exiting game won't kill the game for other players
(if needed) it's easier to bind server to few cores and client to others
you probably need to write dedicated server anyway (so it can be hosted on remote servers, possibly without graphics) and porting it to other OSes might be easier ...
makes the client code overall less complicated
(i may later add something to the list if i come up with it)
Separation of concerns is always an important factor, so it's better to keep them as separate processes. It's also better for performance. If I only want to use the client functionality, there is no reason to also keep the server-related code in the same application. I will only launch it if I need it.

flow-based traffic classification for traffic shaping

I’m wondering if there are ways to achieve flow-based traffic shaping with linux.
Traditional traffic shaping approaches seem be based on creating classes for specific protocols or types of packets (such as ssh, http, SYN or ACK) that need high troughput.
Here I want to see every TCP connection as a flow characterized by a certain data-rate.
There’ll be
quick flows such as interactive ssh or IRC chat and
slow flows (bulk data) such as scp or http file transfers
Now I’m looking for a way to characterize / classify an incoming packet to one of these classes, so I can run a tc based traffic shaper on it. Any hints?
Since you mention a dedicated machine I'll assume that you are managing from a network bridge and, as such, have access to the entirety of the packet for the lifetime it is in your system.
First and foremost: throttling at the receiving side of a connection is meaningless when you are speaking of link saturation. By the time you see the packet it has already consumed resources. This is true even if you are a bridge; you can only realistically do anything intelligent on the egress interface.
I don't think you will find an off-the-shelf product that is going to do exactly what you want. You are going to have to modify something like dummynet to be dynamic according to rules you derive during execution or you are going to have to program a dynamic software router using some existing infrastructure. One I am familiar with is Click modular router, but there are others. I really dont know how things like tc and ipfw will react to being configured/reconfigured with high frequency - I suspect poorly.
There are things that you should address ahead of time, however. Things that are going to make this task difficult regardless of the implementation. For instance,
How do you plan on differentiating between scp bulk and ssh interactive behavior? Will you monitor initial behavior and apply a rule based on that?
You mention HTTP-specific throttling; this implies DPI. Will you be able to support that on this bridge/router? How many classes of application traffic will you support?
How do you plan on handling contention? (you allot for 'bulk' flows to each get 30% of the capacity but get 10 'bulk' flows trying to consume)
Will you hard-code the link capacity or measure it? Is it fixed or will it vary?
In general, you can get a fairly rough idea of 'flow' by just hashing the networking 5-tuple. Once you start dealing with applications semantics, however, all bets are off and you need to plow through packet contents to get what you want.
If you had a more specific purpose it might render some of these points moot.

How many open udp or tcp/ip connections can a linux machine have?

There are limits imposed by available memory, bandwidth, CPU, and of course, the network connectivity. But those can often be scaled vertically. Are there any other limiting factors on linux? Can they be overcome without kernel modifications? I suspect that, if nothing else, the limiting factor would become the gigabit ethernet. But for efficient protocols it could take 50K concurrent connections to swamp that. Would something else break before I could get that high?
I'm thinking that I want a software udp and/or tcp/ip load balancer. Unfortunately nothing like that in the open-source community seems to exist, except for the http protocol. But it is not beyond my abilities to write one using epoll. I expect it would go through a lot of tweaking to get it to scale, but that's work that can be done incrementally, and I would be a better programmer for it.
The one parameter you will probably have some difficulty with is jitter. Has you scale the number of connections per box, you will undoubtedly put strain on all the resources of the said system. As a result, the jitter characteristics of the forwarding function will likely suffer.
Depending on your target requirements, that might or not be an issue: if you plan to support mainly elastic traffic (traffic which does not suffer much from jitter and latency) then it's ok. If the proportion of inelastic traffic is high (e.g. interactive voice/video), then this might be more of an issue.
Of course you can always over engineer in this case ;-)
If you intend to have a server which holds one socket open per client, then it needs to be designed carefully so that it can efficiently check for incoming data from 10k+ clients. This is known as the 10k problem.
Modern Linux kernels can handle a lot more than 10k connections, generally at least 100k. You may need some tuning, particularly the many TCP timeouts (if using TCP) to avoid closing / stale sockets using up lots of resource if a lot of clients connect and disconnect frequently.
If you are using netfilter's conntrack module, that may also need tuning to track that many connections (this is independent of tcp/udp sockets).
There are lots of technologies for load balancing, the most well-known is LVS (Linux Virtual Server) which can act as the front end to a cluster of a real servers. I don't know how many connections it can handle, but I think we use it with at least 50k in production.
To your question, you are only restrained by hardware limitations. This was the design philosophy for linux systems. You are describe exactly what would be your limiting factors.
Try HAProxy software load balancer:
http://haproxy.1wt.eu/

Resources