communication between two processes running node.js - node.js

I'm writing a server, and decided to split up the work between different processes running node.js, because I heard node.js was single threaded and figured this would parallize better. The application is going to be a game. I have one process serving html pages, and then other processes dealing with the communication between clients playing the game. The clients will be placed into "rooms" and then use sockets to talk to each other relayed through the server. The problem I have is that the html server needs to be aware of how full the different rooms are to place people correctly. The socket servers need to update this information so that an accurate representation of the various rooms is maintained. So, as far as I see it, the html server and the room servers need to share some objects in memory. I am planning to run it on one (multicore) machine. Does anyone know of an easy way to do this? Any help would be greatly appreciated

Node currently doesn't support shared memory directly, and that's a reflection of JavaScript's complete lack of semantics or support for threading/shared memory handling.
With node 0.7, only recently usable even experimentally, the ability to run multiple event loops and JS contexts in a single process has become a reality (utilizing V8's concept of isolates and large changes to libuv to allow multiple event loops per process). In this case it's possible, but still not directly supported or easy, to have some kind of shared memory. In order to do that you'd need to use a Buffer or ArrayBuffer (both which represent a chunk of memory outside of JavaScript's heap but accessible from it in a limited manner) and then some way to share a pointer to the underlying V8 representation of the foreign object. I know it can be done from a minimal native node module but I'm not sure if it's possible from JS alone yet.
Regardless, the scenario you described is best fulfilled by simply using child_process.fork and sending the (seemingly minimal) amount of data through the communication channel provided (uses serialization).
http://nodejs.org/docs/latest/api/child_processes.html
Edit: it'd be possible from JS alone assuming you used node-ffi to bridge the gap.

You may want to try using a database like Redis for this. You can have a process subscribed to a channel listening new connections and publishing from the web server every time you need.
You can also have multiple processes waiting for users and use a list and BRPOP to subscribe to wait for players.

Sounds like you want to not do that.
Serving and message-passing are both IO-bound, which Node is very good at doing with a single thread. If you need long-running calculations about those messages, those might be good for doing separately, but even so, you might be surprised at how well you do with a single thread.
If not, look into Workers.

zeromq is also becomming quite popular as a process comm method. Might be worth a look. http://www.zeromq.org/ and https://github.com/JustinTulloss/zeromq.node

Related

Efficient way to process many threads of same Application

I have a Multi-Client Single-Server application where client and server gets connected through sockets. Client and Server are in different machine.
In client Application, client socket gets connected to server and sends data periodically to server.
In server application server socket listens for client to connect. When a client is connected, new thread is created for client to receive data.
for example: 1 client = 1 thread created by server for receiving data. If its 10000 client, server creates 10000 threads. This seems not good and scalable too.
My Application is in Java.
Is there an alternate method for this problem?
Thanks in advance
This is a typical C10K problem. There are patterns to solve this, one examples is Reactor pattern
Java NIO is another way where the incoming request can be processed in non blocking way. See a reference implementation here
Yes, you should not need a separate thread for each client. There's a good tutorial here that explains how to use await to handle asynchronous socket communication. Once you receive data over the socket you can use a fixed number of threads. The tutorial also covers techniques to handle thousands of simultaneous communications.
Unfortunately given the complexity it's not possible to post the code here, so although link-only answers are frowned upon ...
I would say it's a perfect candidate for an Erlang/Elixir application. Whatsapp, RabbitMQ...
Erlang processes are cheap and fast to start, Erlang manages the scheduling for you so you don't have to think about the number of threads, CPUs or even machines, Erlang manages garbage collection for each process after you don't need it anymore.
Haskell is slow, Erlang is fast enough for most applications that are not doing heavy calculations and even then you can use it and hand off the heavy lifting to a C process.
What are you writing in?
Yes, you can use the Actor model, with e.g. Akka or Akka.net. This allows you to create millions of actors that run on e.g. 4 threads. Erlang is a programming language that implements the actor model natively.
However, actors and non-blocking code won't do you much good if you are relying on blocking library calls for backend services that you rely on, such as (the most prominent example in the JVM world) JDBC calls.
There is also a rather interesting approach that Haskell uses, called green threads. It means that the runtime threads are very lightweight and are dynamically mapped to OS threads. It also means that you get a certain amount of scalability "for free", with no need to write non-blocking IO code. It does however require a good IO manager in the runtime to schedule the IO operations efficiently, and GHC Haskell has had a substantial amount of work put into that in recent years.

Node.js - Multiple websocket servers?

I've got a system running mainly PHP for server-sided logic, however, I am using Node.js for a few parts of the system and I've got a question about what the best way to handle this is.
The Node.js server is used solely as a websocket server. I'm using Socket.IO as the API for the sockets.
Currently I have three different uses for the sockets, i.e. A queue to handle incoming requests, a chat server and an announcements system.
My question is; is this the best approach? The file looks messy, all my logic is in the single file. Should I spread this out to separate files to handle each part, OR, should I be creating multiple socket servers to handle the different uses?
With this current implementation, i'm finding it very hard to debug any failures, as there seems to be too much happening in the one script.
Any help would be appreciated,
Thanks!
I think this is down to preference and the size of your system.
Personally I would at least separate the logic for each component into separate modules. That way at least each bit is kind of self contained and the system can become modular. In my opinion this makes it far easier to maintain and add/remove components.
Running multiple socket servers could be a bit overkill if your app is small, however if your having trouble separating the data to be handled by each part it could be worth considering running more than one.

How should I implement server push so that the browser is updated with DB updates?

I am reading on various ways of doing server push to client side(broswer).I would like to understand the best approach out of these.
Long polling -- To be avoided as it holds up resources longer on server side.
Node JS async delegation using callbacks.--cons that it is single threaded.
Write callbacks in java , use threads to do task in background and later use callback to push it to server like node.js does.
The advantage here is that we will have multiple threads running in parallel and utilizing CPU efficiently.
Can anyone suggest the best way of implementation? Any other way is also appreciated.
You seem to misunderstand few things. You cannot compare for example long polling to the server side technology.
Long polling means that the client (i.e. browser) makes an AJAX request to the server. Then the server keeps that request alive until there is a notification. Then it responds to that request and the client after receiveing the update immediatly calls the server with new AJAX request.
You can chose whatever technology you want to handle that on the server side. People made NodeJS with this on they're mind and thus I would recommend using it for that. But use whatever suits you better.
Now another misunderstanding is that threads increase performance and thus they are better then single threaded applications. Actually the opposite is true: with threads the performance gets worse (here I assume that we are working on one core CPU). Threads increase responsivness, but not performance. There is however a problem (with single threaded apps) that if a thing you're trying to do is CPU intensive, then it will block your server. However if you are talking about simple notifications, then that's not an issue at all (you don't need CPU power for that). SIDE NOTE: You can fire as many instances of NodeJS as you have cores to take advantage of multiple cores (you will need a bit more complex code, though).
Also you should consider using WebSockets (thus implementing a simple TCP server on the server side). Long polling is inefficient and most modern browsers do support WebSockets (in particular IE10+ as it was always an issue with IE).
Concluding: on the server side use the technology you're most familiar with.

Multithreaded server using threadpool

I'm planing a multithreaded server (SCGI to be precise). Now, I know that the traditional approach using one thread per connection is not very scalable. I also don't want to use something fancy like libevent, as this is a hobby project and I prefer not to have lots of dependencies in my codebase.
The approach I'm thinking of is to use a threadpool and let one thread listen to the network to queue up any request coming in. The threads managed by the pool then dequeue the requests, receive the data and respond respectively.
This way, I wouldn't have the overhead of constant thread creation while still being able to serve many request in parallel.
Is there some fundamental problem in this architecture design that I'm not aware of or is this a not-ideal-but-still-ok solution?
Thanks!
Looks good. Depending on how much load this server is expected to see this might be an overkill.
One thread per connection is good enough until you start handling dozens if not hundreds requests in parallel. The advantage of one thread per connection is the simplicity, and it might not be worthwhile to give that up.
On the other hand if you are looking for something that needs to handle tons of traffic (either external like webproxy or internal like memcache) you probably should just use libevent. AFAIK all the big boys are using it or something very similar (memcache, haproxy and so on)
Finally, if you are doing this just for fun just use whatever you want :) It's possible to get good performance with all those archs.

Node.js event vs thread programming on server side

We are planning to start a fairly complex web-portal which is expected to attract good local traffic and I've been told by my boss to consider/analyse node.js for the serve side.
I think scalability and multi-core support can be handled with an Nginx or Cherokee in front.
1) Is this node.js ready for some serious/big business?
2) Does this 'event/asynchronous' paradigm on server side has the potential to support the heavy traffic and data operation ? considering the fact that 'everything' is being processed in a single thread and all the live connections would be lost if it got crashed (though its easy to restart).
3) What are the advantages of event based programming compared to thread based style ? or vice-versa.
(I know of higher cost associated with thread switching but hardware can be squeezed with event model.)
Following are interesting but contradicting (to some extent) papers:-
1) http://www.usenix.org/events/hotos03/tech/full_papers/vonbehren/vonbehren_html
2) http://pdos.csail.mit.edu/~rtm/papers/dabek:event.pdf
Node.js is developing extremely rapidly, and most of its functionality is sturdy and ready for business. However, there are a lot of places where its lacking, like database drivers, jquery and DOM, multiple http headers, etc. There are plenty of modules coming up tackling every aspect, but for a production environment you'll have to be careful to pick ones that are stable.
Its actually much MUCH more efficient using a single thread than a thousand (or even fifty) from an operating system perspective, and benchmarks I've read (sorry, don't have them on hand -- will try to find them and link them later) show that it's able to support heavy traffic -- not sure about file-system access though.
Event based programming is:
Cleaner-looking code than threaded code (in JavaScript, that is)
The JavaScript engine is extremely efficient with processing events and handling callbacks, and its easily one of the languages seeing the most runtime optimization right now.
Harder to fit when you are thinking in terms of control flow. With events, you can never be sure of the flow. However, you can also come to think of it as more dynamic programming. You can treat each event being fired as independent.
It forces you to be more security-conscious when programming, for the above reason. In that sense, its better than linear systems, where sometimes you take sanitized input for granted.
As for the two papers, both are relatively old. The first benchmarks against this, which as you can see, has a more recent note about these studies:
http://www.eecs.harvard.edu/~mdw/proj/seda/
It also cites the second paper you linked about what they have done, but refuses to comment on its relevance to the comparison between event-based systems and thread-based ones :)
Try yourself to discover the truth
See What is Node.js? where we cover exactly that:
Node in production is definitely possible, but far from the "turn-key" deployment seemingly promised by the docs. With Node v0.6.x, "cluster" has been integrated into the platform, providing one of the essential building blocks, but my "production.js" script is still ~150 lines of logic to handle stuff like creating the log directory, recycling dead workers, etc. For a "serious" production service, you also need to be prepared to throttle incoming connections and do all the stuff that Apache does for PHP. To be fair, Rails has this exact problem. It is solved via two complementary mechanisms: 1) Putting Rails/Node behind a dedicated webserver (written in C and tested to hell and back) like Nginx (or Apache / Lighttd). The webserver can efficiently serve static content, access logging, rewrite URLs, terminate SSL, enforce access rules, and manage multiple sub-services. For requests that hit the actual node service, the webserver proxies the request through. 2) Using a framework like "Unicorn" that will manage the worker processes, recycle them periodically, etc. I've yet to find a Node serving framework that seems fully baked; it may exist, but I haven't found it yet and still use ~150 lines in my hand-rolled "production.js".

Resources