Autobahn VS Einaros - Websockets with Node JS - node.js

I need to write a WebSocket server and I am learning Node JS by reading some books I purchased. This server is for a very fast game so I need to stream small messages to groups of clients as quick as possible.
What is the difference between:
Autobahn | JS : http://autobahn.ws/js/
and
Einaros : https://github.com/einaros/ws
?
I have heard that Autobahn is very powerful and capable to deal with 200k clients without a load balancer so I was wondering if someone with more experience could advise me whether there is any advantage in opting for one or another library.

The functional difference is: Einaros is a WebSocket library, whereas Autobahn provides WebSocket implementations (e.g. AutobahnPython), plus WAMP on top of WebSocket.
WAMP provides higher-level communication for apps (RPC + PubSub - pls see the WAMP website). And AutobahnJS is a WAMP implementation for browsers (and NodeJS) on top of WebSocket.
Now, say you don't care about WAMP, and hence only need a raw WebSocket server. Then you can compare AutobahnPython with Einaros primarily based on non-functional characteristics, like protocol compliance, security and performance.
Autobahn has best-in-class protocol compliance. I dare to say that, since the Autobahn project also provides the quasi industry standard WebSocket testsuite - used by most projects - including Einaros. Autobahn has 100% strict passes on all tests. Einaros probably also - I don't know.
Performance: yes, a single AutobahnPython based WebSocket server (4GB RAM, 2 cores, PyPy, FreeBSD in a VirtualBox VM) can handle 200k connected clients. To give you some more data points: here is a post with performance benchmarks on the RaspberryPi.
In particular, this post highlights the most important (IMO) metric: 95%/99% quantile messaging latency. You shouldn't look only at average latency, since there can be big skews and massive outliers. What you want is consistent low latency.
Achieving consistent low latency is non-trivial. E.g. one factor for languages/run-times like NodeJS or PyPy (a JITted Python implementation) is the garbage collector. Every time the GC runs, it'll slow stuff done - potentially introducing large latencies in messaging. I have done extensive benchmarking (unpublished) which indicates that PyPy's incremental GC is very good in this regard. Better than HotSpot (JVM) and NodeJS (Google V8). When in doubt, and since I haven't (yet) published numbers, you shouldn't believe me, but measure yourself.
The one thing I'd strongly recommend: don't rely on average latency, measure quantiles, do histograms.
Disclose: I am original author of Autobahn and work for Tavendo.

Related

Can I use Node to build a payment gateway software?

I know it is a subjective question, but the reason I ask this question is because
Node.js is not good with heavy computational task
Node.js has some issue with memory leak.
By having the problems above, would node be a good use case to build a payment gateway software?
I'm very comfortable with node, but there are many people said that its better to use other language like golang or scala for this type of systems.
Let me know what you guys think about, whether I should use Node or other languages.
Yes, node.js would be perfectly fine for payment gateway software. An appropriate design using clustering or off-loading computation tasks to child processes could easily help optimize heavy computational tasks.
And, node.js is being used by many heavy traffic commercial sites without memory leak issues. Memory leaks are an issue with faulty software design, not with the platform.
Further, the very nature of payment gateway software (being the middleman in a transaction between two other networking endpoints) is very well set up for the node.js async design that handles lots of in-flight transactions very efficiently.
As with pretty much any major back-end system these days, you just have to design your app to work the way the platform performs best and you could probably use any of the systems you mention just fine.

resource comparison between socket.io vs rest api

I am building application on socket.io and node.js expressjs 4 .
for my opinion socket.io takes much resources then Rest APIs.
I want to know how i can compare RestApis vs Socket.io in terms of memory ,CPU usage .and which is best for large applications
Thanks
To compare memory and CPU usage i suggest executing a big number of queries of each one separately (like a thousand) and watching the memory and cpu of the process. Now which is better it's not that simple. It all depends... Socket.io is designed to be a real-time tool, so in large applications where there are a lot of real-time operations, it's better. But in large applications where real-time is really not an issue i believe(never really tested to know the numbers) you can get lesser memory usage using RESTful APIs, mainly because websocket is stateful, so the server needs to have every connection in it's memory. Another thing to put in mind is that HTTP protocol has a lot of goodies that websocket doesn't like gzipping, cache, routing, SEO, proxy, and much more.
A good article about that: REST vs WebSocket
You might get help from mention below blogs
http://www.pubnub.com/blog/websockets-vs-rest-api-understanding-the-difference/
http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
These contion the complete detail of the difference.

Which of Yesod's Warp and snap-server should I choose for a high-performance application server?

I've seen benchmarks on Yesod's homepage, but they are mostly for static files. And the benchmarks on Snap's website are outdated.
I'm trying to expose a Haskell module as a service. The server's logic is to receive the function name and arguments in JSON, invoke the Haskell function and deliver the output again as JSON. The referential transparency guarantees thread safety and the ability to memoize and cache functions.
If I were to support concurrent connections in the order of 2k - 5k, how would I go about implementing it? How scalable can this approach be?
I would highly recommend making the choice between Warp/Yesod and Snap based on which system provides you with the best set of tools for creating your application. Both Warp and Snap are using the same underlying GHC I/O manager, and both are highly optimized. I would be surprised if a well-written application for each system, doing anything non-trivial, showed a significant performance gap.
Your last paragraph is a bit vague, but I think the basic answer for either Warp or Snap is to just write your code, and the I/O manager will scale as well as possible. If you really find concurrent connections to be the bottleneck, you could consider trying out the prefork technique, using GHC 7.8 (not yet released, but has a much improved I/O manager), or using multiple servers.
Most of the benchmarks published for Warp and Snap are based on an extremely simple and very contrived benchmark that returns a static string of "pong". This is great for benchmarking how efficient a web server is at parsing HTTP requests, constructing HTTP responses, etc, but in most applications the time spent doing that stuff will be negligible. Secondly, my guess is that any performance differences there may be between Warp and Snap right now are likely to diminish in the future as both servers continue to improve and approach the theoretical limit. Also, I expect that both servers will also benefit significantly from the performance improvements in GHC 7.8.
Haskell is an excellent choice for getting high performance with a large number of concurrent connections. Haskell has green threads which are extremely cheap compared to threads in most other languages. This gives Haskell web frameworks a huge advantage. We can fire off a new thread for every connection and leverage the massive amount of effort that has gone into optimizing GHC to get great performance while still maintaining a nice programming model.
Regarding Yesod vs Snap, there's a reason that the two exist as separate projects. They are approaching the problem of web development in Haskell from two quite different directions. They both benefit from the performance that Haskell gets you, so you should choose between them based on which approach you prefer. Here are some resources to get you started:
A StackOverflow question with some pretty good answers
My very generalized comparison matrix
A more detailed and objective comparison I wrote
Haskell wiki page comparing web frameworks

Thrift and other Rpc frameworks vs ms rpc

What is difference between rpc frameworks like thrift or gSoap and build-in MS RPC if we talk about security configurations. MSDN describes on http://msdn.microsoft.com/en-us/library/windows/desktop/aa379441(v=vs.85).aspx some aspects, so I can presume that there is support from Microsoft in rpc. Does this mean that if i would like to use different frameworks than MS, I need to take care of security by myself?
This is a very broad question. I'm not quite sure what you really expect, but I'll try to do my best to answer your question.
First, of course you have to take care of the security of whatever you are writing, be it server or client code. Security with regard to RPC services is a wide field, and any sophisticated security feature made available to you by a framework is still just a tool, and still only one part of the overall security concept of your service. To put it in another way: Using SSL will not protect your server from SQL-Injection.
Next, Thift , SOAP and MS-RPC each have different design goals. Thrift is designed with performance and portability in mind. Thrift is more focused on the basic RPC to provide efficiency and portability to any application, for any purpose, in the simplest possible way that works. Of course this approach implies, that there are not much higher-level features, because this is considered being out of the scope of Thrift and left to the user. However, for some of the languages TLS (SSL) transports are available.
In contrast, SOAP is a much richer protocol, based on XML as an machine-readable, standardized and extendable format which can be extended to support higher level features like WS-Security, WS-ReliableMessaging and so on. The downside is, that I have seen many frameworks and development tools which - despite the fact that SOAP has been standardized years ago - are still not able to deal with SOAP in the simpest fashion correctly, let alone supporting WS-Security. Yet, even in spite of this and even in spite of the fact, that SOAP messages tend to produce a lot of traffic and give bad performance, SOAP is still widely used in the industry.
MS-RPC as one of the foundations of DCOM is bound very much to the Windows environment and to Windows development tools. If you can live with that limitation and want to use DCOM, then DCOM offers a very high-level abstraction with good and proven support in today's IDEs.

Node.js event vs thread programming on server side

We are planning to start a fairly complex web-portal which is expected to attract good local traffic and I've been told by my boss to consider/analyse node.js for the serve side.
I think scalability and multi-core support can be handled with an Nginx or Cherokee in front.
1) Is this node.js ready for some serious/big business?
2) Does this 'event/asynchronous' paradigm on server side has the potential to support the heavy traffic and data operation ? considering the fact that 'everything' is being processed in a single thread and all the live connections would be lost if it got crashed (though its easy to restart).
3) What are the advantages of event based programming compared to thread based style ? or vice-versa.
(I know of higher cost associated with thread switching but hardware can be squeezed with event model.)
Following are interesting but contradicting (to some extent) papers:-
1) http://www.usenix.org/events/hotos03/tech/full_papers/vonbehren/vonbehren_html
2) http://pdos.csail.mit.edu/~rtm/papers/dabek:event.pdf
Node.js is developing extremely rapidly, and most of its functionality is sturdy and ready for business. However, there are a lot of places where its lacking, like database drivers, jquery and DOM, multiple http headers, etc. There are plenty of modules coming up tackling every aspect, but for a production environment you'll have to be careful to pick ones that are stable.
Its actually much MUCH more efficient using a single thread than a thousand (or even fifty) from an operating system perspective, and benchmarks I've read (sorry, don't have them on hand -- will try to find them and link them later) show that it's able to support heavy traffic -- not sure about file-system access though.
Event based programming is:
Cleaner-looking code than threaded code (in JavaScript, that is)
The JavaScript engine is extremely efficient with processing events and handling callbacks, and its easily one of the languages seeing the most runtime optimization right now.
Harder to fit when you are thinking in terms of control flow. With events, you can never be sure of the flow. However, you can also come to think of it as more dynamic programming. You can treat each event being fired as independent.
It forces you to be more security-conscious when programming, for the above reason. In that sense, its better than linear systems, where sometimes you take sanitized input for granted.
As for the two papers, both are relatively old. The first benchmarks against this, which as you can see, has a more recent note about these studies:
http://www.eecs.harvard.edu/~mdw/proj/seda/
It also cites the second paper you linked about what they have done, but refuses to comment on its relevance to the comparison between event-based systems and thread-based ones :)
Try yourself to discover the truth
See What is Node.js? where we cover exactly that:
Node in production is definitely possible, but far from the "turn-key" deployment seemingly promised by the docs. With Node v0.6.x, "cluster" has been integrated into the platform, providing one of the essential building blocks, but my "production.js" script is still ~150 lines of logic to handle stuff like creating the log directory, recycling dead workers, etc. For a "serious" production service, you also need to be prepared to throttle incoming connections and do all the stuff that Apache does for PHP. To be fair, Rails has this exact problem. It is solved via two complementary mechanisms: 1) Putting Rails/Node behind a dedicated webserver (written in C and tested to hell and back) like Nginx (or Apache / Lighttd). The webserver can efficiently serve static content, access logging, rewrite URLs, terminate SSL, enforce access rules, and manage multiple sub-services. For requests that hit the actual node service, the webserver proxies the request through. 2) Using a framework like "Unicorn" that will manage the worker processes, recycle them periodically, etc. I've yet to find a Node serving framework that seems fully baked; it may exist, but I haven't found it yet and still use ~150 lines in my hand-rolled "production.js".

Resources