When, if at all, is it more appropriate to use http over web sockets? - node.js

I am using Socket.IO with a MEAN stack and it's been excellent for low latency and bidirectional communication, but what would be the major draw back for using it for relatively static data as well as dynamic?
My assumption is that it would be more apt for sending more dynamic content. That being said, once a socket connection is established, how relevant is the amount of communication being done? Is there a time where it would be more appropriate to use http instead when a connection is constantly established throughout the user's direct interaction with the application?
Thanks!

WebSockets are a bidirectional data exchange within a HTTP connection. So the question is not if you use HTTP or WebSockets, because there is no WebSockets without HTTP. WebSockets are often confused with simple (BSD) sockets, but WebSockets are actually a socket-like layer inside a HTTP connection which is inside a TCP connection which uses "real" sockets. Or for anybody familiar with OSI layers: it as a layer 4 (transport) encapsulated inside layer 7 (application) and the main reason for doing it this strange way instead of using layer 4 directly is that plain sockets to ports outside of HTTP, SMTP and a few other protocols are no longer possible because of all the port blocking firewalls.
So the question should be more if you use simple HTTP or if you need to use WebSockets (inside HTTP).
With simple HTTP the client sends a request and the server sends the response back. The format is well defined and browser and server transparently support compression, caching and other optimizations. But this simple request-response pattern is limited, because there is no way to push data from server to client or to have a more (BSD) socket like behavior where both client and server can send any data at any time. There are various more or less good workarounds for this, like long polling.
WebSockets gives you a bidirectional communication, which makes it possible for the server to push data to the client or to send data in both directions at any time. And once the WebSocket connection is established by upgrading an existing HTTP connection the overhead for the data itself is very small, much smaller then with a full new HTTP request. While this sounds good you loose all the advantages of simple request-response HTTP like caching at the client or in proxies. And because client and server need resources to keep the underlying TCP connection open it needs more resources, which can be relevant for a busy server. Also, WebSockets might give you more trouble with middleboxes (like proxies or firewalls) then simple HTTP does.
In summary: if you don't need the advantages of WebSockets stay with simple request-response HTTP.

Related

Why do many websocket libraries implement their own application-level heartbeats?

For example socket.io has pingInterval and pingTimeout settings, nes for hapi has similar heartbeat interval settings. This is ostensibly to prevent any intermediates such as over-zealous proxies from closing what seems to be an inactive connection.
But ping/pong frames are part of the websocket protocol and seem to serve the same purpose. So why do websocket library implementors add another layer of ping/pong at the application level?
If I was pushed to guess it would be in case the websocket server is dealing with a client that doesn't respond/support the websocket protocol level ping-pongs.
I did some reading up and made some tests and I think it comes down to this:
Websocket pings are initiated by the server only
The browser Websocket API has isn't able to send ping frames and the incoming pings from the server are not exposed in any way
These pings are all about keepalive, not presence
Therefore if the server goes away without a proper TCP teardown (network lost/crash etc), the client doesn't know if the connection is still open
Adding a heartbeat at application level is a way for the client to establish the servers presence, or lack thereof. These must be sent as normal data messages because that's all the Websocket API (browser) is capable of.

http.createserver vs net.createserver in node.js

I am having trouble understanding the difference between net.createserver and http.createserver in node.js.
I have read the documentation for both methods located at these two urls
https://nodejs.org/api/net.html#/net_net,
https://nodejs.org/api/http.html#/http_class_http_server.
I understand that http.createserver creates an http server. However, the documentation says that net.createserver creates a tcp server. I understand that tcp is the transmission protocol that http is on top of and that http servers are set up to read http request headers. I also understand the concept of even emitters in node.js pretty well. However, I don't understand this notion of a tcp server and why one would be made in node.js. The context is I am coding a chat application example in the "node.js in action" book.
http.createServer() sets up a server that handles the HTTP protocol, which is indeed transmitted over tcp. net.createServer() creates a server that simply understands when a TCP connection has happened, and data has been transmitted, and so on, but doesn't know anything about whether a valid HTTP request has been received, etc.
If you are writing a web server, favor http.createServer() over net.createServer() as it will save you a lot of work. If you are writing some other kind of server, do not use http.createServer().
I don't know much of a Node.js, but I know something about networks. HTTP is a protocol that works on 7th (Application) layer of model OSI. TCP is protocol that works on 4th (Transport) layer of model OSI. As you said, yes HTTP works on top of the TCP. The option of creating HTTP server by http.createServer() is there so you don't have to implement it by yourself by using net.createServer(). The protocol TCP might by used by lot of applications, you might create your own, or implement some different protocol than HTTP, for example: FTP, DNS, SMTP, Telnet and much much more.
Straight from the Node Net documentation. NET is the basic bare-bones server you can create. It's particularly useful for setting up a cluster of servers and allows simple connections but on that you'll want communication protocols, namely HTTP, which HTTP is in fact a NET server at it's core.
The net module provides an asynchronous network API for creating stream-based TCP or IPC servers (net.createServer()) and clients (net.createConnection()).
And from the HTTP documentation. HTTP is the common way to transmit large sets of data as requested by the client and then a response is generated. It's the standard way of communicating over the internet and introduces the concept of handshakes and is done through REST protocol, you know the usual request and response way of communicating.
The HTTP interfaces in Node.js are designed to support many features of the protocol which have been traditionally difficult to use. In particular, large, possibly chunk-encoded, messages. The interface is careful to never buffer entire requests or responses — the user is able to stream data.
Websockets are an upgrade over the HTTP headers and offer low latency and less server load and are a much more minimal conversation. If you're talking peer to peer communication, that's the way you'll want to go.

Node.js: HTTP/REST requests using existing libraries over proprietary transport protocol

Given a standard Node.js HTTP library, or an existing REST client library, what would be the most feasible way to allow such a library to perform those HTTP requests over the top of my own protocol?
To put this another way: I aim provide a module which looks like a HTTP client. It accepts HTTP requests headers, and returns HTTP responses. What options should I consider to adapt an existing REST library to work with my 'pseudo' HTTP client module, as opposed to the standard Node library HTTP client?
Further background information
I wish to create a server application (based on Node.js) which makes HTTP REST requests to a remote embedded device. However, due to NAT, it is not possible for the application server to make client TCP connections directly to the remote device. Therefore, to get around NAT, I will devise my own proprietary protocol which involves the remote device initiating a persistent connection to the application server. Then, once that persistent connection is established, the Node.js application shall be able to make HTTP requests back over that persistent connection to the networked device.
My objective is therefore to create a Node.js module which acts as a 'bridge' layer between incoming socket connections from the networked devices, and the main application which makes REST requests. The aim is that the application would make REST requests as if it were making HTTP client requests to a server, when in fact the HTTP requests and responses are being conveyed on top of the proprietary protocol.
An option I'm presently considering is for my 'bridge' module to implement an interface that mimics that of http.request(options,[callback]) and somehow enforce a REST client library to use this interface instead of the Node HTTP client. Supposedly at minimum I'd have to lightly modify whichever REST client library I'd use to achieve this.
As explained above, I'm essentially trying to create my own form of NAT traversal using an intermediary server. The intermediary server would provide the front-end UI to users, and make back-end data requests to the embedded networked devices. Connections between embedded devices and application server would be persistent, and initiated from the embedded devices, to avoid the usual NAT headaches (i.e. the requirement to configure port forwarding).
Though I mentioned earlier I'd achieve the device-to-server connection using my own protocol over a raw socket connection, the mechanism I'm actually experimenting with right now is to use plain HTTP together with long-polling. The embedded device initiates a HTTP connection to the application server and delayed responses are used to convey data back to the device when the server has something to send. I would then 'tunnel' HTTP requests going in the reverse direction over the top of this.
Therefore, in simple terms, my 'bridge' layer is something that accepts HTTP connections inwards from both sides (outside device connections, and inside web application REST requests). By using long-polling it would effectively convey requests and responses between the connected clients.
Instead of replacing the http layer, create a man-in-the-middle. Create an http server in node that is the target for all of the rest requests. It then transfers the request onto the proprietary protocol and handles the response by translating back to rest.
This way you don't have to hack the rest code and can even swap it out for another library if needed.

why engine io use polling first to establish the connection and then use websocket [duplicate]

This question already has answers here:
WebSockets protocol vs HTTP
(6 answers)
Closed 8 years ago.
when I read engine io protocol, I found it use polling to establish the connection and then upgrade the transport to websocket, I don't know why ? could you give me some idea?
It's because the WebSocket upgrade could fail, so having polling as a fallback mechanism is useful.
It doesn't really use polling. The initial HTTP url may look like it's a setup for polling, but that will go into play only if the server doesn't agree to upgrade the connection to the webSocket protocol.
A socket.io connection starts with a single TCP connection which is an HTTP request with certain webSocket headers set and then when the server responds that the webSocket protocol is supported, the connection is "upgraded" from HTTP to webSocket and both sides switch the protocol being used from HTTP to webSocket. This is how the webSocket protocol is specified.
If the client/server combination does not support webSocket, then and only then does socket.io resort to using long polling.
This particular design allows both webSocket and HTTP to share the same port and the socket.io design allows for a graceful fallback to long-polling if both sides don't agree on a webSocket upgrade.
engine.io/socket.io 1.x+ starts with polling first because that pretty much always works with all types of clients, allowing them to get connected very quickly. Then in the background, attempts to upgrade the connection are made (to WebSockets or whatever else). That way if the connection upgrades fail, nothing is lost because the polling is still working like before, so there is no down time.
The reason for this change from the old behavior of downgrading instead of upgrading is that WebSockets can be troublesome to get going correctly in some situations (e.g. problems with load balancers, proxies, etc.) and even if they do get connected there can be some extra delays involved. Also using the flash fallback for WebSockets would take some time to get connected because it involves extra roundtrips and some additional delays.

Full-duplex messaging between remote autonomous Node.js applications over WebSockets?

There will be no human being in the loop, and both endpoints are autonomous Node.js applications operating as independent services.
Endpoint A is responsible for contacting Endpoint B via secure web socket, and maintaining that connection 24/7/365.
Both endpoints will initiate messages independently (without human intervention), and both endpoints will have an API (RESTful or otherwise) to receive and process messages. You might say that each endpoint is both a client of, and a server to, the other endpoint.
I am considering frameworks like Sails.js and LoopBack (implemented on both endpoints), as well as simply passing JSON messages over ws, but remain unclear what the most idiomatic approach would be.
Web Sockets have a lot of overhead for connecting to browsers and what not, since they try to remain compatible with HTTP. If you're just connecting a pair of servers, a simple TCP connection will suffice. You can use the net module for this.
Now, once you have that connection, how do you initiate communication? You could go through the trouble of making your own protocol, but I don't recommend it. I found that a simple RPC was easiest. You can use the rpc-stream package over any duplex stream (including your TCP socket).
For my own application, I actually installed socket.io-client and let my servers use it for RPC. Although if I were to do it again, I would use rpc-stream to skip all the overhead required for setting up a Web Socket connection.

Resources