I want to proxy WebSocket connections to multiple node.js servers using Amazon Elastic Load Balancer. Since Amazon ELB does not provide actual WebSocket support, I would need to use its vanilla TCP messaging. However, I'm trying to understand how this would work without some sort of sticky session functionality.
I understand that WebSockets work by first sending an HTTP Upgrade request from the client, which is handled by the server by sending a response which correctly handles key authentication. After the server sends that response and it is approved by the client, there is a bidirectional connection between that client and server.
However let's say the client, after approving the server response, sends data to the server. If it sends the data to the load balancer, and the load balancer then relays that data to a different server that did not handle the original WebSocket Upgrade request, then how will this new server be aware of the WebSocket connection? Or will the client automatically bypass the load balancer and send data directly to the server that handled the initial upgrade?
I think what we need to understand in order to answer this question is how exactly the underlying TCP connection evolves during the whole WebSocket creation process. You will realize that the sticky part of a WebSocket connection is the underlying TCP connection itself. I am not sure what you mean with "session" in the context of WebSockets.
At a high level, initiating a "WebSocket connection" requires the client to send an HTTP GET request to an HTTP server whereas the request includes the Upgrade header field. Now, for this request to happen the client needs to have established a TCP connection to the HTTP server (that might be obvious, but I think here it is important to point this out explicitly). The subsequent HTTP server response is then sent through the same TCP connection.
Note that now, after the server response has been sent, the TCP connection is still open/alive if not actively closed by either the client or the server.
Now, according to RFC 6455, the WebSocket standard, at the end of section 4.1:
If the server's response is validated as provided for above, it is
said that The WebSocket Connection is Established and that the
WebSocket Connection is in the OPEN state
I read from here that the same TCP connection that was initiated by the client before sending the initial HTTP GET (Upgrade) request will just be left open and will from now on serve as the transport layer for the full-duplex WebSocket connection. And this makes sense!
With respect to your question this means that a load balancer will only play a role before the initial HTTP GET (Upgrade) request is made, i.e. before the one and only TCP connection involved in said WebSocket connection creation is established between the two communication end points. Thereafter, the TCP connection stays established and cannot become "redirected" by a network device in between.
We can conclude that -- in your session terminology -- the TCP connection defines the session. As long as a WebSocket connection is alive (i.e. is not terminated), it by definition provides and lives in its own session. Nothing can change this session. Speaking in this picture, two independent WebSocket connections, however, cannot share the same session.
If you referred to something else with "session", then it probably is a session that is introduced by the application layer and we cannot comment on that one.
Edit with respect to your comments:
so you're saying that the load balancer is not involved in the TCP
connection
No, that is not true, at least in general. It definitely can take influence upon TCP connection establishment, in the sense that it can decide what to do with the client connection attempt. The specifics depend on the exact type of load balancer (* , see below). Important: After the connection is established between two endpoints -- whereas I don't consider the load balancer to be an endpoint, I refer to WebSocket client and WebSocket server -- the two endpoints will not change anymore for the lifetime of the WebSocket connection. The load balancer might* still be in the network path, but can be assumed to not take influence anymore.
Therefore the full-duplex connection is between the client and the
end server?
Yes!
***There are different types of load balancing. Depending on the type, the role of the load balancer is different after connection establishment between the two end points. Examples:
If the load balancing happens on DNS basis, then the load balancer is not involved in the final TCP connection at all. It just tells the client to which host is has to connect directly.
If the load balancer works like the Layer 4 ELB from AWS (docs here), then it so to say proxies the TCP connection. So the client would actually see the ELB itself as the server. What happens, however, is that the ELB just forwards the packages in both directions, without change. Hence, it is still heavily involved in the TCP connection, just transparently. In this case there are actually two permanent TCP connections involved: one from you to the ELB, and one from the ELB to the server. These are again permanent for the lifetime of your WebSocket connection.
WebSocket uses a persistent TCP connection, and hence requires all IP packets for that TCP connection to be forwarded to the same backend server (for the lifetime of the TCP connection).
It needs to be sticky. This is different from L7 HTTP LBs which are able to dispatch on a per HTTP-request basis.
A LB can work sticky by different approaches, i.e.
hash the source IP/port to the set of alive backend servers
upon TCP connection establishment, choose a backend server and remember that
Related
Objective:
Never close connection between client and SOCKS proxy + reuse it to send multiple HTTPS requests to different targets (example targets: google.com, cloudflare.com) without closing the socket during the switch to different target.
Step 1:
So I have client which connects to SOCKS proxy server over TCP connection. That is client socket(and only socket(file descriptor) used in this project).
client -> proxy
Step 2:
Then after connection is established and verified. Then it does TLS connect to the target server which can be for example google.com (DNS lookup is done before this).
Now we have connection:
client -> proxy -> target
Step 3:
Then client sends HTTPS request over it and receives response successfully.
Issue appears:
After that I want to close connection explicitly between proxy and target so I can send request to another target. For this it is required to close TLS connection and I don't know how to do it without closing connection between client and proxy which is not acceptable.
Possible solutions?:
1:
Would sending Connection: close\n\r request to current target close connection only between proxy and target and not close the socket.
2:
If I added Connection: close\n\r to headers of every request, would that close the socket and thus it's not valid solution?
Question:
(NodeJS) I made custom https Agent which handles Agent-s method -> callback(req, opts) where opts argument is request options from what client sent to target (through proxy). This callback returns tls socket after it's connected, I built tls socket connection outside of the callback and passed it to agent. Is it possible to use this to close connection between proxy and target using req.close(), would this close the socket? Also what is the point of req in Agent's callback, can it be used in this case?
Any help is appreciated.
If you spin up wireshark and look at what is happening through your proxy, you should quickly see that HTTP/S requests are connection oriented, end-to-end (for HTTPS) and also time-boxed. If you stop and think about it, they are necasarily so, to avoid issues such as the confused deputy problem etc.
So the first bit to note is that for HTTPS, the proxy will only see the initial CONNECT request, and then from there on everything is just a TCP stream of TLS bytes. Which means that the proxy won't be able to see the headers (that is, unless your proxy is a MITM that intercepts the TLS handshake, and you haven't mentioned this, so I've assumed not).
The next bit is that the agent/browser will open connections in parallel (typically a half-dozen for a browser) and will also use pipelining and keep-alive to send multiple requests down the same connection.
Then there are connection limits imposed by the browser, and servers. These typically cap the number of requests, and the duration that they are held open, before speculatively closing them. If they didn't, any reasonably busy server would quickly exhaust all their TCP sockets.
So all-in, what you are looking to achieve isn't going to work.
That said, if you are looking to improve performance, the node client has a few things you can enable and tweak:
Enable TLS session reuse, which will make connections much more
efficient to establish.
Enable keep-alive, which will funnel multiple requests through
the same connection.
I read the docs, concerning the .listen() method, used in express. I can USE the method and setup a server that is listening to HTTP requests.
However, since I am fairly new to coding, I find it difficult to grasp whats really happening when using the .listen() method. The high level explanation "listening for connections" didn't help me.
I think, this could be made easier if I could actually see the function instead of only calling it.
Any help is very much appreciated
In a nutshell, the Express app.listen() method creates an http server object and then configures it to receive incoming TCP connections on a specific port and IP address so that when clients request a connection to that port and send an http request, the server can receive that http request and process it, sending a response. The code in app.listen() is shown below later in the answer - though all it does is call down to one further layer down in the http server object.
Here are the lower level details for how that works.
When a server wishes to start listening for incoming connections, it informs the local TCP stack by creating a socket and binding to a particular port and IP address. That essentially reserves that incoming port for this particular server (no other server will be allowed to also bind to that port). So, for example, on a regular http server on the default port, you would bind to port 80. This type of bound socket is used for incoming connections only, not for two-way communications with a client.
Then, the server informs the TCP stack that it is ready for incoming connections. At the TCP level, this is referred to as listen. Within nodejs, the bind and listen steps are combined into the one step called listen.
From then on, whenever the local TCP stack receives an incoming connecting request whose destination is the IP address and port that the server bound to, then that incoming connection will be accepted and inserted into a queue for the server that is configured for that IP address and port. There will typically be a maximum number of incoming connections that can be queued in this way and, if that number is exceeded, then the connection will be refused. This manages load and protects the host if the server gets "backed up" and is behind on processing incoming connections.
The server will then be informed by the TCP stack for each new incoming connection. Once the server accepts that connection, then it can start reading any data that the client has sent over the socket. In the case of an HTTP server working with the HTTP protocol, this would be the initial request protocol, method, version, headers and any body data. For different types of servers, the data would be in a different format.
Here's a useful diagram of the server:
Source: https://medium.com/javarevisited/fundamentals-of-socket-programming-in-java-bc9acc30eaf4
The server creates a socket used for the server to accept new connections..
It binds that socket to a specific IP address and port so it will only be informed about incoming connections targeted to that IP address and port.
It listens on that port to inform the TCP stack it is ready to accept incoming connections.
When it is notified of an incoming connection, it accepts that incoming connection.
Then it can read and write to that new connection over the new socket.
Then, sometime later, the incoming socket is closed to complete the client transaction.
The app.listen() method in Express encapsulates these steps and a few others. Internally (within Express), the code looks like this:
app.listen = function listen() {
var server = http.createServer(this);
return server.listen.apply(server, arguments);
};
You can see that method here in the open source repository.
To get an http server ready for steps 1-6 above, this creates the http server object within nodejs and then registers the app as the request listener for that server object (so it will be notified of incoming http requests).
Then, the call to server.listen() encapsulates steps 1-3 above.
Step 4 happens inside the http server object implementation and the app object is called when a new connection has been established and a new HTTP request is available. The http server reads the initial request and parses the http protocol and that initial request is already made available to the app for routing to the appropriate handler.
Then, subsequent calls such as res.send() or res.json() write a response back on the http socket and close the socket or res.end() will close it directly (steps 5 and 6 above).
Some other useful references:
Why is bind() used in TCP? Why is it used only on server side and not in client side? - Helps explain how a port and IP address define the TCP endpoint represented by a server. This port has to be known by the client so it can specifically request to connect to that port. The client end of the socket also has an IP address and a port, but its port can be dynamically assigned, thus the client does not have to bind to a specific port itself. The four pieces of data [server IP, server port, client IP, client port] define a specific TCP connection.
How TCP sockets work - has a good section about how new connections to a server work.
Understanding socket and port in TCP - talks about active and passive sockets. Passive sockets are sockets in "listen" mode used to accept incoming connections. Active sockets are two-way communications channels between two TCP endpoints.
Transmission Control Protocol (TCP) - more details on the various aspects of TCP from initiating a listening server, initiating a client connection to that server, through packet transmission to closing the socket.
There are a gazillion other references on the topic on the web. You can probably find 1000 articles on any single aspect of TCP that you might want more info about.
I think, this could be made easier if I could actually see the function instead of only calling it.
The underlying code for listen is inside the operating system's TCP stack and is not part of nodejs or Express. Express relies on the nodejs http server object as its interface to that and the nodejs http server object uses native code (built into nodejs) to call libuv (which is a cross platform C library that nodejs uses for networking and other things). Then, libuv talks to the underlying operating system APIs to reach the actual TCP stack on that target host. All of this is to put the server socket into listen mode so it can be notified of new incoming client connections to that target IP address and port.
Here's some doc on the related portions of the Linux TCP API if you want to see what the underlying TCP interface and description of that interface is:
socket() - https://linux.die.net/man/7/socket
bind() - https://linux.die.net/man/2/bind
listen() - https://linux.die.net/man/2/listen
And, portions of the libuv library that nodejs uses for networking:
TCP handles - http://docs.libuv.org/en/v1.x/tcp.html
Server listen() and accept() - http://docs.libuv.org/en/v1.x/stream.html#c.uv_listen
I have a nodejs TLS client socket on my laptop, connected to a TLS server socket on a different computer (server). The server cannot connect to my laptop. The laptop needs to initiate the connection.
Now I want the server to make requests to my laptop. The idea is to reuse the HTTP protocol. Is there a way to create a HTTP server using the existing TLS client socket?
This way, the server machine can make a HTTP request, and the client TLS receives it, and the HTTP server would parse it? Or am I missing something?
Once you have a TCP socket open between laptop and server, you can send data either way over that socket. So, if the server wants to send some query to the laptop, it can do so just fine. You will have to invent your own protocol on top of TCP to do that, but it could be as simple as a text/line based protocol if you want.
Or, instead of making a plain TCP connection, you can make a webSocket or socket.io connection from the laptop to the server (instead of the plain TCP connection) and then either side can send messages either way and the protocol part is already taken care of. If you use socket.io, it will automatically reconnect if the connection is interrupted too.
There is no simple way to attach an HTTP server to an existing TCP socket and it would be fraught with difficulties too because an HTTP connection is generally not a continuous connection over which you send many separate requests (advanced versions of http can do that, but I doubt you want to get into implementing all that logic on both ends). You could use the HTTP protocol over your existing TCP socket, but that would probably be substantially more work to implement than just use the webSocket/socket.io idea above.
For example socket.io has pingInterval and pingTimeout settings, nes for hapi has similar heartbeat interval settings. This is ostensibly to prevent any intermediates such as over-zealous proxies from closing what seems to be an inactive connection.
But ping/pong frames are part of the websocket protocol and seem to serve the same purpose. So why do websocket library implementors add another layer of ping/pong at the application level?
If I was pushed to guess it would be in case the websocket server is dealing with a client that doesn't respond/support the websocket protocol level ping-pongs.
I did some reading up and made some tests and I think it comes down to this:
Websocket pings are initiated by the server only
The browser Websocket API has isn't able to send ping frames and the incoming pings from the server are not exposed in any way
These pings are all about keepalive, not presence
Therefore if the server goes away without a proper TCP teardown (network lost/crash etc), the client doesn't know if the connection is still open
Adding a heartbeat at application level is a way for the client to establish the servers presence, or lack thereof. These must be sent as normal data messages because that's all the Websocket API (browser) is capable of.
I have to implement a server to server communication protocol using a SINGLE PERSISTENT TCP connection. The server at both the ends of this connection are implemented using "multi-threaded and asynchronous event-driven model". Both these servers are implemented in C++ and Pthreads on Linux. Server A always sends requests to Server B and Server B responds with a response. Server B doesn't send any requests to Server A, it just responds to the requests it receives. Could some one post me a sample code for this communication? Could you help me with the code for both Server A and Server B? Or please point me to any old answers or any websites where i can find a prototype code. Thanks in advance.
TCP servers cannot open connections to TCP servers. There is no IP protocol for that. One of the two servers must run a TCP client as a subsystem. The exact mechanics of how you do that depend on your client<>server protocol - the 'server-client' could log in to the 'client-server' with a unique username/password, or could use a different server listening port.
It's up to you:)