Reverse proxy and Socket.io, failed websocket handshake - node.js

I am running a little website using IIS 7.5 on Windows Server 2008 R2.
I've got a node.js application too running on port 3000.
Http calls from the website (client browser) are reverse proxied from http://example.com/node/whatever to http://localhost:3000/whatever. Everything works fine so far.
The problem is when i try to use socket.io.
I am receiving:
WebSocket connection to 'ws://example.com/socket.io/?EIO=3&transport=websocket&sid=adb9WRpoMFYRoS0vAAAB'
failed: Error during WebSocket handshake: Unexpected response code: 502
I am pretty sure, if i am not wrong, that:
It does forward the initial request to my server as the initial request to a websocket server is a standard HTTP request (with some additional headers). IIS does know about that and simply forwards the request. However, upon receiving the websocket request the websocket server sends a 101 response and switch into websocket mode. IIS does not understand the websocket traffic and it is not able to proxy that.
Is there a trick or solution to configure the reverse proxy for the ws:// adresses?

I'll expand a bit on my comments.
As far as I understand, you have several options :
bypass the reverse proxy after the connection (ugly hack if you ask me) : http://www.guyellisrocks.com/2014/06/using-websockets-when-your-reverse.html
upgrade to a more recent version that support native websockets everse-proxying. See this related serverfault thread and this technet features announcement
force socket-io to not to upgrade to websockets Socket.io (post v1.0) should already handle that I think, that is it should stay in comet/long-polling and not upgrade if the Websocket handshake doesn't work. But you can force it manually anyway.
So, which version of socket.io are you using ? See the docs for allowed transports (extract below) :
io.enable('browser client minification'); // send minified client
io.enable('browser client etag'); // apply etag caching logic based on version number
io.enable('browser client gzip'); // gzip the file
io.set('log level', 1); // reduce logging
// enable all transports (optional if you want flashsocket support, please note that some hosting
// providers do not allow you to create servers that listen on a port different than 80 or their
// default port)
io.set('transports', [
'websocket'
, 'flashsocket'
, 'htmlfile'
, 'xhr-polling'
, 'jsonp-polling'
]);

Related

URL generated by SocketIO in NodeJS running locally

I'm using Socket.IO to run a WebSocket server locally in NodeJS using the following code:
import express = require('express');
const path = require('path');
import http = require('http');
import { Socket } from 'socket.io';
const app = express();
const server = http.createServer(app);
const socketio = require('socket.io')(server);
app.get('/', (req, res) => {
res.send("Node Server is running");
});
server.listen(3000, function () {
console.log('Example app listening on port 3000!');
});
socketio.on("connection", (socket: Socket) => {
console.log(`connect ${socket.id}`);
console.log(`connect ${socket.handshake.url}`);
socket.on("disconnect", () => {
console.log(`disconnect ${socket.id}`);
});
});
Using a tool like Firecamp, I try to establish a connection on ws://localhost:3000, but to no avail. I eventually use the Socket.IO client to connect from a simple web page by running let socket = io(). It seems the only reason this works is because that call connects to the host serving the page by default, as stated here. Running console.log(socket) and looking at the output, I eventually find that the URL inside the engine field is ws://localhost:3000/socket.io/?EIO=4&transport=websocket&sid=qerg3iHm3IKMOjdNAAAA.
My question is why is the URL so complicated rather than simply ws://localhost:3000? And is there no easier way to get the URL instead of having to access it through dev tools?
A socket.io server does not accept generic webSocket connections. It only accepts socket.io connections as socket.io goes through an extra layer of preparation stuff (over http) before establishing the actual webSocket connection. It then also adds a layer on top of the regular webSocket packet format to support some of its features (such as message names).
When using a socket client to connect to a socket.io server in the default configuration, socket.io first makes a few regular http requests to the socket.io server and with those http requests it sends a few parameters. In your URL:
ws://localhost:3000/socket.io/?EIO=4&transport=websocket&sid=qerg3iHm3IKMOjdNAAAA
The path:
/socket.io/
Is the path that the socket.io server is looking for requests on as destined for the socket.io server. Since this is a unique path and not generally used by other requests, this allows you to share an http server between socket.io and other http requests. In fact, this is a common way to deploy a socket.io server (hooking into an http server that you are already using for http requests).
In fact, the path /socket.io/socket.io.js is also served by the socket.io server and that will return the client-side socket.io.js file. So, clients often use this in their HTML files:
<script src="/socket.io/socket.io.js"></script>
as a means of getting the socket.io client code. Again you see the use of the path prefix /socket.io on all socket.io related URLs.
In your original URL, you can see parameters for:
EIO=4 // engine.io protocol version
transport=websocket // desired transport once both sides agree
sid=qerg3iHm3IKMOjdNAAAA // client identifier so the server knows which client this
// is before the actual webSocket connection is established
Once both sides agree that the connection looks OK, then the client will make a webSocket connection to the server. In cases where webSocket connections are blocked (by network equipment that doesn't support them or blocks them), then socket.io will use a form of http polling where it repeatedly "polls" the server asking for any more data and it will attempt to simulate a continuous connection. The client configuration can avoid this http polling and go straight to a webSocket connection if you want, but you would give up the fallback behavior in case continuous webSocket connections are blocked.
And is there no easier way to get the URL instead of having to access it through dev tools?
Not really. This URL is not something you have to know at all. The socket.io client will construct this URL for you. You just specify http://localhost:3000 as the URL you want to connect to and the socket.io client will add the other parameters to it.

Socket.io connection event not fired on server

I am trying to build a command-line chat room using Node.js and Socket.io.
This is my server-side code so far, I have tried this with both http initialisations (with express, like on the official website's tutorial, and without it):
#app = require('express')()
#http = require('http').Server(app)
http = require('http').createServer()
io = require('socket.io')(http)
io.sockets.on 'connect', (socket) ->
console.log 'a user connected'
http.listen 3000, () ->
console.log 'listening on *:3000'
I start this with nodejs server.js, the "Listening on" is showing up.
If I run lsof -i tcp:3000, the server.js process shows up.
However, when I start this client-side code:
socket = require('socket.io-client')('localhost:3000', {})
socket.on 'connect', (socket) ->
console.log "Connected"
No luck... When I run nodejs client.js, neither "connect" events, from server nor client, are fired!
My questions are :
- What am I doing wrong?
- Is it necessary to start a HTTP server to use it? Sockets are on the transport layer, right? So in theory I don't need a HTTP protocol to trade messages.
If this is a server to server connection and you're only making a socket.io connection (not also setting it up for regular HTTP connections), then this code shows the simple way for just a socket.io connection:
Listening socket.io-only server
// Load the library and initialize a server on port 3000
// This will create an underlying HTTP server, start it and bind socket.io to it
const io = require('socket.io')(3000);
// listen for incoming client connections and log connect and disconnect events
io.on('connection', function (socket) {
console.log("socket.io connect: ", socket.id);
socket.on('disconnect', function() {
console.log("socket.io disconnect: ", socket.id);
});
});
Node.js socket.io client - connects to another socket.io server
// load the client-side library
const io = require('socket.io-client');
// connect to a server and port
const socket = io('http://localhost:3000');
// listen for successful connection to the server
socket.on('connect', function() {
console.log("socket.io connection: ", socket.id);
});
This code works on my computer. I can run two separate node.js apps on the same host and they can talk to one another and both see the connect and disconnect events.
Some Explaining
The socket.io protocol is initiated by making an HTTP connection to an HTTP server. So, anytime you have a socket.io connection, there is an HTTP server listening somewhere. That HTTP connection is initially sent with some special headers that indicate to the server that this is a request to "upgrade" to the webSocket protocol and some additional security info is included.
This is pretty great reference on how a webSocket connection is initially established. It will show you step by step what happens.
Once both sides agree on the "upgrade" in protocol, then the protocol is switched to webSocket (socket.io is then an additional protocol layer on top of the base webSocket protocol, but the connection is all established at the HTTP/webSocket level). Once the upgrade is agreed upon, the exact same TCP connection that was originally the incoming HTTP connection is repurposed and becomes the webSocket/socket.io connection.
With the socket.io server-side library, you can either create the HTTP server yourself and then pass that to socket.io or you can have socket.io just create one for you. If you're only using socket.io on this server and not also sharing using http server for regular http requests, then you can go either way. The minimal code example above, just lets socket.io create the http server for you transparently and then socket.io binds to it. If you are also fielding regular web requests from the http server, then you would typically create the http server first and then pass it to socket.io so socket.io could bind to the http server you already have.
Then, keep in mind that socket.io is using the webSocket transport. It's just some additional packet structure on top of the webSocket transport. It would akin to agreeing to send JSON across an HTTP connection. HTTP is the host transport and underlying data format. Both sides then agree to format some data in JSON format and send it across HTTP. The socket.io message format sits on top of webSocket in that way.
Your Questions
Is it necessary to start a HTTP server to use it?
Yes, an HTTP server must exist somewhere because all socket.io connections start with an HTTP request to an HTTP server.
Sockets are on the transport layer, right?
The initial connection protocol stack works like this:
TCP <- HTTP protocol
Then, after the protocol upgrade:
TCP <- webSocket <- socket.io
So after the protocol upgrade from HTTP to the webSocket transport, you then have socket.io packet format sitting on top of the webSocket format sitting on top of TCP.
So in theory I don't need a HTTP protocol to trade messages.
No, that is not correct. All connections are initially established with HTTP. Once the upgrade happens to the webSocket transport, HTTP is no longer used.

Node.js server for Socket.IO explanation?

I have the following code:
express = require('express');
app = express();
http = require('http').createServer(app);
io = require('socket.io')(http);
app.use(express.static(__dirname + '/'));
http.listen(80);
I know it creates a server that clients can connect to and it works. But I don't know what exactly happens. Can you explain in detail?
Also, why things don't work when I forget about Express.js and just use this line:
io = require('socket.io').listen(80);
It appears to listen for connections. However, inside the browser when I go to http://localhost/, nothing happens. My guess is that I don't specify the directory for my app like that:
app.use(express.static(__dirname + '/'));
Is that why I need Express? To specify the directory?
At the client, I use:
socket = io('http://localhost/'); // this
socket = io(); // or this
None of them work with the single line code at the server-side.
Also, why do I need an HTTP server when Socket.IO uses the WebSocket protocol?
When your browser goes to http://localhost/, you need a web server that's going to respond back to the browser with a web page. That's what Express and the express.static() lines were doing. When you remove those, you do indeed have a server listening for webSocket connections on a specific path, but you don't have anything serving web pages. So, when the browser goes to http://localhost/, there's nothing responding back with a plain web page.
Also, why do I need an HTTP server when Socket.IO uses the WebSocket
protocol?
All socket.io connections start with an HTTP request. socket.io is based on the webSocket protocol and all webSocket connections are initiated with an HTTP request. So, to accept a socket.io connection, you need a web server that responds to an HTTP request and you then need a web server that is smart enough to recognize a request for a webSocket connection so it can "upgrade" the protocol from HTTP to webSocket.
For a well written overview of how a webSocket connection is established, see this overview on MDN.
The socket.io infrastructure then runs on top of that webSocket once it is connected.
I know it creates a server that clients can connect to and it works.
But I don't know what exactly happens. Can you explain in detail?
Here's a line-by-line explanation of your code:
express = require('express');
This loads the Express library.
app = express();
This creates an Express app object which can be used as a webServer request handler.
http = require('http').createServer(app);
This creates a web server and passes it the Express app object as the webServer request handler.
io = require('socket.io')(http);
This hooks socket.io into your web server as another request handler so it can see any incoming http requests that are actually the first stage of starting a webSocket/socket.io connection.
app.use(express.static(__dirname + '/'));
This tells Express that if any request is made for a web page that it should look in the __dirname for a file that matches the requested path. If found, it should return that path.
http.listen(80);
This starts the web server listening on port 80.
None of them work with the single line code at the server-side.
Both of those lines of code to create a socket.io connection will work when used properly. You don't say how this code is being run. If you're trying to run this code from a web page that the browser loads from http://localhost/, then I've already explained why that web page won't load if you don't start Express. If you're trying to run those lines of code from a web page loaded some other way, then you're probably having a same-origin security issue were the browser by default won't let you access a domain that is different than the one the web page came from.
You need the express http server to deliver the socket client to the browser.
Express server starts on port 80
Browser connects to express on port 80, the socket.io server component delivers socket client javascript to the browser (http://localhost:80/socket.io/socket.io.js)
Socket client (running in browser) can then connect to socket.io server

can I use socket.io-client to connect to a standard websocket?

Trying to use socket.io-client to connect to a websocket server that is written in Go. I've successfully connected using the node WebSocket library (npm). So the working Websocket code looks like:
goSocketPort = 6060
url = "ws://localhost:#{goSocketPort}/streamresults/"
ws = new WebSocket(url)
ws.on('open', ->
log "socket opened"
)
ws.on('message', (message) ->
console.log('received: %s', message)
#log "Socket message: #{JSON.stringify message}"
)
Pretty easy and it works -- the socket on the other end sends messages on a set frequency. But I initially tried with socket.io-client (npm) and just couldn't get it to go. It certainly lists websocket as its first-preference transport, but damn if I can get it to connect:
socket = ioClient.connect("#{url}", {port: goSocketPort, transports: ['xhr-polling', 'websocket']})
socket.on("connect", (r) ->
log "connected to #{url}"
)
The connection never happens, so none of the on events are fired and the code exits right away. I've tried: leaving the port off the url and adding it in the options, leaving off the transports option (which means "all" according to the docs) and using an http url. Is socket-io.client not capable of connecting to a "standard" websocket?
Based on our chat, it looks like you were misled by this quote:
The socket.io client is basically a simple HTTP Socket interface implementation. It looks similar to WebSocket while providing additional features and leveraging other transports when WebSocket is not supported by the user's browser.
What this means is that it looks similar to WebSocket from the perspective of client/server code that interacts with the Socket.io client/server. However, the network traffic looks very different from a simple WebSocket - there's an initial handshake in addition to a more robust protocol built on top of WebSocket once that's connected. The handshake is described here and the message protocol here (both are links to the Socket.IO protocol spec).
If you're writing a WebSocket server, you're better off just using the bare WebSocket interface rather than the Socket.io client, unless you intend to implement all of the Socket.io protocol.
Not sure if this was the case at the time, but socket.io's website now states this directly in the docs.
Although Socket.IO indeed uses WebSocket as a transport when possible,
it adds additional metadata to each packet. That is why a WebSocket
client will not be able to successfully connect to a Socket.IO server,
and a Socket.IO client will not be able to connect to a plain
WebSocket server either.
https://socket.io/docs/

Why do nodejs WebSocket implementations not use net.Server?

I am currently experiencing with Websockets.
By reviewing some active projects/implementations like einaros/ws (and others as well) I found out that they implement the server their own. Instead of using the node net module which provides a tcp server. Is there a reason for this approach?
https://github.com/einaros/ws/blob/master/lib/WebSocketServer.js
Regards
Update:
var server = net.createServer(function(c) {
c.on('data', function(data) {
// data is a websocket fragment which has to get parsed
});
// transformToSingleUtfFragment is building a websocket valid
// byte fragment which contains hello as application payload
// and sets the right flags so the receiver knows we have a single text fragment
c.write(transformToSingleUtfFragment('hello'));
c.pipe(c);
});
server.listen(8124, function() { //'listening' listener
console.log('server bound');
});
WebSocket's a a protocol layered on top of normal HTTP.
How it works is basically that the browser sends a UPGRADE HTTP request and then makes use of the HTTP 1.1 keep alive functionality to keep the underlying TCP socket of the HTTP connection open.
The data is then send via the WebSocket Protocol (Rather large RFC behind the link), which itself is built on top of TCP.
Since the HTTP part is required, and you need to re-use the TCP connection from that one, it makes sense to go with the normal HTTP server instead of net.Server. Otherwise you'd had to implement the HTTP handling part yourself.
Implementing the WebSocket Protocol needs to be done in either case, and since any HTTP connection can be upgraded, you can, in theory, simply connect your WebSocket "server" to the normal HTTP Server on Port 80 and thus handle both normal HTTP requests and WebSockets on the same port.

Resources