Why should I use Restify? - node.js

I had the requirement to build up a REST API in node.js and was looking for a more light-weight framework than express.js which probably avoids the unwanted features and would act like a custom-built framework for building REST APIs. Restify from its intro is recommended for the same case.
Reading Why use restify and not express? seemed like restify is a good choice.
But the surprise came when I tried out both with a load.
I made a sample REST API on Restify and flooded it with 1000 requests per second. Surprise to me the route started not responding after a while. The same app built on express.js handled all.
I am currently applying the load to API via
var FnPush = setInterval(function() {
for(i=0;i<1000;i++)
SendMsg(makeMsg(i));
}, 1000);
function SendMsg(msg) {
var post_data = querystring.stringify(msg);
var post_options = {
host: target.host,
port: target.port,
path: target.path,
agent: false,
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
'Content-Length': post_data.length,
"connection": "close"
}
};
var post_req = http.request(post_options, function(res) {});
post_req.write(post_data);
post_req.on('error', function(e) {
});
post_req.end();
}
Does the results I have got seem sensible? And if so is express more efficient than restify in this scenario? Or is there any error in the way I tested them out?
updated in response to comments
behavior of restify
when fed with a load of more than 1000 req.s it stopped processing in just 1 sec receiving till 1015 req.s and then doing nothing. ie. the counter i implemented for counting incoming requests stopped increment after 1015.
when fed with a load of even 100 reqs. per second it received till 1015 and gone non responsive after that.

Corrigendum: this information is now wrong, keep scrolling!
there was an issue with the script causing the Restify test to be conducted on an unintended route. This caused the connection to be kept alive causing improved performance due to reduced overhead.
This is 2015 and I think the situation has changed a lot since. Raygun.io has posted a recent benchmark comparing hapi, express and restify.
It says:
We also identified that Restify keeps connections alive which removes the overhead of creating a connection each time when getting called from the same client. To be fair, we have also tested Restify with the configuration flag of closing the connection. You’ll see a substantial decrease in throughput in that scenario for obvious reasons.
Looks like Restify is a winner here for easier service deployments. Especially if you’re building a service that receives lots of requests from the same clients and want to move quickly. You of course get a lot more bang for buck than naked Node since you have features like DTrace support baked in.

This is 2017 and the latest performance test by Raygun.io comparing hapi, express, restify and Koa.
It shows that Koa is faster than other frameworks, but as this question is about express and restify, Express is faster than restify.
And it is written in the post
This shows that indeed Restify is slower than reported in my initial
test.

According to the Node Knockout description:
restify is a node.js module purpose built to create REST web services in Node. restify makes lots of the hard problems of building such a service, like versioning, error handling and content-negotiation easier. It also provides built in DTrace probes that you get for free to quickly find out where your application’s performance problems lie. Lastly, it provides a robust client API that handles retry/backoff for you on failed connections, along with some other niceties.
Performance issues and bugs can probably be fixed. Maybe that description will be adequate motivation.

I ran into a similar problem benchmarking multiple frameworks on OS X via ab. Some of the stacks died, consistently, after around the 1000th request.
I bumped the limit significantly, and the problem disappeared.
You can you check your maxfiles is at with ulimit, (or launchctl limit < OS X only) and see what the maximum is.
Hope that helps.

In 2021, benchmarking done by Fastify (https://www.fastify.io/benchmarks/) indicates that Restify is now slightly faster than Express.
The code used to run the benchmark can be found here https://github.com/fastify/benchmarks/.

i was confused with express or restify or perfectAPI. even tried developing a module in all of them. the main requirement was to make a RESTapi. but finally ended up with express, tested my self with the request per second made on all the framework, the express gave better result than others. Though in some cases restify outshines express but express seams to win the race. I thumbs up for express. And yes i also came across locomotive js, some MVC framework build on top of express. If anyone looking for complete MVC app using express and jade, go for locomotive.

Related

nodejs multithread for the same resource

I'm quite new to nodejs and I'm doing some experiments.
What I get from them (and I hope I'm wrong!) is that nodejs couldn't serve many concurrent requests for the same resource without putting them in sequence.
Consider following code (I use Express framework in the following example):
var express = require('express');
var app = express();
app.get('/otherURL', function (req, res) {
res.send('otherURL!');
});
app.get('/slowfasturl', function (req, res)
{
var test = Math.round(Math.random());
if(test == "0")
{
var i;
setTimeout
(
function()
{
res.send('slow!');
}, 10000
);
}
else
{
res.send('fast!');
}
});
app.listen(3000, function () {
console.log('app listening on port 3000!');
});
The piece of code above exposes two endpoints:
http://127.0.0.1:3000/otherurl , that just reply with "otherURL!" simple text
http://127.0.0.1:3000/slowfasturl , that randomly follow one of the two behaviors below:
scenario 1 : reply immediately with "fast!" simple text
or
scenario 2 : reply after 10 seconds with "slow!" simple text
My test:
I've opened several windows of chrome calling at the same time the slowfasturl URL and I've noticed that the first request that falls in the "scenario 2", causes the blocking of all the other requests fired subsequentely (with other windows of chrome), indipendently of the fact that these ones are fallen into "scenario 1" (and so return "slow!") or "scenario 2" (and so return "fast!"). The requests are blocked at least until the first one (the one falling in the "scenario 2") is not completed.
How do you explain this behavior? Are all the requests made to the same resource served in sequence?
I experience a different behavior if while the request fallen in the "scenario 2" is waiting for the response, a second request is done to another resource (e.g. the otherurl URL explained above). In this case the second request is completed immediately without waiting for the first one
thank you
Davide
As far as I remember, the requests are blocked browser side.
Your browser is preventing those parallel requests but your server can process them. Try in different browsers or using curl and it should work.
The behavior you observe can only be explained through any sequencing which browser does. Node does not service requests in sequence, instead it works on an event driven model, leveraging the libuv framework
I have ran your test case with non-browser client, and confirmed that requests do not influence each other.
To gain further evidence, I suggest the following:
Isolate the problem scope. Remove express (http abstraction) and use either http (base http impl), or even net (TCP) module.
Use non-browser client. I suggest ab (if you are in Linux) - apache benchmarking tool, specifically for web server performance measurement.
I used
ab -t 60 -c 100 http://127.0.0.1:3000/slowfasturl
collect data for 60 seconds, for 100 concurrent clients.
Make it more deterministic by replacing Math.random with a counter, and toggling between a huge timeout and a small timeout.
Check result to see the rate and pattern of slow and fast responses.
Hope this helps.
Davide: This question needs an elaboration, so adding as another answer rather than comment, which has space constraints.
If you are hinting at the current node model as a problem:
Traditional languages (and runtimes) caused code to be run in sequence. Threads were used to scale this but has side effects such as:
i) shared data access need sequencing, ii) I/O operations block. Node is the result of a careful integration between three entities
libuv(multiplexer), v8 (executor), and node (orchestrator) to address those issues. This model ensures improved performance and scalability under web and cloud deployments. So there is no problem with this approach.
If you are hinting at further improvements to manage stressful CPU bound operations in node where there will be waiting period yes, leveraging the multi-core and introducing more threads to share the CPU intensive workload would be the right way.
Hope this helps.

Async profiling nodejs server to review the code?

We encountered performance problem on our nodejs server holding 100k ip everyday.
Now we want to review the code and find the bottle-neck.
#jfriend00 from what we can see now, the problem seems to be DB access and file access. But we don't know what logic caused this access.
We are still looking for good ways to do the async profiling of nodejs server.
Here's what we tried
Nodetime
This works for us to some extent. It can give the executing time of code specified to the lines. However, we can't locate the error because the server works async and no stacking and calling info can be determined.
Async-profiling
This works with async and is said to be the first of this kind.
Problem is, we've integrated it's js code with our server-side code.
var AsyncProfile = require('async-profile')
AsyncProfile.profile(function () {
///// OUR SERVER-SIDE CODE RESIDES HERE
setTimeout(function () {
// doAsyncStuff
});
});
We can only record the profile of one time of server execution for one request. Can we use this code with things like forever? I've no idea with this.
dtrace
This is too general for us to locate problem in nodejs code.
Do you have any idea on profiling nodejs server code? Any hints or suggestions are appreciated. Thanks.

“Proxying” a lot of HTTP requests with Node.js + Express 2

I'm writing proxy in Node.js + Express 2. Proxy should:
decrypt POST payload and issue HTTP request to server based on result;
encrypt reply from server and send it back to client.
Encryption-related part works fine. The problem I'm facing is timeouts. Proxy should process requests in less than 15 secs. And most of them are under 500ms, actually.
Problem appears when I increase number of parallel requests. Most requests are completed ok, but some are failed after 15 secs + couple of millis. ab -n5000 -c300 works fine, but with concurrency of 500 it fails for some requests with timeout.
I could only speculate, but it seems thant problem is an order of callbacks exectuion. Is it possible that requests that comes first are hanging until ETIMEDOUT because of node's focus in latest ones which are still being processed in time under 500ms.
P.S.: There is no problem with remote server. I'm using request for interactions with it.
upd
The way things works with some code:
function queryRemote(req, res) {
var options = {}; // built based on req object (URI, body, authorization, etc.)
request(options, function(err, httpResponse, body) {
return err ? send500(req, res)
: res.end(encrypt(body));
});
}
app.use(myBodyParser); // reads hex string in payload
// and calls next() on 'end' event
app.post('/', [checkHeaders, // check Content-Type and Authorization headers
authUser, // query DB and call next()
parseRequest], // decrypt payload, parse JSON, call next()
function(req, res) {
req.socket.setTimeout(TIMEOUT);
queryRemote(req, res);
});
My problem is following: when ab issuing, let's say, 20 POSTs to /, express route handler gets called like thousands of times. That's not always happening, sometimes 20 and only 20 requests are processed in timely fashion.
Of course, ab is not a problem. I'm 100% sure that only 20 requests sent by ab. But route handler gets called multiple times.
I can't find reasons for such behaviour, any advice?
Timeouts were caused by using http.globalAgent which by default can process up to 5 concurrent requests to one host:port (which isn't enough in my case).
Thouthands of requests (instead of tens) were sent by ab (Wireshark approved fact under OS X; I can not reproduce this under Ubuntu inside Parallels).
You can have a look at node-http-proxy module and how it handles the connections. Make sure you don't buffer any data and everything works by streaming. And you should try to see where is the time spent for those long requests. Try instrumenting parts of your code with conosle.time and console.timeEnd and see where is taking the most time. If the time is mostly spent in javascript you should try to profile it. Basically you can use v8 profiler, by adding --prof option to your node command. Which makes a v8.log and can be processed via a v8 tool found in node-source-dir/deps/v8/tools. It only works if you have installed d8 shell via scons(scons d8). You can have a look at this article to help you further to make this working.
You can also use node-webkit-agent which uses webkit developer tools to show the profiler result. You can also have a look at my fork with a bit of sugar.
If that didn't work, you can try profiling with dtrace(only works in illumos-based systems like SmartOS).

Node.js API choking with concurrent connections

This is the first time I've used Node.js and Mongo, so please excuse any ignorance. I come from a PHP background. It was my understanding that Node.js scaled well because of the event-driven nature of it. As such, I built my API in node and have been testing it on a localhost. Today, I deployed it to my cloud server and everything works great, except...
As the requests start to pile up, they start to take a long time to fulfill. With just 2 clients connecting to the API, already I'm seeing 30sec+ page load times when both clients are trying to make several requests at once (which does sometimes happen).
Most of the work done by the API is either (a) reading/writing to MongoDB, which resides on a 2nd server on the cloud (b) making requests to other APIs, websites, etc. and returning the results. Both of these operations should not be blocking, but I can imagine the problem being something to do with a bottleneck either on the Mongo DB server (a) or to the external APIs (b).
Of course, I will have multiple application servers in the end, but I would expect each one to handle more than a couple concurrent clients without choking.
Some considerations:
1) I have some console.logs that I left in my node code, and I have a SSH client open to monitor the cloud server. I suspect that this could cause slowdown
2) I use express, mongoose, Q, request, and a handful of other modules
Thanks for taking the time to help a node newb ;)
Edit: added some pics of performance graphs after some responses below...
EDIT: here's a typical callback -- it is called by the express router, and it uses the Q module and OAuth to make a Post API call to Facebook:
post: function(req, links, images, callback)
{
// removed some code that calculates the target (string) and params (obj) variables
// the this.request function is a simple wrapper around the oauth.getProtectedResource function
Q.ncall(this.request, this, target, 'POST', params)
.then(function(res){
callback(null, res);
})
.fail(callback).end();
},
EDIT: some "upsert" code
upsert: function(query, callback)
{
var id = this.data._id,
upsertData = this.data.toObject(),
query = query || {'_id': id};
delete upsertData._id;
this.model.update(query, upsertData, {'upsert': true}, function(err, res, out){
if(err)
{
if(callback) callback(new Errors.Database({'message':'the data could not be upserted','error':err, 'search': query}));
return;
}
if(callback) callback(null);
});
},
Admittedly, my knowledge of Q/promises is weak. But, I think I have consistently implemented them in a way that does not block...
Your question has provided half of the relevant data: the technology stack. However, when debugging performance issues, you also need the other half of the data: performance metrics.
You're running some "cloud servers", but it's not clear what these servers are actually doing. Are they spiked on CPU? on Memory? on IO?
There are lots of potential issues. Are you running Express in production mode? Are you taking up too much IO on your MongoDB server? Are you legitimately downloading too much data? Did you get caught in an infinite Node.JS loop? (it happens)
I would like to provide better advice, but without knowing the status of the servers involved it's really impossible to start picking at any specific underlying technology. You may be a "Node newb", but basic server monitoring is pretty standard across programming languages.
Thank you for the extra details, I will re-iterate the most important part of my comments above: Where are these servers blocked?
CPU? (clearly not from your graph)
Memory? (doesn't seem likely here)
IO? (where are the IO graphs, what is your DB server doing?)

Differences between socket.io and websockets

What are the differences between socket.io and websockets in
node.js?
Are they both server push technologies?
The only differences I felt was,
socket.io allowed me to send/emit messages by specifying an event name.
In the case of socket.io a message from server will reach on all clients, but for the same in websockets I was forced to keep an array of all connections and loop through it to send messages to all clients.
Also,
I wonder why web inspectors (like Chrome/firebug/fiddler) are unable to catch these messages (from socket.io/websocket) from server?
Please clarify this.
Misconceptions
There are few common misconceptions regarding WebSocket and Socket.IO:
The first misconception is that using Socket.IO is significantly easier than using WebSocket which doesn't seem to be the case. See examples below.
The second misconception is that WebSocket is not widely supported in the browsers. See below for more info.
The third misconception is that Socket.IO downgrades the connection as a fallback on older browsers. It actually assumes that the browser is old and starts an AJAX connection to the server, that gets later upgraded on browsers supporting WebSocket, after some traffic is exchanged. See below for details.
My experiment
I wrote an npm module to demonstrate the difference between WebSocket and Socket.IO:
https://www.npmjs.com/package/websocket-vs-socket.io
https://github.com/rsp/node-websocket-vs-socket.io
It is a simple example of server-side and client-side code - the client connects to the server using either WebSocket or Socket.IO and the server sends three messages in 1s intervals, which are added to the DOM by the client.
Server-side
Compare the server-side example of using WebSocket and Socket.IO to do the same in an Express.js app:
WebSocket Server
WebSocket server example using Express.js:
var path = require('path');
var app = require('express')();
var ws = require('express-ws')(app);
app.get('/', (req, res) => {
console.error('express connection');
res.sendFile(path.join(__dirname, 'ws.html'));
});
app.ws('/', (s, req) => {
console.error('websocket connection');
for (var t = 0; t < 3; t++)
setTimeout(() => s.send('message from server', ()=>{}), 1000*t);
});
app.listen(3001, () => console.error('listening on http://localhost:3001/'));
console.error('websocket example');
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.js
Socket.IO Server
Socket.IO server example using Express.js:
var path = require('path');
var app = require('express')();
var http = require('http').Server(app);
var io = require('socket.io')(http);
app.get('/', (req, res) => {
console.error('express connection');
res.sendFile(path.join(__dirname, 'si.html'));
});
io.on('connection', s => {
console.error('socket.io connection');
for (var t = 0; t < 3; t++)
setTimeout(() => s.emit('message', 'message from server'), 1000*t);
});
http.listen(3002, () => console.error('listening on http://localhost:3002/'));
console.error('socket.io example');
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.js
Client-side
Compare the client-side example of using WebSocket and Socket.IO to do the same in the browser:
WebSocket Client
WebSocket client example using vanilla JavaScript:
var l = document.getElementById('l');
var log = function (m) {
var i = document.createElement('li');
i.innerText = new Date().toISOString()+' '+m;
l.appendChild(i);
}
log('opening websocket connection');
var s = new WebSocket('ws://'+window.location.host+'/');
s.addEventListener('error', function (m) { log("error"); });
s.addEventListener('open', function (m) { log("websocket connection open"); });
s.addEventListener('message', function (m) { log(m.data); });
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/ws.html
Socket.IO Client
Socket.IO client example using vanilla JavaScript:
var l = document.getElementById('l');
var log = function (m) {
var i = document.createElement('li');
i.innerText = new Date().toISOString()+' '+m;
l.appendChild(i);
}
log('opening socket.io connection');
var s = io();
s.on('connect_error', function (m) { log("error"); });
s.on('connect', function (m) { log("socket.io connection open"); });
s.on('message', function (m) { log(m); });
Source: https://github.com/rsp/node-websocket-vs-socket.io/blob/master/si.html
Network traffic
To see the difference in network traffic you can run my test. Here are the results that I got:
WebSocket Results
2 requests, 1.50 KB, 0.05 s
From those 2 requests:
HTML page itself
connection upgrade to WebSocket
(The connection upgrade request is visible on the developer tools with a 101 Switching Protocols response.)
Socket.IO Results
6 requests, 181.56 KB, 0.25 s
From those 6 requests:
the HTML page itself
Socket.IO's JavaScript (180 kilobytes)
first long polling AJAX request
second long polling AJAX request
third long polling AJAX request
connection upgrade to WebSocket
Screenshots
WebSocket results that I got on localhost:
Socket.IO results that I got on localhost:
Test yourself
Quick start:
# Install:
npm i -g websocket-vs-socket.io
# Run the server:
websocket-vs-socket.io
Open http://localhost:3001/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the WebSocket version.
Open http://localhost:3002/ in your browser, open developer tools with Shift+Ctrl+I, open the Network tab and reload the page with Ctrl+R to see the network traffic for the Socket.IO version.
To uninstall:
# Uninstall:
npm rm -g websocket-vs-socket.io
Browser compatibility
As of June 2016 WebSocket works on everything except Opera Mini, including IE higher than 9.
This is the browser compatibility of WebSocket on Can I Use as of June 2016:
See http://caniuse.com/websockets for up-to-date info.
Its advantages are that it simplifies the usage of WebSockets as you described in #2, and probably more importantly it provides fail-overs to other protocols in the event that WebSockets are not supported on the browser or server. I would avoid using WebSockets directly unless you are very familiar with what environments they don't work and you are capable of working around those limitations.
This is a good read on both WebSockets and Socket.IO.
http://davidwalsh.name/websocket
tl;dr;
Comparing them is like comparing Restaurant food (maybe expensive sometimes, and maybe not 100% you want it) with homemade food, where you have to gather and grow each one of the ingredients on your own.
Maybe if you just want to eat an apple, the latter is better. But if you want something complicated and you're alone, it's really not worth cooking and making all the ingredients by yourself.
I've worked with both of these. Here is my experience.
SocketIO
Has autoconnect
Has namespaces
Has rooms
Has subscriptions service
Has a pre-designed protocol of communication
(talking about the protocol to subscribe, unsubscribe or send a message to a specific room, you must all design them yourself in websockets)
Has good logging support
Has integration with services such as redis
Has fallback in case WS is not supported (well, it's more and more rare circumstance though)
It's a library. Which means, it's actually helping your cause in every way. Websockets is a protocol, not a library, which SocketIO uses anyway.
The whole architecture is supported and designed by someone who is not you, thus you dont have to spend time designing and implementing anything from the above, but you can go straight to coding business rules.
Has a community because it's a library (you can't have a community for HTTP or Websockets :P They're just standards/protocols)
Websockets
You have the absolute control, depending on who you are, this can be very good or very bad
It's as light as it gets (remember, its a protocol, not a library)
You design your own architecture & protocol
Has no autoconnect, you implement it yourself if yo want it
Has no subscription service, you design it
Has no logging, you implement it
Has no fallback support
Has no rooms, or namespaces. If you want such concepts, you implement them yourself
Has no support for anything, you will be the one who implements everything
You first have to focus on the technical parts and designing everything that comes and goes from and to your Websockets
You have to debug your designs first, and this is going to take you a long time
Obviously, you can see I'm biased to SocketIO. I would love to say so, but I'm really really not.
I'm really battling not to use SocketIO. I dont wanna use it. I like designing my own stuff and solving my own problems myself.
But if you want to have a business and not just a 1000 lines project, and you're going to choose Websockets, you're going to have to implement every single thing yourself. You have to debug everything. You have to make your own subscription service. Your own protocol. Your own everything. And you have to make sure everything is quite sophisticated. And you'll make A LOT of mistakes along the way. You'll spend tons of time designing and debugging everything. I did and still do. I'm using websockets and the reason I'm here is because they're unbearable for a one guy trying to deal with solving business rules for his startup and instead dealing with Websocket designing jargon.
Choosing Websockets for a big application ain't an easy option if you're a one guy army or a small team trying to implement complex features. I've wrote more code in Websockets than I ever wrote with SocketIO in the past, for ten times simpler things than I did with SocketIO.
All I have to say is ... Choose SocketIO if you want a finished product and design. (unless you want something very simple in functionality)
Im going to provide an argument against using socket.io.
I think using socket.io solely because it has fallbacks isnt a good idea. Let IE8 RIP.
In the past there have been many cases where new versions of NodeJS has broken socket.io. You can check these lists for examples... https://github.com/socketio/socket.io/issues?q=install+error
If you go to develop an Android app or something that needs to work with your existing app, you would probably be okay working with WS right away, socket.io might give you some trouble there...
Plus the WS module for Node.JS is amazingly simple to use.
Using Socket.IO is basically like using jQuery - you want to support older browsers, you need to write less code and the library will provide with fallbacks. Socket.io uses the websockets technology if available, and if not, checks the best communication type available and uses it.
https://socket.io/docs/#What-Socket-IO-is-not (with my emphasis)
What Socket.IO is not
Socket.IO is NOT a WebSocket implementation. Although Socket.IO indeed uses WebSocket as a transport when possible, it adds some metadata to each packet: the packet type, the namespace and the packet id when a message acknowledgement is needed. That is why a WebSocket client will not be able to successfully connect to a Socket.IO server, and a Socket.IO client will not be able to connect to a WebSocket server either. Please see the protocol specification here.
// WARNING: the client will NOT be able to connect!
const client = io('ws://echo.websocket.org');
I would like provide one more answer in 2021. socket.io has become actively maintained again since 2020 Sept. During 2019 to 2020 Aug(almost 2 years) there was basically no activity at all and I had thought the project may be dead.
Socket.io also published an article called Why Socket.IO in 2020?, except for a fallback to HTTP long-polling, I think these 2 features are what socket.io provides and websocket lacks of
auto-reconnection
a way to broadcast data to a given set of clients (rooms/namespace)
One more feature I find socket.io convenient is for ws server development, especially I use docker for my server deployment. Because I always start more than 1 server instances, cross ws server communication is a must and socket.io provide https://socket.io/docs/v4/redis-adapter/ for it.
With redis-adapter, scaling server process to multiple nodes is easy while load balance for ws server is hard. Check here https://socket.io/docs/v4/using-multiple-nodes/ for further information.
Even if modern browsers support WebSockets now, I think there is no need to throw SocketIO away and it still has its place in any nowadays project. It's easy to understand, and personally, I learned how WebSockets work thanks to SocketIO.
As said in this topic, there's a plenty of integration libraries for Angular, React, etc. and definition types for TypeScript and other programming languages.
The other point I would add to the differences between Socket.io and WebSockets is that clustering with Socket.io is not a big deal. Socket.io offers Adapters that can be used to link it with Redis to enhance scalability. You have ioredis and socket.io-redis for example.
Yes I know, SocketCluster exists, but that's off-topic.
Socket.IO uses WebSocket and when WebSocket is not available uses fallback algo to make real time connections.
TLDR:
'Socket.io' is an application layer specification that can be implemented on top of/using the application layer specification 'websockets'.
websocket spec
socket.io spec
I think the simple answer here is in basic web technology definitions:
Specification: A documented standard detailing the requirements for a program to achieve in order to be labeled as "an implimentation of some sepc." It is important to achieve this rubber stamp when building programs, because any program is only as good at the machine executing the code. Programming is fundamentally built upon specifications, and if, they are not followed code will not execute correctly. However, a specification does nothing. It is just a text document.
Implementation: This is actual, executable code that accomplishes what the specification says to do.
Application Layer - System that defines messages and handshakes sent over transport. This is the stuff you have to know when working with HTTP/Websockets/Socketio. It defines how the connections will be made, authenticated, data will be sent, and how it will arrive.

Resources