Im creating a TCP server in NodeJS as follows:
net = require('net');
var clients = [];
// Start a TCP Server
net.createServer(function (socket) {
socket.name = socket.remoteAddress + ":" + socket.remotePort
clients.push(socket);
socket.write("Welcome " + socket.name + "\n");
broadcast(socket.name + " joined the chat\n", socket);
socket.on('data', function (data) {
broadcast(socket.name + "> " + data, socket);
});
socket.on('end', function () {
clients.splice(clients.indexOf(socket), 1);
broadcast(socket.name + " left the chat.\n");
});
function broadcast(message, sender) {
clients.forEach(function (client) {
if (client === sender) return;
client.write(message);
});
}
}).listen(5000);
Coming from architectures like php+nginx, no matter how I mess up the code, there is no way I can crash my Nginx server (in most cases), worst case scenario one of my user gets a 500 error page and life continues. But in NodeJS, if I for example forgot to check that the denominator send by the user must be more than zero and then I try to do something like something/0 the whole server is going to crash, since I'm actually creating the server plus the app and not only the app like in php. What are the best practices to write effectively in NodeJS to mitigate the posibilites of crashing your server? Just an ugly big try/catch thats wraps the whole code?
I personally found the following article by joyent quite illuminating:
https://www.joyent.com/developers/node/design/errors
You should read it!
The author distinguishes between operational errors and programmer errors.
Operational errors are "run-time problems experienced by correctly-written programs." These are mostly things going wrong in external 'stuff', including users. Any robust program should try to deal with these problems as well as possible through proper error-handling. In your case, your program should check for valid user input, including value-ranges.
Programmer errors "are bugs in the program". The author argues that the program shouldn't even try to recover from bugs; If you would've anticipated the bug, the bug wouldn't be there in the first place, so how can you expect to write code to correct a situation that you didn't anticipate in the first place? If there is a situation that the programmer didn't anticipate (correctly) which leads to problems, just crash. This is less risky than continuing to run your software, which might now be in an undefined state.
Since you don't want downtime, this also means that you should run your software inside a 'restarter', something that will restart your software once it crashes. For this, I've used pm2 in the past, which works well imho.
Simply don't allow your server to ever crash. Or, put another way: write fault-tolerant code.
If you have code like this on your server:
function calculateValue(a, b) {
return a / b;
}
... then don't let b ever be zero. Safe-guard it by either validating the inputs (e.g., b !== 0 and spit back an HTTP 400 if it's false) or by defaulting (e.g., b = b <= 0 ? 1 : b).
Then, and this is the most important part: TEST YOUR CODE. Better yet, test your code first (classic test-driven development) by writing every kind of "happy path" and "edge case" tests you can think of. That will force you to write high quality, stable, predictable code that greatly lessens the possibility of application-crashing bugs.
Related
I have made a Node.js script which checks for new entries in a MySQL database and uses socket.io to send data to the client's web browser. The script is meant to check for new entries approximately every 2 seconds. I am using Forever to keep the script running as this is hosted on a VPS.
I believe what's happening is that the for loop is looping infinitely (more on why I think that's the issue below). There are no error messages in the Forever generated log file and the script is "running" even when it's started to hang up. Specifically, the part of the script that hangs up is the script stops accepting browser requests at port 8888 and doesn't serve the client-side socket.io js files. I've done some troubleshooting and identified a few key components that may be causing this issue, but at the end of the day, I'm not sure why it's happening and can't seem to find a work around.
Here is the relevant part of the code:
http.listen(8888,function(){
console.log("Listening on 8888");
});
function checkEntry() {
pool.getConnection(function(err,connection) {
connection.query("SELECT * FROM `data_alert` WHERE processtime > " + (Math.floor(new Date() / 1000) - 172800) + " AND pushed IS NULL", function (err, rows) {
connection.release();
if (!err) {
if(Object.keys(rows).length > 0) {
var x;
for(x = 0; x < Object.keys(rows).length; x++) {
connection.query("UPDATE `data_alert` SET pushed = 1 WHERE id = " + rows[x]['id'],function() {
connection.release();
io.emit('refresh feed', 'refresh');
});
}
}
}
});
});
setTimeout(function() { checkEntry();var d = new Date();console.log(d.getTime()); },1000);
}
checkEntry();
Just a few interesting things I've discovered while trouble shooting...
This only happens when I run the script on Forever. Work's completely fine if I use shell and just leave my terminal open.
It starts to happen after 5-30 minutes of running the script, it does not immediately hang up on the first execution of the checkEntry function.
I originally tried this with setInterval instead of setTimeout, the issue has remained exactly the same.
If I remove the setInterval/setTimeout function and run the checkEntry function only once, it does not hang up.
If I take out the javascript for loop in the checkEntry function, the hang ups stop (but obviously, that for loop controls necessary functionality so I have to at least find another way of using it).
I've also tried using a for-in loop for the rows object and the performance is exactly the same.
Any ideas would be immensely helpful at this point. I started working with Node.js just recently so there may be a glaringly obvious reason that I'm missing here.
Thank you.
So I just wanted to come back to this and address what the issue was. It took me quite some time to figure out and it can only be explained by my own inexperience. There is a section to my script where my code contained the following:
app.get("/", (request, response) => {
// Some code to log things to the console here.
});
The issue was that I was not sending a response. The new code looks as follows and has resolved my hang up issues:
app.get("/", (request, response) => {
// Some code to log things to the console here.
response.send("OK");
});
The issue had nothing to do with the part of the code I presented in the initial question.
I found the following on the ExpressJS guide:
var mysql = require('mysql');
var connection = mysql.createConnection({
host : 'localhost',
user : 'dbuser',
password : 's3kreee7'
});
connection.connect();
connection.query('SELECT 1 + 1 AS solution', function(err, rows, fields) {
if (err) throw err;
console.log('The solution is: ', rows[0].solution);
});
connection.end();
Isn't this supposed to be bad practice? The way I see it, it is possible for the connection to end before the query can be executed. Wouldn't that give an error?
As stated here :
Every method you invoke on a connection is queued and executed in sequence.
Closing the connection is done using end() which makes sure all remaining queries are executed before sending a quit packet to the mysql server.
So even though the call to the end() method can be made before the query has completed, it won't actually be executed until the query has finished executing.
This has to do more with the mysql package than NodeJS itself.
Your question How does async work in Express? and Isn't this supposed to be bad practice? can be answered in many ways, but for clarity I would like to explain that It depends !!!!
It generally is very bad practice, assuming you don't know the actual implementation.
If the the implementation is really simple, where it does exactly what you ask -- i.e. closes or ends the connection when end is executed then it could lead to rather ugly race conditions where it may or may not work depending on the load of the machines.
However, a clever implementation that does reference counting -- that is the end does not actually close the connection but just sets a flag to say -- "when last callback is done then close" -- then it may work.
If the mysql connector it implemented using reference counting then this may well work fine -- but that is not the same as saying that it is good practice for everything you find as a plugin.
(Using Sails.js)
I am testing webworker-threads ( https://www.npmjs.com/package/webworker-threads ) for long running processes on Node and the following example looks good:
var Worker = require('webworker-threads').Worker;
var fibo = new Worker(function() {
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
this.onmessage = function (event) {
try{
postMessage(fibo(event.data));
}catch (e){
console.log(e);
}
}
});
fibo.onmessage = function (event) {
//my return callback
};
fibo.postMessage(40);
But as soon as I add any code to query Mongodb, it throws an exception:
(not using the Sails model in the query, just to make sure the code could run on its own -- db has no password)
var Worker = require('webworker-threads').Worker;
var fibo = new Worker(function() {
function fibo (n) {
return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
}
// MY DB TEST -- THIS WORKS FINE OUTSIDE THE WORKER
function callDb(event){
var db = require('monk')('localhost/mydb');
var users = db.get('users');
users.find({ "firstName" : "John"}, function (err, docs){
console.log(("serviceSuccess"));
return fibo(event.data);
});
}
this.onmessage = function (event) {
try{
postMessage(callDb(event.data)); // calling db function now
}catch (e){
console.log(e);
}
}
});
fibo.onmessage = function (event) {
//my return callback
};
fibo.postMessage(40);
Since the DB code works perfectly fine outside the Worker, I think it has something to do with the require. I've tried something that also works outside the Worker, like
var moment = require("moment");
var deadline = moment().add(30, "s");
And the code also throws an exception. Unfortunately, console.log only shows this for all types of errors:
{Object}
{/Object}
So, the questions are: is there any restriction or guideline for using require inside a Worker? What could I be doing wrong here?
UPDATE
it seems Threads will not allow external modules
https://github.com/xk/node-threads-a-gogo/issues/22
TL:DR I think that if you need to require, you should use a node's
cluster or child process. If you want to offload some cpu busy work,
you should use tagg and the load function to grab any helpers you
need.
Upon reading this thread, I see that this question is similar to this one:
Load Nodejs Module into A Web Worker
To which Audreyt, the webworker-threads author answered:
author of webworker-threads here. Thank you for using the module!
There is a default native_fs_ object with the readFileSync you can use
to read files.
Beyond that, I've mostly relied on onejs to compile all required
modules in package.json into a single JS file for importScripts to
use, just like one would do when deploying to a client-side web worker
environment. (There are also many alternatives to onejs -- browserify,
etc.)
Hope this helps!
So it seems importScripts is the way to go. But at this point, it might be too hacky for what I want to do, so probably KUE is a more mature solution.
I'm a collaborator on the node-webworker-threads project.
You can't require in node-webworker-threads
You are correct in your update: node-webworker-threads does not (currently) support requireing external modules.
It has limited support for some of the built-ins, including file system calls and a version of console.log. As you've found, the version of console.log implemented in node-webworker-threads is not identical to the built-in console.log in Node.js; it does not, for example, automatically make nice string representations of the components of an Object.
In some cases you can use external modules, as outlined by audreyt in her response. Clearly this is not ideal, and I view the incomplete require as the primary "dealbreaker" of node-webworker-threads. I'm hoping to work on it this summer.
When to use node-webworker-threads
node-webworker-threads allows you to code against the WebWorker API and run the same code in the client (browser) and the server (Node.js). This is why you would use node-webworker-threads over node-threads-a-gogo.
node-webworker-threads is great if you want the most lightweight possible JavaScript-based workers, to do something CPU-bound. Examples: prime numbers, Fibonacci, a Monte Carlo simulation, offloading built-in but potentially-expensive operations like regular expression matching.
When not to use node-webworker-threads
node-webworker-threads emphasizes portability over convenience. For a Node.js-only solution, this means that node-webworker-threads is not the way to go.
If you're willing to compromise on full-stack portability, there are two ways to go: speed and convenience.
For speed, try a C++ add-on. Use NaN. I recommend Scott Frees's C++ and Node.js Integration book to learn how to do this, it'll save you a lot of time. You'll pay for it in needing to brush up on your C++ skills, and if you want to work with MongoDB then this probably isn't a good idea.
For convenience, use a Child Process-based worker pool like fork-pool. In this case, each worker is a full-fledged Node.js instance. You can then require to your heart's content. You'll pay for it in a larger application footprint and in higher communication costs compared to node-webworker-threads or a C++ add-on.
I'm about to start coding a chat bot. However, I plan on running more than one, using a wrapper to communicate and restart them. I have done this in the past with child_process.fork(), but it was incredibly inefficient. I've looked into spawn and cluster as well, but they all seem to focus on running the same thing, not unique bots. As for plugins, I've looked into fleet, forkfriend, and workerfarm, but none seem to fit my needs.
Is there any plugin or way I'm not seeing to help me do this? Or am I just going to have o wing it again?
You can have as many chat bots as you wish in a single process. The rule of thumb in Node.js is using one process per processor core since Node has slightly different multithreading model you might got used to.
Assuming you still need some multithreading on top of this, here is a couple of node modules you might find fitting your needs:
node-webworker-threads, dnode.
UPDATE:
Now I see what you need. There is a nice example in Node.js docs, which I saw recently. I just copy & paste it here:
var normal = require('child_process').fork('child.js', ['normal']);
var special = require('child_process').fork('child.js', ['special']);
// Open up the server and send sockets to child
var server = require('net').createServer();
server.on('connection', function (socket) {
// if this is a VIP
if (socket.remoteAddress === '74.125.127.100') {
special.send('socket', socket);
return;
}
// just the usual dudes
normal.send('socket', socket);
});
server.listen(1337);
child.js looks like this:
process.on('message', function(m, socket) {
if (m === 'socket') {
socket.end('You were handled as a ' + process.argv[2] + ' person');
}
});
I believe it's pretty much what you need. Launch several processes with different configs (if number of configs is relatively low) and pass socket to a particular one from master process.
I've run into an issue with NodeJS where, due to some middleware, I need to directly return a value which requires knowing the last modified time of a file. Obviously the correct way would be to do
getFilename: function(filename, next) {
fs.stat(filename, function(err, stats) {
// Do error checking, etc...
next('', filename + '?' + new Date(stats.mtime).getTime());
});
}
however, due to the middleware I am using, getFilename must return a value, so I am doing:
getFilename: function(filename) {
stats = fs.statSync(filename);
return filename + '?' + new Date(stats.mtime).getTime());
}
I don't completely understand the nature of the NodeJS event loop, so what I was wondering is if statSync had any special sauce in it that somehow pumped the event loop (or whatever it is called in node, the stack of instructions waiting to be performed) while the filenode information was loading or is it really blocking and that this code is going to cause performance nightmares down the road and I should rewrite the middleware I am using to use a callback? If it does have special sauce to allow for the event loop to continue while it is waiting on the disk, is that available anywhere else (though some promise library or something)?
Nope, there is no magic here. If you block in the middle of the function, everything is blocked.
If performance becomes an issue, I think your only option is to rewrite that part of the middleware, or get creative with how it is used.