I'm using NodeJS to run a socket server (using socket.io). When a client connects, I want am opening and running a module which does a bunch of stuff. Even though I am careful to try and catch as much as possible, when this module throws an error, it obviously takes down the entire socket server with it.
Is there a way I can separate the two so if the connected clients module script fails, it doesn't necessarily take down the entire server?
I'm assuming this is what child process is for, but the documentation doesn't mention starting other node instances.
I'd obviously need to kill the process if the client disconnected too.
I'm assuming these modules you're talking about are JS code. If so, you might want to try the vm module. This lets you run code in a separate context, and also gives you the ability to do a try / catch around execution of the specific code.
You can run node as a separate process and watch the data go by using spawn, then watch the stderr/stdout/exit events to track any progress. Then kill can be used to kill the process if the client disconnects. You're going to have to map clients and spawned processes though so their disconnect event will trigger the process close properly.
Finally the uncaughtException event can be used as a "catch-all" for any missed exceptions, making it so that the server doesn't get completely killed (signals are a bit of an exception of course).
As the other poster noted, you could leverage the 'vm' module, but as you might be able to tell from the rest of the response, doing so adds significant complexity.
Also, from the 'vm' doc:
Note that running untrusted code is a tricky business requiring great care.
To prevent accidental global variable leakage, vm.runInNewContext is quite
useful, but safely running untrusted code requires a separate process.
While I'm sure you could run a new nodejs instance in a child process, the best practice here is to understand where your application can and will fail, and then program defensively to handle all possible error conditions.
If some part of your code "take(s) down the entire ... server", then you really to understand why this occurred and solve that problem rather than rely on another process to shield you from the work required to design and build a production-quality service.
Related
I have a question regarding the examples out there when using Nodejs, Express and Jade for templates.
All the examples show how to build some sort of a user administrative interface where you can add user profiles, delete them and manage them.
Those are considered beginner's guides to NodeJs. My question is around the fact that if I have have 10 users concurrently accessing the same interface and doing the same operations, surely NodeJs will block the requests for the other users as they are running on the same port.
So let's say I am pulling out a list of users which may be something like 10000. Yes I can do paging, but that is not the point. While I am getting the list from the server another 4 users want to access the application. They have to wait for my process to end. That is my question - how can one avoid that using NodeJS & Express?
I am on this issue for a couple of months! I currently have something in place that does the following:
Run the main processing of stuff on a port
Run a Socket.io process on a different port
Use a sticky session
The idea is that I do a request (like getting a list of items), and immediately respond with some request reference but without the requested items, thus releasing the port.
In the background "asynchronously" I then do the process of getting the items. Upon which when completed, I do an http request from one node to the socket node port node SENDING the items through.
When that is done I then perform a socket.io emit WITH the data and the initial request reference so that the correct user gets the message.
On the client side I have an event listening for the socket which then completes the ajax request by populating the list.
I have SOME success in doing this! It actually works to a degree! I have an issue online which complicates matters due to ip addresses, and socket.io playing funny.
I also have multiple workers using clustering. I use it in the following manner:
I create a master worker
I spawn workers
I take any connection request and pass it to the relevant worker.
I do that for the main node request as well as for the socket requests. Like I said I use 2 ports!
As you can see I have had a lot of work done on this and I am not getting a proper solution!
My question is this - have I gone all around the world 10 times only to have missed something simple? This sounds way to complicated to achieve a non-blocking nodejs only website.
I asked myself - surely all these tutorials would have not missed on something as important as this! But they did!
I have researched, read, and tested a lot of code - this is my very first time I ask anything on stackoverflow!
Thank you for any assistance.
P.S. One example of the same approach is this: I request a report using jasper, I pass parameters, and with the "delayed ajax response" approach as described above I simply release the port, and in the background a very intensive report is being generated (and this can be very intensive process as a lot of calculations are being performed)..! I really don't see a better approach - any help will be super appreciated!
Thank you for taking the time to read!
I'm sorry to say it, but yes, you have been going around the world 10 times only to have been missing something simple.
It's obvious that your previous knowledge/experience with webservers are from a blocking point of view, and if this was the case, your concerns had been valid.
Node.js is a framework focused around using a single thread to execute code, which means if it does any blocking operations, no one else would be able to get anything done.
There are some operations that can do this in node, like reading/writing to disk. However, most node operations will be asynchronous.
I believe you are familiar with the term, so I won't go into details. What asynchronous operations allows node to do, is to keep this single thread idle as much as possible. By idle I mean open for other work. If your code is fully asynchronous, then handling 4 concurrent users (or even 400) shouldn't be a problem, even for a single thread.
Now, in regards to your initial problem of ports: Once a request is received on a given port, node.js execute whatever code you have written for it, until it encounters an asynchronous operation as soon as that happens, it is available to to pick up more requests on the same port.
The second problem you inquire about, is the database operation. In this case, node-js would send the query to the database (which takes no time at all) and the database does that actual execution of the query. In the meantime, node is free to do whatever it wants, until the database is finished, and lets node know there is a result to fetch.
You can recognize async operations by their structure: my_function(..., ..., callback). Function that uses a callback function, is in most cases asynch.
So bottom line: Don't worry about the problems around blocking IO, as you will hardly encounter any in node. Use a single port if you want (By creating multiple child processes, you can even have multiple node instances on the same port).
Hope this explains it good enough. If you have any further questions, let me know :)
We have a C# Web API server and a Node Express server. We make hundreds of requests from the C# server to a route on the Node server. The route on the Node server does intensive work and often doesn't return for 6-8 seconds.
Making hundreds of these requests simultaneously seems to cause the Node server to fail. Errors in the Node server output include either socket hang up or ECONNRESET. The error from the C# side says
No connection could be made because the target machine actively refused it.
This error occurs after processing an unpredictable number of the requests, which leads me to think it is simply overloading the server. Using a Thread.Sleep(500) on the C# side allows us to get through more requests, and fiddling with the wait there leads to more or less success, but thread sleeping is rarely if ever the right answer, and I think this case is no exception.
Are we simply putting too much stress on the Node server? Can this only be solved with Load Balancing or some form of clustering? If there is an another alternative, what might it look like?
One path I'm starting to explore is the node-toobusy module. If I return a 503 though, what should be the process in the following code? Should I Thread.Sleep and then re-submit the request?
It sounds like your node.js server is getting overloaded.
The route on the Node server does intensive work and often doesn't return for 6-8 seconds.
This is a bad smell - if your node process is doing intense computation, it will halt the event loop until that computation is completed, and won't be able to handle any other requests. You should probably have it doing that computation in a worker process, which will run on another cpu core if available. cluster is the node builtin module that lets you do that, so I'll point you there.
One path I'm starting to explore is the node-toobusy module. If I return a 503 though, what should be the process in the following code? Should I Thread.Sleep and then re-submit the request?
That depends on your application and your expected load. You may want to refresh once or twice if it's likely that things will cool down enough during that time, but for your API you probably just want to return a 503 in C# too - better to let the client know the server's too busy and let them make their own decision then to keep refreshing on its behalf.
In the parent process, I have started the tiny-lr(livereload) server, followed by spawing a child process which looks for changes to the css files. how to pass on the livereload server to the child process or is it possible to query for the livereload server that is currently running in the child process so that I don't create it again getting an already in use error for the port.
the same case with node http server. can I know if the server is already running and use that instead of creating new one.
is it possible to query for the livereload - it is possible and may be implemented in more than one way.
Use stdout/stdin to communicate with the child process. For detailed description look HERE. Basically you can send messages from one process to the other and reply to them.
Use http.request to check if the port is in use.
You can use a file: the process with the server keeps the file open in the write mode - the content of the file stores the port on which the server runs (if needed).
You can use sockets for inter-process communication, as well.
Basically, none of the above guarantees 100% confidentiality, so you have to try/catch for errors anyway: the server may die just after your check, but before you wanted to do something with it.
how to pass on the livereload server to the child process - if you mean sharing an object between different process that it is for sure out of question; if you mean changing the ownership of the object that I am some 99,99% sure it is not possible neither.
What is the problem with having just one process responsible for running the server? And why not to use, let say, forever to take care of running and restarting the server, if needed?
Say I have a rest end point which when called starts a long running process server side e.g.
http://host/api/program/start
and I want to push any updates / output from that process from the server side to a client.
I'm thinking the rest call would return some sort of unique id which the client could then use when connecting to the websocket to only receive updates about that particular process.
I'd have to think about buffering the output / updates from the process to send to the client if they didn't connect before the first output from the process but irrespective of that, what would be the best way of achieving the socket data handling for this? Could I make use of the socket.io rooms / namespaces in some way?
If you really want to do it this way, I would suggest generating the ID via the initial start call, then passing that to the long running process as an argument. Then that process publishes all messages to that ID (which appropriate clients are listening to as well).
However, I would discourage you from going from this approach. There are plenty of ways to go about handling a child process in Node, so you might want to look into these options a little more so you don't end up dealing with zombie processes all over the place.
The first that comes to mind is ChildProcess. Another option would be something like WebWorker Threads. Either of these would be right in the vein of what (I think) you're trying to do, but allow you to maintain much more control over the child processes.
I want to be able to kill my old process after a code update without any downtime.
In Ruby, I do this by using Unicorn. When you send the Unicorn master process a USR1 kill signal, it spawns a new copy of itself, which means all the libraries get loaded from file again. There's a callback available when a Unicorn master has loaded its libraries and its worker processes are ready to handle requests; you can then put code in here to kill the old master by its PID. The old master will then shut down its worker processes systematically, waiting for them to conclude any current requests. This means you can deploy code updates without dropping a single request, with 0 downtime.
Is it possible to do this in Node? There seem to be a lot of libraries vaguely to do with this sort of thing - frameworks that seem to just do mindless restarting of the process after a crash, and stuff like that - but I can't find anything that's a stripped-down implementation of this basic pattern. If possible I'd like to do it myself, and it wouldn't be that hard - I just need to be able to do http.createServer().listen() and specify a socket file (which I'll configure nginx to send requests to), rather than a port.
Both the net and http modules have versions of listen that take a path to a socket and fire their callback once the server has been bound.
Furthermore, you can use the new child_process.fork to launch a new Node process. This new process has a communication channel built in to its parent, so could easily tell its parent to exit once initialized.
net documentation
http documentation
child_process.fork documentation
(For the first two links, look just under the linked-to method, since they are all the same method name.)