I've been working on hooking up mongoDB to a node.js server. I've got the code all neatly put away, however it takes about 5~ seconds to connect, and if a request for an insert or a query comes before that, the server will crash.
My first instinct was to use try/catch to filter off any requests which error out. I will want the server to go on regardless of what errors an individual request breaks, so why not use it?
Everywhere I look it's touted as a bad idea, I'm not sure I understand why.
A try/catch block around something that simply ignores the error is generally considered bad practice. However, if that is the behavior you want, then there is nothing wrong with it. Just consider that it may not actually be the best behavior. You may, at a minimum, want to log the fact that an exception occurred.
Now, due to the asynchronous nature of Node.js, try/catch blocks are sometimes not useful. I don't know what part of the MongoDB API you are using, but if there is a callback, you will want to instead check the err parameter, which should be the first parameter of your callback function in most cases.
Finally, for all of my applications, I connect to any necessary DBs synchronously, and start listening on ports afterwards. But, this is only relevant if a persistent connection makes sense for your project. Plus, you still have to watch for errors, which can occur as connection failures do happen.
Related
I have a question regarding the examples out there when using Nodejs, Express and Jade for templates.
All the examples show how to build some sort of a user administrative interface where you can add user profiles, delete them and manage them.
Those are considered beginner's guides to NodeJs. My question is around the fact that if I have have 10 users concurrently accessing the same interface and doing the same operations, surely NodeJs will block the requests for the other users as they are running on the same port.
So let's say I am pulling out a list of users which may be something like 10000. Yes I can do paging, but that is not the point. While I am getting the list from the server another 4 users want to access the application. They have to wait for my process to end. That is my question - how can one avoid that using NodeJS & Express?
I am on this issue for a couple of months! I currently have something in place that does the following:
Run the main processing of stuff on a port
Run a Socket.io process on a different port
Use a sticky session
The idea is that I do a request (like getting a list of items), and immediately respond with some request reference but without the requested items, thus releasing the port.
In the background "asynchronously" I then do the process of getting the items. Upon which when completed, I do an http request from one node to the socket node port node SENDING the items through.
When that is done I then perform a socket.io emit WITH the data and the initial request reference so that the correct user gets the message.
On the client side I have an event listening for the socket which then completes the ajax request by populating the list.
I have SOME success in doing this! It actually works to a degree! I have an issue online which complicates matters due to ip addresses, and socket.io playing funny.
I also have multiple workers using clustering. I use it in the following manner:
I create a master worker
I spawn workers
I take any connection request and pass it to the relevant worker.
I do that for the main node request as well as for the socket requests. Like I said I use 2 ports!
As you can see I have had a lot of work done on this and I am not getting a proper solution!
My question is this - have I gone all around the world 10 times only to have missed something simple? This sounds way to complicated to achieve a non-blocking nodejs only website.
I asked myself - surely all these tutorials would have not missed on something as important as this! But they did!
I have researched, read, and tested a lot of code - this is my very first time I ask anything on stackoverflow!
Thank you for any assistance.
P.S. One example of the same approach is this: I request a report using jasper, I pass parameters, and with the "delayed ajax response" approach as described above I simply release the port, and in the background a very intensive report is being generated (and this can be very intensive process as a lot of calculations are being performed)..! I really don't see a better approach - any help will be super appreciated!
Thank you for taking the time to read!
I'm sorry to say it, but yes, you have been going around the world 10 times only to have been missing something simple.
It's obvious that your previous knowledge/experience with webservers are from a blocking point of view, and if this was the case, your concerns had been valid.
Node.js is a framework focused around using a single thread to execute code, which means if it does any blocking operations, no one else would be able to get anything done.
There are some operations that can do this in node, like reading/writing to disk. However, most node operations will be asynchronous.
I believe you are familiar with the term, so I won't go into details. What asynchronous operations allows node to do, is to keep this single thread idle as much as possible. By idle I mean open for other work. If your code is fully asynchronous, then handling 4 concurrent users (or even 400) shouldn't be a problem, even for a single thread.
Now, in regards to your initial problem of ports: Once a request is received on a given port, node.js execute whatever code you have written for it, until it encounters an asynchronous operation as soon as that happens, it is available to to pick up more requests on the same port.
The second problem you inquire about, is the database operation. In this case, node-js would send the query to the database (which takes no time at all) and the database does that actual execution of the query. In the meantime, node is free to do whatever it wants, until the database is finished, and lets node know there is a result to fetch.
You can recognize async operations by their structure: my_function(..., ..., callback). Function that uses a callback function, is in most cases asynch.
So bottom line: Don't worry about the problems around blocking IO, as you will hardly encounter any in node. Use a single port if you want (By creating multiple child processes, you can even have multiple node instances on the same port).
Hope this explains it good enough. If you have any further questions, let me know :)
We have a C# Web API server and a Node Express server. We make hundreds of requests from the C# server to a route on the Node server. The route on the Node server does intensive work and often doesn't return for 6-8 seconds.
Making hundreds of these requests simultaneously seems to cause the Node server to fail. Errors in the Node server output include either socket hang up or ECONNRESET. The error from the C# side says
No connection could be made because the target machine actively refused it.
This error occurs after processing an unpredictable number of the requests, which leads me to think it is simply overloading the server. Using a Thread.Sleep(500) on the C# side allows us to get through more requests, and fiddling with the wait there leads to more or less success, but thread sleeping is rarely if ever the right answer, and I think this case is no exception.
Are we simply putting too much stress on the Node server? Can this only be solved with Load Balancing or some form of clustering? If there is an another alternative, what might it look like?
One path I'm starting to explore is the node-toobusy module. If I return a 503 though, what should be the process in the following code? Should I Thread.Sleep and then re-submit the request?
It sounds like your node.js server is getting overloaded.
The route on the Node server does intensive work and often doesn't return for 6-8 seconds.
This is a bad smell - if your node process is doing intense computation, it will halt the event loop until that computation is completed, and won't be able to handle any other requests. You should probably have it doing that computation in a worker process, which will run on another cpu core if available. cluster is the node builtin module that lets you do that, so I'll point you there.
One path I'm starting to explore is the node-toobusy module. If I return a 503 though, what should be the process in the following code? Should I Thread.Sleep and then re-submit the request?
That depends on your application and your expected load. You may want to refresh once or twice if it's likely that things will cool down enough during that time, but for your API you probably just want to return a 503 in C# too - better to let the client know the server's too busy and let them make their own decision then to keep refreshing on its behalf.
I just started trying out SailsJS a few days ago.
I've realized that the Node is terminated whenever I have an uncaught exception.
I have a list of controllers and each of them calls a specific service JS file (Containing logics and DB calls) in services/.
Can I write a global error handler for all services so that any type of error that occurs from these services should be handled by it and appropriate error response has to be communicated to front-end.
I tried using process.on('uncaughtexception') or some of basic exceptions but it needs to be added to each service method.
Also can I have one common point for all service calls made from client to server through which all io.socket.post() and io..socket.get() goes through
I would appreciate any pointer/article that would show me the common best practices for handling uncaught exceptions in SailsJS and using shorter code rather than writing redundant code in all services.
Best practice is using Domains in your controller. This will handle exceptions in async code, and its relatively straight forward.
You can use something like trycatch to simplify things a little, but domain based exceptions will be most effective. It'll insure that exceptions do not crash your application. Just create a new domain in your controller and run your controller methods inside of that domain.
Sailsjs being based on express you can use connect middleware, and you can seamlessly create a new domain from middleware. There such thing as express-domain-middleware. This might be the most aesthetic option, and most convenient.
Update:
As mention by Benjamin Gruenbaum, Domains are planned to become deprecated in v1 of node. Perhaps you should read through Joyents Error Handling Best Practices. Its agnostic to the framework you are using.
Additonally you can still use Domains, while there isn't a way to globally handle errors in node.js otherwise. Once deprecated you could always remove your dependence on Domains, relatively easily. That said, it may be best not to rely solely on domains.
Strongloop also provides a library inspired by domains called Zone. This is also an option.
Its OK to let node instance error out due to a programming error, else it may continue in an inconsistent state and mess-up business logic. In production environment the server can be restarted on crash, this will reset its state and keep it available, if the error is not frequent. And in all of it its very important to Log everything. This applies to most of Node setups, including SailsJS.
The following approach can be taken:
Use a Logger : A dedicated logger should be accessible to server components. Should be connected to a service that notifies the developer (email ?) of very serious errors.
Propagate per request errors to the end: Carefully forward errors from any step in request processing. In ExperssJs/ConnectJs/middle-ware based setup's, the next(err) can be used to pass an error to the middle-ware chain. An error catching middle-ware at the end of the chain will get this error, log it verbose, and send a 500 status back. You can use Domains or Zones or Promises or async or whatever you like to process request and catch errors.
Shutdown on process.on('uncaughtexception'): Log the erorr, do necessary clean-up, and throw the same error again to shutdown process.
User PM2/Forever or Upstart/init.d on linux : Now when the process shuts down due to the bad exception, these tools will restart it and track how many time server has been crashing. If the server is crashing way too many time, its good to stop it and take immediate action.
I have not tried this, but I believe you should be able to set a catch-all exception handler in bootstrap.js using process.on('uncaughtexception').
Personally, I use promises via the bluebird library, and put a catch statement that passes all errors to a global error handling function.
I didn't want DB requests to run accidentally before connection, so the connect method returns a promise and every single DB method uses connectPromise.then().
It seems like my app is leaking memory, so I'm wondering if that could be the cause. The top offender in the v8 heap memory snapshot is titled sql and contains a lot of stuff from bluebird promises and domains. I don't really know what to make of it, but that one single connect promise came to mind.
EDIT: I have confirmed that the source of the problem is indeed my practice of keeping around a promise from sequelize. For the sake of testing, I tried refreshing that promise every 30 seconds and my application stopped gathering more and more memory.
I have opened a issue in sequelize
Assuming you are writing a server, why not make DB connections before listening on a port?
db.connect().then(function(){
server.listen(function(){
});
});
I am not saying the issue with Sequelize is not real, but I think it would be a good idea not call connect().then() every time you make a db request.
There are places for this type of pattern and this is not one of them, I think.
It looks like Bluebird is not working well with domains, and that is where the memory leak is occurring. Bluebird is the only promise library that I know of that makes an attempt to properly propagate domains in Node.js.
I want to write a callback that takes a bit of time to complete an external IO operation, but I do not want it to interfere when sending data back to the client. I don't care about waiting for callback completion for purposes of the reply back to the client, but if the callback results in an error, I would like to log it. About 80% of executions will result in this callback executing after the response is sent back to the client and the connection is closed.
My approach works well and I have not seen any problems, but I would like to know whether there are any pitfalls in this approach that I may be unaware of. I would think that node's evented IO would handle this without issue, but I want to make sure before I commit this architecture to production. Any issues that should make me reconsider this approach?
As long as you're not trying to reference that response object after the response is sent, this will not cause any problems. There's nothing special about a request handler that cares one bit about callbacks in its code being invoked after the response is generated.