Debugging node in a per request strategy - node.js

I'm dealing with Node for quite a while now, and something keeps annoying me on my production environment: debugging!
So I thought about a system that would be as following:
An error occurs with a certain level or an uncaught one.
Log a super long stack trace related to the request, containing all function calls AND variable values since the request happened.
Send that to a service or a simple log file (monitored) that would inform me that an error happened but with a clear idea of the context.
I don't know how to do something like that or if there are some existing stuff out there doing that job.
My strategy for now, long-stack-trace when an error occurs and crash the worker that will be restarted by the cluster parent (only responsible for redirecting HTTP requests and monitor children)
Thanks!

Related

Manage Cuncurent request process nodejs

Different users trying to access different application routes with heavy manipulation of data. At the mid time one of the request failed due to internal server error and my whole application has been crashed. Thats why other request has been failed because build has been crashed. Is there any solution to handle this situation?
If your program has thrown an uncaught error and crashed then there's nothing you can really do other than start it back up again. You could use something like pm2 to automatically restart your node process when it crashes, and then at least future requests that come in should work (although you will lose any in memory data from before the last crash).
Another thing that I think would help you would be to move your backend onto a serverless architecture where each invocation of your code is independent of the others.
And of course try to fix the code so that it handles things gracefully and doesn't actually throw errors. :)

Whole app in MEAN stack crashes for a single bug

I am developing a web application using MEAN stack. When an error comes in the codes, the whole application crashes and I have to restart the server (re-run the application).
This behavior may not be a problem during development. But when the application goes to production, and if some unexpected errors occurs, the whole application will crash. I have to keep an eye on the system constantly and restart it when an error occurs? What am I missing here?
This is one solution that I have seen being used in production to ensure that a node program is always running (even after server restarts).
Use Forever (https://www.npmjs.com/package/forever). You can run it through code or through command line.
$ [sudo] npm install forever -g
forever start app.js
Where app.js has the code for instantiating the web server (in the MEAN stack it's the express initialization).
If an unhandled error bubbles to the top of the stack without being caught, crashing is the intended behavior. An unhandled exception means that your app is in an undefined state.
Think of it this way. If you lose control of your car and drive off the road, the best thing to do is to slam on the brakes and stop (AKA a controlled program crash or halt) rather than continue blindly blundering through foliage, flower beds, backyards, swimming pools, toddlers, and whatever other obstacles may be in the way.
I'd recommend using a tool like forever to run your app in production, which will monitor and restart your app when it crashes. This sort of thing is standard practice. Obviously you don't want it to crash, and you should handle errors in context where you know how to recover from them. And some frameworks do a better job than others of handling errors smoothly without crashing. Restarting the process is mainly best for things that catch you completely off guard. Checkout this for more error handling tips:
https://www.joyent.com/developers/node/design/errors
The issue mentioned by you is the point you need to keep in your mind while you develop applications. Because the way of handling errors is the thing that you can't skip. There are a few useful solutions (just a small part of the whole 'Error Handling' World) you can use in order to save your applications.
But lets start from your issue. As you have already such situation, I can recommend you integrating node domain module into your application, so it will not crash in case of such exceptions. Please refer to the link below:
https://nodejs.org/api/domain.html
You may wrap your server creation and catch all the unhandled exceptions.
Example:
var domain = require('domain').create();
domain.on('error', function(err){
//track error into your log file
//do something else with error (send message to admin, send error response etc. )
});
domain.run(function(){
//run http server here
});
Here you may find good example as well:
https://engineering.gosquared.com/error-handling-using-domains-node-js
As for error handling solutions I can recommend you keeping the following rules:
use domain to catch exceptions
use node eventemitter to event and catch exceptions
always think about possible situations when you handle results of functions
develop single strategy of error handling (throw error/return error/send error to user etc.)
use try catch block to safe code blocks from unhandled exceptions
There are much more solutions you can find.
Please see the links I recommend you to check:
https://www.joyent.com/developers/node/design/errors
http://derickbailey.com/2014/09/06/proper-error-handling-in-expressjs-route-handlers/
http://expressjs.com/guide/error-handling.html
http://shapeshed.com/uncaught-exceptions-in-node/

Ghost (NodeJS blog) on Azure: Periodic 500 error troubleshooting

Background / Issue
Having a strange issue running a Ghost blog on Azure. The site seems to run fine for a while, but every once in a while, I'll receive a 500 error with no further information. The next request always appears to succeed (in tests so far).
The error seems to happen after a period of inactivity. Since I'm currently just getting set up, I'm utilizing an Azure "Free" instance, so I'm wondering if some sort of resource conservation is causing it behind the scenes (which will be allevaited when I upgrade).
Any idea what could be causing this issue? I'm sort of at a loss for where to start since the logs don't necessarily help me in this case. I'm new to NodeJS (and nodeJS on Azure) and since this is my first foray, any tips/tricks on where to look would be helpful as well.
Some specific questions:
When receiving an error like this, is there anywhere I can go to see any output, or is it pretty much guaranteed that Node actually didn't output something?
On Azure free instances, does some sort of resource conservation take place which might cause the app to be shut down (and thus for me to see these errors only after a period of inactivity)?
The Full Error
The full text of the error is below (I've turned debugging on for this reason):
iisnode encountered an error when processing the request.
HRESULT: 0x2
HTTP status: 500
HTTP reason: Internal Server Error
You are receiving this HTTP 200 response because system.webServer/iisnode/#devErrorsEnabled configuration setting is 'true'.
In addition to the log of stdout and stderr of the node.exe process, consider using debugging and ETW traces to further diagnose the problem.
The node.exe process has not written any information to stderr or iisnode was unable to capture this information. Frequent reason is that the iisnode module is unable to create a log file to capture stdout and stderr output from node.exe. Please check that the identity of the IIS application pool running the node.js application has read and write access permissions to the directory on the server where the node.js application is located. Alternatively you can disable logging by setting system.webServer/iisnode/#loggingEnabled element of web.config to 'false'.
I think it might be something in the Azure web config rather than Ghost itself. So look for logs based on that because Ghost is not throwing that error. I found this question that might help you out:
How to debug Azure 500 internal server error
Good luck!

Why NodeJS domains documentation code tries to terminate the process?

In the official NodeJS documentation there is code example where process tries to exit gracefully when there was exception in domain (it closes connections, waits for some time for other requests and then exits).
But why just not send the 500 error and continue to work?
In my application I want to throw some expected Errors (like FrontEndUserError) when user input is not valid, and catch these exceptions somewhere in middleware to send pretty error message to client. With domains it very easy to implement, but are there any pitfalls around this?
app.use (err, req, res, next) ->
if err instanceof FrontEndUserError
res.send {error: true, message: err.message}
else
log err.trace
res.send 500
From domain module official documentation:
By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state.
The safest way to respond to a thrown error is to shut down the process ...
To me that means when something thrown an error in your NodeJS application, then you're done unfortunately. If you do care about how your application works and the result is important to you, then your best bet is to kill the process and start it again. However, in that last milliseconds, you can be more nice to other clients, let them finish their work, say sorry to new clients, log couple of things if you want and then kill the process and start it again.
That's what exactly happening in the NodeJS domain module's documentation example.
Let's look at your web application/server as a state machine.
Unless your application is very small, it is very unlikely that you happen to know every state that your machine can possibly be in.
When you get an error, you have two choices:
1) Examine the error and decide what to do, or
2) ignore it.
In the first case, you gracefully change from one state to another. In the second case, you don't have any clue what state your machine is in, since you didn't bother seeing what the error was. Essentially, your machine's state is now 'undefined'.
It is for this reason, NodeJS recommends killing the process if an error propagates all the way to the event loop. Then again, this level of absolution may be overkill for pet projects and small apps, so your solution is quite fine too.
But imagine if you were writing a banking software; someday you get an error you've never seen before, you app simple ignores it and sends a 500; but each time someone is losing a 100k$. Here, I would want to make sure no error ever reaches the event loop, and if it does, kill the process with a detailed stack trace for later analysis.

Something making NServiceBus lose messages

I have an NServiceBus configuration that is working great on developers machines and in my Development Environment.
However, when I move it to my Test Environment my messages just start getting tossed.
Here is the system:
An app gets a TCP message from a Mainframe system and sends it to a MSMQ (call it FromMainframe).
An application hosted in IIS has a "Handle" method for that MSMQ and processes the messages from the mainframe.
In my Test Environment, step two only half way happens. The message is popped off the MSMQ, but not processed by my application.
Effectively my data is LOST! NServiceBus removes them from the Queue but I never get to process them. They are not even in the error queue!
These are the things I have tried in an attempt to figure out what is happening:
Check the Config files
Attach a remote debugger to the process to see what the Handle method is doing
The Handle method is never called (but when I attach to the Development Environment my breakpoint in my Handle method is hit and it all works flawlessly).
Redeploy my Dev version to the Test Envioronment and try step 2 again (just in case the versions were not exactly the same.)
Check the Config files again
Check that the Error queue is not filling up
The error queue stays empty (I wish it would fill up, then my data would not be LOST).
Check for any other process that may be pulling stuff from my MSMQs
I Turned off my IIS website and the messages in the FromMainframe queue start to backup.
When I turn it back on, the messages disappear fairly fast (but still not all at once). The speed that they disappear is too fast for them to be processed by my Handle method.
Check Config files yet again.
Run the NServiceBusTools\MsmqUtils\Runner.exe \i
I ran it, rebooted, ran it again and again for good measure!
Check the Configs again (I must have missed SOMETHING right?)
Check the Development Environment Configs are not pointing to the Test Environment
I don't think it is possible to use another computer's MSMQ as your input queue, but it does not hurt to check.
Look for any catch blocks that could be silently killing my message.
One last check of the Config files.
Recreate my Test Environment on another machine (it worked flawlessly)
Run my stuff outside of IIS.
When I host outside of IIS (using NServiceBus.Host.exe) it all works fine. So it has to be an IIS thing right?
Go crazy and hope that stack overflow can offer any kind of insight.
So I know enough about what happened to throw out an "Answer".
When I setup my NServiceBus self hosting I had a call that loaded the message handlers.
NServiceBus.Configure.With().LoadMessageHandlers()
(There are more configurations, but I omitted them for brevity)
When you call this, NServiceBus scans the assmeblies for a class that implements IHandleMessages<T>.
So, somehow, on my Test Environment Machine, the ServiceBus scan of the directory for a class that calls IHandleMessages was failing to find my class (even though the assembly was absolutely there).
Turns out that if NServiceBus does not find something that handles a message it will THROW IT AWAY!!!
This is a total design bug in my opinion. The whole idea of NServiceBus is to not lose your data, but in this case it does just that!
Now, once you know about this pitfall, there are several ways around it.
Expressly state what your handler(s) should be:
NServiceBus.Configure.With().LoadMessageHandlers<First<MyMessageType>>()
Even further protection is to add another handler that will handle "Everything else". IMessage is the base for all message payloads, so if you put a handler on it, it will pickup everything.
If you set IMessage to handle after your messages get handled, then it will handle everything that NServiceBus can't find a handler for. If you throw and exception in that Handle method that will cause NServiceBus to to move the message to the error queue. (What I think should be the default behavior.)

Resources