Ghost (NodeJS blog) on Azure: Periodic 500 error troubleshooting - node.js

Background / Issue
Having a strange issue running a Ghost blog on Azure. The site seems to run fine for a while, but every once in a while, I'll receive a 500 error with no further information. The next request always appears to succeed (in tests so far).
The error seems to happen after a period of inactivity. Since I'm currently just getting set up, I'm utilizing an Azure "Free" instance, so I'm wondering if some sort of resource conservation is causing it behind the scenes (which will be allevaited when I upgrade).
Any idea what could be causing this issue? I'm sort of at a loss for where to start since the logs don't necessarily help me in this case. I'm new to NodeJS (and nodeJS on Azure) and since this is my first foray, any tips/tricks on where to look would be helpful as well.
Some specific questions:
When receiving an error like this, is there anywhere I can go to see any output, or is it pretty much guaranteed that Node actually didn't output something?
On Azure free instances, does some sort of resource conservation take place which might cause the app to be shut down (and thus for me to see these errors only after a period of inactivity)?
The Full Error
The full text of the error is below (I've turned debugging on for this reason):
iisnode encountered an error when processing the request.
HRESULT: 0x2
HTTP status: 500
HTTP reason: Internal Server Error
You are receiving this HTTP 200 response because system.webServer/iisnode/#devErrorsEnabled configuration setting is 'true'.
In addition to the log of stdout and stderr of the node.exe process, consider using debugging and ETW traces to further diagnose the problem.
The node.exe process has not written any information to stderr or iisnode was unable to capture this information. Frequent reason is that the iisnode module is unable to create a log file to capture stdout and stderr output from node.exe. Please check that the identity of the IIS application pool running the node.js application has read and write access permissions to the directory on the server where the node.js application is located. Alternatively you can disable logging by setting system.webServer/iisnode/#loggingEnabled element of web.config to 'false'.

I think it might be something in the Azure web config rather than Ghost itself. So look for logs based on that because Ghost is not throwing that error. I found this question that might help you out:
How to debug Azure 500 internal server error
Good luck!

Related

HTTP 413 Error When App is Deployed on Azure

There is a strange problem that we have when we deploy our application on the Azure environment. When I start the application on my laptop, no Azure, no Docker or anything, on sending requests (which is a little bit big), I don't face any issues.
Our test and production environments are all on Azure right now. So when the application is deployed on it, I get this strange error:
log4javascript error: AjaxAppender.append: XMLHttpRequest request to URL ./common/logToServer.jsp?controllerName=6c3eaf3e-897d-4b30-a15e-62f9d3d3ce78 returned status code 413
Now I know what HTTP 413 error code is, but not sure, why my local is not showing the same error. Which leads me to believe that it might be some Azure configuration that I need to change. But don't know what.
It is simple web application on Java, Servlets and running on Tomcat.
Log4j is used as a logging framework for JavaScript with no runtime dependencies. As per the error statement, the issue was caused by the length of the payload, which is too large.
The HTTP status code 413 ("Payload Too Large") indicates that the request entity is larger than the limits defined by the server; the server might close the connection.
Fix:
Under java code -> application.properties add these two lines
server.tomcat.max-swallow-size=***MB //maximum size of the request body/payload
server.tomcat.max-http-post-size=*** MB //maximum size of entire POST request
NOTE:
*** is your desired integer representing megabyte.
reference article for more information and solution.

Azure Function getting 503

I am trying to run an Azure Function App, that we already have running in a different resource group / service plan / storage account. The original app works fine. But when I try to run this one, I get a 503.
The problem is that all I know is that I'm getting the 503. There is no other information. I turned on tracing in the app, but I still get no messages. I have tried to execute the app from both the Azure Portal Function App Code / Test section, and from Postman, with the same results. It spins for a long time, and then I get the 503.
When I try to execute the function, it is showing me the following in the logs:
Request successfully matched the route with name 'IngestRfidScan' and template 'api/v1/rfidScan'
Executing 'Functions.IngestRfidScan' (Reason='This function was programmatically called via the host APIs.', Id=a9c37c44-6a27-41e0-bff8-74fbb4275ecc)
Sending invocation id:a9c37c44-6a27-41e0-bff8-74fbb4275ecc
Posting invocation id:a9c37c44-6a27-41e0-bff8-74fbb4275ecc on workerId:7195f57f-b8ff-4613-84e4-9d4bc5dd7c4a
I don't see any log messages after this. I tried adding logging to the app, but I am not seeing my messages in the log anywhere. So this leads me to believe that it's not executing the function at all. But I can't seem to find any way to determine why. At first I thought it could be a firewall issue, but I don't think I'd see those messages in the log above.
Any ideas how to diagnose this?
Check one of my workarounds to know the reasons of Azure Functions 503 service unavailable error causes.
It is definitely timing out. But I don't know why that is, I don't have enough info in the logs. I checked App Insights, but again, it just tells me the request is timing out, but no explanation.
I have given the timeout limits in the above workaround reference, check that and also the resolution.
For getting the logs / more information, you can check the Diagnose and solve problems menu in the Azure Portal Function App and also my workaround that shows different ways to see the Function App and Host Logs.

Microsoft Azure sign in stuck in an infinite loading screen

The errors are as shown in the image... I followed the guide and downloaded the newest tools, but the error still persists... I'm on school wifi right now, and since I'm new to this, I have no idea on how to change Azure's environment variables on an existing project... I cannot sign in in the first place. I was just following through the guide :(
Please check the below steps if they helps to work around:
Error in Azure portal is HTTP response code 503 Service Unavailable
This situation happens due to network connectivity or service available issues.
The better approach is to retry the operation and if the issue persists, contact Azure Support as referenced here.
Alternative ways to solve this error is finding the cause by navigating to Diagnose and solve problems to know the root cause of 503 error as there can be multiple reasons for this error.
Please check the below causes related to 503 error issue if they helps to work around:
request taking a long time
application crashing due to an exception.
average response time is long
Function App is also an app service so app service enforces limits on the number of outbound connections
Error in Browser / Postman is 502 - Web server received an invalid response while acting as a gateway or proxy server.
As this 502 error, you were addressed in MSFT Q&A.
Normally, 502 error occurs when HTTP is placed instead of HTTPS in the connection but I know that the Azure Functions endpoint look like in this format
http://<APP_NAME>.azurewebsites.net/api/<FUNCTION_NAME>
And 502 error occurs when maximum timeout exceeds the value. Please check the timeout value of the function app and the logs, metrics of requestTime, responseTime in Application Insights and if it is the cause, increase the timeout value.
References:
Troubleshooting Reason for a 502 Error
How do I fix this 502 Error on my Azure Function?

Azure Service Fabric Activation Error 7148

I have a service fabric cluster which hosts numerous applications. One of the applications has a service type where the service is created, runs for a bit, and then is deleted. Everything works great, but the cluster virtually always has its state set to error because there will be a few of these in the "Unhealthy evaluations" section.
Error event: SourceId='System.Hosting', Property='CodePackageActivation:Code:EntryPoint'.
There was an error during CodePackage activation.The service host terminated with exit code:7148
I've wrapped both the program's main and RunAsync in exception handlers, but never see anything in analytics. Is there any way to look up what exit code 7148 means? Thanks.
7148 is a general error code that indicates that something failed in SF in the process of setting up or activating your service's host process. So that's the reason that you're not seeing any errors or exceptions - your code is never getting a chance to run.
Examples of things I've seen that led to 7148:
The exe was not actually a windows exe due to corruption
The service's manifest had a reference to a cert or some other pre-req like an endpoint that was incorrectly configured (like a port that was already in use or the wrong thumbprint for a cert)
Something blew up inside Windows that cause the process creation to fail, like a failure to correctly configure host networking for a container
Most of the times when I see this I have to look at the windows error logs to see what's really happening. The SF folks are also trying to capture more common causes of failures and reporting them as better health errors rather than relying on 7148.

Debugging node in a per request strategy

I'm dealing with Node for quite a while now, and something keeps annoying me on my production environment: debugging!
So I thought about a system that would be as following:
An error occurs with a certain level or an uncaught one.
Log a super long stack trace related to the request, containing all function calls AND variable values since the request happened.
Send that to a service or a simple log file (monitored) that would inform me that an error happened but with a clear idea of the context.
I don't know how to do something like that or if there are some existing stuff out there doing that job.
My strategy for now, long-stack-trace when an error occurs and crash the worker that will be restarted by the cluster parent (only responsible for redirecting HTTP requests and monitor children)
Thanks!

Resources