NServiceBus & IIS - iis

NSB 4.6.1
IIS 7.5 - Recyling for this pool is turned off
I am hosting a handler inside of a IIS web appblication which forwards messages to web clients over SignalR. These messages are sent from other NSB services. The problem I run into is that in production the handler for these messages will randomly stop consuming messages from the queue and I have to restart iis to get it to start again. I can't reproduce this issue outside of production, and I cannot find any information in any of my logs.
Anyone have any ideas on what may be causing this?
My NSB configuration
var bus = NServiceBus.Configure.With()
.StructureMapBuilder(container)
.UseTransport<Msmq>()
.UnicastBus()
.PurgeOnStartup(false)
.CreateBus();

can you try to configure IIS as always running and see if the behavior changes?
http://msdn.microsoft.com/en-us/library/ee677261(v=azure.10).aspx
.m

Related

SignalR long polling repeatedly calls /negotiate and /hub POST and returns 404 occasionally on Azure Web App

We have enabled SignalR on our ASP.NET Core 5.0 web project running on an Azure Web App (Windows App Service Plan). Our SignalR client is an Angular client using the #microsoft/signalr NPM package (version 5.0.11).
We have a hub located at /api/hub/notification.
Everything works as expected for most of our clients, the web socket connection is established and we can call methods from client to server and vice versa.
For a few of our clients, we see a massive amount of requests to POST /api/hub/notification/negotiate and POST /api/hub/notification within a short period of time (multiple requests per minute per client). It seems like that those clients switch to long polling instead of using web sockets since we see the POST /api/hub/notification requests.
We have the suspicion that the affected clients could maybe sit behind a proxy or a firewall which forbids the web sockets and therefore the connection switches to long polling in the first place.
The following screenshot shows requests to the hub endpoints for one single user within a short period of time. The list is very long since this pattern repeats as long as the user has opened our website. We see two strange things:
The client repeatedly calls /negotiate twice every 15 seconds.
The call to POST /notification?id=<connectionId> takes exactly 15 seconds and the following call with the same connection ID returns a 404 response. Then the pattern repeats and /negotiate is called again.
For testing purposes, we enabled only long polling in our client. This works for us as expected too. Unfortunately, we currently don't have access to the browsers or the network of the users where this behavior occurs, so it is hard for us to reproduce the issue.
Some more notes:
We currently have just one single instance of the Web App running.
We use the Redis backplane for a scale-out scenario in future.
The ARR affinity cookie is enabled and Web Sockets in the Azure Web App are enabled too.
The Web App instance doesn't suffer from high CPU usage or high memory usage.
We didn't change any SignalR options except of adding the Redis backplane. We just use services.AddSignalR().AddStackExchangeRedis(...) and endpoints.MapHub<NotificationHub>("/api/hub/notification").
The website runs on HTTPS.
What could cause these repeated calls to /negotiate and the 404 returns from the hub endpoint?
How can we further debug the issue without having access to the clients where this issue occurs?
Update
We now implemented a custom logger for the #microsoft/signalr package which we use in the configureLogger() overload. This logger logs into our Application Insights which allows us to track the client side logs of those clients where our issue occurs.
The following screenshot shows a short snippet of the log entries for one single client.
We see that the WebSocket connection fails (Failed to start the transport "WebSockets" ...) and the fallback transport ServerSentEvents is used. We see the log The HttpConnection connected successfully, but after pretty exactly 15 seconds after selecting the ServerSentEvents transport, a handshake request is sent which fails with the message from the server Server returned handshake error: Handshake was canceled. After that some more consequential errors occur and the connection gets closed. After that, the connection gets established again and everything starts from new, a new handshare error occurs after those 15 seconds and so on.
Why does it take so long for the client to send the handshake request? It seems like those 15 seconds are the problem, since this is too long for the server and the server cancels the connection due to a timeout.
We still think that this has maybe something to to with the client's network (Proxy, Firewall, etc.).
Fiddler
We used Fiddler to block the WebSockets for testing. As expected, the fallback mechanism starts and ServerSentEvents is used as transport. Opposed to the logs we see from our issue, the handshake request is sent immediately and not after 15 seconds. Then everything works as expected.
You should check which pricing tier you use, Free or Standard in your project.
You should change the connectionstring which is in Standard Tier. If you still use Free tier, there are some restrictions.
Official doc: Azure SignalR Service limits

Getting 500 server error.. My site works fine and only throws these errors occasionally. How can I better diagnose the problem?

My site works fine locally. It even works fine with my backend using azure web services and front end using netlify but occasionally after several api calls (I'm not overloading the server because these api calls are done one by one) I get LOTS of errors that are all the same. 500 internal server error. I look at the logs and they give me some numbers 500 1013 109 329 2144 391
Reason for this could be
Network issue of your server
Server request time out
-Web app takes too long to respond for a request/response when connecting to any resource( database,different server) etc..
To resolve that , i would suggest you to to increase the idle timeout of your app.
in the app setting of your web app add SCM_COMMAND_IDLE_TIMEOUT = 3600
By default, Web Apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable ‘Always On’ to keep the app loaded all the time.
You may also check the diagnostic log stream to get more details on this issue and the blog post for Troubleshooting Azure App Service Apps Using Web Server Logs.
Hope it helps.

Keep a node application running on azure app service

I have deployed a node js web application on app service in azure. Issue is that my application occasionally getting killed for unknown reason. I have done exhaustive search through all the log fines using kudu.
If I restart app service, application starts working.
Is there any way I can restart my node application once it has crashed. Kind of run for ever no matter what. For example if any error happens in an asp.net code deployed in IIS, IIS never crashes, its keeps of serving other incoming request.
Something like using forever/pm2 in azure app service.
node.js in Azure App Services is powered by IISNode, which takes care of everything you described, including monitoring your process for failures and restarting it.
Consider the following POC:
var http = require('http');
http.createServer(function (req, res) {
if (req.url == '/bad') {
throw 'bad';
}
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('bye');
}).listen(process.env.PORT || 1337);
If I host this in a Web App and issue the following sequence of requests:
GET /
GET /bad
GET /
Then the first will yield HTTP 200, the second will throw on the server and yield HTTP 500, and the third will yield HTTP 200 without me having to do anything. IISNode will just detect the crash and restart the process.
So you shouldn't need PM2 or similar solution because this is built in with App Services. However, if you really want to, they now have App Services Preview on Linux, which is powered by PM2 and lets you configure PM2. More on this here. But again you get this out of the box already.
Another thing to consider is Always On setting which is on by default:
By default, web apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable Always On to keep the app loaded all the time. If your app runs continuous web jobs, you should enable Always On, or the web jobs may not run reliably.
This is another possible root cause for your issue and the solution is to disable Always On for your Web App (see the link above).
I really want to thank itaysk for your support for this issue.
Issue was not what I was suspecting. Actually the node server was getting restarted on failure correctly.
There was a different issue. Why my website was getting non responsive is for a different reason. Here is what was happening-
We have used rethinkdbdash to connect to rethinkdb database and we was using connection pool. There was a coding/design issue. We have around 15 change feeds implemented with along with socket.io. And the change feed was getting initialised for every user logged in. This was increasing number of active connections in the pool. And rethinkdbdash has default limit of 1000 connection in the pool and as there were lots of live connections, all the available connection in the pool was getting exhausted resulting no more available connection. So, request was waiting for an open connection and it was not getting any available, hence waiting for ever blocking any new requests to be served.

Azure app service node.js deployment - handling failover

I have deployed a node application in azure running under an app service. Now issue is that the sites goes down occasionally and stops responding. Once I restart the site its starts working.
If I see logs, its says IISNode has encountered an error.
My question is is there any way to log the error and restart node process gracefully.
What is the best practice approach for node website deployed in app service?
This is the only error I get-
According the list on https://azure.microsoft.com/en-us/documentation/articles/app-service-web-nodejs-best-practices-and-troubleshoot-guide/
500 1004-1018 There was some error while sending the request or processing the response to/from node.exe. Check if node.exe crashed. check d:\home\LogFiles\logging-errors.txt for stack trace.
And more best practices and troubleshooting scenarios, you can refer to https://azure.microsoft.com/en-us/documentation/articles/app-service-web-nodejs-best-practices-and-troubleshoot-guide/.

Using RabbitMQ to capture web application log

I'm trying to setup RabbitMQ to take web application logs to a log server.
My log server will listen to one channel and store the logs that comes in.
There are several web applications that need to send info to the log server.
With many connections (users) hitting the web server, what is the best design to publish messages to RabbitMQ without locking each other? Is it a good idea to keep opening a new connection to the MQ for each web request? Is there some sort of message queue pool?
I'm using IIS for a web server.
I assume you’re leveraging the .NET framework to build your application, given that it’s hosted in IIS. If so, you can also leverage Daishi.AMQP, which has a built-in QueuePool feature. Here is a tutorial that outlines the mechanism in full.
To answer your question, you should initially establish a connection to RabbitMQ from your application server. You can then initialise a Channel (a process that executes within the context of the underlying connection) to serve each HTTP request. It is not a good idea to establish a new connection for each request.
RabbitMQ has a build in queue feature. It is well documented, have a look at the official docs: http://www.rabbitmq.com/getstarted.html

Resources