I have inherited a basic Azure Web App as part of a system that needs to be very responsive. The system sends an HTTP POST to my Azure server which then processes the message and sends a response. Most of the time the server responds in about 0.4s which is fine however there are occasions when the response time jumps to several seconds.
I have instrumented the code and captured the time from when the POST is sent from my system to when the App first get kicked off by the POST arriving and the delay seems to be from the POST being sent to the App starting (not in the App itself). This is all in the working of Azure and I do not know of any way of getting any further metrics.
My Web App is always on with auto scaling I do not see any obvious problems on the Azure console. My tests send a POST every few seconds so I would not expect the App to go to sleep.
Related
We have enabled SignalR on our ASP.NET Core 5.0 web project running on an Azure Web App (Windows App Service Plan). Our SignalR client is an Angular client using the #microsoft/signalr NPM package (version 5.0.11).
We have a hub located at /api/hub/notification.
Everything works as expected for most of our clients, the web socket connection is established and we can call methods from client to server and vice versa.
For a few of our clients, we see a massive amount of requests to POST /api/hub/notification/negotiate and POST /api/hub/notification within a short period of time (multiple requests per minute per client). It seems like that those clients switch to long polling instead of using web sockets since we see the POST /api/hub/notification requests.
We have the suspicion that the affected clients could maybe sit behind a proxy or a firewall which forbids the web sockets and therefore the connection switches to long polling in the first place.
The following screenshot shows requests to the hub endpoints for one single user within a short period of time. The list is very long since this pattern repeats as long as the user has opened our website. We see two strange things:
The client repeatedly calls /negotiate twice every 15 seconds.
The call to POST /notification?id=<connectionId> takes exactly 15 seconds and the following call with the same connection ID returns a 404 response. Then the pattern repeats and /negotiate is called again.
For testing purposes, we enabled only long polling in our client. This works for us as expected too. Unfortunately, we currently don't have access to the browsers or the network of the users where this behavior occurs, so it is hard for us to reproduce the issue.
Some more notes:
We currently have just one single instance of the Web App running.
We use the Redis backplane for a scale-out scenario in future.
The ARR affinity cookie is enabled and Web Sockets in the Azure Web App are enabled too.
The Web App instance doesn't suffer from high CPU usage or high memory usage.
We didn't change any SignalR options except of adding the Redis backplane. We just use services.AddSignalR().AddStackExchangeRedis(...) and endpoints.MapHub<NotificationHub>("/api/hub/notification").
The website runs on HTTPS.
What could cause these repeated calls to /negotiate and the 404 returns from the hub endpoint?
How can we further debug the issue without having access to the clients where this issue occurs?
Update
We now implemented a custom logger for the #microsoft/signalr package which we use in the configureLogger() overload. This logger logs into our Application Insights which allows us to track the client side logs of those clients where our issue occurs.
The following screenshot shows a short snippet of the log entries for one single client.
We see that the WebSocket connection fails (Failed to start the transport "WebSockets" ...) and the fallback transport ServerSentEvents is used. We see the log The HttpConnection connected successfully, but after pretty exactly 15 seconds after selecting the ServerSentEvents transport, a handshake request is sent which fails with the message from the server Server returned handshake error: Handshake was canceled. After that some more consequential errors occur and the connection gets closed. After that, the connection gets established again and everything starts from new, a new handshare error occurs after those 15 seconds and so on.
Why does it take so long for the client to send the handshake request? It seems like those 15 seconds are the problem, since this is too long for the server and the server cancels the connection due to a timeout.
We still think that this has maybe something to to with the client's network (Proxy, Firewall, etc.).
Fiddler
We used Fiddler to block the WebSockets for testing. As expected, the fallback mechanism starts and ServerSentEvents is used as transport. Opposed to the logs we see from our issue, the handshake request is sent immediately and not after 15 seconds. Then everything works as expected.
You should check which pricing tier you use, Free or Standard in your project.
You should change the connectionstring which is in Standard Tier. If you still use Free tier, there are some restrictions.
Official doc: Azure SignalR Service limits
I've encountered an interesting problem when trying to make a HTTP request from Azure VM. It appears that when the request is ran from this VM the response never arrives. I tried using a custom C# code that makes an HTTP request and Postman. In both cases we can see in the logs on the target API side that the response has been sent, but no data is received on the origin VM. The exact same C# request and Postman request work outside of this VM in multiple networks and machines. The only tool that actually works for this request on VM side is Curl Bash terminal but it is not an option based on current requirements.
Tried on multiple Azure VM sizes, on Windows 10 and Windows Server 2019.
The target API is on-premise hosted and it requires around 5 minutes for the data to be sent back. Payload is very small but due to the computing performed on the API side it takes a while to generate. Modifying this API is not an option.
So to be clear- the requests are perpetually stuck until the timeout on client side is reached (if it was configured). Does anybody know what could be a reason for this?
If these transfers take longer than 4 minutes without keep alives, Azure will typically close the connection.
You should be able to see this by monitoring the connection with wireshark.
TCP Timeouts can be configured when using a Load Balancer, but you can also try adding keep alives in your API server if possible.
My site works fine locally. It even works fine with my backend using azure web services and front end using netlify but occasionally after several api calls (I'm not overloading the server because these api calls are done one by one) I get LOTS of errors that are all the same. 500 internal server error. I look at the logs and they give me some numbers 500 1013 109 329 2144 391
Reason for this could be
Network issue of your server
Server request time out
-Web app takes too long to respond for a request/response when connecting to any resource( database,different server) etc..
To resolve that , i would suggest you to to increase the idle timeout of your app.
in the app setting of your web app add SCM_COMMAND_IDLE_TIMEOUT = 3600
By default, Web Apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable ‘Always On’ to keep the app loaded all the time.
You may also check the diagnostic log stream to get more details on this issue and the blog post for Troubleshooting Azure App Service Apps Using Web Server Logs.
Hope it helps.
I have created a bot for slack and deployed to Azure, I am making some API calls to another server from this BOT, for this bot I have a client requirement, My client wants to measure the time taken by request to reach to server and time taken by response to come back to Bot. (only time taken by request/response to reach to either side . I have been exploring Azure application insight from three days , but could not find any helpful service. I can not change my bot code , Is there any way in azure service by that I can monitor latency?
Here is simple diagram:-
Bot ----t1----> Server
<---t2-----
I don't want response time taken to process at Server side (No calculation time ) Just request/response travelling time.
Ganesh,
What you seem to be asking is how long it takes the API to process a request and return a response. Nothing to do with your bot.
My suggestion would be to create performance tests against the API directly using a tool such as Jmeter. This will give you average response times for say 10,000 requests and plot out on nice graphs etc.
If you need a help doing this, I could write it up for you in step by step instructions.
Let me know.
Thanks,
Tim
I am using service stack to build the create RESTful services, not have depth knowledge of it. This works as sending request and getting response back. I have scenario and my question is depends on it.
Scenario: I am sending request from browser or any client where I am able to send request to server. Consider server will take 3 seconds to process single request and send back response to browser. After one second, I have sent another request to server from same browser(client). Now I am getting response of second request which I sent later.
Question 1: What is happening behind with the first request which I did not get response.
Question 2: How I can stop processing of orphan request.
Edit : I have used IIS server to host services.
ServiceStack executes requests concurrently on multithreaded web servers, whether you're hosting on ASP.NET/IIS or self-hosted so 2 concurrent requests are running concurrently on different threads. There are different scenarios possible if you're executing async tasks in your Services in which it frees up the thread to execute different tasks, but the implementation details are largely irrelevant here.
HTTP Web Requests are each executed to their end, even when its client connection is lost your Services are never notified and no Exceptions are raised.
But for long running Services you can enable the high-level ServiceStack's Cancellable Requests Feature which enables a way for clients to cancel long running requests.