I’m currently building an application using NServicebus and Azure.
The regular processes are working, but now I’d like to do more about the management and monitoring aspect of the application.
The customer wants to see a dashboard where he can see the health of the application and also be able to correct issues.
What I’d like to do is:
Detect when things are sent to an error queue (to be able to send an alert to an admin)
Allow admin to handle messages on error queue from management application, without
resorting to the provided command line tool.
Is there a way to programmatically do error handling in NServicebus? I know which errors are transient and which errors might need manual intervention.
Is it possible to plug in logic to the error handling logic of nservicebus?
Is it possible to handle messages on the error queue programmatically?
Thanks,
Erwin
Regarding "dashboard where he can see the health of the application and also be able to correct issues":
Please take a look at ServicePulse (http://particular.net/ServicePulse) for production and online monitoring.
This provides both endpoint health indicators and Failed message indicators (including "Retry" capabilities).
For advanced debugging and visualization of your process you should also consider ServiceInsight (http://particular.net/ServiceInsight).
Behind the scenes of ServicePulse there's the ServiceControl server which exposes REST HTTP API with programmatic access to audited and error messages.
HTH,
Danny.
Related
I have an Azure function with ServiceBusTrigger which will post the message content to a webservice behind an Azure API Manager. In some cases the load of the (3rd party) webserver backend is too high and it collapses returning error 500.
I'm looking for a proper way to implement circuit breaker here.
I've considered the following:
Disable the azure function, but it might result in data loss due to multiple messages in memory (serviceBus.prefetchCount)
Implement API Manager with rate-limit policy, but this seems counter productive as it runs fine in most cases
Re-architecting the 3rd party webservice is out of scope :)
Set the queue to ReceiveDisabled, this is the preferred solution, but it results in my InputBinding throwing a huge amount of MessagingEntityDisabledExceptions which I'm (so far) unable to catch and handle myself. I've checked the docs for host.json, ServiceBusTrigger and the Run parameters but was unable to find a useful setting there.
Keep some sort of responsecode resultset and increase retry time, not ideal in a serverless scenario with multiple parallel functions.
Let API manager map 500 errors to 429 and reschedule those later, will probably work but since we send a lot of messages it will hammer the service for some time. In addition it's hard to distinguish between a temporary 500 error or a consecutive one.
Note that this question is not about deciding whether or not to trigger the circuitbreaker, merely to handle the appropriate action afterwards.
Additional info
Azure functionsV2, dotnet core 3.1 run in consumption plan
API Manager runs Basic SKU
Service Bus runs in premium tier
Messagecount: 300.000
our Api app is in UAT on Azure with service plan (Standard 3 large). What should we do if App Availability is Zero. It is getting slow response or timeout issue. When i restart the application it is up to normal. (We are using Parallel Language programming.(Async/Await)
How to find the route cause from it for slowness issue.
Ensure that Always On feature is enabled.
Such problems may be caused by application level issues, such as:
network requests taking a long time
application code or database queries being inefficient
application using high memory/CPU
application crashing due to an exception
You could enable web server diagnostics to fetch more details on the issue.
Detailed Error Logging - Detailed error information for HTTP status codes that indicate a failure (status code 400 or greater). This may contain information that can help determine why the server returned the error code.
Failed Request Tracing - Detailed information on failed requests, including a trace of the IIS components used to process the request and the time taken in each component. This can be useful if you are attempting to improve web app performance or isolate what is causing a specific HTTP error.
Web Server Logging - Information about HTTP transactions using the W3C extended log file format. This is useful when determining overall web app metrics, such as the number of requests handled or how many requests are from a specific IP address.
Also, Azure Application Insights collects telemetry from your application to help analyze its operation and performance. You can use this information to identify problems that may be occurring or to identify improvements to the application that would most impact users. This tutorial takes you through the process of analyzing the performance of both the server components of your application and the perspective of the client: https://learn.microsoft.com/en-us/azure/application-insights/app-insights-tutorial-performance
Ref: https://learn.microsoft.com/en-us/azure/app-service/app-service-web-troubleshoot-performance-degradation
I have timer-triggered Azure functions running in production, but now I want to be notified if the function fails.
In my case, access to various connected services can cause crashes, and there are many to troubleshoot. The crash is the type of error I need notification for.
When the function does fail, the log entry indicates failure, so I wonder if there is a hook in the system that would allow me to cause the system to generate a notification.
I know that blob and queue bindings, for instance, support the creation of poison queue entries, but timer trigger binding doesn't say anything about any trigger outputs of that nature.
I see that functions can pass their $return status as input to other functions, but that operation is not explained in depth in the docs. Also, in that case, I need to write another function to process the error status, and I was looking for something built-in.
I Have inquired with #AzureSupport on this, but their answer had nothing to do with Azure Functions, instead referring me to DLL notification hooks, then recommending I file on uservoice.
I'm sure there must be people here who have implemented some sort of error status notification. I prefer a solution that doesn't require code.
The recommended way to monitor and alert on failures is to use AppInsights which integrates fully with Azure Functions now
https://blogs.msdn.microsoft.com/appserviceteam/2017/04/06/azure-functions-application-insights/
Since all the logs are available in AppInsights it's easy to monitor for failures and setup alerts based on your own criteria.
However, if you only care about alerting and not things like monitoring etc, you could use Azure Monitor instead: https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-get-started
When the function does fail, the log entry indicates failure, so I wonder if there is a hook in the system that would allow me to cause the system to generate a notification.
...
I prefer a solution that doesn't require code.
This is a zero-code solution:
I poked #AzureFunctions once before on this topic, and a suggested response was to use Application Insights. It can handle the alerts upon failure and also can use webhooks.
See the Azure Functions App-Insights documentation on how to link your function app to App Insights. Then set up any alerts you want.
Unfortunately this hook doesn't exist.
Can you switch from a timer trigger to a queue trigger?
You can get retries (if you want them), and after the specified number of attempts the message is sent to a poison queue.
To schedule executions you can add queue messages with a visibility timeout to match your schedule.
In order to get alerts on failure you have two options:
A timer trigger than scans the execution logs (via SFTP) for failures.
Wrap the whole function in a try/catch block and in the catch block write a few lines to send you an email with the error details.
Hope this helps.
No code:
Go to your azure cloud account
From the menu select Monitor
Then select Add New Rule
Then Select your condition, action and add the alert details.
I have a web application that interfaces with another application through a message queue. So, my web application has a service-actibator that is bound to an inbound message driven channel adapter; currently it is is always listening for messages on the queue.
However, there may be times where it is desiarable to turn that listening off without bouncing the application itself. For example, if the queue gets a backlog of messages and for whatever reason the web application that is listening for these messages begins to have performance issues and we want to isolate the application from the queue to help identify if that is the source of the performance problem or not.
The bottom line is we are trying to proactivey look for ways to help our support staff when needing to diagnose potential inter-system issues...without having to necessarily bounce the servers for a configuration change.
Then if it is determined that the interface to the external system should be turned back on then we would want to be able to re-start the service activator.
Is anything like this possible? Or is there an approach that I'm not thinking of that would allow this type of runtime start/stop capability?
Yes, it is possible.
All Endpoints in the Spring Integration implement org.springframework.context.SmartLifecycle.
From other side SI has a component for this purpose - Control Bus
So, it very simple:
<channel id="controlBusChannel"/>
<control-bus input-channel="controlBusChannel"/>
<service-activator input-channel="stopMyServiceActivatorChannel"
output-channel="controlBusChannel" expression="'#myServiceActivator.stop()'"/>
<service-activator id="myServiceActivator" input-channel="myInputChannel"
output-channel="myOutupChannel"/>
I am working with the Transient Fault Handling Application Block (TFHAB) to define a retry policy when interfacing with an Azure Database. I'm wondering if there is a way to invoke a throttling response in order to plan and handle likely production scenarios?
I could place an SQL command in a loop and run it until I invoke a response however presumably this is not considered to be "best practice"?
Can anyone suggest some practical ways in which I can test my transient error handling logic?
Look at combining Testing Transient errors in Azure and the list of error codes returned by Windows Azure SQL Database, and see if you can mock in the behavior that you are testing. Beyond unit testing I don't think that you are going to be able to 'simulate' errors because these errors are coming from SQL via the TDS protocol, which will be difficult to intercept. Your need would be a good candidate for a fork of the application block where you could inject a simulator.
I've managed to develop some code that allows you to reliably generate transient errors in Azure SQL Databases. You can find the code at: https://github.com/robdmoore/SQLAzureTransientDemo
My recommendation to you is dependency injection:) To eloborate, your higher level client code should not take a direct dependency to azure, put the client SDK behind an interface and pass that interface as dependency to your higher level client code. Pass the actual sdk for production code as the interface implementation and for your tests pass a test implementation of the interface where you can return any error code or response you want.