How do I simulate transient errors with Azure Database? - azure

I am working with the Transient Fault Handling Application Block (TFHAB) to define a retry policy when interfacing with an Azure Database. I'm wondering if there is a way to invoke a throttling response in order to plan and handle likely production scenarios?
I could place an SQL command in a loop and run it until I invoke a response however presumably this is not considered to be "best practice"?
Can anyone suggest some practical ways in which I can test my transient error handling logic?

Look at combining Testing Transient errors in Azure and the list of error codes returned by Windows Azure SQL Database, and see if you can mock in the behavior that you are testing. Beyond unit testing I don't think that you are going to be able to 'simulate' errors because these errors are coming from SQL via the TDS protocol, which will be difficult to intercept. Your need would be a good candidate for a fork of the application block where you could inject a simulator.

I've managed to develop some code that allows you to reliably generate transient errors in Azure SQL Databases. You can find the code at: https://github.com/robdmoore/SQLAzureTransientDemo

My recommendation to you is dependency injection:) To eloborate, your higher level client code should not take a direct dependency to azure, put the client SDK behind an interface and pass that interface as dependency to your higher level client code. Pass the actual sdk for production code as the interface implementation and for your tests pass a test implementation of the interface where you can return any error code or response you want.

Related

Splitting up Azure Functions without creating new function app

Our existing system uses App Services with API controllers.
This is not a good setup because our scaling support is poor, its basically all or nothing
I am looking at changing over to use Azure Functions
So effectively each method in a controller would become a new function
Lets say that we have a taxi booking system
So we have the following
Taxis
GetTaxis
GetTaxiDrivers
Drivers
GetDrivers
GetDriversAvailableNow
In the app service approach we would simply have a TaxiController and DriverController with the the methods as routes
How can I achieve the same thing with Azure Functions?
Ideally, I would have 2 function apps - Taxis and Drivers with functions inside for each
The problem with that approach is that 2 function apps means 2 config settings, and if that is expanded throughout the system its far too big a change to make right now
Some of our routes are already quite long so I cant really add the "controller" name to my function name because I will exceed the 32 character limit
Has anyone had similar issues migrating from App Services to Azure Functions>
Paul
The problem with that approach is that 2 function apps means 2 config
settings, and if that is expanded throughout the system its far too
big a change to make right now
This is why application setting is part of the release process. You should compile once, deploy as many times you want and to different environments using the same binaries from the compiling process. If you're not there yet, I strongly recommend you start by automating the CI/CD pipeline.
Now answering your question, the proper way (IMHO) is to decouple taxis and drivers. When requested a taxi, your controller should add a message to a Queue, which will have an Azure Function listening to it, and it get triggered automatically to dequeue / process what needs to be processed.
Advantages:
Your controller response time will get faster as it will pass the processing to another process
The more messages in the queue / more instances of the function to consume, so it will scale only when needed.
Http Requests (from one controller to another) is not reliable (unless you implement properly a circuit breaker and a retry policy. With the proposed architecture, if something goes wrong, the message will remain in the queue or it won't get completed by the Azure function and will return to the queue.

Azure Function with ServiceBusTrigger circuit breaker pattern

I have an Azure function with ServiceBusTrigger which will post the message content to a webservice behind an Azure API Manager. In some cases the load of the (3rd party) webserver backend is too high and it collapses returning error 500.
I'm looking for a proper way to implement circuit breaker here.
I've considered the following:
Disable the azure function, but it might result in data loss due to multiple messages in memory (serviceBus.prefetchCount)
Implement API Manager with rate-limit policy, but this seems counter productive as it runs fine in most cases
Re-architecting the 3rd party webservice is out of scope :)
Set the queue to ReceiveDisabled, this is the preferred solution, but it results in my InputBinding throwing a huge amount of MessagingEntityDisabledExceptions which I'm (so far) unable to catch and handle myself. I've checked the docs for host.json, ServiceBusTrigger and the Run parameters but was unable to find a useful setting there.
Keep some sort of responsecode resultset and increase retry time, not ideal in a serverless scenario with multiple parallel functions.
Let API manager map 500 errors to 429 and reschedule those later, will probably work but since we send a lot of messages it will hammer the service for some time. In addition it's hard to distinguish between a temporary 500 error or a consecutive one.
Note that this question is not about deciding whether or not to trigger the circuitbreaker, merely to handle the appropriate action afterwards.
Additional info
Azure functionsV2, dotnet core 3.1 run in consumption plan
API Manager runs Basic SKU
Service Bus runs in premium tier
Messagecount: 300.000

How to find/cure source of function app throughput issues

I have an Azure function app triggered by an HttpRequest. The function app reads the request, tosses one copy of it into a storage table for safekeeping and sends another copy to a queue for further processing by another element of the system. I have a client running an ApacheBench test that reports approximately 148 requests per second processed. That rate of processing will not be enough for our expected load.
My understanding of function apps is that it should spawn as many instances as is needed to handle the load sent to it. But this function app might not be scaling out quickly enough as it’s only handling that 148 requests per second. I need it to handle at least 200 requests per second.
I’m not 100% sure the problem is on my end, though. In analyzing the performance of my function app I found a LOT of 429 errors. What I found online, particularly https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits, suggests that these errors could be due to too many requests being sent from a single IP. Would several ApacheBench 10K and 20K request load tests within a given day cause the 429 error?
However, if that’s not it, if the problem is with my function app, how can I force my function app to spawn more instances more quickly? I assume this is the way to get more throughput per second. But I’m still very new at working with function apps so if there is a different way, I would more than welcome your input.
Maybe the Premium app service plan that’s in public preview would handle more throughput? I’ve thought about switching over to that and running a quick test but am unsure if I’d be able to switch back?
Maybe EventHub is something I need to investigate? Is that something that might increase my apparent throughput by catching more requests and holding on to them until the function app could accept and process them?
Thanks in advance for any assistance you can give.
You dont provide much context of you app but this is few steps how you can improve
If you want more control you need to use App Service plan with always on to avoid cold start, also you will need to configure auto scaling since you are responsible in this plan and auto scale is not enabled by default in app service plan.
Your azure function must be fully async as you have external dependencies so you dont want to block thread while you are calling them.
Look on the limits. Using host.json you can tweek it.
429 error means that function is busy to process your request, so probably when you writing to table you are not using async and blocking thread
Function apps work very well and scale as it says. It could be because request coming from Single IP and Azure could be considering it DDOS. You can do the following
AzureDevOps Load Test
You can load test using one of the azure service . I am very sure they have better criteria of handling IPs. Azure DeveOps Load Test
Provision VM in Azure
The way i normally do is provision the VM (windows 10 pro) in azure and use JMeter to Load test. I have use this method to test and it works fine. You can provision couple of them and subdivide the load.
Use professional Load testing services
If possible you may use services like Loader.io . They use sophisticated algos to run the load test and provision bunch of VMs to run the same test.
Use Application Insights
If not already you must be using application insights to have a better look from server perspective. Go to live stream and see how many instance it would provision to handle the load test . You can easily look into events and error logs that may be arising and investigate. You can deep dive into each associated dependency and investigate the problem.

Transactions in NServicebus using Azure Service Bus Transport

I have several message handlers in a particular endpoint that do their work against a SQL Azure database (at the moment still using a local SQL 2012 instance). I have a command handler that publishes 2 events, call them X and Y. In the same endpoint I have a subscriber to X and a subscriber to Y. Both of these subscribers are internally using the same data access component, call that Z. Dependency injection is configured on a per-call basis, not shared.
Component Z is using Entity Framework 6 under the curtains. The issue I am having is that just opening the database is throwing a SqlException and complaining about MSDTC escalations.
I have temporarily wrapped the handlers in a TransactionScope.Suppress and that has stopped the error but I believe I'm missing something more fundamental.
Is it a simple matter of configuring the endpoint to be non-transactional? I would have thought this would just work seeing as I've configured to use Azure Service Bus as the transport mechanism. If I do this will NServiceBus still retry if an exception is thrown within the message handler? (Up to the SLR limits -- not part of the question, I also understand the idempotency issues).
#Phil,
First, you shouldn't be using MSDTC with SQL Azure - it's not supported. The feature is suggested, but only under review. DTC is not supported on Azure. Alternatively, you could look into the following suggestion to use SqlTransaction approach.
Second, transport you're using has nothing to do with your data access. Since you're using Azure Service Bus, it will not be part of your handler code. Making handler a transactional is to force an atomic change or roll-back. Regardless of your handler, will retry. Challenge is that when handler/endpoint is not transactional, and within handler first write to DB succeeded and second failed, first write won't be reverted. As for Azure Service Bus as a transport, it's not transactional in its nature (ie no DTC).
Which version of NServiceBus.Azure are you on? Do you have a stack trace of the exception? Where does it come from?
We push the sends and publishes out of the scope of the receive transaction scope explicitly to prevent promotion to the DTC, so that the transaction is local to the sql, so I doubt that is what is happening here.
From you description it looks like you are using a different data access instance for each handler (per call container config) and you have multiple handlers on the same message. If both of these open a new connection to the SQL you would see promotion as well (even if it is the same server)
Could that be it? That it throws on the second open?

NServicebus: Programmatic reading of error queue

I’m currently building an application using NServicebus and Azure.
The regular processes are working, but now I’d like to do more about the management and monitoring aspect of the application.
The customer wants to see a dashboard where he can see the health of the application and also be able to correct issues.
What I’d like to do is:
Detect when things are sent to an error queue (to be able to send an alert to an admin)
Allow admin to handle messages on error queue from management application, without
resorting to the provided command line tool.
Is there a way to programmatically do error handling in NServicebus? I know which errors are transient and which errors might need manual intervention.
Is it possible to plug in logic to the error handling logic of nservicebus?
Is it possible to handle messages on the error queue programmatically?
Thanks,
Erwin
Regarding "dashboard where he can see the health of the application and also be able to correct issues":
Please take a look at ServicePulse (http://particular.net/ServicePulse) for production and online monitoring.
This provides both endpoint health indicators and Failed message indicators (including "Retry" capabilities).
For advanced debugging and visualization of your process you should also consider ServiceInsight (http://particular.net/ServiceInsight).
Behind the scenes of ServicePulse there's the ServiceControl server which exposes REST HTTP API with programmatic access to audited and error messages.
HTH,
Danny.

Resources