Why we need Transient Fault Handling for storage? - azure

I saw some thread said storage library already have retry policy,
So why should us use this :Transient Fault Handling block
Can any one show me some samples about how to use this Transient Fault handling block for myy blob and Table storage properly?
thanks!

For implementation examples, read on in the link you sent - the Key Scenarios section. If you aren't having problems connecting, then don't implement. We use it, but it hasn't helped as far as we know. Every fault we've encountered has been a longer term, Azure internal network related issue that caused faults the TFHB couldn't handle.

One reason (and the reason I use it in my application) is that the transient fault handling application block provides retry logic for not only storage (tables, blobs and queues) but also for SQL Azure as well as Service Bus Queues. If your project makes use of these additional resources (namely SQL Azure and Service Bus Queues) and you would want to have a single library to handle transient faults, I would recommend using this over storage client library.
Another reason I would give for using this library is it's extensibility. You could probably extend this library to handle other error scenarios (not covered by storage client library retry policies) or use it against other web resources like service management API.
If you're just using blob and table storage, you could very well use the retry policies which come with storage client library.

I don't use the Transient Fault Handling block for blob storage, it may be more applicable to table storage or when transmitting larger chunks of data. Given that I use blob storage containers to archive debug information (in the form of short txt files) on certain areas of the site, it seems a little convoluted. I've never once witnessed any failures writing to storage and we write 10s of thousands of logs a week. Of course different usage of storage may yield different reliability.

For tables and blobs you do not need to use any external transient retry blocks afaik. The ones that are implemented within the sdk are fairly robust. If you think you should implement a special retry policy the way to do that is to implement your own retry policy inheriting from Azure Storage IRetryPolicy interface and pass that on to your storage requests as part of the TableRequestOptions.RetryPolicy property.

Related

Dedicated or shared Storage Account for Azure Function Apps with the names less than 32 symbols

Short Version
We want to migrate to v4 and our app names are less than 32 symbols.
Should we migrate to dedicated Storage Accounts or not?
Long Version
We use Azure Functions v3. From start one Storage Account was shared between 10+ Azure Function Apps. It could be by luck but the names are less than 32 symbols and it is not going to change. We are not using slots as they were initially not recommended and then with no adoption time or recommendation made generally available.
Pre-question research revealed this question but it looks like more related to the durable functions. Another question looks more up the point but outdated and the accepted answer states that one Storage Account can be used.
Firstly, the official documentation has a page with storage considerations and it states (props to ijabit for pointing to it.):
It's possible for multiple function apps to share the same storage account without any issues. For example, in Visual Studio you can develop multiple apps using the Azure Storage Emulator. In this case, the emulator acts like a single storage account. The same storage account used by your function app can also be used to store your application data. However, this approach isn't always a good idea in a production environment.
Unfortunately it does not elaborate further on the rationale behind the last sentence.
The page with best practices for Azure Function mentions:
To improve performance in production, use a separate storage account for each function app. This is especially true with Durable Functions and Event Hub triggered functions.
To my greater confusion there was a subsection on this page that said "Avoid sharing storage accounts". But it was later removed.
This issue is somehow superficially related to the question as it mentions the recommendation in the thread.
Secondly, we had contacted Azure Support for different not-related to this question issues and the two different support engineers shared different opinions on the current issue. One said that we can share a Storage Account among Functions Apps and another one said that we should not. So the recommendation from the support was mixed.
Thirdly, we want to migrate to v4 and in the migration notes it is stated:
Function apps that share storage accounts will fail to start if their computed hostnames are the same. Use a separate storage account for each function app. (#2049)
Digging deeper into the topic, the only issue is the collision of the function host names that are used to obtain the lock that was known even in Oct 2017. One can follow the thread and see how in Jan 2020 the recommendation was made to update the official Azure naming recommendation but it was made only on late Nov 2021. I also see that a non-intrusive, i.e. without renaming, solution is to manually set the host id. The two arguments raised by balag0 are: single point of failure and better isolation. They sound good from the perspective of cleaner architecture but pragmatically I personally find Storage Accounts reliable, especially if read about redundancy or consider that MS is dog-fooding it for other services. So it looks more like a backbone of Azure for me.
Finally, as we want to migrate to v4, should we migrate to dedicated Storage Accounts or not?
For the large project with 30+ Azure Functions I work on, we have gone with dedicated Storage Accounts. The reason why is Azure Storage account service limits. As the docs mention, this really comes into play with Durable Task Functions, but can also come into play in other high volume scenarios. There's a hard limit of 20k requests per second for a Storage Account. Hit that limit, and requests will fail and will return HTTP 429 responses. This means that your Azure Function invocation will fail too. We're running some high-volume scenarios and ran into this.
It can also cause problems with Durable Task Functions if two functions have the same TaskHub ID in host.json. This causes a collision when Durable Task Framework does its internal bookkeeping using Storage Queues and Table Storage, and there's lots of pain and agony as things fail in spectacular fashion.
Note that the 20k requests per second service limit can be raised with a support ticket to Azure. If approved, the max they'll raise it to is 50k requests/second.
So avoid the potential headaches and go with a Storage Account per Function.

How can I find the source of my Hot LRS Write Operations on Azure Storage Account?

We are using an Azure Storage account to store some files that shall be downloaded by our app on the users demand.
Even though there should be no write operations (at least none I could think of), we are exceeding the included write operations just some days into the billing period (see image).
Regarding the price it's still within limits, but I'd still like to know whether this is normal and how I can analyze the matter. Besides the storage we are using
Functions and
App Service (mobile app)
but none of them should cause that many write operations. I've checked the logs of our functions and none of those that access the queues or the blobs have been active lately. There are are some functions that run every now and then, but only once every few minutes and those do not access the storage at all.
I don't know if this is related, but there is a kind of periodic ingress on our blob storage (see the image below). The period is roundabout 1 h, but there is a baseline of 100 kB per 5 min.
Analyzing the metrics of the storage account further, I found that there is a constant stream of 1.90k transactions per hour for blobs and 1.3k transactions per hour for queues, which seems quite exceptional to me. (Please not that the resolution of this graph is 1 h, while the former has a resolution of 5 minutes)
Is there anything else I can do to analyze where the write operations come from? It kind of bothers me, since it does not seem as if it's supposed to be like that.
I 've had the exact same problem; after enabling Storage Analytics and inspecting the $logs container I found many log entries that indicate that upon every request towards my Azure Functions, these write operations occur against the following container object:
https://[function-name].blob.core.windows.net:443/azure-webjobs-hosts/locks/linkfunctions/host?comp=lease
In my Azure Functions code I do not explicitly write in any of container or file as such but I have the following two Application Settings configured:
AzureWebJobsDashboard
AzureWebJobsStorage
So I filled a support ticker in Azure with the following questions:
Are the write operation triggered by these application settings? I
believe so but could you please confirm.
Will the write operation stop if I delete these application settings?
Could you please describe, in high level, in what context these operations occur (e.g. logging? resource locking, other?)
and I got the following answers from Azure support team, respectively:
Yes, you are right. According to the logs information, we can see “https://[function-name].blob.core.windows.net:443/azure-webjobs-hosts/locks/linkfunctions/host?comp=lease”.
This azure-webjobs-hosts folder is associated with function app and it’s created by default as well as creating function app. When function app is running, it will record these logs in the storage account which is configured with AzureWebJobsStorage.
You can’t stop the write operations because these operations record necessary logs to storage account used by Azure Functions runtime. Please do not remove application setting AzureWebJobsStorage. The Azure Functions runtime uses this storage account connection string for all functions except for HTTP triggered functions. Removing this Application Settings will cause your function app unable to start. By the way, you can remove AzureWebJobsDashboard and it will stop Monitor rather than the operation above.
These operations is to record runtime logs of function app. These operations will occur when our backend allocates instance for running the function app.
Best place to find information about storage usage is to make use of Storage Analytics especially Storage Analytics Logging.
There's a special blob container called $logs in the same storage account which will have detailed information about every operation performed against that storage account. You can view the blobs in that blob container and find the information.
If you don't see this blob container in your storage account, then you will need to enable storage analytics on your storage account. However considering you can see the metrics data, my guess is that it is already enabled.
Regarding the source of these write operations, have you enabled diagnostics for your Functions and App Service? These write diagnostics logs to blob storage. Also, storage analytics is also writing to the same account and that will also cause these write operations.
For my case, I have a Azure App Insight which took 10K transactions on its storage per mintues for functions and app services, even thought there are only few https requests among them. I'm not sure what triggers them, but once I removed app insights, everything becomes normal.

Azure EventHubs EventProcessorHost tries to acess Azure storage queue

After enabling app insights on a webjobs which listens for events on an EventHub using the EventProcessor class, we see that it tries continuously to access a set of non-existing queues in the configured blob storage account. We have not configured any queues on this account.
There's no reference to a queue anywhere in my code, and it is my understanding that the EventProcessorHost uses blob storage and not queues in order to maintain state. So: Why is it trying to access queues?
The queue access that you're seeing comes from the JobHost itself, not from any specific trigger type like EventHubs. The WebJobs SDK uses some storage resources itself behind the scenes for its own operation, e.g. control queues to track its own work, blobs for storage of log information shown in the Dashboard, etc.
In the specific case you mention above, those control queues that are being accessed are part of our Dashboard Invoke/Replay/Abort support. We have an open issue here in our repo tracking potential improvements we can make in this area. Please feel free to chime in on that issue.

About windows azure blob storage, the implementation of a project should not depends on the cloud platform

We plan to migrate the existing website to Windows azure, and i have been told that we need to store files to blob storage.
My questions is:
If we want to use blob storage, that means i need to re-write the file storage function(we use file system for now), call blob service api to store files, that's very strange for me just because we want to use windows azure, how about in the future we want to use Amazon EC2 or other cloud platform, they might have there own way to store file, then may be i need to re-write the file storage function again, in my opinion , the implementation of a project should not depends on the cloud platform(or cloud server)! Can any body correct me, thanks!
I won't address the commentary about whether an app should have a dependency on a particular cloud environment (or specific ways to deal with that particular issue), as that's subjective and it's a nice debate to have somewhere else. What I will address is the actual storage in Azure, as your info is a bit out-of-date.
One reason to use blob storage directly (and possibly the reason you were told to use blob storage) is that it provides access from multiple instances of your app. Also, blob storage provides 500TB of storage per storage account, and it's triple-replicated within the deployed region (and optionally geo-replicated). With attached storage (either with local disk or blob-backed Azure Disk), the access is specific to a particular instance of your app. Shifting from file system access to blob storage access does require app modification.
If you choose not to modify your app's file I/O operations, then you can also consider the new Azure File Service, which provides SMB access to storage (backed by blob storage). Using File Service, your app would (hopefully) not need to be modified, although you might need to change your root path.
More information on Azure File Service may be found here.
Why does it seem strange? You need to store your files somewhere and the cloud is a good a place as any IF it suits your needs. The obvious advantages are redundancy and geo replication, sharing files across multiple projects and servers, The list goes on. It's difficult to advise on whether it would be a good idea or not without hearing some specifics.
You could use windows azure storage with amazon in the future if you wanted to (you'd just need to set up the access for it), obviously with slighter longer delay. Then again that slight performance drop may be significant and you may end up re-writing it.
Most importantly, swapping over from one cloud provider to another is not trivial depending on just how much you use it or how much data you've got in it, so I would strongly suggest looking at the advantages / disadvantages of each platform closely before putting your lot in with either one and then fully learn that platform.
Personally, I went for Azure cloud services + storage etc even though it was slightly more expensive at the time, because i'm a Microsoft Person (not that I didn't do my research). It was annoying in the early days when key features were missing, but it's really matured now and I like the pace that it's improving.
It's cheap to test, why not try both and see which one suits you? A small price to pay when you have big decisions to make.
Disclaimer: I don't know the current state of Amazon web services.
Nice question. We are in the middle of a migration of an old PHP/MySQL/LocalShare to WebRole/SQLAzure/AzureStorage ERP application. We faced the same problem and decision. Let me write some thoughts about the issue :
It is a good option to just be able to switch the storage provider but is it reasonable? You can always build the abstraction but do you plan how to do the actual change of storage provider - migration/sync while in production? What kind of argument will exactly drive the transition to another storage provider? How much users and data do you have? Do you plan to shard-rebalance the storage in the future? How reliable must be this system during this storage provider switch? Do you want to totally move the data when you want to switch or you just want to shard it so that you start using this different provider? Does the cost development of these (reliable) storage layers and the cost of development of reliable transitions (or bi-directional syncs) outweighs the money difference between any two storage providers?
Just switching storage mechanism from Azure Blob to Amazon will incur heavy latency penalty if your other services are on Azure - When you create Storage and Services on Azure you set affinity groups by region so that you minimize the network latency.
These are only a few of the questions to answer before doing all the weightlifting. We have abstracted the file repository (blob) because we planned to move from local NFS to Blob transparently and gradually and it answers our needs.

Is it possible to create a public queue in Windows Azure?

In Windows Azure it's possible to create public Blob Container. Such a container can be accessed by anonymous clients via the REST API.
Is it also possible to create a publicly accessible Queue?
The documentation for the Create Container operation explains how to specify the level of public access (with the x-ms-blob-public-access HTTP header) for a Blob Container. However, the documentation for the Create Queue operation doesn't list a similar option, leading me to believe that this isn't possible - but I'd really like to be corrected :)
At this time, Azure Queues cannot be made public.
As you have noted, this "privacy" is enforced by requiring all Storage API calls made in RE: to queues to be authenticated with a signed request from your key. There is no "public" concept similar to public containers in blob store.
This would follow best practice in that even in the cloud you would not want to expose the internals of your infrastructure to the outside world. If you wanted to achieve this functionality, you could expose a very thin/simple "layer" app on top of queues. A simple WCF REST app in a web role could expose the queuing operations to your consumers, but handle the signing of api requests internally so you would not need the queues to be public.
You are right, the Azure storage queues won't be publicly accessible like the blobs (Uris). However you may still be able to achieve a publicly consumable messaging infrastructure with the appfabric service bus.
I think the best option would be to setup a worker role and provide access to the queue publicly in that manner. Maybe with AppFabric Service Bus for extra connectivity/interactivity with external sources.
? Otherwise - not really clear what the scope might be. The queue itself appears to be locked away at this time. :(

Resources