About windows azure blob storage, the implementation of a project should not depends on the cloud platform - azure

We plan to migrate the existing website to Windows azure, and i have been told that we need to store files to blob storage.
My questions is:
If we want to use blob storage, that means i need to re-write the file storage function(we use file system for now), call blob service api to store files, that's very strange for me just because we want to use windows azure, how about in the future we want to use Amazon EC2 or other cloud platform, they might have there own way to store file, then may be i need to re-write the file storage function again, in my opinion , the implementation of a project should not depends on the cloud platform(or cloud server)! Can any body correct me, thanks!

I won't address the commentary about whether an app should have a dependency on a particular cloud environment (or specific ways to deal with that particular issue), as that's subjective and it's a nice debate to have somewhere else. What I will address is the actual storage in Azure, as your info is a bit out-of-date.
One reason to use blob storage directly (and possibly the reason you were told to use blob storage) is that it provides access from multiple instances of your app. Also, blob storage provides 500TB of storage per storage account, and it's triple-replicated within the deployed region (and optionally geo-replicated). With attached storage (either with local disk or blob-backed Azure Disk), the access is specific to a particular instance of your app. Shifting from file system access to blob storage access does require app modification.
If you choose not to modify your app's file I/O operations, then you can also consider the new Azure File Service, which provides SMB access to storage (backed by blob storage). Using File Service, your app would (hopefully) not need to be modified, although you might need to change your root path.
More information on Azure File Service may be found here.

Why does it seem strange? You need to store your files somewhere and the cloud is a good a place as any IF it suits your needs. The obvious advantages are redundancy and geo replication, sharing files across multiple projects and servers, The list goes on. It's difficult to advise on whether it would be a good idea or not without hearing some specifics.
You could use windows azure storage with amazon in the future if you wanted to (you'd just need to set up the access for it), obviously with slighter longer delay. Then again that slight performance drop may be significant and you may end up re-writing it.
Most importantly, swapping over from one cloud provider to another is not trivial depending on just how much you use it or how much data you've got in it, so I would strongly suggest looking at the advantages / disadvantages of each platform closely before putting your lot in with either one and then fully learn that platform.
Personally, I went for Azure cloud services + storage etc even though it was slightly more expensive at the time, because i'm a Microsoft Person (not that I didn't do my research). It was annoying in the early days when key features were missing, but it's really matured now and I like the pace that it's improving.
It's cheap to test, why not try both and see which one suits you? A small price to pay when you have big decisions to make.
Disclaimer: I don't know the current state of Amazon web services.

Nice question. We are in the middle of a migration of an old PHP/MySQL/LocalShare to WebRole/SQLAzure/AzureStorage ERP application. We faced the same problem and decision. Let me write some thoughts about the issue :
It is a good option to just be able to switch the storage provider but is it reasonable? You can always build the abstraction but do you plan how to do the actual change of storage provider - migration/sync while in production? What kind of argument will exactly drive the transition to another storage provider? How much users and data do you have? Do you plan to shard-rebalance the storage in the future? How reliable must be this system during this storage provider switch? Do you want to totally move the data when you want to switch or you just want to shard it so that you start using this different provider? Does the cost development of these (reliable) storage layers and the cost of development of reliable transitions (or bi-directional syncs) outweighs the money difference between any two storage providers?
Just switching storage mechanism from Azure Blob to Amazon will incur heavy latency penalty if your other services are on Azure - When you create Storage and Services on Azure you set affinity groups by region so that you minimize the network latency.
These are only a few of the questions to answer before doing all the weightlifting. We have abstracted the file repository (blob) because we planned to move from local NFS to Blob transparently and gradually and it answers our needs.

Related

Dedicated or shared Storage Account for Azure Function Apps with the names less than 32 symbols

Short Version
We want to migrate to v4 and our app names are less than 32 symbols.
Should we migrate to dedicated Storage Accounts or not?
Long Version
We use Azure Functions v3. From start one Storage Account was shared between 10+ Azure Function Apps. It could be by luck but the names are less than 32 symbols and it is not going to change. We are not using slots as they were initially not recommended and then with no adoption time or recommendation made generally available.
Pre-question research revealed this question but it looks like more related to the durable functions. Another question looks more up the point but outdated and the accepted answer states that one Storage Account can be used.
Firstly, the official documentation has a page with storage considerations and it states (props to ijabit for pointing to it.):
It's possible for multiple function apps to share the same storage account without any issues. For example, in Visual Studio you can develop multiple apps using the Azure Storage Emulator. In this case, the emulator acts like a single storage account. The same storage account used by your function app can also be used to store your application data. However, this approach isn't always a good idea in a production environment.
Unfortunately it does not elaborate further on the rationale behind the last sentence.
The page with best practices for Azure Function mentions:
To improve performance in production, use a separate storage account for each function app. This is especially true with Durable Functions and Event Hub triggered functions.
To my greater confusion there was a subsection on this page that said "Avoid sharing storage accounts". But it was later removed.
This issue is somehow superficially related to the question as it mentions the recommendation in the thread.
Secondly, we had contacted Azure Support for different not-related to this question issues and the two different support engineers shared different opinions on the current issue. One said that we can share a Storage Account among Functions Apps and another one said that we should not. So the recommendation from the support was mixed.
Thirdly, we want to migrate to v4 and in the migration notes it is stated:
Function apps that share storage accounts will fail to start if their computed hostnames are the same. Use a separate storage account for each function app. (#2049)
Digging deeper into the topic, the only issue is the collision of the function host names that are used to obtain the lock that was known even in Oct 2017. One can follow the thread and see how in Jan 2020 the recommendation was made to update the official Azure naming recommendation but it was made only on late Nov 2021. I also see that a non-intrusive, i.e. without renaming, solution is to manually set the host id. The two arguments raised by balag0 are: single point of failure and better isolation. They sound good from the perspective of cleaner architecture but pragmatically I personally find Storage Accounts reliable, especially if read about redundancy or consider that MS is dog-fooding it for other services. So it looks more like a backbone of Azure for me.
Finally, as we want to migrate to v4, should we migrate to dedicated Storage Accounts or not?
For the large project with 30+ Azure Functions I work on, we have gone with dedicated Storage Accounts. The reason why is Azure Storage account service limits. As the docs mention, this really comes into play with Durable Task Functions, but can also come into play in other high volume scenarios. There's a hard limit of 20k requests per second for a Storage Account. Hit that limit, and requests will fail and will return HTTP 429 responses. This means that your Azure Function invocation will fail too. We're running some high-volume scenarios and ran into this.
It can also cause problems with Durable Task Functions if two functions have the same TaskHub ID in host.json. This causes a collision when Durable Task Framework does its internal bookkeeping using Storage Queues and Table Storage, and there's lots of pain and agony as things fail in spectacular fashion.
Note that the 20k requests per second service limit can be raised with a support ticket to Azure. If approved, the max they'll raise it to is 50k requests/second.
So avoid the potential headaches and go with a Storage Account per Function.

Can I store and access any temporary storage on cloud/ container?

I am relatively new to cloud, so please guide me through complete process.
I have an application that will be hosted on containers in cloud environment. I want some temporary storage on the container or the cloud environment, and access it via my web application (written in C#), meaning I will generate a file and keep it there. First of all, is it possible without costing me extra? Secondly, if it is possible, how can I access the area with C# code? And even if it costs me extra, will I have any access issues?? Also, please let me know the limitations of that free space, in terms of storage, accessibility and cost.
Using an App Service you can store temporal files inside %TMP% folder which is maped to %SYSTEMDRIVE%\local\Temp with no extra cost.
Depending on your App Service Plan you will have from 10GB to 140GB free space to store files. Beware, your files will dissapear if App service is restarted.
Refer to this link:
https://github.com/projectkudu/kudu/wiki/Understanding-the-Azure-App-Service-file-system

How to store (and query) the MaxMind GeoIP2 database in Azure?

In an Azure Web App I need to efficiently query the MaxMind GeoIP2 City Database (due to the volume of queries and the latency requirements we cannot use the MaxMind's rest API).
I'm wondering what's the best approach for storing the db (binary MMDB format, accessed via the official .NET api) so that it's easy to update with minimal downtime (we are going to subscribe Monthly updates) and still cost effective as to what regards Azure storage and transactions.
Apparently block blobs are the way to go, but I'm not sure about the monthly updates and the fact that the GeoIP2 api load in memory the whole db (I do not know if this would be a problem for the Web App, if I need a web worker to keep it up or I need something else), but actually I do not know yet how large the file is.
What's the most cost effective solution that preserve low latency over a huge volume?
According to the API docs you must have the database available in a file system (the API doesn't know anything about Azure storage and related REST API). So, regardless where you permanently store it, you'll need to have it on a disk somewhere.
I have no idea how large the database footprint is, but Web Apps, Cloud Services (web/worker roles) and Virtual Machines (whether Linux or Windows) all have local disks. And you have read/write access to these disks. So, you'd need to copy the database binary file (or csv) to local disk from somewhere. At this point, when you initialize the SDK, you'd create a DatabaseReader and point it to your locally-downloaded copy of the database file.
You mentioned storing the database in blob storage. There's nothing stopping you from doing so and simply downloading a copy to local disk. And there's nothing stopping you from storing multiple versions in multiple blobs. Note: You may also take advantage of Azure File storage (an SMB share). Which you choose is up to you.
As far as most cost effective solution: You'll need to do the pricing workup yourself to see what's most effective. You'd also need to evaluate how much RAM is available for the given size VM/role instance/Web App you choose. You mentioned Web Apps in your question: Web App instances scale from 0.5GB to 14GB, depending on the tier you choose (again, you'll need to evaluate this).

Windows Azure and multiple storage accounts

I have an ASP.NET MVC 2 Azure application that I am trying to switch from being single tenant to multi-tenant. I have been reviewing many blogs and posts and questions here on Stack Overflow, but am still trying to wrap my head around the specifics of what's right for this particular app.
Currently the application stores some information in a SQL Azure database, as well as some other info in an Azure Storage Account. I'm considering writing the tenant provisioning code to simply create a new database for a new tenant, along with a new azure storage account. This brings me to the following question:
How will I go about testing this approach locally? As far as I can tell, the local Azure Storage Emulator only has 1 storage account. I'm not sure if I'm able to create others locally. How will I be able to test this locally? Or will it be possible?
There are many aspects to consider with multitenancy, one of which is data architecture. You also have billing, performance, security and so forth.
Regarding data architecture, let's first explore SQL storage. You have the following options available to you: add a CustomerID (or other identifyer) that your code will use to filter records, use different schema containers for different customers (each customer has its own copy of all the database objects owned by a dedicated schema in a database), linear sharding (in which each customer has its own database) and Federation (a feature of SQL Azure that offers progressive sharding based on performance and scalability needs). All these options are valid, but have different implications on performance, scalability, security, maintenance (such as backups), cost and of course database design. I couldn't tell you which one to choose based on the information you provided; some models are easier to implement than others if you already have a code base. Generally speaking a linear shard is the simplest model and provides strong customer isolation, but perhaps the most expensive of all. A schema-based separation is not too hard, but requires a good handle on security requirements and can introduce cross-customer performance issues because this approach is not shared-nothing (for customers on the same database). Finally Federations requires the use of a customer identifyer and has a few limitations; however this technology gives you more control over performance distribution and long-term scalability (because like a linear shard, Federation uses a shared-nothing architecture).
Regarding storage accounts, using different storage accounts per customer is definitively the way to go. The primary issue you will face if you don't use separate storage accounts is performance limitations, such as the maximum number of transactions per second that can be executed using a single storage account. As you are pointing out however, testing locally may be a problem; however consider this: the local emulator does not offer 100% parity with an Azure Storage Account (some functions are not supported in the emulator). So I would only use the local emulator for initial development and troubleshooting. Any serious testing, including multitenant testing, should be done using real storage accounts. This is the only way you can fully test an application.
You should consider not creating separate databases, but instead creating different object namespaces within a single SQL database. Each tenant can have their own set of tables.
Depending on how you are using storage, you can create separate storage containers or message queues per client.
Given these constraints you should be able to test locally with the storage emulator and local SQL instance.
Please let me know if you need further explanation.

Share data between users in metro application

I would like to create a Metro application that allows a group of people to interact. One person would create data and serve as the owner, and multiple others would be invited in and be allow to modify that data. I heard from Build talks that each Metro application will get per-user Azure storage, but will it be possible to share that data between multiple users? Does anyone have a link they could share where I could research this?
I think that you are confusing SkyDrive with Azure Blob Storage.
SkyDrive
Personal to a Live ID
Not really meant as a base for collaborative work
Azure Blob Storage
You can have public files that anyone can view and update
You can have a lease on file that only allows certain people to edit it
Since you own the Azure account you also control the content
You can learn the basics here
If you want to share private app data between users, the best way to do so would be via a shared server of some sort. You should have a server (running on Azure, Amazon EC2, or anything really) that exposes a REST-ful web service which each application connects to. The shared state then lives on that server.
This is better than trying to use skydrive or some file-based system for storing shared data. With a file on skydrive and multiple users trying to access it, you would run into concurrency issues when more than 1 person tries to write to it.
You don't get Azure with Metro.
With Live you get a free SkyDrive that is a personal cloud storage. Like 10 GB. Can share files but it is via sending an email link. It is not file storage that would readily support a server type application to manage that sharing.
Azure is a cloud platform for file and data sharing. Azure is not free but storage cost is only $0.125 / GB per month. 10 GB = $1.25 / month. Using SkyDrive as shared storage you are giving up a lot of developer and hosting tools that come with Azure to save $1.25 / month.
It looks like there is a more formal definition of this with the updated help now available. They were referring to roaming application data. I found the following links that provide guidance:
http://msdn.microsoft.com/en-us/library/windows/apps/hh464917.aspx
http://msdn.microsoft.com/en-us/library/windows/apps/hh465094.aspx
The general is that a small amount of temporary application data is provided on a per-app, per-user basis. The actual size you get is not detailed, but the guidance is pretty clear - app settings only, no large data sets, and don't use it for instant synchronization. Given this guidance, my plan is not a good one and will change.

Resources