I am beginning to use Azure Storage (blob specifically) in my application but wanted to know what the norm was in the case of testing versus production storage.
So is it routine to create one storage account? ie:
http:// <storage-account-name>.blob.core.windows.net/
and then have different containers for each environment? ie:
http://<storage-account-name>.blob.core.windows.net/testContainer
http://<storage-account-name>.blob.core.windows.net/productionContainer
so then it would end up looking like with populated data:
http://<storage-account-name>.blob.core.windows.net/testContainer/<whateverkey>
http://<storage-account-name>.blob.core.windows.net/productionContainer/<whateverkey>
or is should I be creating two different storage accounts? I had assumed that the connectionString generated was for just the storage account name and then later in my logic I would be specifying the containers and keys when adding data.
Thanks
There is no standard way, but... keep in mind: Azure storage isn't multi-level regarding subfolders (though the paths can be simulated). So, using containers to organize test vs production will hinder your ability to take advantage of conainers properly within your app (e.g. if you want /images/foo.png ... now you must have /productioncontainer/images/foo.png).
Remember that storage accounts are free: You pay only for storage used. So it costs nothing extra to have both a test and a production storage account. And then, the only thing that changes is the base address (storage account name).
You're correct regarding connection string: You just have accountname.blob.core.windows.net/container/object .
You should use different Storage Accounts - that way in addition to having storage isolation you can also ensure you have different security protection for accessing your development environment vs your production environment.
Related
We have stored 200.000+ images in a classic azure blob account with standard performance. We include the blob URLs in the HTML of our application so the browser downloads the images directly from the blob storage. However, this is really slow. A simple 2kb image can take up to 200ms to download. Download speeds are irregular.
I made a new storage account, now V2 with premium performance. However, now I can't make any public containers anymore. The portal returns the error: 'This is a premium 'StorageV2 (general purpose v2)' account. Containers within this storage account must be set to 'private' access level.'
How can I host images in an Azure environment with good performance without having to deploy them on my web role?
Azure storage V2 with premium only supports private access level. You should consider using BlockBlobStorage accounts with premium in your case, which supports the public access.
And here is the benefit of BlockBlobStorage accounts:
Compared with general-purpose v2 and BlobStorage accounts, BlockBlobStorage accounts provide low and consistent latency, and higher transaction rates.
Here is the screenshot of create a BlockBlobStorage accounts with premium:
Azure storage account have certain limits (like, 20000 IOPS limit per account) which might interfere with performance at the scale you are talking about. Steps you can take to check if this is the root case - split your images into several storage accounts and see if that fixes performance.
Alternatively (and probably better) you should use Azure CDN attached to the storage account to fix this performance issue (and even make it faster).
https://learn.microsoft.com/en-us/azure/cdn/cdn-create-a-storage-account-with-cdn
It has a default selection of "functionb7be452dbab0" in my case, but I can change it to select other storage accounts. There is no documentation that I can see which explains the storage account setting.
It is used for several things:
In Consumption mode, it holds your files, using Azure Files. i.e. all you function files exist in there.
In addition, the script runtime (based on the WebJobs SDK) uses Blobs, Queues and Tables as part of its infrastructure. e.g. it uses that to synchronize the work between multiple instances. It also stores logging information there.
Note that you can easily see all this by using Microsoft Azure Storage Explorer and looking at all the things in there.
As an aside, you can optionally also make use of this storage account for your own queues and blobs that you want to use in your functions.
I want to create a couple of cloud services - Int, QA, and Prod. Each of these will connect to separate Db's.
Do these cloud services require "storage accounts"? Conceptually the cloud services have executables and they must be physically located somewhere.
Note: I do not use any blobs/queues/tables.
If so, must I create 3 separate storage accounts or link them up to one?
Storage accounts are more like storage namespaces - it has a url and a set of access keys. You can use storage from anywhere, whether from the cloud or not, from one cloud service or many.
As #sharptooth pointed out, you need storage for diagnostics with Cloud Services. Also for attached disks (Azure Drives for cloud services), deployments themselves (storing the cloud service package and configuration).
Storage accounts are free: That is, create a bunch, and still only pay for consumption.
There are some objective reasons why you'd go with separate storage accounts:
You feel that you could exceed the 20,000 transaction/second advertised limit of a single storage account (remember that storage diagnostics are using some of this transaction rate, which is impacted by your logging-aggressiveness).
You are concerned about security/isolation. You may want your dev and QA folks using an entirely different subscription altogether, with their own storage accounts, to avoid any risk of damaging a production deployment
You feel that you'll exceed 200TB 500TB (the limit of a single storage account)
Azure Diagnostics uses Azure Table Storage under the hood (and it's more convenient to use one storage account for every service, but it's not required). Other dependencies your service has might also use some of the Azure Storage services. If you're sure that you don't need Azure Storage (and so you don't need persistent storage of data dumped through Azure Diagnostics) - okay, you can go without it.
The service package of your service will be stored and managed by Azure infrastructure - that part doesn't require a storage account.
I think we have gone slightly wrong on the way we have used Azure storage in a SAAS system. We created a storage account per client (Securtiy was prime consideration) and containers per system area e.g. Vehicle, Work etc
Having done further reading it seems a suggestion would be that we should have used one account for all clients. Each client should have a container (so we can programmatically create it) which we then secure. Then files should just be structured using "virtual" folder structure e.g. Container called "Client A". Then Files for the Jobs (in Work area of system) stored like Work/Jobs/{entity id}/blah.pdf. Does this sound sensible?
If so we now have about 10 accounts that we need to restructure. Are there any tools that will let us easily copy one accounts contents to another containers account? I appreciate we probably can't move the files between accounts (as we set them up ages ago so can't use native copy function) so I guess some sort of copy. There are GB of files across all the accounts.
It may not be such a bad idea to keep different storage accounts per client. The benefits of doing that (to me) are:
Better security as mentioned by you.
You'll be able to achieve better throughput / client as each client will have their own storage account. If you keep one storage account for all clients, and if one client starts hitting that account badly other clients will be impacted.
Better scalability. Each storage account can hold up to 200 TB of data. So if you keep just one storage account and assuming each client consumes 100 GB of data, you'll be able to accommodate only 2000 clients (I hope my math is right :)). With individual storage accounts, you won't be restricted in that sense.
There're some downsides as well. Some of them are:
Management would be a nightmare. Imagining you have 2000 customers then you would end up managing 2000 storage accounts.
You may be limited by Windows Azure. Currently by default you get about 10 or 20 storage accounts per subscription and you would need to contact support to manually up that limit. They can do that for you but I would imagine you would want this to be a self-service model where you would be able to create as many storage accounts as you want without contacting support.
Now coming to your question about tooling, you could possibly write something on your own which makes use of Copy Blob functionality. This functionality allows you to copy blob data across storage accounts asynchronously. Basically this is what you would do is:
First create a blob container for each client in the target storage account.
Enumerate all blob containers in source storage account.
For each blob container in source storage account, enumerate the blobs.
Copy each blob asynchronously to target storage account in the client's blob container.
If you're a PowerShell fan, you can look into Cerebrata's Azure Management Cmdlets (http://www.cerebrata.com/Products/AzureManagementCmdlets) as well which wraps this functionality. I could have recommended Cerebrata's Azure Management Studio as well but I haven't tried this functionality just yet there [Disclosure: I'm one of the devs on Cerebrata team].
Hope this helps.
Adding to Gaurav Mantri answer...
You can have shared storage account for customers and use Shared Access Signature(SAS) to limiting access to particular container or blobs(as well as for tables and queues)...
http://msdn.microsoft.com/en-us/library/windowsazure/hh508996.aspx
I know Azure will geo-replication a copy of current storage account to another location,
my questions is: can I access another location in program, even just read only
I asked this, because this allow me to build another deploy in different geo-location for performance and disaster-proof like what Azure did. For current setup, if I use same source of storage in different geo-location, I have to pay extra bandwidth cost.
You can only access your storage account by its primary name. In the event of failover, that name will be mapped to the alternate datacenter. You cannot access the failover storage directly, nor can you choose when to trigger a failover. For a multi-site setup as you described, you'd need to duplicate your data (which would then add the cost of storage in datacenter #2). This does give you ultimate flexibility in your DR and performance planning, but at an added cost of storage and bandwidth (egress-only).
Last week the storage team announced read-only access to the failover storage: Windows Azure Storage Redundancy Options and Read Access Geo Redundant Storage.
This means you can now deploy your application in a different datacenter which can be used for "full" failover (meaning that the storage will also be available there). Even if it's only read-only, your application will still be online - but simply in "degraded" mode.
The steps on how you can implement this with traffic manager are described here: http://fabriccontroller.net/blog/posts/adding-failover-to-your-application-with-read-access-geo-redundant-storage-and-the-windows-azure-traffic-manager/