Azure blobs, what are they for? - azure

I'm reading about Azure blobs and storage, and there are things I don't understand.
First, you can hire Azure for just hosting, but when you create a web role ... do you need storage for the .dll's and other files (.js and .css) ?? Or there are a small storage quota in a worker role you can use? how long is it? I cannot understand getting charged every time a browser download a CSS file, so I guess I can store those things in another kind of storage.
Second, you get charged for transaction and bandwidth, so it's not a good idea to provide direct links to the blobs in your websites, then... what do you do? Download it from your web site code and write to the client output stream on the fly from ASP.NET? I think I've read that internal trafic/transactions are for free, so it looks like a "too-good-for-be-truth" solution :D
Is the trafic between hosting and storage also free?
Thanks in advance.

First, to answer your main question: blobs are best used for dynamic data files. If you run a YouTube sorta site, you would use blobs to store videos in every compressed state and thumbnails to images generated from those videos. Tables within table storage are best for dynamic data that does not require files. For example comments on YouTube videos would likely be best stored by tables in ATS.
You generally want a storage account for at least: publishing your deployments into Azure and to have your compute nodes transfer their diagnostic data to, for when you're deployed and need to monitor your compute nodes
Even though you publish your deployments THROUGH a storage account, the deployment code lives on your compute nodes. .CSS/.HTML files served by your app are served through your node's storage space which you get plenty of (it is NOT a good place for your dynamic data however)
You pay for traffic/data that crosses the Azure data center boundary, irregardless where it came from. Furthermore, transactions (reads or writes) between your azure table storage and anywhere else are not free. You also pay for storing the data in the storage account (storing data on compute nodes themselves is not metered). Data that does not leave their data center is not subject to transfer fees. Now in reality, the costs are so low, that you have to be pushing gigabytes per day to start noticing
Don't store any dynamic data only on compute instances. That data will get purged whenever you redeploy your app or whenever they decide to move your app onto a different node.
Hope this helps

Related

Sharepoint - external storage for documents

I am now extending some features of our corporate sharepoint, and one of the wishes of my customer is following:
Sharepoint farm has little space allocation, about 2 Gb per department. The department I am working with has hundreds of midsized documents (4-10 Mb in average, pptx, doc, pdf...) per month to be stored permanently. Currently they do it in a NAS file share. They want however to store them in Sharepoint for accessibility, but avoiding the 2Gb limit. Is there a way to integrate the external file storage to the Sharepoint?
Alternatively, is it possible to store the file data in a database (i.E. Access or MS SQL) but avoiding the farm-wide installations of frameworks like RBS?
You can using Remote BLOB Storage (RBS) as part of your data storage solution.
Check the article: Manage RBS in SharePoint Server
Before you migrate your BLOBs out of the database, you’ll need to choose an option. Here are several typical options:
1.File System: You can use a normal file system (perhaps a large disk partition on a file server) to store your BLOBs.
2.SAN/NAS Storage: Storage area network (SAN) and network attached storage (NAS) are usually high-end storage options. They’re expensive, but well-suited if the business value of your documents and their size can justify the cost. Both SAN and NAS provide data replication and mirroring and seamless growth into terabytes of data.
3.Cloud Storage: This is useful in at least two situations. First, when you’re running SharePoint in the cloud, but still want to externalize the BLOBs, your natural choice is to store them in a nearby, vendor-provided cloud storage. The second is when you’re running SharePoint within your own datacenter but want to store or archive all BLOBs in the cloud due to space limitations or reliability issues in your datcenter. Archiving is the most common reason in this situation. Make sure that if a user is creating or modifying document for cloud storage that your EBS or RBS provider does this in the background, as it could degrade performance. You also want to make sure your third-party EBS or RBS provider supports storing BLOBs in the cloud storage.
More information is here: Optimize SharePoint with External Storage
RBS is the way to leverage non-SQL Server storage for your BLOBs. Any other approach you take will be a hack/work around. Only your customer's requirements can tell us if the limitations of these workarounds are acceptable. For example, you could store the files elsewhere (network share or cloud share) and have SharePoint Search index them. In this case you loose out on a consistent UI for managing content and you'd still need the SP hosting team's help to setup the search crawling.
The real answer is to work with your customer to document their business needs and why IT's offering doesn't meet that need so that they can give you more space.

Cut videos from Azure Blob Storage

I have a web app that is hosted in Azure; one of it's functionalities is to be able to make a few cuts from the video(generate 2 or 3 small videos of 5-10 seconds from a larger video).
The videos are persisted in Azure Blob Storage.
How do you suggest to accomplish this in the Azure environment?
The actual cutting of the videos will be initiated by a web job. I'm also concerned about the pricing(within the Azure environment), I'm taking into account the possibility of high traffic.
Any feedback is appreciated.
Thank you.
Assuming you have video-cutting code that operates on files through normal I/O: You'd need to download the video file from blob, process it via code (or whatever library you've employed), and then store the result back in blob storage. You cannot reference a blob directly with normal standard IO libraries.
If, however, videos are stored in Azure File storage (which is an SMB layer on top of blob storage, then you will be able to directly manipulate your video files.
Web Jobs run within an App Service (just like Web Apps), so you have access to a certain amount of local disk space (depending on App Service tier) for use. You should have no problem temporarily storing a video file within your web app's disk space, for editing operations.
You asked about cost: Again, assuming you're talking about running code within a Web Job (app service), you're just paying for whatever App Service tier you've chosen.
How you actually do those edit operations is entirely up to you (language, library, etc).
Azure Blob Storage is simply an object store which stores the data. It does not have the capability you're looking for.
Azure Media Service however is the service you should look into. The media served by this service makes use of Azure Blob Storage.
For editing video, may I suggest you take a look at Video Editor Plugin for Azure Media Player. You can read more about this plugin here: https://azure.microsoft.com/en-in/blog/video-editor-plugin/. You can also try it out here: http://ampdemo.azureedge.net/amp_editor.html.

How to clone blob container and contents

Summary
I have picked up support for a fairly old website which stores a bunch of blobs in Azure. What I would like to do is duplicate all of my blobs from live to the test environment so I can use them without affecting users.
Architecture
The website is a mix of VB webforms and MVC, communicating with an Azure blob service (e.g. https://x.blob.core.windows.net/LiveBlobs).
The test site mirrors the live setup, except it points to a different blob container in the same storage account (e.g. https://x.blob.core.windows.net/TestBlobs)
Questions
Can I copy all of the blobs from live to test without downloading
them? They would need to maintain the same names.
How do I work out what it will cost to do this? The live blob
storage is roughly 130GB, but it should just be copying the data within the same data centre right?
Things I've investigated
I've spent quite some time searching for an answer, but what I've found deals with copying between storage accounts or copying single blobs.
I've also found AzCopy which looks promising but it looks like it would copy the files one by one so I'm worried it would end up taking a long time and costing a lot.
I am fairly new to Azure so please forgive me if this is a silly question or I've missed out some important details. I'm more than happy to add any extra information should you need it.
Can I copy all of the blobs from live to test without downloading
them? They would need to maintain the same names.
Yes, you can. Copying blob is an asynchronous server-side operation. You simply tell the blob service the blobs to copy & destination details and it will do the job for you. No need to download first and upload them to destination.
How do I work out what it will cost to do this? The live blob storage
is roughly 130GB, but it should just be copying the data within the
same data centre right?
So there are 3 things you need to consider when it comes to costing: 1) Storage costs, 2) transaction costs and 3) data egress costs.
Since the copied blobs will be stored somewhere, they will be consuming storage and you will incur storage costs.
Copy operation will perform some read operations on source blobs and then write operation on destination blobs (to create them), so you will have to incur transaction costs. At very minimum for each blob copy, you can expect 2 transactions - read on source and write on destination (though there can be more transactions).
You incur data egress costs if the destination storage account is not in the same region as your source storage account. As long as both storage accounts are in the same region, you would not incur this.
You can use Azure Storage Pricing Calculator to get an idea about how much it is going to cost you.
I've also found AzCopy which looks promising but it looks like it
would copy the files one by one so I'm worried it would end up taking
a long time and costing a lot.
Blobs are always copied one-by-one. Copying across storage accounts is always async server side operation so you can't really predict how much time it would take for the copy operation to complete but in my experience it is quite fast. If you want to control when the blobs are copied, you would need to download them first and upload them. AzCopy supports this mode as well.
As far as costs are concerned, I think it is a relative term when you say it is going to cost a lot. But in general Azure Storage is very cheap and 130 GB is not a whole lot of data.

Is this a sensible Azure Blob Storage setup and are there restructuring tools to help me migrate to it?

I think we have gone slightly wrong on the way we have used Azure storage in a SAAS system. We created a storage account per client (Securtiy was prime consideration) and containers per system area e.g. Vehicle, Work etc
Having done further reading it seems a suggestion would be that we should have used one account for all clients. Each client should have a container (so we can programmatically create it) which we then secure. Then files should just be structured using "virtual" folder structure e.g. Container called "Client A". Then Files for the Jobs (in Work area of system) stored like Work/Jobs/{entity id}/blah.pdf. Does this sound sensible?
If so we now have about 10 accounts that we need to restructure. Are there any tools that will let us easily copy one accounts contents to another containers account? I appreciate we probably can't move the files between accounts (as we set them up ages ago so can't use native copy function) so I guess some sort of copy. There are GB of files across all the accounts.
It may not be such a bad idea to keep different storage accounts per client. The benefits of doing that (to me) are:
Better security as mentioned by you.
You'll be able to achieve better throughput / client as each client will have their own storage account. If you keep one storage account for all clients, and if one client starts hitting that account badly other clients will be impacted.
Better scalability. Each storage account can hold up to 200 TB of data. So if you keep just one storage account and assuming each client consumes 100 GB of data, you'll be able to accommodate only 2000 clients (I hope my math is right :)). With individual storage accounts, you won't be restricted in that sense.
There're some downsides as well. Some of them are:
Management would be a nightmare. Imagining you have 2000 customers then you would end up managing 2000 storage accounts.
You may be limited by Windows Azure. Currently by default you get about 10 or 20 storage accounts per subscription and you would need to contact support to manually up that limit. They can do that for you but I would imagine you would want this to be a self-service model where you would be able to create as many storage accounts as you want without contacting support.
Now coming to your question about tooling, you could possibly write something on your own which makes use of Copy Blob functionality. This functionality allows you to copy blob data across storage accounts asynchronously. Basically this is what you would do is:
First create a blob container for each client in the target storage account.
Enumerate all blob containers in source storage account.
For each blob container in source storage account, enumerate the blobs.
Copy each blob asynchronously to target storage account in the client's blob container.
If you're a PowerShell fan, you can look into Cerebrata's Azure Management Cmdlets (http://www.cerebrata.com/Products/AzureManagementCmdlets) as well which wraps this functionality. I could have recommended Cerebrata's Azure Management Studio as well but I haven't tried this functionality just yet there [Disclosure: I'm one of the devs on Cerebrata team].
Hope this helps.
Adding to Gaurav Mantri answer...
You can have shared storage account for customers and use Shared Access Signature(SAS) to limiting access to particular container or blobs(as well as for tables and queues)...
http://msdn.microsoft.com/en-us/library/windowsazure/hh508996.aspx

Where to store things like user pictures using Azure? Blob Storage?

I have just migrated a project of mine for test cases to Microsoft's azure.
But for functionalities similar to an avatar upload I need write access to the files on the harddrive. But this is a cloud, so this is not possible.
How can I build such functionalities instead? Should I use the Blob Storage or is there a better solution?
Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?
Blob storage is definitely the place to put dynamic images like avatars. While you can write to the disk on the VM you'll be running in, you can't rely on this to be present - if your app gets moved off to another machine (which could happen for any number of reasons) this storage will be erased.
One thing you could do is store your images in blob storage, and cache them on the local VM disk (using the standard file IO mechanisms). This way you'll get pretty good performance and will save on a few storage transactions while still making sure you're not storing in volatile storage.
If you've got static images which will be completely static, these are just bundled with your application and can be referenced like a normal file. But, if you will ever need to change them, you'd need to redeploy the application - so only use this technique for images which won't need to change.
Be aware there are two types of Blobs in Windows Azure: Block Blobs and Page Blobs. Block Blobs are appropriate for media file serving, whereas Page Blobs are optimized for other work patterns.
Also consider use of the Azure Content Distribution Network (CDN) for lowering latency to clients.
Azure also has streaming capabilities which work in concert with Silverlight Smooth Streaming (http://blog.smarx.com/posts/smooth-streaming-with-windows-azure-blobs-and-cdn if interested).
"Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?"
Yes I think so - this is what I'm rolling out right now actually.

Resources