Strategy to minimize Azure storage outbound data costs - azure

I am building a web site that (among other things) allows the user to upload photos via web api. The user images will be stored in azure storage blob to be displayed in user albums, and shared with social media. The site will be hosted as an azure web site. I am eager to minimize data transfer costs. I understand that data transfer between an azure web site and table/blob storage incurs no data transfer charge (as it is not considered "outbound") while and data requested from outside the azure web site does. In response to this, I have 2 strategies for exposing the images to the browser:
1.) Via the URI to the image blob in azure storage e.g. with local storage account http://ipv4.fiddler:10000/devstoreaccount1/bcb2ad7581.jpg
2.) Via web api that downloads the image bytes from storage and returns them. e.g. with local host http://localhost:58559/api/image/bcb2ad7581.jpg
These are my assumptions. The direct to storage access (method 1 above) is more efficient. Accessing the images via web api (method 2 above) must incur overheads that the direct access doesn't, right? Each web api request must consume an asp .net thread plus cpu cycles. For each web api image request processed, that is one less request for other web api resources on the site that cannot, and must be queued. On the other hand any external site the image is shared with would add a data transfer cost (among other costs) for each image request; if accessed via method 1.
So my strategy is to access the images within the site via a direct link to the storage (method 1) e.g. when the user opens an album all tags have azure blob uri in their src attribute. However when the user clicks on the Facebook icon to share, I will provide a link to the image via web api (method 2). I realise the user can bypass all of that with plugins like the "PinIt" button etc, but that's OK.
I am only learning this stuff, so I could be way off.
Am I wrong about outbound transfer costs not being applied to azure web sites? I don't think I am but the whole pricing model is confusing, to say the least.
Is accessing blob storage from a browser html page with tag and src atribute, considered outbound data transfer; even if the html page comes from an azure website domain? I mean is it only free when the server side code accesses the storage, not the html client?
Is any data transfer cost saved via method 2 (if indeed there is one), simply cancelled out by a different cost associated with the web api method (like bandwith cost)?
Am I wrong about the performance benefit of direct access to the blob storage, or possibly wrong about the overhead of the web api requests?
It is early days in the design, so I can dump Azure if I have to. I would rather not though, as I think it is what I'm looking. I don't want something for nothing and am happy to pay for the services I consume. Naturally, though, I don't want my ignorance to cost me.
I could do with your advice, on this, and truly appreciate your help.

To answer your questions:
Am I wrong about outbound transfer costs not being applied to azure
web sites?
Sadly, Yes :) Any data that goes out of an Azure Datacenter (DC) incurs an outbound transfer cost and that includes data served through your websites.
Is accessing blob storage from a browser html page with tag and src
atribute, considered outbound data transfer; even if the html page
comes from an azure website domain? I mean is it only free when the
server side code accesses the storage, not the html client?
Yes. Remember the browser is consuming the data which is sitting outside of Azure DC.
Is any data transfer cost saved via method 2 (if indeed there is one),
simply cancelled out by a different cost associated with the web api
method (like bandwidth cost)?
No. Because data eventually flows out of Azure DC (doesn't matter if it is via storage directly or via web api).
Am I wrong about the performance benefit of direct access to the blob
storage, or possibly wrong about the overhead of the web api requests?
You will certainly get more performance benefit by providing direct access to the blob storage than transferring data through web api. Plus you will increase latency as well.
Solution Recommendation
For your application, may I recommend that you look at Shared Access Signature functionality offered by Azure Blob Storage. I believe this will significantly improve the performance of your application.
For uploads, you could create a SAS URL will upload permission and have your web application directly upload files in blob storage. That way the upload data won't be routed through your servers. I wrote some blog posts on the same which you may find useful:
http://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-access-signature-html-and-javascript/
http://gauravmantri.com/2013/12/01/windows-azure-storage-and-cors-lets-have-some-fun/
For downloading images, again have your Web API return a SAS URL instead of reading the image data from blob storage and then stream that data back to the client browser.

Related

Files stored under Azure Blob storage [duplicate]

This question already has answers here:
How to download a file to browser from Azure Blob Storage
(4 answers)
Closed 4 years ago.
In my application hosted in Azure App Service, we upload some images which are stored directly to Azure blob.
After sometime we can view images from application but anyhow user fetch blob URL so he/she can simply check images without application.
Is there any way like we can view images from my application only not from direct URL.
I tried keeping it private then was not visible to user from myapplication.
Is there some way so we can store these images securely so authentic users from my application can access or view?
No, there is no way to restrict Azure Storage API's to only allow your application to access content and not user if user happens to get the blob link. I don't think adding CDN into the mix would change that either since content from CDN can also be accessed via an endpoint.
If you have a two tier architecture with Web portal and Web api, then your web API can fetch the content from Azure Storage upon user request and then stream the content as http response message to your web portal. User this way can only see final content as image without knowing where image came from.
But this can make your application slow because of an extra hop to get content from web api. The whole point of web portal having the link to blob is to get it faster.
Regarding security, if the blobs are open to read then anyone can read them. If you want to restrict to certain users then you have to come up with logic in your web app to issues SAS tokens to users for a limited time.

Cut videos from Azure Blob Storage

I have a web app that is hosted in Azure; one of it's functionalities is to be able to make a few cuts from the video(generate 2 or 3 small videos of 5-10 seconds from a larger video).
The videos are persisted in Azure Blob Storage.
How do you suggest to accomplish this in the Azure environment?
The actual cutting of the videos will be initiated by a web job. I'm also concerned about the pricing(within the Azure environment), I'm taking into account the possibility of high traffic.
Any feedback is appreciated.
Thank you.
Assuming you have video-cutting code that operates on files through normal I/O: You'd need to download the video file from blob, process it via code (or whatever library you've employed), and then store the result back in blob storage. You cannot reference a blob directly with normal standard IO libraries.
If, however, videos are stored in Azure File storage (which is an SMB layer on top of blob storage, then you will be able to directly manipulate your video files.
Web Jobs run within an App Service (just like Web Apps), so you have access to a certain amount of local disk space (depending on App Service tier) for use. You should have no problem temporarily storing a video file within your web app's disk space, for editing operations.
You asked about cost: Again, assuming you're talking about running code within a Web Job (app service), you're just paying for whatever App Service tier you've chosen.
How you actually do those edit operations is entirely up to you (language, library, etc).
Azure Blob Storage is simply an object store which stores the data. It does not have the capability you're looking for.
Azure Media Service however is the service you should look into. The media served by this service makes use of Azure Blob Storage.
For editing video, may I suggest you take a look at Video Editor Plugin for Azure Media Player. You can read more about this plugin here: https://azure.microsoft.com/en-in/blog/video-editor-plugin/. You can also try it out here: http://ampdemo.azureedge.net/amp_editor.html.

Azure storage locality

I am somewhat confused by azure storage account, I do not understand why a storage account can’t have multiple geo-locations, and then why a request can’t be automatically handled by a geo-local azure storage.
To make it clear, consider below:
I have two data centers, West-US , East-Europe, each have web-servers and blob storages, web-server is stateless.
For example:
Region West-US : webserver 1, Blob1
Region East-Europe : webserver2, Blob2
I want my East-Europe web-server2 to access “Region East-Europe blob2” and West-US web-server1 to access “Region West-US bolo1”, due to geo-locaity.
I do not want webserver1 to access Blob2 because extra latency unless Blob1 is inaccessible.
But Blob1 and Blob2 are in different region and so they have different URLs and Access Keys, I do not see an easy way to archive what I want.
I know there is azure traffic manager, but looks like it only support “Cloud Service” and “WebSites”, not to mention there is also the ACCESS KEY.
So, my question, am I doing something wrong?
Thanks in advance!
Blobs are accessible via REST API's - so it should not matter where your webserver is you can reference the dependent blobs using the appropriate blob's URI. One thing you do of course have to do is ensure the blob is actually publically accessible. Take a look here for more information.
Of course they will have different URLs and Access keys and you should use separate code base in web server 1 and web server 2 to access these two blobs differently.
A completely different thing is Azure CDN. I'm talking about this, because you were referring to a traffic manager kind of a mechanism for Azure storage. CDN is not exactly that, but it certainly strikes mind as it might be relevant for you.
You can make these blobs as the source to the CDN and CDN will cache these contents at different edge servers. In your web application, instead of directly accessing the web URL, you can access the CDN URL and CDN will decide from which edge server the requested content (blob) should be served from.
Take a look at https://azure.microsoft.com/en-in/documentation/articles/cdn-serve-content-from-cdn-in-your-web-application/

Windows azure: how to setup front-end and back-end with shared image folder

I'm trying to find the best setup for my website on Windows Azure.
I have a front-end and a back-end website made in ASP.NET MVC4.
Both websites must use a shared same images. Font-end for displaying, back-end for CRUD actions. The image files are stored in a folder in the front-end web application and the url's to those images are stored in a mysql database.
Currenty i have 2 Windows Azure websites, but i can't access the images from the back-end website because there are stored in a folder on the front-end application?
What's the best setup and cheapest for this type of application?
2 websites with shared BLOB storage ?
A cloud service containing 2 webroles (front- and back-end) ?
... ?
Thanks
First you should not use web application's folder beside temporary operations. Since Azure means multi-computer environment, resource (image) won't be available for requester if you use more than one instance (machine)
I would go on 2 blob container. (not 2 blob storage account)
We do not have IP based restriction on blobs yet so as long as you don't share those addresses you will be fine. If you really need to have restriction you can use Shared Access Policy which you can find more details on Use a Stored Access Policy also you should review this one too Restrict Access to Containers and Blobs
I think that using a shared blob storage account is the right direction.
Using a local folder is not a good idea - on web sites and cloud services these are not persistent and you may lose your files. Either way - this is not a scalable solutions - if you'll add additional instances in the future you will not have access to the files.
Using blob storage will give you a location that is accessible from both locations and indeed from the client's browser directly.
You do not specify whether the images need to be accessed securely from the front end or not, if not that blob storage is particularly useful as they can be served from a public container on azure storage directly.

Windows azure requests

I have an application that is deployed on Windows Azure, in the application there is a Report part, the reports works as shown below.
The application generates the report as a PDF file and save it in a certain folder in the application.
I have a PDF viewer in the application that takes the URL of the file and displays it.
As you know, in windows azure I will have several VMs that will handled through a Load balancer so I can not ensure that the request in step 2 will go to the same VM in step 1, and this will cause a problem for me.
Any help is very appreciated.
I know that I can use BLOB, but this is not the problem.
The problem is that after creating the file on a certain VM, I give the PDF viewer the url of the pdf viewer as "http://..../file.pdf". This will generate a new request that I cannot control, and I cannot know which VM will server, so even I saved the file in the BLOB it will not solve my problem.
as in any farm environment, you have to consider saving files in a storage that is common for all machines in the farm. In Windows Azure, such common storage is Windows Azure Blob Storage.
You have to make some changes to your application, so that it saves the files to a Blob stroage. If these are public files, then you just mark the Blob Container as public and provide the full URL to the file in blob to the PDF viewer.
If your PDF files are private, you have to mark your container as private. Second step is to generate a Shared Access Signature URL for the PDF and provide that URL to the PDF viewer.
Furthermore, while developing you can explore your Azure storage using any of the (freely and not so freely) available tools for Windows Azure Storage. Here are some:
Azure Storage Explorer
Azure Cloud Storage Studio
There are a lot of samples how to upload file to Azure Storage. Just search it with your favorite search engine. Check out these resources:
http://msdn.microsoft.com/en-us/library/windowsazure/ee772820.aspx
http://blogs.msdn.com/b/windowsazurestorage/archive/2010/04/11/using-windows-azure-page-blobs-and-how-to-efficiently-upload-and-download-page-blobs.aspx
http://wely-lau.net/2012/01/10/uploading-file-securely-to-windows-azure-blob-storage-with-shared-access-signature-via-rest-api/
The Windows Azure Training Kit has great lab named "Exploring Windows Azure Storage"
Hope this helps!
UPDATE (following question update)
The problem is that after creating the file on a certain VM, I give
the PDF viewer the url of the pdf viewer as "http://..../file.pdf".
This will generate a new request that I cannot control, and I cannot
know which VM will server, so even I saved the file in the BLOB it
will not solve
Try changing a bit your logic, and follow my instructions. When your VM create the PDF, upload the file to a blob. Then give the full blob URL for your pdf file to the PDF viewer. Thus the request will not go to any VM, but just to the blob. And the full blob URL will be something like http://youraccount.blob.core.windows.net/public_files/file.pdf
Or I am missing something? What I understand, your process flow is as follows:
User makes a special request which would cause PDF file generation
File is generated on the server
full URL to the file is sent back to the client so that a client PDF viewer could render it
If this is the flow, that with suggested changes will look like the following:
User make a special request which would cause PDF file generation
File is generated on the server
File is uploaded to a BLOB storage
Full URL for the file in the BLOB is returned back to the client, so that it can be rendered on the client.
What is not clear? Or what is different in your process flow? I do exaclty the same for on-the-fly report generation and it works quite well. The only difference is that my app is Silverlight based and I force file download instead of inline-displaying.
An alternative approach is not to persist the file at all.
Rather, generate it in memory, set the content type of the response to "application/pdf" and return the binary content of the report. This is particularly easy if you're using ASP.NET MVC, but you can use a HttpHandler instead. It is a technique I regularly use in similar circumstances (though lately with Excel reports rather than PDF).
The usefulness of this approach does depend on how you're generating the PDF, how big it is and what the load is on your application.
But if the report is to be served just once, persisting it just so that another request can be made by the browser to retrieve it is just wasteful (and you have to provide the persistence mechanism).
If the same file is to be served multiple times and it is resource-intensive to create, it makes sense to persist it, then.
You want to save your PDF to a centralized persisted storage. VM's hard drive is neither. Azure Blob Storage is likely the simplest and best solution. It is dirt cheap to store and access. API for storing files and access them is very simple
There are two things you could consider.
Windows Azure Blob + Queue Storage
Blob Storage is a cost effective way of storing binary and sharing that information between instances. You would most likely use a worker role to create the Report which would store the report to Blob Storage and drop a completed message on the Queue.
Your web role instance could monitor the queue looking for reports that are ready to be displayed.
It would be similar to the concept used in the Windows Azure Guest Book app.
Windows Azure Caching Service
Similarly [and much more expensive] you could share the binary using the Caching Service. This gives a common layer between your VMs in which to store things, however you won't be able to provide a url to the PDF you'd have to download the binary and use either an HttpHandler or change the content-type of the request.
This would be much harder to implement, very expensive to run, and is not guaranteed to work in your scenario. I'd still suggest Blobs over any other means
Another option would be to implement a sticky session handler of your own. Take a look at:
http://dunnry.com/blog/2010/10/14/StickyHTTPSessionRoutingInWindowsAzure.aspx

Resources