How to access XML hosted as azure blob from azure website - azure

I'm currently looking into problem:
We have a backend application that creates XML files with content and stores them as Azure Blobs (we cannot change this).
Blob sample url: http://mytestaccount.blob.core.windows.net
We are implementing a webpage that consumes those XML files.
Our current solution is static webpage (no iis or any other server required) hosted on the same blob as mentioned XML files.
Question 1: Is there a way to redirect a domain name to our webpage hosted currently as blob?
We thought about hosting our web application not as Blob but using Azure Web Sites. This creates a problem of cross-domain requests (we have to get those XML files).
Website sample url: http://mytestpage.azurewebsites.net
Question 2: Is there a way to download XML by jquery ajax call from such a webpage (hosted as Azure Web Page) to xml stored as Blob?

Question 1: Is there a way to redirect a domain name to our webpage hosted currently as blob?
Yes, it's certainly possible to do so however there're some limitations currently. Please see this link regarding configuring a sub-domain which points to blob storage: http://msdn.microsoft.com/en-us/library/windowsazure/ee795179.aspx. Please note that currently you don't get an option to specify default document for your website with this approach.
Question 2: Is there a way to download XML by jquery ajax call from such a webpage (hosted as Azure Web Page) to xml stored as Blob?
Reading the blob would not cause cross-domain request issues. Cross domain issue would come when you're trying to post/put some content in your blob storage.
As long as your blob container is public, you should be able to access the blob without any issues. It's a simple GET. However if your blob container is private, then you would need to create a shared access signature URI on the blob with "Read" permission and use that URI in your jQuery AJAX call.
UPDATE: Reading a blob via jQuery AJAX call would give a "Cross Domain" error.

Related

Azure storage options to serve content on Azure Web App

I am a total newbie to Azure WebApps and storage, I need some clarification/confirmation. The main thing to take note of, my application (described below) requires a folder hierarchy. Blob is out of the question and file share doesn't allow anonymous access unless I use Shared Access Signature (SAS).
Am I understanding Azure storage correctly, it's either you fit into the Azure storage model or you don't?
Can anyone advise how I can achieve what's required by the CMS application as described below by using Blobs?
The only option I see is to find a way to change the CMS application so that it always has the SAS in the URL to every file it requests from storage in order to serve content on my Web App? If so, is it a problem if I set my SAS to expire sometime in the distant future?
https://<appname>.file.core.windows.net/instance1/site1/file1.jpg?<SAS>
Problem with using Blob
So far my understanding is that Blob storage doesn't allow "sub folders" as it's a container that holds unstructured data, therefore I'm unable to use this based on my application (described below) as it requires folder structure.
The problem with using File Share
File share seemed perfect as it allows for folder hierarchy, naturally that's what I've used.
However, no anonymous access is allowed for files stored in file storage, the access needs to be authorised. One way of authorising the access is to create a SAS on a file/share level with Read permission and then using that SAS URL to access the file.
Cannot access Windows azure file storage document
My application
I've created a Linux Web App running open source CMS application. This application allows creation of multiple websites, for each website's content such as images, docs, multimedia to be stored on a file server. These files are then served to the website via a defined URL.
The CMS application allows for a settings of the location where it should save its files, this would be a folder on the file server. It then creates a new sub folder for every site it hosts in that location.
Example folder hierarchy
/instance1
/site1
/file1
/file2
/site2
/file1
/file2
Am I understanding Azure storage correctly, it's either you fit into
the Azure storage model or you don't?
You can use Azure Storage Model for your CMS Application. You can use either Blob Storage or File Share
Can anyone advise how I can achieve what's required by the CMS
application as described below by using Blobs?
You can use Data Lake Gen 2 storage account if you want to use Azure Blob Storage.
Data Lake Gen 2 storage enables hierarchical namespace so that you can use subfolders in the Blob Storage as per your requirements
Problem with using Blob
Blob Storage allows subfolders if we use Data Lake Gen 2 storage account. You can enable Blob Public Anonymous access
The problem with using File Share
Azure File Share supports but does not allow public anonymous access. You can use Azure Managed Identity (System-Assigned) for your web app to access the Azure File Share.
Then your application would be able to access the Azure File Share without SAS token
The issue of not having real folders in a blob storage shouldn't be any issue for your use case. Just because it doesn't have your traditional folders doesn't mean it can't serve content on e.g. instance1/site1/file1. That's still possible but the instance1/site1/ will just be part of the name of the blob.
Tools like the Azure Portal or Storage Explorer will actually show folders by using the delimiter / and querying data that appears to be inside a folder by using the path as prefix.

Multipart byteranges on Azure static website

As it seems, Azure now has an option to publicly serve the contents of Blob storage via HTTP, mainly intended for static websites. It is rather new and currently tagged as "preview".
I'd like to store binary releases (about 3 GB each version) of a game on Azure storage, and allow players to perform a differential update to any version using a zsync-like algorithm. For this way to work, it is crucial to be able to downloaded only specified chunks of a large file. Normally, this is achieved over HTTP by sending a multipart byteranges GET request.
The question is: is it possible to enable HTTP requests with multiple byteranges on the Azure "static website"?
UPDATE: As Itay Podhajcer mentioned, I don't need the "static website" feature for serving my blob storage over HTTP, I can directly open my storage container for public. However, multipart byteranges requests do not work with direct access to blob storage too.
The Azure Static Website service is intended more for Single Page Applications (such as Angular and React apps).
If you only need to store binary content to be downloaded by clients, I think you should be fine using the "regular" Blob container.
To specify the a range header on a GET request, you can follow Specifying the Range Header for Blob Service Operations.
Hope it helps!
I managed to get multipart byteranges working by using CDN.
The full list of actions is:
Screw "static website" feature: as Itay Podhajcer wrote, it is absolutely unnecessary.
Upgrade DefaultServiceVersion using Cloud Shell, as said here (to ensure that Accept-Ranges is included in HTTP headers).
Enable "Standard Akamai" CDN to serve Blob Storage directly.
After that I can send multipart byteranges requests to the CDN endpoint, and it gives back multipart response with exactly the data I requested.

Files stored under Azure Blob storage [duplicate]

This question already has answers here:
How to download a file to browser from Azure Blob Storage
(4 answers)
Closed 4 years ago.
In my application hosted in Azure App Service, we upload some images which are stored directly to Azure blob.
After sometime we can view images from application but anyhow user fetch blob URL so he/she can simply check images without application.
Is there any way like we can view images from my application only not from direct URL.
I tried keeping it private then was not visible to user from myapplication.
Is there some way so we can store these images securely so authentic users from my application can access or view?
No, there is no way to restrict Azure Storage API's to only allow your application to access content and not user if user happens to get the blob link. I don't think adding CDN into the mix would change that either since content from CDN can also be accessed via an endpoint.
If you have a two tier architecture with Web portal and Web api, then your web API can fetch the content from Azure Storage upon user request and then stream the content as http response message to your web portal. User this way can only see final content as image without knowing where image came from.
But this can make your application slow because of an extra hop to get content from web api. The whole point of web portal having the link to blob is to get it faster.
Regarding security, if the blobs are open to read then anyone can read them. If you want to restrict to certain users then you have to come up with logic in your web app to issues SAS tokens to users for a limited time.

Strategy to minimize Azure storage outbound data costs

I am building a web site that (among other things) allows the user to upload photos via web api. The user images will be stored in azure storage blob to be displayed in user albums, and shared with social media. The site will be hosted as an azure web site. I am eager to minimize data transfer costs. I understand that data transfer between an azure web site and table/blob storage incurs no data transfer charge (as it is not considered "outbound") while and data requested from outside the azure web site does. In response to this, I have 2 strategies for exposing the images to the browser:
1.) Via the URI to the image blob in azure storage e.g. with local storage account http://ipv4.fiddler:10000/devstoreaccount1/bcb2ad7581.jpg
2.) Via web api that downloads the image bytes from storage and returns them. e.g. with local host http://localhost:58559/api/image/bcb2ad7581.jpg
These are my assumptions. The direct to storage access (method 1 above) is more efficient. Accessing the images via web api (method 2 above) must incur overheads that the direct access doesn't, right? Each web api request must consume an asp .net thread plus cpu cycles. For each web api image request processed, that is one less request for other web api resources on the site that cannot, and must be queued. On the other hand any external site the image is shared with would add a data transfer cost (among other costs) for each image request; if accessed via method 1.
So my strategy is to access the images within the site via a direct link to the storage (method 1) e.g. when the user opens an album all tags have azure blob uri in their src attribute. However when the user clicks on the Facebook icon to share, I will provide a link to the image via web api (method 2). I realise the user can bypass all of that with plugins like the "PinIt" button etc, but that's OK.
I am only learning this stuff, so I could be way off.
Am I wrong about outbound transfer costs not being applied to azure web sites? I don't think I am but the whole pricing model is confusing, to say the least.
Is accessing blob storage from a browser html page with tag and src atribute, considered outbound data transfer; even if the html page comes from an azure website domain? I mean is it only free when the server side code accesses the storage, not the html client?
Is any data transfer cost saved via method 2 (if indeed there is one), simply cancelled out by a different cost associated with the web api method (like bandwith cost)?
Am I wrong about the performance benefit of direct access to the blob storage, or possibly wrong about the overhead of the web api requests?
It is early days in the design, so I can dump Azure if I have to. I would rather not though, as I think it is what I'm looking. I don't want something for nothing and am happy to pay for the services I consume. Naturally, though, I don't want my ignorance to cost me.
I could do with your advice, on this, and truly appreciate your help.
To answer your questions:
Am I wrong about outbound transfer costs not being applied to azure
web sites?
Sadly, Yes :) Any data that goes out of an Azure Datacenter (DC) incurs an outbound transfer cost and that includes data served through your websites.
Is accessing blob storage from a browser html page with tag and src
atribute, considered outbound data transfer; even if the html page
comes from an azure website domain? I mean is it only free when the
server side code accesses the storage, not the html client?
Yes. Remember the browser is consuming the data which is sitting outside of Azure DC.
Is any data transfer cost saved via method 2 (if indeed there is one),
simply cancelled out by a different cost associated with the web api
method (like bandwidth cost)?
No. Because data eventually flows out of Azure DC (doesn't matter if it is via storage directly or via web api).
Am I wrong about the performance benefit of direct access to the blob
storage, or possibly wrong about the overhead of the web api requests?
You will certainly get more performance benefit by providing direct access to the blob storage than transferring data through web api. Plus you will increase latency as well.
Solution Recommendation
For your application, may I recommend that you look at Shared Access Signature functionality offered by Azure Blob Storage. I believe this will significantly improve the performance of your application.
For uploads, you could create a SAS URL will upload permission and have your web application directly upload files in blob storage. That way the upload data won't be routed through your servers. I wrote some blog posts on the same which you may find useful:
http://gauravmantri.com/2013/02/16/uploading-large-files-in-windows-azure-blob-storage-using-shared-access-signature-html-and-javascript/
http://gauravmantri.com/2013/12/01/windows-azure-storage-and-cors-lets-have-some-fun/
For downloading images, again have your Web API return a SAS URL instead of reading the image data from blob storage and then stream that data back to the client browser.

Windows azure requests

I have an application that is deployed on Windows Azure, in the application there is a Report part, the reports works as shown below.
The application generates the report as a PDF file and save it in a certain folder in the application.
I have a PDF viewer in the application that takes the URL of the file and displays it.
As you know, in windows azure I will have several VMs that will handled through a Load balancer so I can not ensure that the request in step 2 will go to the same VM in step 1, and this will cause a problem for me.
Any help is very appreciated.
I know that I can use BLOB, but this is not the problem.
The problem is that after creating the file on a certain VM, I give the PDF viewer the url of the pdf viewer as "http://..../file.pdf". This will generate a new request that I cannot control, and I cannot know which VM will server, so even I saved the file in the BLOB it will not solve my problem.
as in any farm environment, you have to consider saving files in a storage that is common for all machines in the farm. In Windows Azure, such common storage is Windows Azure Blob Storage.
You have to make some changes to your application, so that it saves the files to a Blob stroage. If these are public files, then you just mark the Blob Container as public and provide the full URL to the file in blob to the PDF viewer.
If your PDF files are private, you have to mark your container as private. Second step is to generate a Shared Access Signature URL for the PDF and provide that URL to the PDF viewer.
Furthermore, while developing you can explore your Azure storage using any of the (freely and not so freely) available tools for Windows Azure Storage. Here are some:
Azure Storage Explorer
Azure Cloud Storage Studio
There are a lot of samples how to upload file to Azure Storage. Just search it with your favorite search engine. Check out these resources:
http://msdn.microsoft.com/en-us/library/windowsazure/ee772820.aspx
http://blogs.msdn.com/b/windowsazurestorage/archive/2010/04/11/using-windows-azure-page-blobs-and-how-to-efficiently-upload-and-download-page-blobs.aspx
http://wely-lau.net/2012/01/10/uploading-file-securely-to-windows-azure-blob-storage-with-shared-access-signature-via-rest-api/
The Windows Azure Training Kit has great lab named "Exploring Windows Azure Storage"
Hope this helps!
UPDATE (following question update)
The problem is that after creating the file on a certain VM, I give
the PDF viewer the url of the pdf viewer as "http://..../file.pdf".
This will generate a new request that I cannot control, and I cannot
know which VM will server, so even I saved the file in the BLOB it
will not solve
Try changing a bit your logic, and follow my instructions. When your VM create the PDF, upload the file to a blob. Then give the full blob URL for your pdf file to the PDF viewer. Thus the request will not go to any VM, but just to the blob. And the full blob URL will be something like http://youraccount.blob.core.windows.net/public_files/file.pdf
Or I am missing something? What I understand, your process flow is as follows:
User makes a special request which would cause PDF file generation
File is generated on the server
full URL to the file is sent back to the client so that a client PDF viewer could render it
If this is the flow, that with suggested changes will look like the following:
User make a special request which would cause PDF file generation
File is generated on the server
File is uploaded to a BLOB storage
Full URL for the file in the BLOB is returned back to the client, so that it can be rendered on the client.
What is not clear? Or what is different in your process flow? I do exaclty the same for on-the-fly report generation and it works quite well. The only difference is that my app is Silverlight based and I force file download instead of inline-displaying.
An alternative approach is not to persist the file at all.
Rather, generate it in memory, set the content type of the response to "application/pdf" and return the binary content of the report. This is particularly easy if you're using ASP.NET MVC, but you can use a HttpHandler instead. It is a technique I regularly use in similar circumstances (though lately with Excel reports rather than PDF).
The usefulness of this approach does depend on how you're generating the PDF, how big it is and what the load is on your application.
But if the report is to be served just once, persisting it just so that another request can be made by the browser to retrieve it is just wasteful (and you have to provide the persistence mechanism).
If the same file is to be served multiple times and it is resource-intensive to create, it makes sense to persist it, then.
You want to save your PDF to a centralized persisted storage. VM's hard drive is neither. Azure Blob Storage is likely the simplest and best solution. It is dirt cheap to store and access. API for storing files and access them is very simple
There are two things you could consider.
Windows Azure Blob + Queue Storage
Blob Storage is a cost effective way of storing binary and sharing that information between instances. You would most likely use a worker role to create the Report which would store the report to Blob Storage and drop a completed message on the Queue.
Your web role instance could monitor the queue looking for reports that are ready to be displayed.
It would be similar to the concept used in the Windows Azure Guest Book app.
Windows Azure Caching Service
Similarly [and much more expensive] you could share the binary using the Caching Service. This gives a common layer between your VMs in which to store things, however you won't be able to provide a url to the PDF you'd have to download the binary and use either an HttpHandler or change the content-type of the request.
This would be much harder to implement, very expensive to run, and is not guaranteed to work in your scenario. I'd still suggest Blobs over any other means
Another option would be to implement a sticky session handler of your own. Take a look at:
http://dunnry.com/blog/2010/10/14/StickyHTTPSessionRoutingInWindowsAzure.aspx

Resources