Windows azure requests - azure

I have an application that is deployed on Windows Azure, in the application there is a Report part, the reports works as shown below.
The application generates the report as a PDF file and save it in a certain folder in the application.
I have a PDF viewer in the application that takes the URL of the file and displays it.
As you know, in windows azure I will have several VMs that will handled through a Load balancer so I can not ensure that the request in step 2 will go to the same VM in step 1, and this will cause a problem for me.
Any help is very appreciated.
I know that I can use BLOB, but this is not the problem.
The problem is that after creating the file on a certain VM, I give the PDF viewer the url of the pdf viewer as "http://..../file.pdf". This will generate a new request that I cannot control, and I cannot know which VM will server, so even I saved the file in the BLOB it will not solve my problem.

as in any farm environment, you have to consider saving files in a storage that is common for all machines in the farm. In Windows Azure, such common storage is Windows Azure Blob Storage.
You have to make some changes to your application, so that it saves the files to a Blob stroage. If these are public files, then you just mark the Blob Container as public and provide the full URL to the file in blob to the PDF viewer.
If your PDF files are private, you have to mark your container as private. Second step is to generate a Shared Access Signature URL for the PDF and provide that URL to the PDF viewer.
Furthermore, while developing you can explore your Azure storage using any of the (freely and not so freely) available tools for Windows Azure Storage. Here are some:
Azure Storage Explorer
Azure Cloud Storage Studio
There are a lot of samples how to upload file to Azure Storage. Just search it with your favorite search engine. Check out these resources:
http://msdn.microsoft.com/en-us/library/windowsazure/ee772820.aspx
http://blogs.msdn.com/b/windowsazurestorage/archive/2010/04/11/using-windows-azure-page-blobs-and-how-to-efficiently-upload-and-download-page-blobs.aspx
http://wely-lau.net/2012/01/10/uploading-file-securely-to-windows-azure-blob-storage-with-shared-access-signature-via-rest-api/
The Windows Azure Training Kit has great lab named "Exploring Windows Azure Storage"
Hope this helps!
UPDATE (following question update)
The problem is that after creating the file on a certain VM, I give
the PDF viewer the url of the pdf viewer as "http://..../file.pdf".
This will generate a new request that I cannot control, and I cannot
know which VM will server, so even I saved the file in the BLOB it
will not solve
Try changing a bit your logic, and follow my instructions. When your VM create the PDF, upload the file to a blob. Then give the full blob URL for your pdf file to the PDF viewer. Thus the request will not go to any VM, but just to the blob. And the full blob URL will be something like http://youraccount.blob.core.windows.net/public_files/file.pdf
Or I am missing something? What I understand, your process flow is as follows:
User makes a special request which would cause PDF file generation
File is generated on the server
full URL to the file is sent back to the client so that a client PDF viewer could render it
If this is the flow, that with suggested changes will look like the following:
User make a special request which would cause PDF file generation
File is generated on the server
File is uploaded to a BLOB storage
Full URL for the file in the BLOB is returned back to the client, so that it can be rendered on the client.
What is not clear? Or what is different in your process flow? I do exaclty the same for on-the-fly report generation and it works quite well. The only difference is that my app is Silverlight based and I force file download instead of inline-displaying.

An alternative approach is not to persist the file at all.
Rather, generate it in memory, set the content type of the response to "application/pdf" and return the binary content of the report. This is particularly easy if you're using ASP.NET MVC, but you can use a HttpHandler instead. It is a technique I regularly use in similar circumstances (though lately with Excel reports rather than PDF).
The usefulness of this approach does depend on how you're generating the PDF, how big it is and what the load is on your application.
But if the report is to be served just once, persisting it just so that another request can be made by the browser to retrieve it is just wasteful (and you have to provide the persistence mechanism).
If the same file is to be served multiple times and it is resource-intensive to create, it makes sense to persist it, then.

You want to save your PDF to a centralized persisted storage. VM's hard drive is neither. Azure Blob Storage is likely the simplest and best solution. It is dirt cheap to store and access. API for storing files and access them is very simple

There are two things you could consider.
Windows Azure Blob + Queue Storage
Blob Storage is a cost effective way of storing binary and sharing that information between instances. You would most likely use a worker role to create the Report which would store the report to Blob Storage and drop a completed message on the Queue.
Your web role instance could monitor the queue looking for reports that are ready to be displayed.
It would be similar to the concept used in the Windows Azure Guest Book app.
Windows Azure Caching Service
Similarly [and much more expensive] you could share the binary using the Caching Service. This gives a common layer between your VMs in which to store things, however you won't be able to provide a url to the PDF you'd have to download the binary and use either an HttpHandler or change the content-type of the request.
This would be much harder to implement, very expensive to run, and is not guaranteed to work in your scenario. I'd still suggest Blobs over any other means

Another option would be to implement a sticky session handler of your own. Take a look at:
http://dunnry.com/blog/2010/10/14/StickyHTTPSessionRoutingInWindowsAzure.aspx

Related

Scanning for malware in files uploaded to Azure

I am building a .net app (Azure app service) with file upload feature that would upload pdf/docx files to Azure blob storage.
I was just wondering if a malicious user uploads a virus infected file, bad macro in word file, is Azure able to scan and remove/quarantine that file?
In the app, admin user will be able to download the file using a url. Will that file be virus free? Or do I need to explicitly call antivirus like Symantec End Point after the admin downloads the file.
Please let me know your expert thoughts.
Looks like Azure Advanced Threat Protection is now available for Blob Storage (and other Azure resources).
At this moment, there's no solution for that available on Azure. You'd need to use 3rd party or build one yourself. In the future, you'll be able to use https://learn.microsoft.com/en-us/azure/security-center/threat-protection
There is an open source solution that my team implemented, Its a small antivirus system that sends all uploaded blobs from a specific container to antivirus scan (using VM with Microsoft defender) and the blob is moved based on the scan results to a different container.
Also the remediation can be modified.
You can use that solution to send to scan each blob uploaded and download blobs only from a "clean-container".
here's a link to the repo azure-storage-av-automation.

Cut videos from Azure Blob Storage

I have a web app that is hosted in Azure; one of it's functionalities is to be able to make a few cuts from the video(generate 2 or 3 small videos of 5-10 seconds from a larger video).
The videos are persisted in Azure Blob Storage.
How do you suggest to accomplish this in the Azure environment?
The actual cutting of the videos will be initiated by a web job. I'm also concerned about the pricing(within the Azure environment), I'm taking into account the possibility of high traffic.
Any feedback is appreciated.
Thank you.
Assuming you have video-cutting code that operates on files through normal I/O: You'd need to download the video file from blob, process it via code (or whatever library you've employed), and then store the result back in blob storage. You cannot reference a blob directly with normal standard IO libraries.
If, however, videos are stored in Azure File storage (which is an SMB layer on top of blob storage, then you will be able to directly manipulate your video files.
Web Jobs run within an App Service (just like Web Apps), so you have access to a certain amount of local disk space (depending on App Service tier) for use. You should have no problem temporarily storing a video file within your web app's disk space, for editing operations.
You asked about cost: Again, assuming you're talking about running code within a Web Job (app service), you're just paying for whatever App Service tier you've chosen.
How you actually do those edit operations is entirely up to you (language, library, etc).
Azure Blob Storage is simply an object store which stores the data. It does not have the capability you're looking for.
Azure Media Service however is the service you should look into. The media served by this service makes use of Azure Blob Storage.
For editing video, may I suggest you take a look at Video Editor Plugin for Azure Media Player. You can read more about this plugin here: https://azure.microsoft.com/en-in/blog/video-editor-plugin/. You can also try it out here: http://ampdemo.azureedge.net/amp_editor.html.

Any ideas about how to check Azure's blob storage for viruses?

Our application stores files uploaded from our customers to blob storage. These files are exchanged between different parties (our customers and their suppliers). Is there a way to check the uploaded files for viruses? The Antimalware service seems to just check virtual machines, but I cannot get any information about using it to check files as a service.
A great solution would be if we could store such a file in Azure Storage as an "on hold" file till it is checked. Then we would need a service to check this file and returns the result. If the file is virus-free we could then move it to the final destination.
Azure Storage is just... storage. There are no utilities built in, such as antivirus. You'd need to do your antivirus check on your own. Since antivirus tools typically only work with local OS storage, you'd need to place your "on hold" content (as you referred to it) on a local disk somewhere that you have antivirus installed and then copy to blob storage once your antivirus check is done.
How you accomplish managing this, and which software you use, is up to you. But VMs, App Services, and Cloud Services (web/worker roles) all have local disks available.
As the other answer states Azure Storage is just storage. There are a couple of ways you could do this though,
The first solution would be to run your own anti-virus and use this either as a gateway or programatically download the file from the Blob storage, check the file and then take the appropriate action. It's possible to run something like ClamAV to do this yourself.
Alternatively you could use a third party service like AttachmentScanner (which is exactly what you mention in your comment) which will accept a URL or a direct file upload. With Azure you can generate a temporary url pointing to the file with an expiration of a few minutes, pass the URL to AttachmentScanner and then take the appropriate action depending on the result.
I read an article about virus scanning for blob storage. Might be useful for you.
This guy is using an azure function trigger for the blob to catch the changes and sending the blob file to a virus scanner. The virus scanner is running in a docker container.
Full implementation details are available in the link below
https://peterrombouts.nl/2019/04/15/scanning-blob-storage-for-viruses-with-azure-functions-and-docker/
You can use Azure Defender for Storage to detect following:
Suspicious access patterns - such as successful access from a Tor exit node or from an IP considered suspicious by Microsoft Threat Intelligence
Suspicious activities - such as anomalous data extraction or unusual change of access permissions
Upload of malicious content - such as potential malware files (based on hash reputation analysis) or hosting of phishing content
And to enable it you need to go to Advanced security:
I setup an "azinbox" folder on an azure file storage container. I setup a console application (job) on a VM to check every 30 seconds for a file in that folder. If the job finds it, it moves the file from azinbox to a vminbox folder on the VM. As soon as the files shows up on the VM, if it has a virus, it gets quarantined and the file is deleted from the vminbox. The job on the vm then checks 30 seconds later to see if the file is still in the vminbox. If it is, it must be OK. The job moves the validated file to an azoutbox folder on the azure file storage container. From the Web Site perspective, 1) upload the file to azinbox 2) wait a minute and check the azoutbox. If the file is found, the website moves the file from the azoutbox to its final destination.
I admit it is a crappy solution because it takes a LONG time to complete a file upload. A minute or two can seem like a long time to upload a simple PDF to the user especially if they have more than one to upload.
Also, this requires you setup an entire VM server JUST to validate a file for a virus.
If anyone has a better option, please let me know.

What is the best strategy for using Windows Azure as a file storage system - with http download capabilities

I need to store multiple files that users upload, and then provide these users with the capability of accessing their files via http. There are two key considerations:
- Storage (which is my primary concern here)
- Security (which let's leave aside for now)
The question is:
What is the most cost efficient and performant way of storing all these files and giving access to them later? I believe the answer is:
- Store files within Azure Storage Account, and have a key that references them in an SQL Azure database.
I am correct on this?
Is a blob storage flat? Or can I create something like folders inside it to better organize my files?
The idea of using SQL Azure to store metadata for your blobs is a pretty common scenario, which allows you to take advantage of SQL for searching, and blobs for storage.
Blobs are organized by container. So you'd have something like:
http://mystorage.blob.core.windows.net/mycontainer/myfile.doc
You can also simulate a hierarchy using a delimiter, but in reality there's just container plus blob.
If you keep the container or blob private, the user would either have to go through your web front end (or web service), or you'd have to provide them with a special URL with a Shared Access Signature appended, which is a time-limited URL.
I would recommend you to take a look at BlobShare Sample which is a simple file sharing application that demonstrates the storage services of the Windows Azure Platform, together with the authentication and authorization capabilities of Access Control Service (ACS). The full sample code is located at following link:
http://blobshare.codeplex.com/
You can use this sample code immediately, just by adding proper reference to your Windows Azure Account credentials. The best thing with this sample is that you can provide blob access directly through Access Control Services. You can also modify the code to add SAS support as well as blob download from public containers. Once you have it working and understood the concept you can tweak to make it the way you would want.

Where to store things like user pictures using Azure? Blob Storage?

I have just migrated a project of mine for test cases to Microsoft's azure.
But for functionalities similar to an avatar upload I need write access to the files on the harddrive. But this is a cloud, so this is not possible.
How can I build such functionalities instead? Should I use the Blob Storage or is there a better solution?
Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?
Blob storage is definitely the place to put dynamic images like avatars. While you can write to the disk on the VM you'll be running in, you can't rely on this to be present - if your app gets moved off to another machine (which could happen for any number of reasons) this storage will be erased.
One thing you could do is store your images in blob storage, and cache them on the local VM disk (using the standard file IO mechanisms). This way you'll get pretty good performance and will save on a few storage transactions while still making sure you're not storing in volatile storage.
If you've got static images which will be completely static, these are just bundled with your application and can be referenced like a normal file. But, if you will ever need to change them, you'd need to redeploy the application - so only use this technique for images which won't need to change.
Be aware there are two types of Blobs in Windows Azure: Block Blobs and Page Blobs. Block Blobs are appropriate for media file serving, whereas Page Blobs are optimized for other work patterns.
Also consider use of the Azure Content Distribution Network (CDN) for lowering latency to clients.
Azure also has streaming capabilities which work in concert with Silverlight Smooth Streaming (http://blog.smarx.com/posts/smooth-streaming-with-windows-azure-blobs-and-cdn if interested).
"Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?"
Yes I think so - this is what I'm rolling out right now actually.

Resources