moving locally stored documented documents to azure - azure

I want to spike whether azure and the cloud is a good fit for us.
We have a website where users upload documents to our currently hosted website.
Every document has an equivalent record in a database.
I am using terraform to create the azure infrastructure.
What is my best way of migrating the documents from the local file path on the server to azure?
Should I be using file storage or blob storage. I am confused about the difference.
Is there anything in terraform that can help with this?

Based on your comments, I would recommend storing them in Blob Storage. This service is suited for storing and serving unstructured data like files and images. There are many other features like redundancy, archiving etc. that you may find useful in your scenario.
File Storage is more suitable in Lift-and-Shift kind of scenarios where you're moving an on-prem application to the cloud and the application writes data to either local or network attached disk.
You may also find this article useful: https://learn.microsoft.com/en-us/azure/storage/common/storage-decide-blobs-files-disks
UPDATE
Regarding uploading files from local computer to Azure Storage, there are actually many options available:
Use a Storage Explorer like Microsoft's Storage Explorer.
Use AzCopy command-line tool.
Use Azure PowerShell Cmdlets.
Use Azure CLI.
Write your own code using any available Storage Client libraries or directly consuming REST API.

Related

When should we use file share in azure as compared to Azure Blobs?

Could someone please tell some examples where we can use Azure file share in azure instead of Azure Blobs. In the internet whenever I search I get it can be mounted or it follows SMB protocol. But still I am not understanding a single case where we can use Azure File share.
For this I tried to look into When to use Azure blob storage versus Azure file share?
-This is a similar question but doesn't answer my question.
Azure provides a variety of storage tools and services, including Azure Storage. To determine which Azure technology is best suited for your scenario, see Review your storage options in the Azure Cloud Adoption Framework.
For detailed information and examples refer to this article: https://learn.microsoft.com/en-us/azure/storage/common/storage-introduction
It depends mostly on your use-case and how you plan to access the data. If you simply want to mount and access your files Azure Files will be your best fit. If you are looking for the lowest cost and want to access your data programmatically through your application Azure Blob would be a better fit. Both are accessible through the portal or Azure Storage Explorer.
I also recommend this Learn module which covers the difference in data types and solutions.
Additional information: Azure Blob Storage vs Azure File Storage
Cost details of Azure Blob Storage pricing & Azure Files pricing
In short: if you ...
have an application that needs to store or access files in the cloud, use Blob Storage
need a file share that can be used by, for instance, a server, use File Shares
Azure Files shares can be mounted concurrently by cloud or on-premises deployments of Windows, Linux, and macOS. Azure Files shares can also be cached on Windows Servers with Azure File Sync for fast access near where the data is being used.
This means a File Share is, somewhat simplified, similar to a network share you would have in a local environment.
Azure Blob Storage helps you create data lakes for your analytics needs, and provides storage to build powerful cloud-native and mobile apps. Optimize costs with tiered storage for your long-term data, and flexibly scale up for high-performance computing and machine learning workloads.
This means Blob Storage is what you need when you're building powerful cloud-native and mobile apps.

Which Azure Storage method is best for a temporary file transfer?

I want to automate the transfer of files from a website not hosted in Azure to my client’s premises.
I am considering having an API on the website send the files to Azure Blob Storage , and then having another API running at the client site, download them.
Both would make use of the Azure storage API, which I like because it is easy to implement.
The files do not need to stay in Azure and can be deleted from storage once they are downloaded.
However I am wondering if there is a faster way.
Should I be using Hot Blob Storage or File Storage perhaps?
I looked at https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers but am still unclear as to the fastest method for my use case.
I suggest you can use File share, which can be mapped to local as a mapped drive and can be easily and faster operation like read / delete.
If you choose code only, from the comparison of blob and file, they can be up to Up to 60 MiB/s, I cannot see which is faster. There is a Azure Storage Data Movement Library , which is designed for high-performance uploading, downloading and copying Azure Storage Blob and File, you can use it for your purpose.
I would recommend blob storage for this application. Logic apps can also be used to automate this pipeline based on timer triggers or some other trigger.

SQLite on Azure website

I've been trying to get SQLite to work on an Azure website. I have deployed everything successfully but I need to point it to a file name for the database. I have looked at creating Blob storage but I'm unsure how to convert this into a file name that SQLite will accept.
I'm sure this has been done as I can see references to other issues related to SQLite on Azure.
I have read http://www.sqlite.org/whentouse.html.
Based on my experience if you want to use SQLite with Azure Websites you can keep the database file within your deployment package so it will stay at the same server where your website is. Azure websites provide 1GB of application storage which is plenty for a database file. Your content with the websites will persist and access to SQLite DB will be fast. This is super easy and you can very easily do with ASP.NET web application or any other.
The problem of choosing Azure Blob storage is that if the database file is stored at Azure Blob storage, there are no API that SQLite can write to that file. So one option you could have is to writing locally first and then syncing to Azure Blob storage back and forth while others on SO may have some other options. If you want to backup your database file to Azure Blob storage for any reason you sure can do that separately however I think if you choose the have SQLite, the best would be the keep the database file with website to make it simple.

Upload 650,000 documents to Azure

I can't seem to find any reference to bulk uploading data to azure.
I have a document store with 650,000 pdf document that take up about 1.2 TB of disk space.
Uploading those files to Azure via the web will be difficult. Is there a way I can mail a hard drive and have your team upload them for me?
If not can you recommend the best way to upload this many documents?
Maybe not the answer you expected, but you could use Amazon's AWS Import/Export (this allows you to mail them a HDD and they'll import it in your S3 account).
To transfer the data to a Windows Azure Storage Account you can leverage one of the new features in the 1.7.1 SDK: the StartCopyFromBlob method. This method allows you to copy a file at a specific url in an asynchronous way (you could use this to copy all files from your S3 to your Azure storage account).
Read the following blogpost for a fully working example: How to Copy a Bucket from Amazon S3 to Windows Azure Blob Storage using “Copy Blob”
While Azure doesn't offer a physical ingestion process today, if you talk nicely to the Azure team they can do this as a one off. If you like I can get a contact on the product team for you (dave at greenbutton dot com).
Alternatively there are solutions such as Aspera which provide for accelerated data transfers over UDP and is being beta test in Azure along with the Azure Media Services offering.
We have some tools that help with this as well http://www.greenbutton.com and leverage Aspera's technology.
As disk shipment are not supported by Windows Azure, your best bet is use a 3rd party application (or write your own one) which supports parallel upload. This way you can still upload much faster. 3rd party applications like Gladinet, Cloudberry could be used for upload the data but I am not sure how configurable they are to get maximum parallel upload to achieve fastest upload.
If you decide to write by yourself here is the starting point: Asynchronous Parallel Block Blob Transfers with Progress Change Notification
I know this is a bit too late for the OP, but in the Azure Management Portal, under Storage, pick your storage instance, then click the Import/Export link at the top. At the bottom of that screen, there is a "Create Import Job" link and icon. Also, if you click the blue help icon on the far right side, it says this:
You can use the Windows Azure Import/Export service to transfer large amounts of file data to Windows Azure Blob storage in situations where uploading over the network is prohibitively expensive or infeasible. You can also use the Import/Export service to transfer large quantities of data resident in Blob storage to your on-premises installations in a timely and cost-effective manner. Use the Windows Azure Import/Export Service to Transfer Data to Blob Storage
To transfer a large set of file data into Blob storage, you can send one or more hard drives containing that data to a Microsoft data center, where your data will be uploaded to your storage account. Similarly, to export data from Blob storage, you can send empty hard drives to a Microsoft data center, where the Blob data from your storage account will be copied to your hard drives and then returned to you. Before you send in a drive that contains data, you'll encrypt the data on the drive; when Microsoft exports your data to send to you, the data will also be encrypted before shipping.
Both windows azure storage powershell and azcopy could bulk upload data to azure.
For azure storage powershell, you could use ls -File -Recurse | Set-AzureStorageBlobContent -Container upload.
You can refer http://msdn.microsoft.com/en-us/library/dn408487.aspx for more details.
For azcopy, you can refer this article http://blogs.msdn.com/b/windowsazurestorage/archive/2012/12/03/azcopy-uploading-downloading-files-for-windows-azure-blobs.aspx

Is it possible to mount blob storage to my local machine for deployment?

I have a build script that it would be very useful to configure to dump some files into Azure blob storage so they can be picked up by my Azure web role.
My preferred plan was to find some way of mounting the blob storage on my build server as a mapped drive and simply using Robocopy copy to copy the files over. This will involve the least ammount of friction as I already am deploying some files like this to other web servers using WebDrive.
I found a piece of software that will allow me to do that: http://www.gladinet.com/
However on further investigation I found that it needs port 80 to run without some hairy looking hacking about on the server.
So is there another piece of software I could use or perhaps another way I haven't considered, such as deploying the files to a local folder that is automagically synced with blob storage?
Update in response to #David Makogon
I am using http://waacceleratorumbraco.codeplex.com/ this performs 2 way synchronisation between the blob storage and the web roles. I have tested this with http://cloudberrylab.com/ and I can deploy files manually to the blob and they are deployed correctly to the web roles. Also I have done the reverse and updated files in the web roles which have then been synced back to the blob and I have subsequently edited/downloaded them from blob storage.
What I'm really looking for is a way to automate the cloudberry side of things. So I don't have a manual step to copy a few files over. I will investigate the Powershell solutions in the meantime.
I know this is an old post - but in case someone else comes here... the answer is now "yes". I've been working on a CodePlex project to do exactly that. (All source code is available).
http://azuredrive.codeplex.com/
If you're comfortable using powershell in your build process then you could use the Cerebrata Cmdlets to upload the files. If that doesn't work for you, you could write a custom activity (but this sounds quite a bit more involved).
Mounting a cloud drive from a non-Windows Azure compute instance (e.g. your local build machine) is not supported.
Having said that: Even if you could mount a Cloud Drive from your build machine, your compute instances would need access to it too, and there can only be one writer. If your compute instances only needed read-only access, they'd need to create a snapshot after you upload new files.
This really doesn't sound like a good idea though. As knightpfhor suggested, the Cerebrata cmdlets provide this capability (look at Import-File). This allows you to push individual files into their own blobs. You can optimize further by pushing a single ZIP file into a blob. You can then use a technique similar to the one described by Nate Totten in his multi-tenant web role sample, to detect new zip files and expand them to your local storage. Nate's blog post is here.
Oh, and if you don't want to use the Cerebrata cmdlets, you can upload blobs directly with the Windows Azure Storage REST API (though the cmdlets are very simple to use and integrate seamlessly with PowerShell).

Resources