Azure blob versioning - azure

Is there a way I can version the blobs being stored in Azure storage account, so that the blobs can be picked up using their version or the latest blob can be picked up?

Versioning for blobs is accomplished by taking a snapshot of a blob which creates a read-only copy of the blob based on the blob's contents when snapshot was taken.
When a snapshot for a blob is taken, Azure Storage returns a date/time value when the snapshot was taken. You can access that blob by appending this value to the blob's URL e.g. https://myaccount.blob.core.windows.net/mycontainer/myblob?snapshot=2017-06-09T00:00:00.0000000Z
However this snapshot date/time value is not stored anywhere in Azure.
What you could do is store this date/time value in your database and whenever you need to present this version of the blob in your application, you can simply append this value to the blob's URL.
Please note that snapshot exist along with blob i.e. if you delete the base blob, all snapshots for the blob will also be deleted.

Related

Tracking the changes in data of azure blob storage

Azure provides tracking of activities by activity log but I am working on a use case in which I have to track changes in a JSON file that is in the Azure blob storage and I have to figure out how can I track changes in the file.
You can enable Blob storage versioning to automatically keep track changes of a file in blob.When blob versioning is enabled, you can access earlier versions of a blob to recover your data if it is modified or deleted.
Each blob version is identified by a unique version ID. The value of
the version ID is the timestamp at which the blob was updated. The
version ID is assigned at the time that the version is created.
When you call a write operation to create or modify a blob, Azure
Storage returns the x-ms-version-id header in the response. This
header contains the version ID for the current version of the blob
that was created by the write operation.
The version ID remains the same for the lifetime of the version.
Reference : https://learn.microsoft.com/en-us/azure/storage/blobs/versioning-overview

Azure: Unable to copy Archive blobs from one storage account to another?

Whenever I try to copy Archive blobs to a different storage account and changing its tier in destination. I am getting the following error:
Copy source blob has been modified. ErrorCode: CannotVerifyCopySource
I have tried copying Hot/Cool blobs to Hot/Cool/Archive. I am facing the issue only while copying Archive to Hot/Cool/Archive. Also, there is no issue while copying within same storage account.
I am using Azure python SDK:
blob_url = source_block_blob_service.make_blob_url(copy_from_container, blob_name, sas_token = sas)
dest_blob_service.copy_blob(copy_to_container, blob_name, blob_url, requires_sync = True, standard_blob_tier = 'Hot')
The reason you're getting this error is because copying an archived blob is only supported in the same storage account and you're trying it across different storage account.
From the REST API documentation page:
Copying Archived Blob (version 2018-11-09 and newer)
An archived blob can be copied to a new blob within the same storage
account. This will still leave the initially archived blob as is. When
copying an archived blob as source the request must contain the header
x-ms-access-tier indicating the tier of the destination blob. The data
will be eventually copied to the destination blob.
While a blob is in the archive access tier, it's considered offline and can't be read or modified.
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-rehydration
To read the blob, you either need to rehydrate it first. Or, as described in the link above, you can also use the CopyBlob operation. I am not sure if the python SDK copy_blob() operation uses that API behind the scenes - maybe not if it did not work that way for you.

Azure blob copy in cloud

In aws, the "upload-part-copy" has option of byte ranges. If I wanted to copy portions of two objects to a new object within the cloud, I can copy using the "upload-part-copy" command.
I could not find any such method or mechanism to copy portions of blobs to a new blob in Azure. I tried AzCopy. But it does not have any option to select some portion of blob.
Can anyone please help me if there is any method like that.
Can anyone please help me if there is any method like that.
As of today, this feature is not there in Azure Blob Storage. A copy operation copies the entire source blob to destination blob.
A workaround would be to download the byte ranges (blocks) from the source blobs on your local machine and then create a new blob by uploading these blocks.
If you were using Blob Service REST API, here would be the operations you would need to perform:
Read Source Blob 1 by specifying the range in Range or x-ms-range request header you would like to read. Store the data fetched somewhere in your application.
Repeat the same for Source Blob 2.
Now create a new blob by uploading the data fetched for 1st source blob using Put Block.
Repeat the same for 2nd source blob.
Create the destination blob by committing block list.

Could not verify the copy source within the specified time. RequestId: (blank)

I am trying to copy some blob files from one storage account to another one. I am using AzCopy in order to fulfill this goal.
The process works for copying files between containers within the same storage account, but not between different storage accounts.
The command I am issuing is:
AzCopy /Source:https://<storage_account1>.blob.core.windows.net/<container_name1>/<path_to_desired_blobs> /Dest:https://<storage_account2>.blob.core.windows.net/<container_name2>/<path_to_store>/ /SourceKey:<source_key> /DestKey:<dest_key> /Pattern:<some_pattern> /S
The error I am getting is the following:
The remote server returned an error: (400) Bad Request.
Could not verify the copy source within the specified time.
RequestId:
Time:2016-04-01T19:33:01.0527460Z
The only difference between the two storage accounts is that one is Standard, whereas the other one is Premium.
Any help will be appreciated!
From your description, you're trying to copy Block Blob from source account to Page Blob in destination account, which is not supported in Azure Storage Service and AzCopy.
To work around it, you can firstly use AzCopy to download the Block Blobs from source account to local file system, and then upload them from local file system to destination account with option /BlobType:Page (this option is only valid when uploading from local to blob).
Premium Storage only supports page blobs. Please confirm that you are copying page blobs from standard to premium storage account. Also, specify the BlobType parameter to "page" in order to copy the data as page blobs into destination premium storage account.
From the description, I am assuming your source blob is a block blob. Azure's "Async Copy Blob" process (which is used by AzCopy as the default method) preserves the blob type. That is, you cannot convert a blob type from Block to Page through async copy blob.
Instead, can you try AzCopy again with "/SyncCopy" option along with "/BlobType:page" parameter? That might help change the destination blob type to Page.
(If that doesn't work, only other solution would be to first download the blob, and then upload it with "/BlobType:page")

Check if Blob of unknown Blob type exists

I've inherited a project built using the Azure Storage Client 1.7 and am upgrading it as Microsoft have announced that this will no longer be supported from December this year.
References to the files in Blob storage are stored in a database with the following fields:
FilePath - a string in the form of uploadfiles/xxx/yyy/Image-20140117170146.jpg
FileURI - A string in the form of https://zzz.blob.core.windows.net/uploadfiles/xxx/yyy/Image-20140117170146.jpg
GetBlobReferenceFromServer will throw an exception if the file doesn't exist, so it seems you should use GetBlockBlobReference if you know the container and the Blob type.
So my question(s):
Can I assume any Blobs currently uploaded (using StorageClient 1.7) will be BlockBlobs?
As I need to know the container name to call GetBlockBlobReference can I reliably say that in the examples above my container would always be uploadfiles
Can I assume any Blobs currently uploaded (using StorageClient 1.7)
will be BlockBlobs?
Though you can't be 100% sure that the blobs uploaded via Storage Client library 1.7 are Blob Blobs because 1.7 also supported Page Blobs however you can make some intelligent guesses. For example, if the files are image files and other commonly used files (pdf, document etc.), you can assume that they are block blobs. Typically you would see vhd files uploaded as page blobs. Again if these are uploaded by the users of your application, more than likely they are block blobs.
Having said this, I think you should use GetBlobReferenceFromServer method. What you could do is list all blobs from the database and for each of them call GetBlobReferenceFromServer method. If the blob exists, then you will get the blob type. If the blob doesn't exist, this method will give you an error. This would be the quickest way to identify the blob type of existing entries in the database. If you want, you can store the blob type back in the database along with existing record if you find both block and page blobs when you check the blob type so that if in future you need to decide between creating a CloudBlockBlob or CloudPageBlob reference, you can look at this field.
As I need to know the container name to call GetBlockBlobReference can
I reliably say that in the examples above my container would always be
uploadfiles
Yes. In the examples you listed above, you can say that the blob container is upload files.

Resources