Issue while copying existing blob to Azure Media Services - azure

We are trying to copy the existing blob to AMS but it is not getting copied. Blob resides in storage account 1 and AMS is associated with storage account 2. All the accounts including AMS are in the same location.
await destinationBlob.StartCopyAsync(new Uri(sourceBlob.Uri.AbsoluteUri + signature));
When visualizing the AMS Storage Account using Blob Storage explorer, asset folders are getting created but with no blobs in it. Also, within the Media explorer, we can see the assets listed in the AMS but when clicked, not found exception is thrown. Basically they are not getting fully copied into the AMS.
However, when we use same code and attach a new AMS to the blob storage account (storage account1) where the actual blob resides, copy is working fine.

I have not reproduce your issue. But there is a code sample to copy existing blob to Azure media services via .NET SDK. Please try to copy the Blob using StartCopyFromBlob or StartCopyFromBlobAsync(the Azure storage client library 4.3.0). Below is the code snippet in the code sample :
destinationBlob.StartCopyFromBlob(new Uri(sourceBlob.Uri.AbsoluteUri + signature));
while (true)
{
// The StartCopyFromBlob is an async operation,
// so we want to check if the copy operation is completed before proceeding.
// To do that, we call FetchAttributes on the blob and check the CopyStatus.
destinationBlob.FetchAttributes();
if (destinationBlob.CopyState.Status != CopyStatus.Pending)
{
break;
}
//It's still not completed. So wait for some time.
System.Threading.Thread.Sleep(1000);
}

Related

Azure: Unable to copy Archive blobs from one storage account to another?

Whenever I try to copy Archive blobs to a different storage account and changing its tier in destination. I am getting the following error:
Copy source blob has been modified. ErrorCode: CannotVerifyCopySource
I have tried copying Hot/Cool blobs to Hot/Cool/Archive. I am facing the issue only while copying Archive to Hot/Cool/Archive. Also, there is no issue while copying within same storage account.
I am using Azure python SDK:
blob_url = source_block_blob_service.make_blob_url(copy_from_container, blob_name, sas_token = sas)
dest_blob_service.copy_blob(copy_to_container, blob_name, blob_url, requires_sync = True, standard_blob_tier = 'Hot')
The reason you're getting this error is because copying an archived blob is only supported in the same storage account and you're trying it across different storage account.
From the REST API documentation page:
Copying Archived Blob (version 2018-11-09 and newer)
An archived blob can be copied to a new blob within the same storage
account. This will still leave the initially archived blob as is. When
copying an archived blob as source the request must contain the header
x-ms-access-tier indicating the tier of the destination blob. The data
will be eventually copied to the destination blob.
While a blob is in the archive access tier, it's considered offline and can't be read or modified.
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-rehydration
To read the blob, you either need to rehydrate it first. Or, as described in the link above, you can also use the CopyBlob operation. I am not sure if the python SDK copy_blob() operation uses that API behind the scenes - maybe not if it did not work that way for you.

Get the file from Azure Blob Storage when it is only updated through C#?

I have to get the file Contents from Azure Blob Storage only when it is updated/Created the same file in the Azure Storage. This has to be done through C#.
I have to get the file Contents from Azure Blob Storage only when it is updated/Created the same file in the Azure Storage. This has to be done through C#.
According to your description, I suggest you could try to use azure webjobs or functions blob trigger to get the file content from the blob storage.
The blobtrigger will trigger a process when an Azure blob is created or updated.
More details, you could refer to this article and below code sample.
public static void WriteLog([BlobTrigger("input/{name}")] string logMessage,
string name,
string blobTrigger,
TextWriter logger)
{
logger.WriteLine("Full blob path: {0}", blobTrigger);
logger.WriteLine("Content:");
logger.WriteLine(logMessage);
}
Notice: The SDK scans log files to watch for new or changed blobs. This process is not real-time; a function might not get triggered until several minutes or longer after the blob is created.
If the speed and reliability limitations of blob triggers are not acceptable for your application, the recommended method is to create a queue message when you create the blob, and use the QueueTrigger attribute instead of the BlobTrigger attribute on the function that processes the blob.

Avoid over-writing blobs AZURE on the server

I have a .NET app which uses the WebClient and the SAS token to upload a blob to the container. The default behaviour is that a blob with the same name is replaced/overwritten.
Is there a way to change it on the server, i.e. prevents from replacing the already existing blob?
I've seen the Avoid over-writing blobs AZURE but it is about the client side.
My goal is to secure the server from overwritting blobs.
AFAIK the file is uploaded directly to the container without a chance to intercept the request and check e.g. existence of the blob.
Edited
Let me clarify: My client app receives a SAS token to upload a new blob. However, an evil hacker can intercept the token and upload a blob with an existing name. Because of the default behavior, the new blob will replace the existing one (effectively deleting the good one).
I am aware of different approaches to deal with the replacement on the client. However, I need to do it on the server, somehow even against the client (which could be compromised by the hacker).
You can issue the SAS token with "create" permissions, and without "write" permissions. This will allow the user to upload blobs up to 64 MB in size (the maximum allowed Put Blob) as long as they are creating a new blob and not overwriting an existing blob. See the explanation of SAS permissions for more information.
There is no configuration on server side but then you can implement some code using the storage client sdk.
// retrieve reference to a previously created container.
var container = blobClient.GetContainerReference(containerName);
// retrieve reference to a blob.
var blobreference = container.GetBlockBlobReference(blobName);
// if reference exists do nothing
// else upload the blob.
You could do similar using the REST api
https://learn.microsoft.com/en-us/rest/api/storageservices/fileservices/blob-service-rest-api
GetBlobProperties which will return 404 if blob does not exists.
Is there a way to change it on the server, i.e. prevents from replacing the already existing blob?
Azure Storage Services expose the Blob Service REST API for you to do operations against Blobs. For upload/update a Blob(file), you need invoke Put Blob REST API which states as follows:
The Put Blob operation creates a new block, page, or append blob, or updates the content of an existing block blob. Updating an existing block blob overwrites any existing metadata on the blob. Partial updates are not supported with Put Blob; the content of the existing blob is overwritten with the content of the new blob.
In order to avoid over-writing existing Blobs, you need to explicitly specify the Conditional Headers for your Blob Operations. For a simple way, you could leverage Azure Storage SDK for .NET (which is essentially a wrapper over Azure Storage REST API) to upload your Blob(file) as follows to avoid over-writing Blobs:
try
{
var container = new CloudBlobContainer(new Uri($"https://{storageName}.blob.core.windows.net/{containerName}{containerSasToken}"));
var blob = container.GetBlockBlobReference("{blobName}");
//bool isExist=blob.Exists();
blob.UploadFromFile("{filepath}", accessCondition: AccessCondition.GenerateIfNotExistsCondition());
}
catch (StorageException se)
{
var requestResult = se.RequestInformation;
if(requestResult!=null)
//409,The specified blob already exists.
Console.WriteLine($"HttpStatusCode:{requestResult.HttpStatusCode},HttpStatusMessage:{requestResult.HttpStatusMessage}");
}
Also, you could combine your blob name with the MD5 code of your blob file before uploading to Azure Blob Storage.
As I known, there is no any configurations on Azure Portal or Storage Tools for you to achieve this purpose on server-side. You could try to post your feedback to Azure Storage Team.

Deleted blobs still showing in Azure Portal

I've run a process to delete approximately 1500 blobs from my Azure storage service. The code I've used to do this (in a loop) is essentially this:
var blob = BlobStorageContainer.GetBlockBlobReference(blobName);
if (await blob.ExistsAsync(cancellationToken))
{
await blob.DeleteAsync(cancellationToken);
}
I went through both the Azure Portal and Azure Storage Explorer, and it looks like all the blobs that should have been deleted are still there. However, when I try to actually access the file via the URL, I get a ResourceNotFound error. So it seems the data has been deleted, but the storage service seems to think that the blob should still be there. Am I doing something wrong, or does the storage service need time to catch up, in a sense, to all the delete operations I performed?
You can try doing a list blob operation for the container and that will give you an up to date view of what blobs are still present in your account. Accessing the blob from the internet URI will come back as ResourceNotFound if the blob isn't public even if it is still present in the container. Is it possible your calls are failing but your code is eating the exceptions?

Copying storage data from one Azure account to another

I would like to copy a very large storage container from one Azure storage account into another (which also happens to be in another subscription).
I would like an opinion on the following options:
Write a tool that would connect to both storage accounts and copy blobs one at a time using CloudBlob's DownloadToStream() and UploadFromStream(). This seems to be the worst option because it will incur costs when transferring the data and also be quite slow because data will have to come down to the machine running the tool and then get re-uploaded back to Azure.
Write a worker role to do the same - this should theoretically be faster and not incur any cost. However, this is more work.
Upload the tool to a running instance bypassing the worker role deployment and pray the tool finishes before the instance gets recycled/reset.
Use an existing tool - have not found anything interesting.
Any suggestions on the approach?
Update: I just found out that this functionality has finally been introduced (REST APIs only for now) for all storage accounts created on July 7th, 2012 or later:
http://msdn.microsoft.com/en-us/library/windowsazure/dd894037.aspx
You can also use AzCopy that is part of the Azure SDK.
Just click the download button for Windows Azure SDK and choose WindowsAzureStorageTools.msi from the list to download AzCopy.
After installing, you'll find AzCopy.exe here: %PROGRAMFILES(X86)%\Microsoft SDKs\Windows Azure\AzCopy
You can get more information on using AzCopy in this blog post: AzCopy – Using Cross Account Copy Blob
As well, you could remote desktop into an instance and use this utility for the transfer.
Update:
You can also copy blob data between storage accounts using Microsoft Azure Storage Explorer as well. Reference link
Since there's no direct way to migrate data from one storage account to another, you'd need to do something like what you were thinking. If this is within the same data center, option #2 is the best bet, and will be the fastest (especially if you use an XL instance, giving you more network bandwidth).
As far as complexity, it's no more difficult to create this code in a worker role than it would be with a local application. Just run this code from your worker role's Run() method.
To make things more robust, you could list the blobs in your containers, then place specific file-move request messages into an Azure queue (and optimize by putting more than one object name per message). Then use a worker role thread to read from the queue and process objects. Even if your role is recycled, at worst you'd reprocess one message. For performance increase, you could then scale to multiple worker role instances. Once the transfer is complete, you simply tear down the deployment.
UPDATE - On June 12, 2012, the Windows Azure Storage API was updated, and now allows cross-account blob copy. See this blog post for all the details.
here is some code that leverages the .NET SDK for Azure available at http://www.windowsazure.com/en-us/develop/net
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.WindowsAzure.StorageClient;
using System.IO;
using System.Net;
namespace benjguinAzureStorageTool
{
class Program
{
private static Context context = new Context();
static void Main(string[] args)
{
try
{
string usage = string.Format("Possible Usages:\n"
+ "benjguinAzureStorageTool CopyContainer account1SourceContainer account2SourceContainer account1Name account1Key account2Name account2Key\n"
);
if (args.Length < 1)
throw new ApplicationException(usage);
int p = 1;
switch (args[0])
{
case "CopyContainer":
if (args.Length != 7) throw new ApplicationException(usage);
context.Storage1Container = args[p++];
context.Storage2Container = args[p++];
context.Storage1Name = args[p++];
context.Storage1Key = args[p++];
context.Storage2Name = args[p++];
context.Storage2Key = args[p++];
CopyContainer();
break;
default:
throw new ApplicationException(usage);
}
Console.BackgroundColor = ConsoleColor.Black;
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine("OK");
Console.ResetColor();
}
catch (Exception ex)
{
Console.WriteLine();
Console.BackgroundColor = ConsoleColor.Black;
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine("Exception: {0}", ex.Message);
Console.ResetColor();
Console.WriteLine("Details: {0}", ex);
}
}
private static void CopyContainer()
{
CloudBlobContainer container1Reference = context.CloudBlobClient1.GetContainerReference(context.Storage1Container);
CloudBlobContainer container2Reference = context.CloudBlobClient2.GetContainerReference(context.Storage2Container);
if (container2Reference.CreateIfNotExist())
{
Console.WriteLine("Created destination container {0}. Permissions will also be copied.", context.Storage2Container);
container2Reference.SetPermissions(container1Reference.GetPermissions());
}
else
{
Console.WriteLine("destination container {0} already exists. Permissions won't be changed.", context.Storage2Container);
}
foreach (var b in container1Reference.ListBlobs(
new BlobRequestOptions(context.DefaultBlobRequestOptions)
{ UseFlatBlobListing = true, BlobListingDetails = BlobListingDetails.All }))
{
var sourceBlobReference = context.CloudBlobClient1.GetBlobReference(b.Uri.AbsoluteUri);
var targetBlobReference = container2Reference.GetBlobReference(sourceBlobReference.Name);
Console.WriteLine("Copying {0}\n to\n{1}",
sourceBlobReference.Uri.AbsoluteUri,
targetBlobReference.Uri.AbsoluteUri);
using (Stream targetStream = targetBlobReference.OpenWrite(context.DefaultBlobRequestOptions))
{
sourceBlobReference.DownloadToStream(targetStream, context.DefaultBlobRequestOptions);
}
}
}
}
}
Its very simple with AzCopy. Download latest version from https://azure.microsoft.com/en-us/documentation/articles/storage-use-azcopy/
and in azcopy type:
Copy a blob within a storage account:
AzCopy /Source:https://myaccount.blob.core.windows.net/mycontainer1 /Dest:https://myaccount.blob.core.windows.net/mycontainer2 /SourceKey:key /DestKey:key /Pattern:abc.txt
Copy a blob across storage accounts:
AzCopy /Source:https://sourceaccount.blob.core.windows.net/mycontainer1 /Dest:https://destaccount.blob.core.windows.net/mycontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt
Copy a blob from the secondary region
If your storage account has read-access geo-redundant storage enabled, then you can copy data from the secondary region.
Copy a blob to the primary account from the secondary:
AzCopy /Source:https://myaccount1-secondary.blob.core.windows.net/mynewcontainer1 /Dest:https://myaccount2.blob.core.windows.net/mynewcontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt
I'm a Microsoft Technical Evangelist and I have developed a sample and free tool (no support/no guarantee) to help in these scenarios.
The binaries and source-code are available here: https://blobtransferutility.codeplex.com/
The Blob Transfer Utility is a GUI tool to upload and download thousands of small/large files to/from Windows Azure Blob Storage.
Features:
Create batches to upload/download
Set the Content-Type
Transfer files in parallel
Split large files in smaller parts that are transferred in parallel
The 1st and 3rd feature is the answer to your problem.
You can learn from the sample code how I did it, or you can simply run the tool and do what you need to do.
Write your tool as a simple .NET Command Line or Win Forms application.
Create and deploy a dummy we/worker role with RDP enabled
Login to the machine via RDP
Copy your tool over the RDP connection
Run the tool on the remote machine
Delete the deployed role.
Like you I am not aware of any of the off the shelf tools supporting a copy between function.
You may like to consider just installing Cloud Storage Studio into the role though and dumping to disk then re-uploading. http://cerebrata.com/Products/CloudStorageStudiov2/Details.aspx?t1=0&t2=7
Use could 'Azure Storage Explorer' (free) or some other such tool. These tools provide a way to download and upload content. You will need to manually create containers and tables - and of course this will incur a transfer cost - but if you are short on time and your contents are of reasonable size then this is a viable option.
I recommend use azcopy, you can copy the all the storage account, a container, a directory or a single blob. Here al example of cloning all the storage account:
azcopy copy 'https://{SOURCE_ACCOUNT}.blob.core.windows.net{SOURCE_SAS_TOKEN}' 'https://{DESTINATION_ACCOUNT}.blob.core.windows.net{DESTINATION_SAS_TOKEN}' --recursive
You can get SAS token from Azure Portal. Navigate to storage account overviews (source and destination), then in the sidenav click on "Shared access sigantura" and generate your own.
More examples here
I had to do somethign similar to move 600 GB of content from a local file system to Azure Storage. After a couple iterations of code I finally ended up with taking the 'Azure Storage Explorer' and extended it with ability to select folders instead of just files and then have it recursively drill into the multiple selected folders, loaded a list of Source / Destination copy item statements into an Azure Queue. Then in the upload section in 'Azure Storage Explorer', in the Queue section to pull from the queue and execute the copy operation.
Then I launched like 10 instances of the 'Azure Storage Explorer' tool and had each pulling from the queue and executing the copy operation. I was able to move the 600 GB of items in just over 2 days. Added in smarts to utilize the modified time stamps on files and have it skip over files that have already been both copied from the queue and not add to the queue if it is in sync. Now I can run "updates" or syncs within an hour or two across the whole library of content.
Try CloudBerry Explorer. It copies blob within and between subscriptions.
For copying between subscriptions, edit the storage account container's access from Private to Public Blob.
The copying process took few hours to complete. If you choose to reboot your machine, the process will continue. Check status by refreshing the target storage account container in Azure management UI by checking the timestamp, the value gets updated until the copy process completes.

Resources