Azure copy blob to another account: invalid blob type - azure

I want to copy a 12GB page blob from one storage account to another. At the moment, both sides are "public container". But it doesn't work: HTTP/1.1 409 The blob type is invalid for this operation.
Copying it the same way but within the same storage account works without errors.
What am I missing?
Thanks!
//EDIT: This is how I'm trying to copy blob.dat from account1 to account2 (casablanca lib):
http_client client(L"https://account2.blob.core.windows.net");
http_request request(methods::PUT);
request.headers().add(L"Authorization", L"SharedKey account2:*************************************");
request.headers().add(L"x-ms-copy-source", L"http://account1.blob.core.windows.net/dir/blob.dat");
request.headers().add(L"x-ms-date", L"Sat, 23 Nov 2013 16:50:00 GMT"); // I'm keeping this updated
request.headers().add(L"x-ms-version", L"2012-02-12");
request.set_request_uri(L"/dir/blob.dat");
auto ret = client.request(request).then([](http_response response)
{
std::wcout << response.status_code() << std::endl << response.to_string() << std::endl;
});
The storage accounts were created a few days ago, so no restrictions apply.
Also, the destination dir is empty (account2 /dir/blob.dat is not existing).
//EDIT2:
I did more testing and found out this: Uploading a new page blob (few MB) then copying it to another storage account worked!
Then I tried to rename the 12GB page blob which I wasn't able to copy (renamed from mydisk.vhd to test.dat) and suddenly the copy to another storage worked as well!
But the next problem is: After renaming the test.dat back to mydisk.vhd in the destination storage account, I cannot create a disk from it (error like "not a valid vhd file"). But the copy is already done (x-ms-copy-status: success).
What could be the problem now?
(Oh I forgot: the source mydisk.vhd lease status was "unlocked" before copying)
//EDIT3:
Well, it seems that the problem has solved itself... even with the original mydisk.vhd I wasn't able to create a disk again (invalid vhd). I don't know why as I didnt alter it, but I created it on the xbox one launch day, it was all quite slow so maybe something went wrong there. Now, as I created a new VM, I can copy the .vhd over to another storage without problems (after deleting the disk).

I would suggest using AzCopy - Cross Account Copy Blob.
Check it out here:
http://blogs.msdn.com/b/windowsazurestorage/archive/2013/04/01/azcopy-using-cross-account-copy-blob.aspx

Related

ASP.NET Core Higher memory use uploading files to Azure Blob Storage SDK using v12 compared to v11

I am building a service with an endpoint that images and other files will be uploaded to, and I need to stream the file directly to Blob Storage. This service will handle hundreds of images per second, so I cannot buffer the images into memory before sending it to Blob Storage.
I was following the article here and ran into this comment
Next, using the latest version (v12) of the Azure Blob Storage libraries and a Stream upload method. Notice that it’s not much better than IFormFile! Although BlobStorageClient is the latest way to interact with blob storage, when I look at the memory snapshots of this operation it has internal buffers (at least, at the time of this writing) that cause it to not perform too well when used in this way.
But, using almost identical code and the previous library version that uses CloudBlockBlob instead of BlobClient, we can see a much better memory performance. The same file uploads result in a small increase (due to resource consumption that eventually goes back down with garbage collection), but nothing near the ~600MB consumption like above
I tried this and found that yes, v11 has considerably less memory usage compared to v12! When I ran my tests with about a ~10MB file the memory, each new upload (after initial POST) jumped the memory usage 40MB, while v11 jumped only 20MB
I then tried a 100MB file. On v12 the memory seemed to use 100MB nearly instantly each request and slowly climbed after that, and was over 700MB after my second upload. Meanwhile v11 didn't really jump in memory, though it would still slowly climb in memory, and ended with around 430MB after the 2nd upload.
I tried experimenting with creating BlobUploadOptions properties InitialTransferSize, MaximumConcurrency, etc. but it only seemed to make it worse.
It seems unlikely that v12 would be straight up worse in performance than v11, so I am wondering what I could be missing or misunderstanding.
Thanks!
Sometimes this issue may occur due to Azure blob storage (v12) libraries.
Try to upload the large files in chunks [a technique called file chunking which breaks the large file into smaller chunks for each upload] instead of uploading whole file. Please refer this link
I tried  producing the scenario in my lab
public void uploadfile()
{
string connectionString = "connection string";
string containerName = "fileuploaded";
string blobName = "test";
string filePath = "filepath";
BlobContainerClient container = new BlobContainerClient(connectionString, containerName);
container.CreateIfNotExists();
// Get a reference to a blob named "sample-file" in a container named "sample-container"
BlobClient blob = container.GetBlobClient(blobName);
// Upload local file
blob.Upload(filePath);
}
The output after uploading file.

Backup ADLS gen2

I have datalake & datawarehouse containing about 5-10 TBs of data in Azure ADLS gen2, CSV and Delta formats. ADLS's Performance/Tier=Standard/Hot, replication=GRS, type=StorageV2.
What is the best way to backup my ADLS gen2 data?
From data corruption perspective, I want to backup raw ingested data. This can be done incrementally, small amount of data, on regular basis.
From PROD availability perspective, I want to backup all 5-10 TBs rarely before complex PROD migrations. Yes, the data can be derived from raw data, but it may take up to few days or even a week (including reconsiliations, testing even more).
Considerations:
Azure Backup doesn't support ADLS
Copying data with help of Azure Storage Explorer is slow, because speed is unstable from 50 to 1000 Mbps. It may take days or week on my data volumes. Am I right Azure Storage Explorer speed doesn't depend on my local internet speed?
I haven't tried AzCopy, but expect it to have the same speed as Azure Storage Explorer
Mounting data_container to archive_container in DBFS, and trying to copy data with Databrick's dbutils.fs.cp works even slower then Azure Storage Explorer: 3GB/10 minutes on big 10 notes 30 DBUs cluster. Why?
ADF haven't tried, but I dislike the fact that Copy activity requires details on table/format level. I would like to backup the whole container, without implementing logic and depending on folders amount and naming.
For Raw data/folder backup I use Microsoft data movement service to copy blob directory from ADLS Gen2 into Storage Account.
For this create a daily time trigger function to do an incremental copy of the blob directory.
You can configure something like this.
Create a new folder with every Monday (date) full backup and keep incremental changes till Sunday. After a Month remove old backup folders.
here is my implementation.
public async Task<string> CopyBlobDirectoryAsync(BlobConfiguration sourceBlobConfiguration, BlobConfiguration destBlobConfiguration, string blobDirectoryName)
{
CloudBlobDirectory sourceBlobDir = await GetCloudBlobDirectoryAsync(sourceBlobConfiguration.ConnectionString, sourceBlobConfiguration.ContainerName, blobDirectoryName);
CloudBlobDirectory destBlobDir = await GetCloudBlobDirectoryAsync(destBlobConfiguration.ConnectionString, destBlobConfiguration.ContainerName, destBlobConfiguration.BlobDirectoryPath + "/" + blobDirectoryName);
// You can also replace the source directory with a CloudFileDirectory instance to copy data from Azure File Storage. If so:
// 1. If recursive is set to true, SearchPattern is not supported. Data movement library simply transfer all azure files
// under the source CloudFileDirectory and its sub-directories.
CopyDirectoryOptions options = new CopyDirectoryOptions()
{
Recursive = true
};
DirectoryTransferContext context = new DirectoryTransferContext();
context.FileTransferred += FileTransferredCallback;
context.FileFailed += FileFailedCallback;
context.FileSkipped += FileSkippedCallback;
// Create CancellationTokenSource used to cancel the transfer
CancellationTokenSource cancellationSource = new CancellationTokenSource();
TransferStatus trasferStatus = await TransferManager.CopyDirectoryAsync(sourceBlobDir, destBlobDir, CopyMethod.ServiceSideAsyncCopy, options, context, cancellationSource.Token);
return TransferStatusToString(blobDirectoryName, trasferStatus);
}

Azure Function App copy blob from one container to another using startCopy in java

I am using java to write a Azure Function App which is eventgrid trigger and the trigger is blobcreated. So when ever blob is created it will be trigerred and the function is to copy a blob from one container to another. I am using startCopy function from com.microsoft.azure.storage.blob. It was working fine but sometimes, It uses to copy files of zero bytes which are actually containing some data in source location. So at destination sometimes it dumps zero bytes of files. I would like to have a little help on this so that I could understand how to possibly handle this situation
CloudBlockBlob cloudBlockBlob = container.getBlockBlobReference(blobFileName);
CloudStorageAccount storageAccountdest = CloudStorageAccount.parse("something");
CloudBlobClient blobClientdest = storageAccountdest.createCloudBlobClient();
CloudBlobContainer destcontainer = blobClientdest.getContainerReference("something");
CloudBlockBlob destcloudBlockBlob = destcontainer.getBlockBlobReference();
destcloudBlockBlob.startCopy(cloudBlockBlob);
Copying blobs across storage accounts is an async operation. When you call startCopy method, it just signals Azure Storage to copy a file. Actual file copy operation happens asynchronously and may take some time depending how how large file you're transferring.
I would suggest that you check the copy operation progress on the target blob to see how many bytes have been copied and if there's a failure in the copy operation. You can do so by fetching the properties of the target blob. A copy operation could potentially fail if the source blob is modified after the copy operation has started by Azure Storage.
had the same problem, and later figured out from the docs
Event Grid isn't a data pipeline, and doesn't deliver the actual
object that was updated
Event grid will tell you that something has changed and that the actual message has a size limit and as long as the data that you are copying is within that limit it will be successful if not it will be 0 bytes. I was able to copy upto 1mb and beyond that it resulted 0 bytes. You can try and see if azure has increased by size limit in the recent.
However if you want to copy the complete data then you need to use Event Hub or Service Bus. For mine, I went with service bus.

AzCopy blob download throwing errors on local machine

I am running the following command while learning how to use AzCopy.
azcopy /Source:https://storeaccountname.blob.core.windows.net/container /Dest:C:\container\ /SourceKey:Key /Pattern:"tdx" /S /V
Some files are downloaded by most files result in an error like the following. I have no idea why this happening and wondered if somebody has encountered this and knows the cause and the fix.
[2016/05/31 21:27:13][ERROR] tdx/logs/site-visit/archive/1463557944558/visit-1463557420000: Failed to open file C:\container\tdx\logs\site-visit\archive\1463557944558\visit-1463557420000: Access to the path 'C:\container\tdx\logs\site-visit\archive\1463557944558\visit-1463557420000' is denied..
My ultimate goal was to create backups of the blobs in a container of one storage account to the container of another storage account. So I am starting out with basics which seem to fail.
Here is a list of folder names from an example path pulled from Azure Portal:
storeaccountname > Blob service > container > app-logs > hdfs > logs
application_1461803569410_0008
application_1461803569410_0009
application_1461803569410_0010
application_1461803569410_0011
application_1461803569410_0025
application_1461803569410_0027
application_1461803569410_0029
application_1461803569410_0031
application_1461803569410_0033
application_1461803569410_0035
application_1461803569410_0037
application_1461803569410_0039
application_1461803569410_0041
application_1461803569410_0043
application_1461803569410_0045
There is an error in the log for each one of these folders that looks like this:
[2016/05/31 21:29:18.830-05:00][VERBOSE] Transfer FAILED: app-logs/hdfs/logs/application_1461803569410_0008 => app-logs\hdfs\logs\application_1461803569410_0008.
[2016/05/31 21:29:18.834-05:00][ERROR] app-logs/hdfs/logs/application_1461803569410_0008: Failed to open file C:\container\app-logs\hdfs\logs\application_1461803569410_0008: Access to the path 'C:\container\app-logs\hdfs\logs\application_1461803569410_0008' is denied..
The folder application_1461803569410_0008 contains two files. Those two files were successfully downloaded. From the logs:
[2016/05/31 21:29:19.041-05:00][VERBOSE] Finished transfer: app-logs/hdfs/logs/application_1461803569410_0008/10.2.0.5_30050 => app-logs\hdfs\logs\application_1461803569410_0008\10.2.0.5_30050
[2016/05/31 21:29:19.084-05:00][VERBOSE] Finished transfer: app-logs/hdfs/logs/application_1461803569410_0008/10.2.0.4_30050 => app-logs\hdfs\logs\application_1461803569410_0008\10.2.0.4_30050
So it appears that the problem is related to copying folders, which themselves are blobs but I can't be certain yet.
There are several known issues when using AzCopy, such as the below which will cause error,
If there are two blobs named “a” and “a/b” under a storage container, copying the blobs under that container with /S will fail. Windows will not allow the creation of folder name “a” and file name “a” under the same folder.
Refer to https://blogs.msdn.microsoft.com/windowsazurestorage/2012/12/03/azcopy-uploadingdownloading-files-for-windows-azure-blobs/. Scroll down to the bottom, see details of Known Issues.
In my container con2, there are a folder named abc.pdf and also a file abc.pdf, when executing Azcopy download command with /S, it will prompt a error message.
Please check your container whether there are folders with the same name as a file.

Error mounting CloudDrive snapshot in Azure

I've been running a cloud drive snapshot in dev for a while now with no probs. I'm now trying to get this working in Azure.
I can't for the life of me get it to work. This is my latest error:
Microsoft.WindowsAzure.Storage.CloudDriveException: Unknown Error HRESULT=D000000D --->
Microsoft.Window.CloudDrive.Interop.InteropCloudDriveException: Exception of type 'Microsoft.WindowsAzure.CloudDrive.Interop.InteropCloudDriveException' was thrown.
at ThrowIfFailed(UInt32 hr)
at Microsoft.WindowsAzure.CloudDrive.Interop.InteropCloudDrive.Mount(String url, SignatureCallBack sign, String mount, Int32 cacheSize, UInt32 flags)
at Microsoft.WindowsAzure.StorageClient.CloudDrive.Mount(Int32 cacheSize, DriveMountOptions options)
Any idea what is causing this? I'm running both the WorkerRole and Storage in Azure so it's nothing to do with the dev simulation environment disconnect.
This is my code to mount the snapshot:
CloudDrive.InitializeCache(localPath.TrimEnd('\\'), size);
var container = _blobStorage.GetContainerReference(containerName);
var blob = container.GetPageBlobReference(driveName);
CloudDrive cloudDrive = _cloudStorageAccount.CreateCloudDrive(blob.Uri.AbsoluteUri);
string snapshotUri;
try
{
snapshotUri = cloudDrive.Snapshot().AbsoluteUri;
Log.Info("CloudDrive Snapshot = '{0}'", snapshotUri);
}
catch (Exception ex)
{
throw new InvalidCloudDriveException(string.Format(
"An exception has been thrown trying to create the CloudDrive '{0}'. This may be because it doesn't exist.",
cloudDrive.Uri.AbsoluteUri), ex);
}
cloudDrive = _cloudStorageAccount.CreateCloudDrive(snapshotUri);
Log.Info("CloudDrive created: {0}", snapshotUri, cloudDrive);
string driveLetter = cloudDrive.Mount(size, DriveMountOptions.None);
The .Mount() method at the end is what's now failing.
Please help as this has me royally stumped!
Thanks in advance.
Dave
I finally got this to work last night. All I did was create a new container and upload my VHD to it so I'm not sure if there was something weird going on with the old container...? Can't think what. The old container must've been getting a bit long in the tooth..!?!
2 days of my life I'll never get back. Debugging live Azure issues is an excruciatingly tedious process.
It's a shame the Azure CloudDrive dev simulation doesn't more closely replicate the live environment.
One source of the D000000D InteropCloudDriveException is when the drive (or snapshot) being mounted is expandable rather than fixed size. Unfortunately, the MSDN documentation provides minimal information on restrictions, but this note is an excellent source of information:
http://convective.wordpress.com/2010/02/27/azure-drive/
I can confirm Dave's findings regarding the BLOB container (Love you Dave, I only spent one evening).
I also had problems debugging before changing BLOB container.
The error message I had was "there was an error attaching the debugger to the IIS worker process for url ...".
Hope this helps some poor Azure dev, having a challenging time with the debugger.

Resources