Writing to Azure Block Blobs

Writing to Azure Block Blobs - azure

I am using PutBlock and PutBlockList to upload data to a block blob, the code i am using for this is below:-
CloudBlobContainer container = blobStorage.GetContainerReference("devicebackups");
var permissions = container.GetPermissions();
permissions.PublicAccess = BlobContainerPublicAccessType.Container;
container.SetPermissions(permissions);
CloudBlockBlob blob = container.GetBlockBlobReference(serialNo.ToLower() + " " + dicMonths[DateTime.Now.Month]);
try
{
var serializer = new XmlSerializer(typeof(List<EnergyData>));
var stringBuilder = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(stringBuilder))
{
try
{
serializer.Serialize(writer, deviceData);
byte[] byteArray = Encoding.UTF8.GetBytes(stringBuilder.ToString());
List<string> blockIds = new List<string>();
try
{
blockIds.AddRange(blob.DownloadBlockList(BlockListingFilter.Committed).Select(b => b.Name));
}
catch (StorageClientException e)
{
if (e.ErrorCode != StorageErrorCode.BlobNotFound)
{
throw;
}
blob.Container.CreateIfNotExist();
}
var newId = Convert.ToBase64String(Encoding.UTF8.GetBytes(blockIds.Count().ToString()));
blob.PutBlock(newId, new MemoryStream(byteArray), null);
blockIds.Add(newId);
blob.PutBlockList(blockIds);
}
catch (Exception ex)
{
UT.ExceptionReporting(ex, "Error in Updating Backup Blob - writing byte array to blob");
}
}
}
catch (Exception ex)
{
UT.ExceptionReporting(ex, "Error in Updating Backup Blob - creating XmlWriter");
}
}
catch (Exception ex)
{
UT.ExceptionReporting(ex, "Error in Updating Backup Blob - getting container and blob references, serial no -" + serialNo);
}
This works for 10 blocks, then on the 11th block it crashes with the following error:-
StorageClientException - The specified block list is invalid.
InnerException = {"The remote server returned an error: (400) Bad Request."}
I have searched the internet for reports of the same error, but had no luck.
Any help would be much appreciated.

For a given blob, the length of the value specified for the blockid
parameter must be the same size for each block.
http://msdn.microsoft.com/en-us/library/windowsazure/dd135726.aspx
The first 10 blocks are numbered 0 through 9. The 11th block is number 10, which is longer by one character. So you should change your numbering scheme to always use the same length. One solution would be to convert the count to a zero-padded string that's long enough to hold the number of blocks you expect to have.
But if you don't need the benefits of using blocks, you're probably better off just writing the whole blob in one go instead of using blocks.

Set your BlockID has below code
var blockIdBase64 = Convert.ToBase64String(Encoding.UTF8.GetBytes(blockId.ToString(CultureInfo.InvariantCulture).PadLeft(32, '0')));

My problem was that after 10 put block I received the bad request (Error 400).
Replaced Encoding.UTF8.GetBytes with
System.BitConverter.GetBytes string
blockIdBase64=Convert.ToBase64String(System.BitConverter.GetBytes(x++));
The blockIDs must be same size. BitConverter.GetBytes does the
work.
I still received the bad request (Error 400). I solved it by
deleting the temp blob with 'Azure storage explorer'. It is like
resetting the temporary block from my previous bad tries.

Related

Access Denied exception being thrown in UWP

I'm trying to save a video file to a specific folder location instead of a library; which is what it defaults saves to. I'm using StorageFolder.GetFolderFromPathAsync to get the location. When it reaches that line in the function it'll throw the exception. 'Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))'
private async Task StartRecordingAsync()
{
try
{
// Create storage file for the capture
var videoFile = await _captureFolder.CreateFileAsync("SimpleVideo.mp4", CreationCollisionOption.GenerateUniqueName);
var encodingProfile = MediaEncodingProfile.CreateMp4(VideoEncodingQuality.Auto);
// Calculate rotation angle, taking mirroring into account if necessary
Debug.WriteLine("Starting recording to " + videoFile.Path);
await _mediaCapture.StartRecordToStorageFileAsync(encodingProfile, videoFile);
_isRecording = true;
_isPreviewing = true;
Debug.WriteLine("Started recording!");
}
catch (Exception ex)
{
// File I/O errors are reported as exceptions
Debug.WriteLine("Exception when starting video recording: " + ex.ToString());
}
}
Code in between
private async Task SetupUiAsync()
{
var lvmptVid = await StorageFolder.GetFolderFromPathAsync("C:\\Users\\Nano\\Documents\\lvmptVid");
// var videosLibrary = await StorageLibrary.GetLibraryAsync(KnownLibraryId.Videos);
// var picturesLibrary = await StorageLibrary.GetLibraryAsync(KnownLibraryId.Pictures);
// Fall back to the local app storage if the Pictures Library is not available
_captureFolder = lvmptVid;
}
I've tried using different saving techniques, currently in the process of revamping the process.
I've tried using a public file location instead of a user specific one.

When you are trying to use GetFolderFromPathAsync to get the file directly using a path, please make sure that you've enabled the broadFileSystemAccess capability in the manifest file.
Like this:
<Package
xmlns:rescap="http://schemas.microsoft.com/appx/manifest/foundation/windows10/restrictedcapabilities"
IgnorableNamespaces="uap mp rescap">
<Capabilities>
<rescap:Capability Name="broadFileSystemAccess" />
</Capabilities>
Please also remember to enable the file system settings in Settings > Privacy > File system.
Like this:

Azure Durable function removes files form local storage after it is downloaded

I am struggling a lot with this task. I have to download files from SFTP and then parse them. I am using Durable functions like this
[FunctionName("MainOrch")]
public async Task<List<string>> RunOrchestrator(
[OrchestrationTrigger] IDurableOrchestrationContext context, ILogger log)
{
try
{
var filesDownloaded = new List<string>();
var filesUploaded = new List<string>();
var files = await context.CallActivityAsync<List<string>>("SFTPGetListOfFiles", null);
log.LogInformation("!!!!FilesFound*******!!!!!" + files.Count);
if (files.Count > 0)
{
foreach (var fileName in files)
{
filesDownloaded.Add(await context.CallActivityAsync<string>("SFTPDownload", fileName));
}
var parsingTasks = new List<Task<string>>(filesDownloaded.Count);
foreach (var downlaoded in filesDownloaded)
{
var parsingTask = context.CallActivityAsync<string>("PBARParsing", downlaoded);
parsingTasks.Add(parsingTask);
}
await Task.WhenAll(parsingTasks);
}
return filesDownloaded;
}
catch (Exception ex)
{
throw;
}
}
SFTPGetListOfFiles: This functions connects to SFTP and gets the list of files in a folder and return.
SFTPDownload: This function is suppose to connect to SFTP and download each file in Azure Function's Tempt Storage. and return the download path. (each file is from 10 to 60 MB)
[FunctionName("SFTPDownload")]
public async Task<string> SFTPDownload([ActivityTrigger] string name, ILogger log, Microsoft.Azure.WebJobs.ExecutionContext context)
{
var downloadPath = "";
try
{
using (var session = new Session())
{
try
{
session.ExecutablePath = Path.Combine(context.FunctionAppDirectory, "winscp.exe");
session.Open(GetOptions(context));
log.LogInformation("!!!!!!!!!!!!!!Connected For Download!!!!!!!!!!!!!!!");
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
downloadPath = Path.Combine(Path.GetTempPath(), name);
log.LogInformation("Downloading " + name);
var transferResult = session.GetFiles("/Receive/" + name, downloadPath, false, transferOptions);
log.LogInformation("Downloaded " + name);
// Throw on any error
transferResult.Check();
log.LogInformation("!!!!!!!!!!!!!!Completed Download !!!!!!!!!!!!!!!!");
}
catch (Exception ex)
{
log.LogError(ex.Message);
}
finally
{
session.Close();
}
}
}
catch (Exception ex)
{
log.LogError(ex.Message);
_traceService.TraceException(ex);
}
return downloadPath;
}
PBARParsing: function has to get the stream of that file and process it (processing a 60 MB file might take few minutes on Scale up of S2 and Scale out with 10 instances.)
[FunctionName("PBARParsing")]
public async Task PBARParsing([ActivityTrigger] string pathOfFile,
ILogger log)
{
var theSplit = pathOfFile.Split("\\");
var name = theSplit[theSplit.Length - 1];
try
{
log.LogInformation("**********Starting" + name);
Stream stream = File.OpenRead(pathOfFile);
i want the download of all files to be completed using SFTPDownload thats why "await" is in a loop. and then i want parsing to run in parallel.
Question 1: Does the code in MainOrch function seems correct for doing these 3 things 1)getting the names of files, 2) downloading them one by one and not starting the parsing function until all files are downloaded. and then 3)parsing the files in parallel. ?
I observed that what i mentioned in Question 1 is working as expected.
Question 2: 30% of the files are parsed and for the 80% i see errors that "Could not find file 'D:\local\Temp\fileName'" is azure function removing the files after i place them ? is there any other approach i can take? If i change the path to "D:\home" i might see "File is being used by another process" error. but i haven't tried it yet. out the 68 files on SFTP weirdly last 20 ran and first 40 files were not found at that path and this is in sequence.
Question3: I also see this error " Singleton lock renewal failed for blob 'func-eres-integration-dev/host' with error code 409: LeaseIdMismatchWithLeaseOperation. The last successful renewal completed at 2020-08-08T17:57:10.494Z (46005 milliseconds ago) with a duration of 155 milliseconds. The lease period was 15000 milliseconds." does it tells something ? it came just once though.
update
after using "D:\home" i am not getting file not found errors

For others coming across this, the temporary storage is local to an instance of the function app, which will be different when the function scales out.
For such scenarios, D:\home is a better alternative as Azure Files is mounted here, which is the same across all instances.
As for the lock renewal error observed here, this issue tracks it but shouldn't cause issues as mentioned. If you do see any issue because of this, it would be best to share details in that issue.

How to reduce Azure web app temp file utilization

I have a web app developed in ASP.Net MVC 5 hosted in Azure. I am using a shared app service, not VMs. Recently Azure has started showing warnings that I need to reduce my app's usage of temporary files on workers.
Temp file utilization
After restarting the app, the problem has gone away. Seems that temporary apps were cleared by doing a restart.
How to detect and prevent unexpected growth of the temporary file usages. I am not sure what generated 20 GB of temporary files. What should I look for reduce app usage of temporary? I am not explicitly storing anything in temporary files in code, data is stored in the database, so not sure what to look for?
What are the best practices that should be followed in order to keep the Temp File usages in a healthy state and prevent any unexpected growth?
Note: I have multiple virtual path with same physical path in my Web App.
Virtual path
try
{
if (file != null && file.ContentLength > 0)
{
var fileName = uniqefilename;
CloudStorageAccount storageAccount = AzureBlobStorageModel.GetConnectionString();
if (storageAccount != null)
{
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
string containerName = "storagecontainer";
CloudBlobContainer container = blobClient.GetContainerReference(containerName);
bool isContainerCreated = container.CreateIfNotExists(BlobContainerPublicAccessType.Blob);
CloudBlobDirectory folder = container.GetDirectoryReference("employee");
CloudBlockBlob blockBlob = folder.GetBlockBlobReference(fileName);
UploadDirectory = String.Format("~/upload/{0}/", "blobfloder");
physicalPath = HttpContext.Server.MapPath(UploadDirectory + fileName);
file.SaveAs(physicalPath);
isValid = IsFileValid(ext, physicalPath);
if (isValid)
{
using (var fileStream = System.IO.File.OpenRead(physicalPath))
{
blockBlob.Properties.ContentType = file.ContentType;
blockBlob.UploadFromFile(physicalPath);
if (blockBlob.Properties.Length >= 0)
{
docURL = blockBlob.SnapshotQualifiedUri.ToString();
IsExternalStorage = true;
System.Threading.Tasks.Task T = new System.Threading.Tasks.Task(() => deletefile(physicalPath));
T.Start();
}
}
}
}
}
}
catch (Exception ex)
{
}
//Delete File
public void deletefile(string filepath)
{
try
{
if (!string.IsNullOrWhiteSpace(filepath))
{
System.GC.Collect();
System.GC.WaitForPendingFinalizers();
System.IO.File.Delete(filepath);
}
}
catch(Exception e) { }
}

You problem may be caused by using temporary files to process uploads or downloads. The solution would be either to process files using memory stream instead of filestream or delete the temporary files after you are finished processing. This SO exchange has some relevant suggestions:
Azure Web App Temp file cleaning responsibility
Given your update, it looks like your file upload code lets temp files accumulate in line 39, because you are not waiting for your async call to delete the file to finish before you exit. I assume that this code block is tucked inside an MVC controller action, which means that, as soon as the code block is finished, it will abandon the un-awaited async action, leaving you with an undeleted temp file.
Consider updating your code to await your Task action. Also, you may want to update to Task.Run. E.g.,
var t = await Task.Run(async delegate
{
//perform your deletion in here
return some-value-if-you-want;
});

How can I cleanup files & directories in Azure Local Storage after a file begins streaming to the browser?

BACKGROUND: I'm making use of Azure Local Storage. This is supposed to be treated as "volatile" storage. First of all, how long do the files & directories that I create persist on the Web Role Instances (there are 2, in my case)? Do I need to worry about running out of storage if I don't do cleanup on those files/directories after each user is done with it? What I'm doing is I'm pulling multiple files from a separate service, storing them in Azure Local Storage, compressing them into a zip file and storing that zip file, and then finally file streaming that zip file to the browser.
THE PROBLEM: This all works beautifully except for one minor hiccup. The file seems to stream to the browser asynchronously. So what happens is that an exception gets thrown when I try to delete the zipped file from azure local storage afterward since it is still in the process of streaming to the browser. What would be the best approach to forcing the deletion process to happen AFTER the file is completely streamed to the browser?
Here is my code:
using (Service.Company.ServiceProvider CONNECT = new eZ.Service.CompanyConnect.ServiceProvider())
{
// Iterate through all of the files chosen
foreach (Uri fileId in fileIds)
{
// Get the int file id value from the uri
System.Text.RegularExpressions.Regex rex = new System.Text.RegularExpressions.Regex(#"e[B|b]://[^\/]*/\d*/(\d*)");
string id_str = rex.Match(fileId.ToString()).Groups[1].Value;
int id = int.Parse(id_str);
// Get the file object from eB service from the file id passed in
eZ.Data.File f = new eZ.Data.File(CONNECT.eZSession, id);
f.Retrieve("Header; Repositories");
string _fileName = f.Name;
try
{
using (MemoryStream stream = new MemoryStream())
{
f.ContentData = new eZ.ContentData.File(f, stream);
// After the ContentData is created, hook into the event
f.ContentData.TransferProgressed += (sender, e) => { Console.WriteLine(e.Percentage); };
// Now do the transfer, the event will fire as blocks of data is read
int bytesRead;
f.ContentData.OpenRead();
// Open the Azure Local Storage file stream
using (azure_file_stream = File.OpenWrite(curr_user_path + _fileName))
{
while ((bytesRead = f.ContentData.Read()) > 0)
{
// Write the chunk to azure local storage
byte[] buffer = stream.GetBuffer();
azure_file_stream.Write(buffer, 0, bytesRead);
stream.Position = 0;
}
}
}
}
catch (Exception e)
{
throw e;
//Console.WriteLine("The following error occurred: " + e);
}
finally
{
f.ContentData.Close();
}
} // end of foreach block
} // end of eB using block
string sevenZipDllPath = Path.Combine(Utilities.GetCurrentAssemblyPath(), "7z.dll");
Global.logger.Info(string.Format("sevenZipDllPath: {0}", sevenZipDllPath));
SevenZipCompressor.SetLibraryPath(sevenZipDllPath);
var compressor = new SevenZipCompressor
{
ArchiveFormat = OutArchiveFormat.Zip,
CompressionLevel = CompressionLevel.Fast
};
// Compress the user directory
compressor.CompressDirectory(webRoleAzureStorage.RootPath + curr_user_directory, curr_user_package_path + "Package.zip");
// stream Package.zip to the browser
httpResponse.BufferOutput = false;
httpResponse.ContentType = Utilities.GetMIMEType("BigStuff3.mp4");
httpResponse.AppendHeader("content-disposition", "attachment; filename=Package.zip");
azure_file_stream = File.OpenRead(curr_user_package_path + "Package.zip");
azure_file_stream.CopyTo(httpResponse.OutputStream);
httpResponse.End();
// Azure Local Storage cleanup
foreach (FileInfo file in user_directory.GetFiles())
{
file.Delete();
}
foreach (FileInfo file in package_directory.GetFiles())
{
file.Delete();
}
user_directory.Delete();
package_directory.Delete();
}

Can you simply run a job on the machine that cleans up files after say a day of their creation? This could be as simple as a batch file in the task scheduler or a separate thread started from WebRole.cs.
You can even use AzureWatch to auto-re-image your instance if the local space drops below a certain threshold

Could you place the files (esp. the final compressed one that the users download) in Windows Azure blob storage? The file could be made public, or create a Shared Access Signature so that only the persons you provide the URL to could download it. Placing the files in blob storage for download could alleviate some pressures on the web server.

"The specified block list is invalid" while uploading blobs in parallel

I've a (fairly large) Azure application that uploads (fairly large) files in parallel to Azure blob storage.
In a few percent of uploads I get an exception:
The specified block list is invalid.
System.Net.WebException: The remote server returned an error: (400) Bad Request.
This is when we run a fairly innocuous looking bit of code to upload a blob in parallel to Azure storage:
public static void UploadBlobBlocksInParallel(this CloudBlockBlob blob, FileInfo file)
{
blob.DeleteIfExists();
blob.Properties.ContentType = file.GetContentType();
blob.Metadata["Extension"] = file.Extension;
byte[] data = File.ReadAllBytes(file.FullName);
int numberOfBlocks = (data.Length / BlockLength) + 1;
string[] blockIds = new string[numberOfBlocks];
Parallel.For(
0,
numberOfBlocks,
x =>
{
string blockId = Convert.ToBase64String(Guid.NewGuid().ToByteArray());
int currentLength = Math.Min(BlockLength, data.Length - (x * BlockLength));
using (var memStream = new MemoryStream(data, x * BlockLength, currentLength))
{
var blockData = memStream.ToArray();
var md5Check = System.Security.Cryptography.MD5.Create();
var md5Hash = md5Check.ComputeHash(blockData, 0, blockData.Length);
blob.PutBlock(blockId, memStream, Convert.ToBase64String(md5Hash));
}
blockIds[x] = blockId;
});
byte[] fileHash = _md5Check.ComputeHash(data, 0, data.Length);
blob.Metadata["Checksum"] = BitConverter.ToString(fileHash).Replace("-", string.Empty);
blob.Properties.ContentMD5 = Convert.ToBase64String(fileHash);
data = null;
blob.PutBlockList(blockIds);
blob.SetMetadata();
blob.SetProperties();
}
All very mysterious; I'd think the algorithm we're using to calculate the block list should produce strings that are all the same length...

We ran into a similar issue, however we were not specifying any block ID or even using the block ID anywhere. In our case, we were using:
using (CloudBlobStream stream = blob.OpenWrite(condition))
{
//// [write data to stream]
stream.Flush();
stream.Commit();
}
This would cause The specified block list is invalid. errors under parallelized load. Switching this code to use the UploadFromStream(…) method while buffering the data into memory fixed the issue:
using (MemoryStream stream = new MemoryStream())
{
//// [write data to stream]
stream.Seek(0, SeekOrigin.Begin);
blob.UploadFromStream(stream, condition);
}
Obviously this could have negative memory ramifications if too much data is buffered into memory, but this is a simplification. One thing to note is that UploadFromStream(...) uses Commit() in some cases, but checks additional conditions to determine the best method to use.

This exception can happen also when multiple threads open stream into a blob with the same file name and try to write into this blob simultaneously.

NOTE: this solution is based on Azure JDK code, but I think we can safely assume that pure REST version will have the very same effect (as any other language actually).
Since I have spent entire work day fighting this issue, even if this is actually a corner case, I'll leave a note here, maybe it will be of help to someone.
I did everything right. I had block IDs in the right order, I had block IDs of the same length, I had a clean container with no leftovers of some previous blocks (these three reasons are the only ones I was able to find via Google).
There was one catch: I've been building my block list for commit via
CloudBlockBlob.commitBlockList(Iterable<BlockEntry> blockList)
with use of this constructor:
BlockEntry(String id, BlockSearchMode searchMode)
passing
BlockSearchMode.COMMITTED
in the second argument. And THAT proved to be the root cause. Once I changed it to
BlockSearchMode.UNCOMMITTED
and eventually landed on the one-parameter constructor
BlockEntry(String id)
which uses UNCOMMITED by default, commiting the block list worked and blob was successfuly persisted.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Writing to Azure Block Blobs - azure

Set your BlockID has below code var blockIdBase64 = Convert.ToBase64String(Encoding.UTF8.GetBytes(blockId.ToString(CultureInfo.InvariantCulture).PadLeft(32, '0')));

Related

Access Denied exception being thrown in UWP

Azure Durable function removes files form local storage after it is downloaded

How to reduce Azure web app temp file utilization

How can I cleanup files & directories in Azure Local Storage after a file begins streaming to the browser?

"The specified block list is invalid" while uploading blobs in parallel

Categories

Resources