Azure Blob Storage - upload file with progress - node.js

I have following code - quite normal for uploading files into Azure-Blob-Storage but, when i upload files instead of getting onProgress executed many times, i only have it executed (and always) once with the file.size value (so it is sending - slowly) file to the azure but progress executes only once when finished.
const requestOptions = this.mergeWithDefaultOptions(perRequestOptions);
const client = this.getRequestClient(requestOptions);
const containerClient = await client.getContainerClient(this.options.containerName);
const blobClient = await containerClient.getBlockBlobClient(file.name);
const uploadStatus = await blobClient.upload(file.buffer, file.size, {onProgress: progressCallBack});
What i would love to know is if that outcome is normal for this library (for downloading files from azure, the same approach works correctly).

According to my test, the method is a non-parallel uploading method and it just sends a single Put Blob request to Azure Storage server. For more details, please refer to here.
So if you want to get onProgress executed many times, I suggest you use the method uploadStream. It uses Put Block operation and Put Block List operation to upload. For more details, please refer to here
For example
try {
var creds = new StorageSharedKeyCredential(accountName, accountKey);
var blobServiceClient = new BlobServiceClient(
`https://${accountName}.blob.core.windows.net`,
creds
);
var containerClient = blobServiceClient.getContainerClient("upload");
var blob = containerClient.getBlockBlobClient(
"spark-3.0.1-bin-hadoop3.2.tgz"
);
var maxConcurrency = 20; // max uploading concurrency
var blockSize = 4 * 1024 * 1024; // the block size in the uploaded block blob
var res = await blob.uploadStream(
fs.createReadStream("d:/spark-3.0.1-bin-hadoop3.2.tgz", {
highWaterMark: blockSize,
}),
blockSize,
maxConcurrency,
{ onProgress: (ev) => console.log(ev) }
);
console.log(res._response.status);
} catch (error) {
console.log(error);
}

Related

How to check existence of soft deleted file in Azure blob container with node js?

I have file which was stored in some Azure blob directory "folder1/folder2/file.txt". This file was soft deleted - I can see it in Azure web console. I need to have function which checks this file existence.
I tried library "azure-storage". It perfectly works with NOT removed files:
const blobService = azure.createBlobService(connectingString);
blobService.doesBlobExist(container, blobPath, callback)
May be anyone knows how use same approach with soft removed files?
I tied with lib "#azure/storage-blob".
But I stuck with endless entities there (BlobServiceClient, ContainerItem, BlobClient, ContainerClient, etc) and couldn't find way to see particular file in particular blob directory.
Following this MSDOC, I got to restore the Soft deleted blobs and their names with the below code snippet.
const { BlobServiceClient } = require('#azure/storage-blob');
const connstring = "DefaultEndpointsProtocol=https;AccountName=kvpstorageaccount;AccountKey=<Storage_Account_Key>;EndpointSuffix=core.windows.net"
if (!connstring) throw Error('Azure Storage Connection string not found');
const blobServiceClient = BlobServiceClient.fromConnectionString(connstring);
async function main(){
const containerName = 'kpjohncontainer';
const blobName = 'TextFile05.txt';
const containerClient = blobServiceClient.getContainerClient(containerName);
undeleteBlob(containerClient, blobName)
}
main()
.then(() => console.log(`done`))
.catch((ex) => console.log(ex.message));
async function undeleteBlob(containerClient, blobName){
const blockBlobClient = await containerClient.getBlockBlobClient(blobName);
await blockBlobClient.undelete(); //to restore the deleted blob
console.log(`undeleted blob ${blobName}`);
}
Output:
To check if the blob exists and if exists but in Soft-deleted state, I found the relevant code but it’s in C# provided by #Gaurav Mantri. To achieve the same in NodeJS refer here.

Authorization Error from beginCopyFromURL API from Javascript library (#azure/storage-blob) when executed from minikube

I have application registered in Azure and it has Storage Account Contributor role. I am trying to copy content from one account to another in same subscription by using SAS token. Below is code snippet for testing purpose. This code works perfectly fine from standalone node js but it fails when deployed in minikube pod with Authorization Error code 403. Any suggestions/thoughts will be appreciated.
I have verified start and end date for signature.
Permissions are broader but they seem to correct.
For testing keeping expiry for 24 hrs.
If I copy sas url generated from failed code,I can download file from my host machine using azcopy command line. Looks like code fails only when executed from minikube pod.
const { ClientSecretCredential } = require("#azure/identity");
const { BlobServiceClient, UserDelegationKey, ContainerSASPermissions, generateBlobSASQueryParameters } = require("#azure/storage-blob");
module.exports = function () {
/*
This function will receive an input that conforms to the schema specified in
activity.json. The output is a callback function that follows node's error first
convention. The first parameter is either null or an Error object. The second parameter
of the output callback should be a JSON object that conforms to the schema specified
in activity.json
*/
this.execute = async function (input, output) {
try {
if (input.connection) {
const containerName = input.sourcecontainer.trim()
const credential = new ClientSecretCredential(input.connection.tenantId, input.connection.clientid, input.connection.clientsecret);
const { BlobServiceClient } = require("#azure/storage-blob");
// Enter your storage account name
const account = input.sourceaccount.trim();
const accounturl = 'https://'.concat(account).concat('.blob.core.windows.net')
const blobServiceClient = new BlobServiceClient(
accounturl,
credential);
const keyStart = new Date()
const keyExpiry = new Date(new Date().valueOf() + 86400 * 1000)
const userDelegationKey = await blobServiceClient.getUserDelegationKey(keyStart, keyExpiry);
console.log(userDelegationKey)
const containerSAS = generateBlobSASQueryParameters({
containerName,
permissions: ContainerSASPermissions.parse("racwdl"),
startsOn: new Date(),
expiresOn: new Date(new Date().valueOf() + 86400 * 1000),
},
userDelegationKey, account).toString();
const target = '/' + containerName + '/' + input.sourcefolder.trim() + '/' + input.sourcefilename.trim()
const sastoken = accounturl + target + '?' + containerSAS
console.log(sastoken)
let outputData = {
"sourcesas": sastoken
}
//Testing second action execution from same action for testing purpose.
const containerName2 = 'targettestcontainer'
const credential2 = new ClientSecretCredential(input.connection.tenantId, input.connection.clientid, input.connection.clientsecret);
// Enter your storage account name
const blobServiceClient2 = new BlobServiceClient(
accounturl,
credential2);
const destContainer = blobServiceClient2.getContainerClient(containerName2);
const destBlob = destContainer.getBlobClient('testfolder01' + '/' + 'test-code.pdf');
const copyPoller = await destBlob.beginCopyFromURL(outputData.sourcesas);
const result = await copyPoller.pollUntilDone();
return output(null, outputData)
}
} catch (e) {
console.log(e)
return output(e, null)
}
}
}
Thank you EmmaZhu-MSFT for providing the solution. Simmilar issue also raise in github Posting this as an answer to help other community member.
From service side log, seems there's time skew between Azure Storage
Service and the client, the start time used in source SAS token was
later than server time.
We'd suggest not using start time in SAS token to avoid this kind of
failure caused by time skew.
Reference : https://github.com/Azure/azure-sdk-for-js/issues/21977

Issue in blob storage to fileShare big file transfer : Using fileRange (nodejs Cloud function). It transferring partial files

Issue in blob storage to fileShare big file transfer : Using fileRange (nodejs Cloud function). It transferring partial files.
When we transfer file of size 10MB - it transfers only 9.7MB
When we transfer file of size 50MB - it transfers only 49.5MB
It gives issues that: Stack: RangeError: contentLength must be > 0 and <= 4194304 bytes
Code snnipet:
const fileName = path.basename('master/test/myTestXml.xml')
const fileClient = directoryClient.getFileClient(fileName);
const fileContent = await streamToString(downloadBlockBlobResponse.readableStreamBody)
await fileClient.uploadRange(fileContent, 0,fileContent.length,{
rangeSize: 50 * 1024 * 1024, // 4MB range size
parallelism: 20, // 20 concurrency
onProgress: (ev) => console.log(ev)
});
After transferring partial file it give error - any suggestion: how can we transfer big files using rangeSize.
Stack: RangeError: contentLength must be > 0 and <= 4194304 bytes
If you just want to transfer some files from Azure blob storage to Azure File share, you can generate a blob URL with SAS token on your server-side and use the startCopyFromURL function to let your File share copy this file instead of downloading from Azure blob storage and upload to file share. It eases the pressure of your server and the copy process is quick as it uses Azure's internal network.
Just try code below:
const { ShareServiceClient } = require("#azure/storage-file-share");
const storage = require('azure-storage');
const connStr = "";
const shareName = "";
const sharePath = "";
const srcBlobContainer ="";
const srcBlob="";
const blobService = storage.createBlobService(connStr);
// Create a SAS token that expires in 1 hour
// Set start time to five minutes ago to avoid clock skew.
var startDate = new Date();
startDate.setMinutes(startDate.getMinutes() - 5);
var expiryDate = new Date(startDate);
expiryDate.setMinutes(startDate.getMinutes() + 60);
//grant read permission
permissions = storage.BlobUtilities.SharedAccessPermissions.READ;
var sharedAccessPolicy = {
AccessPolicy: {
Permissions: permissions,
Start: startDate,
Expiry: expiryDate
}
};
var srcBlobURL = blobService.getUrl(srcBlobContainer,srcBlob)
var sasToken = blobService.generateSharedAccessSignature(srcBlobContainer, srcBlob, sharedAccessPolicy)
var srcCopyURL = srcBlobURL + "?" + sasToken
const serviceClient = ShareServiceClient.fromConnectionString(connStr);
const fileClient = serviceClient.getShareClient(shareName).getDirectoryClient(sharePath).getFileClient(srcBlob);
fileClient.startCopyFromURL(srcCopyURL).then(function(){console.log("done")})
I have tested on my side in my storage account, result:
I copy a file with about 11 MB, it spends about 5s.

How to handle Optimistic Concurrency in Azure.Storage.Blobs v12.x.x of Azure dll

I am trying to Implement the example shared in Learn Path
https://github.com/MicrosoftDocs/mslearn-support-concurrency-blob-storage/blob/master/src/OptimisticNewsEditor/Program.cs
I am trying to use the v12 dll which is Azure.Storage.Blobs
this is the code I have.
public static async Task Main()
{
BlobContainerClient container;
try
{
container = new BlobServiceClient(connectionString).GetBlobContainerClient(containerName);
await container.CreateIfNotExistsAsync(PublicAccessType.None);
}
catch (Exception)
{
var msg = $"Storage account not found. Ensure that the environment variable " +
" is set to a valid Azure Storage connection string and that the storage account exists.";
Console.WriteLine(msg);
return;
}
// First, the newsroom chief writes the story notes to the blob
await SimulateChief();
Console.WriteLine();
await Task.Delay(TimeSpan.FromSeconds(2));
// Next, two reporters begin work on the story at the same time, one starting soon after the other
var reporterA = SimulateReporter("Reporter A", writingTime: TimeSpan.FromSeconds(12));
await Task.Delay(TimeSpan.FromSeconds(4));
var reporterB = SimulateReporter("Reporter B", writingTime: TimeSpan.FromSeconds(4));
await Task.WhenAll(reporterA, reporterB);
await Task.Delay(TimeSpan.FromSeconds(2));
Console.WriteLine();
Console.WriteLine("=============================================");
Console.WriteLine();
Console.WriteLine("Reporters have finished, here's the story saved to the blob:");
BlobDownloadInfo story = await container.GetBlobClient(blobName).DownloadAsync();
Console.WriteLine(new StreamReader(story.Content).ReadToEnd());
}
private static async Task SimulateReporter(string authorName, TimeSpan writingTime)
{
// First, the reporter retrieves the current contents
Console.WriteLine($"{authorName} begins work");
var blob = new BlobContainerClient(connectionString, containerName).GetBlobClient(blobName);
var contents = await blob.DownloadAsync();
Console.WriteLine($"{authorName} loads the file and sees the following content: \"{new StreamReader(contents.Value.Content).ReadToEnd()}\"");
// Store the current ETag
var properties = await blob.GetPropertiesAsync();
var currentETag = properties.Value.ETag;
Console.WriteLine($"\"{contents}\" has this ETag: {properties.Value.ETag}");
// Next, the author writes their story. This takes some time.
Console.WriteLine($"{authorName} begins writing their story...");
await Task.Delay(writingTime);
Console.WriteLine($"{authorName} has finished writing their story");
try
{
// Finally, they save their story back to the blob.
var story = $"[[{authorName.ToUpperInvariant()}'S STORY]]";
await uploadDatatoBlob(blob, story);
Console.WriteLine($"{authorName} has saved their story to Blob storage. New blob contents: \"{story}\"");
}
catch (RequestFailedException e)
{
// Catch error if the ETag has changed it's value since opening the file
Console.WriteLine($"{authorName} sorry cannot save the file as server returned an error: {e.Message}");
}
}
private static async Task SimulateChief()
{
var blob = new BlobContainerClient(connectionString, containerName).GetBlobClient(blobName);
var notes = "[[CHIEF'S STORY NOTES]]";
await uploadDatatoBlob(blob, notes);
Console.WriteLine($"The newsroom chief has saved story notes to the blob {containerName}/{blobName}");
}
private static async Task uploadDatatoBlob(BlobClient blob, string notes)
{
byte[] byteArray = Encoding.UTF8.GetBytes(notes);
MemoryStream stream = new MemoryStream(byteArray);
await blob.UploadAsync(stream, overwrite: true);
}
I need to modify the UploadAsync to check for ETag before Uploading.
In the old version of Azure .Net CLI we had Microsoft.Azure.Storage.Blob dll now this handled Optimistic Concurrency by
await blob.UploadTextAsync(story, null, accessCondition: AccessCondition.GenerateIfMatchCondition(currentETag), null, null);
How do i do this in v12 dll.
Any Help appreciated.
Please use the following override of UploadAsync method:
UploadAsync(Stream, BlobHttpHeaders, IDictionary<String,String>, BlobRequestConditions, IProgress<Int64>, Nullable<AccessTier>, StorageTransferOptions, CancellationToken)
You can define the access conditions as part of BlobRequestConditions parameter.

Azure Blob Storage Compressing files by default?

I am uploading JSONs to Azure Blob storage using the Azure Blob storage API's function:
const response = await blobClient.upload(content, content.length);
There is absolutely no compression logic in the code nor any encoding headers being added but the files seem to be around 60% of their original size when they reach the storage. Also, monitoring the PUT requests using fiddler it seems that the file is compressed and then uploaded by the API.
My question is, does Azure do compression by default?
EDIT:
I was stringifying and then uploading the json objects. They get all the white-spaces remove and hence the reduced size.
Based on my test, there is no compression problem. Here is my sample:
const { BlobServiceClient } = require("#azure/storage-blob");
var fs = require('fs');
async function main() {
const AZURE_STORAGE_CONNECTION_STRING = "Your_Stroage_Account_Connection_String";
const blobServiceClient = BlobServiceClient.fromConnectionString(AZURE_STORAGE_CONNECTION_STRING);
const containerName = 'demo';
const blobName = 'test.txt';
const containerClient = blobServiceClient.getContainerClient(containerName);
if(!await containerClient.exists()){
await containerClient.create();
}
const contents = fs.readFileSync('test.txt');
const blockBlobClient = containerClient.getBlockBlobClient(blobName);
await blockBlobClient.upload(contents,contents.length);
}
main().then(() => console.log('Done')).catch((ex) => console.log(ex.message));
The test.txt file's size is about 99.9KB.
And, from the portal, the uploaded file's size is 99.96KB,which is in line with our expectations.
You should also use byte length when uploading, as storage blob api expects number of bytes, the string length can be different
const content = "Hello 世界!";
console.log(`length: ${content.length}`);
console.log(`byteLength: ${Buffer.byteLength(content)}`);
the output:
length: 9
byteLength: 15

Resources