Azure Blob Storage Compressing files by default? - node.js

I am uploading JSONs to Azure Blob storage using the Azure Blob storage API's function:
const response = await blobClient.upload(content, content.length);
There is absolutely no compression logic in the code nor any encoding headers being added but the files seem to be around 60% of their original size when they reach the storage. Also, monitoring the PUT requests using fiddler it seems that the file is compressed and then uploaded by the API.
My question is, does Azure do compression by default?
EDIT:
I was stringifying and then uploading the json objects. They get all the white-spaces remove and hence the reduced size.

Based on my test, there is no compression problem. Here is my sample:
const { BlobServiceClient } = require("#azure/storage-blob");
var fs = require('fs');
async function main() {
const AZURE_STORAGE_CONNECTION_STRING = "Your_Stroage_Account_Connection_String";
const blobServiceClient = BlobServiceClient.fromConnectionString(AZURE_STORAGE_CONNECTION_STRING);
const containerName = 'demo';
const blobName = 'test.txt';
const containerClient = blobServiceClient.getContainerClient(containerName);
if(!await containerClient.exists()){
await containerClient.create();
}
const contents = fs.readFileSync('test.txt');
const blockBlobClient = containerClient.getBlockBlobClient(blobName);
await blockBlobClient.upload(contents,contents.length);
}
main().then(() => console.log('Done')).catch((ex) => console.log(ex.message));
The test.txt file's size is about 99.9KB.
And, from the portal, the uploaded file's size is 99.96KB,which is in line with our expectations.

You should also use byte length when uploading, as storage blob api expects number of bytes, the string length can be different
const content = "Hello 世界!";
console.log(`length: ${content.length}`);
console.log(`byteLength: ${Buffer.byteLength(content)}`);
the output:
length: 9
byteLength: 15

Related

How to check existence of soft deleted file in Azure blob container with node js?

I have file which was stored in some Azure blob directory "folder1/folder2/file.txt". This file was soft deleted - I can see it in Azure web console. I need to have function which checks this file existence.
I tried library "azure-storage". It perfectly works with NOT removed files:
const blobService = azure.createBlobService(connectingString);
blobService.doesBlobExist(container, blobPath, callback)
May be anyone knows how use same approach with soft removed files?
I tied with lib "#azure/storage-blob".
But I stuck with endless entities there (BlobServiceClient, ContainerItem, BlobClient, ContainerClient, etc) and couldn't find way to see particular file in particular blob directory.
Following this MSDOC, I got to restore the Soft deleted blobs and their names with the below code snippet.
const { BlobServiceClient } = require('#azure/storage-blob');
const connstring = "DefaultEndpointsProtocol=https;AccountName=kvpstorageaccount;AccountKey=<Storage_Account_Key>;EndpointSuffix=core.windows.net"
if (!connstring) throw Error('Azure Storage Connection string not found');
const blobServiceClient = BlobServiceClient.fromConnectionString(connstring);
async function main(){
const containerName = 'kpjohncontainer';
const blobName = 'TextFile05.txt';
const containerClient = blobServiceClient.getContainerClient(containerName);
undeleteBlob(containerClient, blobName)
}
main()
.then(() => console.log(`done`))
.catch((ex) => console.log(ex.message));
async function undeleteBlob(containerClient, blobName){
const blockBlobClient = await containerClient.getBlockBlobClient(blobName);
await blockBlobClient.undelete(); //to restore the deleted blob
console.log(`undeleted blob ${blobName}`);
}
Output:
To check if the blob exists and if exists but in Soft-deleted state, I found the relevant code but it’s in C# provided by #Gaurav Mantri. To achieve the same in NodeJS refer here.

compress and write a file in another container with azure blob storage trigger in nodejs

i have to make an api call passing a compressed file as input. i have a working example on premise but I would like to move the solution in cloud. i was thinking to use azure blob storage and the azure function trigger. i have the below code that works for file but I don't know how to do the same with azure blob storage and azure function in nodejs
const zlib = require('zlib');
const fs = require('fs');
const def = zlib.createDeflate();
input = fs.createReadStream('claudio.json')
output = fs.createWriteStream('claudio-def.json')
input.pipe(def).pipe(output)
this code read a file as stream , compress the file and write another file as stream.
what I would like to do is reading the file any time I upload it in a container of azure blob storage, then I want to compress it and save in a different container with different name, then make an API call passing as input the compressed file saved in the other container
I tried this code for compressing the incoming file
const fs = require("fs");
const zlib = require('zlib');
const {Readable, Writable} = require('stream');
module.exports = async function (context, myBlob) {
context.log("JavaScript blob trigger function processed blob \n Blob:", context.bindingData.blobTrigger, "\n Blob Size:", myBlob.length, "Bytes");
// const fin = fs.createReadStream(context.bindingData.blobTrigger);
const def = zlib.createDeflate();
const s = Readable.from(myBlob.toString())
context.log(myBlob);
context.bindings.outputBlob = s.pipe(def)
};
the problem with this approach is that in the last line of the code
context.bindings.outputBlob = s.pipe(def)
i don't have the compressed file, while if i use this
s.pipe(def).pipe(process.stdout)
i can read the compressed file
as you can see above i also tried to use the fs.createReadStream(context.bindingData.blobTrigger) that contains the name of the uploaded file with the container name, but it doesn't work
any idea?
thank you
this is the solution
var input = context.bindings.myBlob;
var inputBuffer = Buffer.from(input);
var deflatedOutput = zlib.deflateSync(inputBuffer);
context.bindings.myOutputBlob = deflatedOutput;
https://learn.microsoft.com/en-us/answers/questions/500368/compress-and-write-a-file-in-another-container-wit.html

Copy blob from one storage account to another using #azure/storage-blob

What would be the best way to copy a blob from one storage account to another storage account using #azure/storage-blob?
I would imagine using streams would be best instead of downloading and then uploading, but would like to know if the code below is the correct/optimal implementation for using streams.
const srcCredential = new ClientSecretCredential(<src-ten-id>, <src-client-id>, <src-secret>);
const destCredential = new ClientSecretCredential(<dest-ten-id>, <dest-client-id>, <dest-secret>);
const srcBlobClient = new BlobServiceClient(<source-blob-url>, srcCredential);
const destBlobClient = new BlobServiceClient(<dest-blob-url>, destCredential);
const sourceContainer = srcBlobClient.getContainerClient("src-container");
const destContainer = destBlobClient.getContainerClient("dest-container");
const sourceBlob = sourceContainer.getBlockBlobClient("blob");
const destBlob = destContainer.getBlockBlobClient(sourceBlob.name)
// copy blob
await destBlob.uploadStream((await sourceBlob.download()).readableStreamBody);
Your current approach downloads the source blob and then re-uploads it which is not really optimal.
A better approach would be to make use of async copy blob. The method you would want to use is beginCopyFromURL(string, BlobBeginCopyFromURLOptions). You would need to create a Shared Access Signature URL on the source blob with at least Read permission. You can use generateBlobSASQueryParameters SDK method to create that.
const sourceBlob = sourceContainer.getBlockBlobClient("blob");
const destBlob = destContainer.getBlockBlobClient(sourceBlob.name);
const sourceBlobSasUrl = GenerateSasUrlWithReadPermissionOnSourceBlob(sourceBlob);
// copy blob
await destBlob.beginCopyFromURL(sourceBlobSasUrl);

Azure Blob Storage - upload file with progress

I have following code - quite normal for uploading files into Azure-Blob-Storage but, when i upload files instead of getting onProgress executed many times, i only have it executed (and always) once with the file.size value (so it is sending - slowly) file to the azure but progress executes only once when finished.
const requestOptions = this.mergeWithDefaultOptions(perRequestOptions);
const client = this.getRequestClient(requestOptions);
const containerClient = await client.getContainerClient(this.options.containerName);
const blobClient = await containerClient.getBlockBlobClient(file.name);
const uploadStatus = await blobClient.upload(file.buffer, file.size, {onProgress: progressCallBack});
What i would love to know is if that outcome is normal for this library (for downloading files from azure, the same approach works correctly).
According to my test, the method is a non-parallel uploading method and it just sends a single Put Blob request to Azure Storage server. For more details, please refer to here.
So if you want to get onProgress executed many times, I suggest you use the method uploadStream. It uses Put Block operation and Put Block List operation to upload. For more details, please refer to here
For example
try {
var creds = new StorageSharedKeyCredential(accountName, accountKey);
var blobServiceClient = new BlobServiceClient(
`https://${accountName}.blob.core.windows.net`,
creds
);
var containerClient = blobServiceClient.getContainerClient("upload");
var blob = containerClient.getBlockBlobClient(
"spark-3.0.1-bin-hadoop3.2.tgz"
);
var maxConcurrency = 20; // max uploading concurrency
var blockSize = 4 * 1024 * 1024; // the block size in the uploaded block blob
var res = await blob.uploadStream(
fs.createReadStream("d:/spark-3.0.1-bin-hadoop3.2.tgz", {
highWaterMark: blockSize,
}),
blockSize,
maxConcurrency,
{ onProgress: (ev) => console.log(ev) }
);
console.log(res._response.status);
} catch (error) {
console.log(error);
}

How to Upload PDF File on Azure Blob Storage via Node.js?

const blobServiceClient = await BlobServiceClient.fromConnectionString(connectionString);
const containerClient = await blobServiceClient.getContainerClient(container);
const blockBlobClient = containerClient.getBlockBlobClient(fileName);
const uploadBlobResponse = await blockBlobClient.upload(content, content.length);
console.log(uploadBlobResponse);
console.log(`FIle upload successfully on cloud ${uploadBlobResponse.requestId}`);
i am trying like this, but blockBlobClient.upload() needs content, i converted file in base64 and sent it into content, but i am having issue, file is uplaoded but corrupted. any help please.
Check the SDK, the upload method construct is upload(HttpRequestBody, number, BlockBlobUploadOptions), the content is HttpRequestBody, check the parameter it requires
Blob, string, ArrayBuffer, ArrayBufferView or a function which returns
a new Readable stream whose offset is from data source beginning.
So maybe you could try uploadFile, just use the file path to upload, I have tried this way it works.
Also, you could use uploadStream to upload the file readable stream.

Resources