Read content from Azure blob storage in node API - node.js

I am new to azure and working on the storage account for one my application.Basically I have json files stored in azure blob storage.
I want to read the data from these files in Node JS application and do some filtering on the data, which is eventually secured REST end point to view data in the UI/Client as HTTP response.
I have gone through the docs about different operations on the blob storage which is exposed as NODE SDK, we can see find them in below link,
https://github.com/Azure/azure-storage-node
But the question I have is "How to read the json files". I see one method getBlobToStream. Is this going to give me json content in the callback, so that I can do further processing on the data and send as response to clients who requested.
Please some one explain how to do this in better way or is this the only option we have.
Thanks for the help.

To use getBlobToStream, you have to define a writable stream.
So I recommend you to use getBlobToText to avoid trouble.
If no error occurs, this method will get blob content into text in callback. You can then parse it to a JSON string. A simple example is as below.
blobService.getBlobToText(container, blobname, function(error, text){
if(error){
console.error(error);
res.status(500).send('Fail to download blob');
} else {
var data = JSON.parse(text);
res.status(200).send('Filtered Data you want to send back');
}
});

Related

Trying to use HttpClient.GetStreamAsync straight to the adls FileClient.UploadAsync

I have an Azure Function that will call an external API via HttpClient. The external API returns a JSON response. I want to save the response directly to an ADLS File.
My simplistic code is:
public async Task UploadFileBulk(Stream contentToUpload)
{
await this._theClient.FileClient.UploadAsync(contentToUpload);
}
The this._theClient is a simple wrapper class around the various Azure Data Lake classes such as DataLakeServiceClient, DataLakeFileSystemClient, DataLakeDirectoryClient, DataLakeFileClient.
I'm happy this wrapper calls works as I expect, I spin one up, set the service, filesystem, directory and then a filename to create. I've used this wrapper class to create directories etc. so it works as I expect.
I am calling the above method as follows:
await dlw.UploadFileBulk(await this._httpClient.GetStreamAsync("<endpoint>"));
I see the file getting created in the Lake directory with the name I want, however if I then download the file using Sorage Explorer and then try to open it in say VS Code it's not in a recognisable format (I can "force" code to open it but it looks like binary format to me).
If I sniff the traffic with fiddler I can see the content from the external API is JSON, content-type is application/json and the body shows in fiddler as JSON.
If I look at the calls to the ADLS endpoint I can see a PUT call followed by two PATCH calls.
The first PATCH call looks like it is the one sending the content, it has a content-header of application/octet-stream and the request body is the "binary looking content".
I am using HttpClient.GetStreamAsync as I don't want my Function to have to load the entire API payload into memory (some of the external API endpoints return very large files over 100mb). I am thinking I can "stream the response from the external API straight into ADLS".
Is there a way to change how the ADLS FileClient.UploadAsync(Stream stream) method works so I can tell it to upload the file as a JSON file with a content type of application/json?
EDIT:
So turns out the External API was sendng back zipped content and so once I added the following extra AutomaticDecompression code to my functions startup I got the files uploaded to ADLS as expected.
public override void Configure(IFunctionsHostBuilder builder)
{
builder.Services.AddHttpClient("default", client =>
{
client.DefaultRequestHeaders.Add("Accept-Encoding", "gzip, deflate");
}).ConfigurePrimaryHttpMessageHandler(() => new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
});
}
#Gaurav Mantri has given me some pointers on if the pattern of "streaming from an output to an input" is actually correct, I will research this further.
Regarding the issue, please refer to the following code
var uploadOptions = new DataLakeFileUploadOptions();
uploadOptions.HttpHeaders = new PathHttpHeaders();
uploadOptions.HttpHeaders.ContentType ="application/json";
await fileClient.UploadAsync(stream, uploadOptions);

Firebase upload raw data to bucket

I am creating an application in nodejs/typescript that uses Firebase Functions and I basically need to upload a JSON object to a bucket. I am having issues because the JSON I am creating exists in memory, and is not an actual file - as the application is a serverless one.
I know firebase is just a wrapper for Google Cloud Functions and have looked for solutions everywhere however cannot seem to get this working. Is anyone able to give me any guidance or suggestions please?
If I cannot upload the in-memory file to a bucket, does anyone know if its possible to programatically export a database document as a json to a firestore bucket using firebase? (as I can easily just upload the json to a database document).
Below is one example of what I have tried. However the code is obviously invalid.
await storage()
.bucket()
.file('test.json') // A random string filename and not an existing file
.createWriteStream()
.write(JSON.stringify(SOME_VALID_JSON))
Thanks!
You can use save() to write data in memory to a bucket.
await storage()
.bucket()
.file('test.json')
.save(JSON.stringify(SOME_VALID_JSON))

MariaDB - pipe blob in Node.js

I'm requesting a 16 Mbyte blob from MariaDB in Node.js web service. Is it possible to pipe the blob data directly to ServerResponse (res) object?
I see two approaches:
connection.query("SELECT blob FROM files WHERE id = ?", [id]);
This method saves the whole blob into memory which I'd like to avoid.
The second option is queryStream method which seems to stream rows in small batches. I need to select one row blob only and this method behaves the same way as query.
Is there a possibility to pipe the blob data directly to ServerResponse (res) object?

Nodejs transform image data back to actual image

My server receives a file from a HTTP request and uploads this file to IBM Cloud Object Storage.
Moreover, the server allows to recover this file. Recovery is triggered by a get http request that should return said file.
It works fine for "basic" data format, such as text files. However, I encounter problems with more complex types such as images and the "reformating".
Image is uploaded to the datastore. The element stored is the buffer itself:
req.files[0].buffer
When getting the image back from the datastore, how can I transform it back to a readable format for my computer?
The data look like this and it is, on the server, a string:
If you are using ExpressJS you can do this:
const data = req.files[0].buffer;
res.contentType('image/jpeg'); // don't know what type is
res.send(data);

How to create a blob in node (server side) from a stream, a file or a base64 string?

I am trying to create a blob from a pdf I am creating from pdfmake so that I can send it to a remote api that only handles blobs.
This is how I get my PDF file:
var docDefinition = { content: 'This is an sample PDF printed with pdfMake' };
pdfDoc.pipe(fs.createWriteStream('./pdfs/test.pdf'));
pdfDoc.end();
The above lines of code do produce a readable pdf.
Now how can I get a blob from there? I have tried many options (creating the blob from the stream with the blob-stream module, creating from the file with fs, creating it from a base64 string with b64toBlob) but all of them require at some point to use the constructor Blob for which I always get an error even if I require the module blob:
TypeError: Blob is not a constructor
After some research I found that it seems that the Blob constructor is only supported client-side.
All the npm packages that I have found and which seem to deal with this issue seem to only work client-side: blob-stream, blob, blob-util, b64toBlob, etc.
So, how can I create a blob server-side on Node?
I don't understand why almost nobody also needs to create a blob server-side? The only thread I could find on the subject is this one.
According to that thread, apparently:
The Solution to this problem is to create a function which can convert between Array Buffers and Node Buffers. :)
Unfortunately this does not help me much as I clearly seem to lack some important knowledge here to be able to comprehend this.
use node-blob npm package
const Blob = require('node-blob');
let myBlob = new Blob(["something"], { type: 'text/plain' });

Resources