Read JSON file directly from google storage (using Cloud Functions)

Read JSON file directly from google storage (using Cloud Functions) - node.js

I created a function that extracts a specific attribute from a JSON file, but this file was together with the function in Cloud Functions. In this case, I was simply attaching the file and was able to refer to a specific attribute:
const jsonData = require('./data.json');
const result = jsonData.responses[0].fullTextAnnotation.text;
return result;
Ultimately, I want to read this file directly from cloud storage and here I have tried several solutions, but without success. How can I read a JSON file directly from google storage so that, as in the first case, I can read its attributes correctly?

As mentioned in the comment the Cloud Storage API allows you to do many things through API. Here's an example from documentation on how to download a file from Cloud Storage for your reference.
/**
* TODO(developer): Uncomment the following lines before running the sample.
*/
// The ID of your GCS bucket
// const bucketName = 'your-unique-bucket-name';
// The ID of your GCS file
// const fileName = 'your-file-name';
// The path to which the file should be downloaded
// const destFileName = '/local/path/to/file.txt';
// Imports the Google Cloud client library
const {Storage} = require('#google-cloud/storage');
// Creates a client
const storage = new Storage();
async function downloadFile() {
const options = {
destination: destFileName,
};
// Downloads the file
await storage.bucket(bucketName).file(fileName).download(options);
console.log(
`gs://${bucketName}/${fileName} downloaded to ${destFileName}.`
);
}
downloadFile().catch(console.error);

To clearly answer the question: you can't!
You need to download the file locally first, and then process it. You can't read it directly from GCS.
With Cloud Functions you can only store file in the /tmp directory, it's the only one writable. In addition, it's an in-memory file system, that means several things:
The size is limited by the memory set up to the Cloud Function. The memory space is shared between your app memory footprint and your file storage in /tmp (you won't be able to download a file of 10Gb for example)
The memory is lost when the instance goes down and
All the Cloud Functions instances have their own memory space. You can't share the files between all the Cloud Functions
The /tmp directory isn't cleaned between 2 functions invocation (on the same instance). Think to cleanup yourselves this directory.

Related

How do I send multiple files from gcloud storage to the end user using a node server?

Here's the problem: I am trying to send multiple files from google storage to my express server and from there, the user of my website.
Google Storage >> Express Server >> Website User
To send the files to the end user, I will have to zip them. But since I can't write any of the files to the hard drive, AppEngine does not allow it. I will have to store the files in memory until they all arrive from Google storage, and then zip it and stream the zipped file to the user.
The files are large so this is not possible. Is there a solution where I can pipe the streams for files into something that zips them, and then pipe that stream to the response? Or some other solution?
const downloadVideoFromCloud = (videoId) => {
const location = `videos/${videoId}.mp4`;
const file = myBucket.file(location);
file.createReadStream().pipe(fs.createWriteStream(`./videos/${videoId}.mp4`));
};
const downloadPlaylistFromCloud = async (playlistId) => {
const playlist = await getPlaylist(playlistId);
for (const video of playlist) {
downloadVideoFromCloud(video.videoId);
}
};
This is the code I am using to save the videos to local storage while testing.

what the best way to upload larger files to s3 with nodejs aws-sdk? MultipartUpload vs ManagedUpload vs getSignedURL, etc

Im trying to look over the ways AWS has to offer in order to upload files to s3. When I looked into their docs it confused the hell of out me. Looking up to the various resources I came to know a bit more resources like s3.upload vs s3.putObject and others realised there are physical limitations in API gateway and using lambda function to upload a file.
Particularly in case of uploading large file like 1-100 GB AWS suggests multiple methods to upload file to s3. Amongst them are createMultipartUpload, ManagedUpload, getSignedURL and tons of other.
So my Question is:
What is the best and the easiest way to upload large files to s3 where I also can cancel the upload process. The multipart upload seems to tedious.

There's no Best Way to upload file to S3
It depends on what you want especially what are the sizes of the object that you want to upload.
putObject - Ideal for objects which are under 20MB
Presigned Url - Allows you to bypass API Gateway and Put object under 5GB to s3 bucket
Multipart Upload - Allows you to upload files in chunks which means you can continue your upload even the connection went off temporarily. The maximum file size you can upload via this method is 5TB.

Use Streams to upload to S3, this way the Node.JS server doesn't take too much of the resources.
const AWS = require('aws-sdk');
const S3 = new AWS.S3();
const stream = require('stream');
function upload(S3) {
let pass = new stream.PassThrough();
let params = {
Bucket: BUCKET,
Key: KEY,
Body: pass
};
S3.upload(params, function (error, data) {
console.error(error);
console.info(data);
});
return pass;
}
const readStream = fs.createReadStream('/path/to/your/file');
readStream.pipe(upload(S3));
This is via streaming local file, the stream can be from request as well.
If want to listen to the progress can use ManagedUpload
const manager = S3.upload(params);
manager.on('httpUploadProgress', (progress) => {
console.log('progress', progress)
// { loaded: 6472, total: 345486, part: 3, key: 'large-file.dat' }
});

How to update file when hosting in Google App Engine?

I have node js server service running on a Google Cloud App Engine.
I have JSON file in the assets folder of the project that needs to update by the process.
I was able to read the file and configs inside the file. But when adding the file getting Read-Only service error from the GAE.
Is there a way I could write the information to the file without using the cloud storage option ?
It a very small file and using the cloud storage thing would be using a very big drill machine for a Allen wrench screw
Thanks

Nope, in App Engine Standard there is no such a file system. In the docs, the following is mentioned:
The runtime includes a full filesystem. The filesystem is read-only except for the location /tmp, which is a virtual disk storing data in your App Engine instance's RAM.
So having this consideration you can write in /tmp but I suggest to Cloud Storage because if the scaling shutdowns all the instances, the data will be lost.
Also you can think of App Engine Flex which offers to have a HDD (because its backend is a VM) but the minimum size is 10GB so it will be worst than using Storage.

Once thanks for steering me not to waste time finding a hack solution for the problem.
Any way there was no clear code how to use the /tmp directory and download/upload the file using the app engine hosted node.js application.
Here is the code if some one needs it
const {
Storage
} = require('#google-cloud/storage');
const path = require('path');
class gStorage {
constructor() {
this.storage = new Storage({
keyFile: 'Please add path to your key file'
});
this.bucket = this.storage.bucket(yourbucketname);
this.filePath = path.join('..', '/tmp/YourFileDetails');
// I am using the same file path and same file to download and upload
}
async uploadFile() {
try {
await this.bucket.upload(this.filePath, {
contentType: "application/json"
});
} catch (error) {
throw new Error(`Error when saving the config. Message : ${error.message}`);
}
}
async downloadFile() {
try {
await this.bucket.file(filename).download({
destination: this.filePath
});
} catch (error) {
throw new Error(`Error when saving the config. Message : ${error.message}`);
}
}
}

Dialogflow, nodejs: File System module - Error: EROFS: read-only file system, open 'filename.pdf' at Error (native)

I'm using File System module to stream a PDF file and then, to access that file for later usage. It works when I try it in the compiler, but when I do a deploy in Dialogflow, it fails with error:
Error: EROFS: read-only file system, open 'filename.pdf' at Error (native)
This is the code:
const fs = require('fs');
const request = require('request');
var downloadRequest = {
url: "http://www.axmag.com/download/pdfurl-guide.pdf",
method: 'GET'
}
var file = fs.createWriteStream("filename.pdf");
request(downloadRequest).pipe(file);
file.on('finish', function(){
var downloadedFile = fs.createReadStream("filename.pdf");
// Other code which accesses 'downloadedFile' takes place below
...
});
Is there anything that can be done to handle this error?

As we discussed, you're deploying code using Firebase Functions. FF are read-only systems, as part of its stateless system, meaning that you cannot save files persistently (as that would conflict with the stateless system, and having these files persist across every execution server/environment wouldn't be guaranteed).
To host content dynamically, you would need to use another system for hosting files. This could be through Firebase cloud storage instead of saving in the process, or just storing the file contents in memory.

export firebase data node to pdf report

What a have is a mobile app emergency message system which uses Firebase as a backend. When the end of an emergency event ends, I would like to capture the message log in a pdf document. I have not been able to find any report editors that work with Firebase. This means I may have to export this to php mysql. The Firebase php SDK looks to be to much overkill for this task. I have been googling php get from firebase and most responses have to do with using the Firebase php SDK. Is this the only way it can be acomplished?

You could use PDF Kit (...) on Cloud Functions (it's all nodeJS, no PHP available there).
On npmjs.com there are several packages for #firebase-ops, googleapis and #google-cloud.
In order to read from Firebase and write to Storage Bucket or Data Store; that example script would still require a database reference and a storage destination, to render the PDF content (eventually from a template) and puts it, where it belongs. also see firebase / functions-samples (especially the package.json which defines the dependencies). npm install -g firebase-tools installs the tools required for deployment; also the requires need to be installed in order to be locally known (quite alike composer - while remotely these are made known while the deployment process).
You'd need a) Firebase Event onUpdate() as the trigger, b) check the endTime of the returned DeltaSnapshot for a value and c) then render & store the PDF document. the code may vary, just to provide a coarse idea of how it works, within the given environment:
'use strict';
const admin = require('firebase-admin');
const functions = require('firebase-functions');
const PDFDocument = require('pdfkit');
const gcs = require('#google-cloud/storage')();
const bucket = gcs.bucket( 'some-bucket' );
const fs = require('fs');
// TODO: obtain a handle to the delta snapshot
// TODO: render the report
var pdf = new PDFDocument({
size: 'A4',
info: {Title: 'Tile of File', Author: 'Author'}
});
pdf.text('Emergency Incident Report');
pdf.pipe(
// TODO: figure out how / where to store the file
fs.createWriteStream( './path/to/file.pdf' )
).on('finish', function () {
console.log('PDF closed');
});
pdf.end();
externally running PHP code is in this case nevertheless not run on the server-side. the problem with it is, that an external server won't deliver any realtime trigger and therefore the file will not appear instantly, upon time-stamp update (as one would expect it from a Realtime Database). one could also add external web-hooks (or interface them with PHP), eg. to obtain these PDF files through HTTPS (or even generated upon HTTPS request, for externally triggered generation). for local testing one can use command firebase serve, saves much time vs. firebase deploy.
the point is, that one can teach Cloud Function how the PDF files shall look alike, when they shall be created and where to put them, as micro-service which does nothing else but to render these files. scripting one script should be still within acceptable range, given all the clues provided.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Read JSON file directly from google storage (using Cloud Functions) - node.js

Related

How do I send multiple files from gcloud storage to the end user using a node server?

what the best way to upload larger files to s3 with nodejs aws-sdk? MultipartUpload vs ManagedUpload vs getSignedURL, etc

How to update file when hosting in Google App Engine?

Dialogflow, nodejs: File System module - Error: EROFS: read-only file system, open 'filename.pdf' at Error (native)

export firebase data node to pdf report

Categories

Resources