when file is 100% uploaded to GCS then run - node.js

Script seems to be running before file is fully uploaded to GCS.
I have a blobStream.on function that is meant to only run when the data has finished uploading to GCS. The issue is it works sometimes and other time it is running too soon and forgets that another file is also uploading (normally it's the audio file that is still uploading).
I am wondering how can I improve the below script.
index is the number of files uploading 0,1,2 it counts up.
fileLength is the number of the file it is uploading 0,1,2 etc.
What seems to happy is that this part is triggering too soon, as the index and fileLength equal the same amount. It's not taking into account the data that could be because according to me doing a console.log(data) nothing returns it's undefined.
Seems a few errors here.
I am wondering is there anyway to watch the data in this function and when that finished to run the correct script. Also clearly way too many delays - this is because I am trying to slow down the script so it runs at the correct time.
What I can say is that the delay just after the blobStream.on('finish') does seem to help it a little.
blobStream.on("finish", async (data) => {
const delay = ms => new Promise(resolve => setTimeout(resolve, ms))
await delay(10000);
const publicUrl = format(
`https://storage.googleapis.com/${bucket.name}/${blob.name}`
);
try {
await bucket.file(newfileName).makePublic();
} catch {
message.push({
message:
`Uploaded the file successfully: ${newfileName}, but public access is denied!`,
url: publicUrl,
});
}
console.log(index);
if(index == fileLength){
await delay(10000);
message.push({
originalname: file.originalname,
mimeType: file.mimetype,
message: "Uploaded the file successfully: " + newfileName,
url: publicUrl,
});
console.log(JSON.stringify(message))
await delay(10000)
console.log(JSON.stringify(message))
await delay(10000)
submitToDB(req, res, message);
//res.status(200).send(message);
}
else{
console.log("this is the first index."+ index +"file name "+ file.originalname);
const delay = ms => new Promise(resolve => setTimeout(resolve, ms))
message.push({
originalname: file.originalname,
mimeType: file.mimetype,
message: "Uploaded the file successfully: " + newfileName,
url: publicUrl,
})
await delay(1000)
console.log(JSON.stringify(message));
}
});

Related

how to set up node js server to handle resumable file downloads

After i upload a file to my google drive, i am saving the information about it in the database and when someone wants to download that particular file, the only information needed to be provided is the id of the saved file in the database.
The sample code below is working very well but the only problem is that when there is an internet connection problem the file downloading process is terminated and when the user try to assume the download, the file download will start afresh.
Note: When requesting a file from google drive, i can also provide ranges but i don't know how to know when the client is requesting for a partial file so that i can include them in the request. My english is bad but i hope my question is understood
app.get("/download", async (req, res) => {
try {
const fileId = req.query.file;
if (!fileId) return res.status(400).json({ msg: "file is needed" });
const file = await File.findById(fileId);
if (!file) return res.status(404).json({ msg: "not found" });
const title = file.title
.replace(/[-&\/\\#, +()$~%.'":*?<>{}]/g, " ")
.trim();
const ext = file.file_type == "audio" ? ".mp3" : ".mp4";
const resp = await drive.files.get(
{
fileId: file.file_id,
alt: "media",
},
{ responseType: "stream" }
);
res.set({
"Content-Length": file.file_size,
"Content-Disposition": `attachment; filename=${title}${ext}`,
});
resp.data.pipe(res);
} catch (error) {
console.log(error.message);
res.status(500).send("something went wrong");
}
})

Read .mp4 file from firebase storage using fs to send that video to tenserflow model

my graduation project is to convert video into text.
I'm trying to read video uploaded in Firebase storage & sent from android app, to send it to TenserFlow model.
but I can't read the video.
here is my function:
exports.readVideo = functions.storage
.object()
.onFinalize(async (object) => {
const bucket = admin.storage().bucket(object.bucket);
const tempFilePath = path.join(os.tmpdir(), object.name);
console.log(tempFilePath);
console.log('download');
// note download
await bucket
.file(object.name!)
.download({
destination: tempFilePath,
})
.then()
.catch((err) => {
console.log({
type: 'download',
err: err,
});
});
console.log('read');
// note read
let stream = await bucket
.file(object.name!)
.createReadStream({
start: 10000,
end: 20000,
})
.on('error', function (err) {
console.log('error 1');
console.log({ error: err });
})
await new Promise((resolve, reject) => {
console.log('error 2');
stream.on('finish', resolve);
console.log('error 3');
stream.on('error', reject);
console.log("end!")
stream.on('end', resolve);
}).catch((error) => {
// successMessage is whatever we passed in the resolve(...) function above.
// It doesn't have to be a string, but if it is only a succeed message, it probably will be.
console.log("oups! " + error)
});
console.log('tempFile size2', fs.statSync(tempFilePath).size);­­­
return fs.unlinkSync(tempFilePath);
});
and I got that error:
Function execution took 60008 ms, finished with status: 'timeout'
As the error message shows, the regular file system on Cloud Functions is read only. The only place you can write to is /tmp, as also shown in the documentation on file system access in Cloud Functions. I'm not sure why os.tmpdir() doesn't give you a location at that path, but you might want to hard-code the directory.
One thing to keep in mind: /tmp is a RAM disk and not a physical disk, so your allocated memory will need to have enough space for the files you write to it.

Get file buffer using Google Drive API and async/await

I'm trying to get the buffer of some drive pdf files so I can parse it and use the data.
I've managed to get the file names and id using async/await and a "drive.files.list" wrapped with promise. Now I need to use the file ids to get the buffer and then read it.
The function I need should return a promise that I can wait (using await) to be fulfilled to get a buffer. (My parser works fine when I get pdf buffer from website responses)
function getBuffer(drive, file) {
return new Promise((resolve, reject) => {
/////Google Auth
var jwToken = new google.auth.JWT(
key.client_email,
null,
key.private_key, ["https://www.googleapis.com/auth/drive"],
null
);
jwToken.authorize((authErr) => {
if (authErr) {
return reject([false, "Auth Error: " + authErr]);
}
});
drive.files.get({
auth: jwToken,
fileId: file.id,
alt: 'media',
supportsAllDrives: true
}, function (err, res) {
if (err) {
return reject('The API returned an error: ' + err);
};
console.log(res);
const buffer = res;
resolve(buffer);
});
});
}
And I use it this way:
var buffer = await getBuffer(drive,files[i]);
The output I get in "console.log(res)" is something like this:
...
��M�7�|�ı�[��Ξ�A����EBS]��P��r�����j�3�|�I.��i�+ϢKU���U�:[�═�,^߻t덲�v��=}'*8���ѻ��#ғ�s��No��-��q8E9�/f� �(�`�j'3
"╚�-��� ������[jp&��╚k��M��vy� In�:a�զ�OlN��u����6�n���q�/Y�i4�?&%��q�,��p╚.ZV&n�Ɨ��2G������X����Y
D],�ggb�&�N���G����NS�Lח\U�^R|_f<��f*�|��]�{�3�-P�~�CS��t��>g�Y��#�#7Wjۋ╗=�5�����#ջ���5]>}&v�╝═�wg��eV�^>�#�{��Ѿ��ޤ��>O�� z�?{8Ij�0╗B�.�Cjm�4������║��m�,╗�������O���fS��ӂcE��g�3(�G��}d^O������7����|�
H�N��;
{��x�bȠ�׮�i]=���~��=��ٟ<��C��
wi��'a�-��p═M�6o��ϴ��ve��+��'
...
And when I try to use the parser (pdf2json) I get this error:
"An error occurred while parsing the PDF: stream must have data"
Thanks in advance :D
You want to download a file from Google Drive.
You want to convert the downloaded data to the buffer.
You have already been able to download files from Google Drive using googleapis with Node.js.
If my understanding is correct, how about this modification? In this modification, the file is downloaded as the stream type and the data is converted to the buffer.
Modified script:
From:
drive.files.get({
auth: jwToken,
fileId: file.id,
alt: 'media',
supportsAllDrives: true
}, function (err, res) {
if (err) {
return reject('The API returned an error: ' + err);
};
console.log(res);
const buffer = res;
resolve(buffer);
});
To:
drive.files.get(
{
auth: jwToken,
fileId: file.id,
alt: "media",
supportsAllDrives: true
},
{ responseType: "stream" },
function(err, { data }) {
if (err) {
return reject("The API returned an error: " + err);
}
let buf = [];
data.on("data", function(e) {
buf.push(e);
});
data.on("end", function() {
const buffer = Buffer.concat(buf);
console.log(buffer);
// fs.writeFile("filename", buffer, err => console.log(err)); // For testing
resolve(buffer);
});
}
);
Note:
As a test case, I could confirm that when buffer is saved to a file using fs.writeFile("filename", buffer, err => console.log(err));, the downloaded file can be created.
Reference:
google-api-nodejs-client
If I misunderstood your question and this was not the direction you want, I apologize.

Node.js - AWS - Program Terminates Before Upload to S3 Bucket Completes

I've written a program that creates HTML files. I then attempt to upload the files to my S3 bucket at the end of the program. It seems that the problem is that my program terminates before allowing the function to complete or receiving a callback from the function.
Here is the gist of my code:
let aws = require('aws-sdk');
aws.config.update({
//Censored keys for security
accessKeyId: '*****',
secretAccessKey: '*****',
region: 'us-west-2'
});
let s3 = new aws.S3({
apiVersion: "2006-03-01",
});
function upload(folder, platform, browser, title, data){
s3.upload({
Bucket: 'html',
Key: folder + platform + '/' + browser + '/' + title + '.html',
Body: data
}, function (err, data) {
if (err) {
console.log("Error: ", err);
}
if (data) {
console.log("Success: ", data.Location);
}
});
}
/*
*
* Here is where the program generates HTML files
*
*/
upload(folder, platform, browser, title, data);
If I call the upload() function (configured with test/dummy data) before the HTML generation section of my code, the upload succeeds. The test file successfully uploads to S3. However, when the function is called at the end of my code, I do not receive an error or success response. Rather, the program simply terminates and the file isn't uploaded to S3.
Is there a way to wait for the callback from my upload() function before continuing the program? How can I prevent the program from terminating before uploading my files to S3? Thank you!
Edit: After implementing Deiv's answer, I found that the program is still not uploading my files. I still am not receiving a success or error message of any kind. In fact, it seems like the program just skips over the upload() function. To test this, I added a console.log("test") after calling upload() to see if it would execute. Sure enough, the log prints successfully.
Here's some more information about the project: I'm utilizing WebdriverIO v4 to create HTML reports of various tests passing/failing. I gather the results of the tests via multiple event listeners (ex. this.on('test:start'), this.on('suite:end'), etc.). The final event is this.on('end'), which is called when all of the tests have completed execution. It is here were the test results are sorted based on which Operating System it was run on, Browser, etc.
I'm now noticing that my program won't to do anything S3 related in the this.on('end') event handler even if I put it at the very beginning of the handler, though I'm still convinced it's because it isn't given enough time to execute because the handler is able to process the results and create HTML files very quickly. I have this bit of code that lists all buckets in my S3:
s3.listBuckets(function (err, data) {
if (err) {
console.log("Error: ", err);
} else {
console.log("Success: ", data.Buckets);
}
});
Even this doesn't return a result of any kind when run at the beginning of this.on('end'). Does anyone have any ideas? I'm really stumped here.
Edit: Here is my new code which implement's Naveen's suggestion:
this.on('end', async (end) => {
/*
* Program sorts results and creates variable 'data', the contents of the HTML file.
*/
await s3.upload({
Bucket: 'html',
Key: key,
Body: data
}, function (err, data) {
if (err) {
console.log("Error: ", err);
}
if (data) {
console.log("Success: ", data.Location);
}
}).on('httpUploadProgress', event => {
console.log(`Uploaded ${event.loaded} out of ${event.total}`);
});
}
The logic seems sound, but still I get no success or error message, and I do not see the upload progress. The HTML file does not get uploaded to S3.
You can use promises to wait for your upload function to finish. Here's what it will look like:
function upload(folder, platform, browser, title, data) {
return new Promise((resolve, reject) => {
s3.upload({
Bucket: 'html',
Key: folder + platform + '/' + browser + '/' + title + '.html',
Body: data
}, function(err, data) {
if (err) {
console.log("Error: ", err);
return reject(err);
}
if (data) {
console.log("Success: ", data.Location);
return resolve(); //potentially return resolve(data) if you need the data
}
});
});
}
/*
*
* Here is where the program generates HTML files
*
*/
upload(folder, platform, browser, title, data)
.then(data => { //if you don't care for the data returned, you can also do .then(() => {
//handle success, do whatever else you want, such as calling callback to end the function
})
.catch(error => {
//handle error
}

async upload multiple files to google cloud storage bucket

I'm trying to upload multiple files to a Google Cloud Storage bucket using NodeJS. I want all files to be uploaded before continuing. I tried several approaches but I can't seem to get it right.
const jpegImages = await fs.readdir(jpegFolder);
console.log('start uploading');
await jpegImages.forEach(async fileName => {
await bucket.upload(
path.join(jpegFolder, fileName),
{destination: fileName}
).then( () => {
console.log(fileName + ' uploaded');
})
})
console.log('finished uploading');
This gives me the following output, which is not what I expect. Why is the 'finished uploading' log not executed after uploading the files?
start uploading
finished uploading
image1.jpeg uploaded
image2.jpeg uploaded
image3.jpeg uploaded
async/await doesn't work with forEach and other array methods.
If you don't need sequential uploading (files can be uploaded in parallel) you could create an array of Promises and use Promise.all() to execute them all at once.
const jpegImages = await fs.readdir(jpegFolder);
console.log('start uploading');
await Promise
.all(jpegImages.map(fileName => {
return bucket.upload(path.join(jpegFolder, fileName), {destination: fileName})
}))
.then(() => {
console.log('All images uploaded')
})
.catch(error => {
console.error(`Error occured during images uploading: ${error}`);
});
console.log('finished uploading');

Resources