Download from source and upload to gcloud using cloud function - node.js

I'm using a service called opentok which store video on their cloud and give me a callback url when file are ready so that i can download it and store on my cloud provider.
We use gcloud where we work and i need to download the file, then store it on my gcloud bucket with a firebase cloud functions.
Here is my code :
const archiveFile = await axios.get(
'https://sample-videos.com/video701/mp4/720/big_buck_bunny_720p_2mb.mp4'
);
console.log('file downloaded from opentokCloud !');
fs.writeFile('archive.mp4', archiveFile, err => {
if (err) throw err;
// success case, the file was saved
console.log('File Saved in container');
});
await firebaseBucket.upload('archive.mp4', {
gzip: true,
// destination: `archivedStreams/${archiveInfo.id}/archive.mp4`,
destination: 'test/lapin.mp4',
metadata: {
cacheControl: 'no-cache',
},
});
I tried to put directly the file downloaded in the upload() but it does not work i have to provide a String (path of my file)
How can i have the path of my downloaded file in the cloud function ? Is it still in the RAM of my container or in a cache folder ?
As you can see i tried to write with FS but i have no write access in the container of the cloud function
Thanks in advance to the community

If someone is looking for this in the futur here is how i solved it :
With Firebase cloud functions you can write temporary files in /tmp (see here for more information https://cloud.google.com/functions/docs/concepts/exec#file_system)
I solved the problem by using node-fetch package andwriteStream function of Node.JS :
const fetch = require('node-fetch');
const fs = require('fs');
await fetch(archiveInfo.url).then(res => {
console.log('start writing data');
const dest = fs.createWriteStream('/tmp/archive.mp4');
res.body.pipe(dest);
//Listen when the writing of the file is done
dest.on('finish', async () => {
console.log('start uploading in gcloud !');
await firebaseBucket.upload('/tmp/archive.mp4', {
gzip: true,
destination: `{pathToFileInBucket}/archive.mp4`,
metadata: {
cacheControl: 'no-cache',
},
});
console.log('uploading finished');
});
});
firebaseBucket is my gcloud bucket already configured elsewhere, define your own bucket using #google-cloud/storage
As my function where triggered by link, don't forget to put a response and catch errors to avoid timed out cloud functions (running for nothing, billed for nothing :D)

Related

File Upload to s3 bucket via NodeJS Console app via aws-sdk doesnt get completed

I do have a one time running JS file, which is running as a command line tool. But not as a REST server.
The issue I have is that I do have the following function which accepts the arguments and uploads a file to a specified S3 bucket.
const uploadToAWSS3Bucket = (stream, fileName, bucketName) =>{
const params = {
Bucket: bucketName || '',
Key: fileName,
Body: stream
};
console.log(`Using Bucket ${bucketName} for uploading the file ${fileName}`);
return s3.upload(params, (err, data) => {
if (err) {
console.log(err);
}
console.log(data.stringify);
console.log(`File uploaded successfully. ${data.Location}`);
console.log(`Finished uploading the file ${fileName} to Bucket ${bucketName}.`);
}).promise();
// await sleep(80000);
};
This is called/implemented by the following method.
(async()=>{
const result = await uploadToAWSS3Bucket(stream, 'filename.json', 'mybucketname');
console.log(result);
});
However, the node index.js command exits with giving out a commandline output and it appears that the file upload never gets completed because of that.
Anything that I am missing or any trick that would work on this case?
The command exits without doing anything because your IIFE is missing a () at the end.
(async () => {
console.log('do something');
})();

Download file from from GCP Storage bucket is very slow with npm library #google-cloud/storage in nodejs/typescript app

I'm downloading files from GCP Cloud Storage bucket from a NodeJS/Express app with Typescript, using the official library #google-cloud/storage. I'm running application locally, inside docker image that runs on docker-compose. Standard local environment, I guess.
The problem is that the file download takes so much time, and I really don't understand why is so.
In fact, I tried to download files with GCP REST API (through media link url), using simple fetch request: in this case, everything goes well and the download time is ok.
Here below download time comparison for a couple of files of different dimensions:
1KB: #google-cloud/storage 621 ms, fetch 224 ms
587KB: #google-cloud/storage 4.1 s, fetch 776 ms
28MB: #google-cloud/storage 2 minutes and 4 seconds, fetch 4 s
#google-cloud/storage authentication is managed through GOOGLE_APPLICATION_CREDENTIALS environment variable. I have the same problems with both 5.8.5 and 5.14.0 versions of the #google-cloud/storage library.
Precisely, I need to get the file as buffer in order to directly manage its content in the Node application, below the code.
import fetch from 'node-fetch'
import { Storage as GoogleCloudStorageLibrary } from '#google-cloud/storage'
export interface GoogleCloudStorageDownload {
fileBuffer: Buffer;
fileName: string;
}
// this method takes long time to retrieve the file and resolve the promise
const downloadBufferFile = async (filePath: string, originalName: string): Promise<GoogleCloudStorageDownload> => {
const storage = new GoogleCloudStorageLibrary()
const bucket = storage.bucket('...gcp_cloud_storage_bucket_name...')
return new Promise<GoogleCloudStorageDownload>((resolve, reject) => {
bucket
.file(filePath)
.download()
.then((data) => {
if (Array.isArray(data) && data.length > 0 && data[0]) {
resolve({ fileBuffer: data[0], fileName: originalName })
}
})
.catch((e) => {
if (e.code === 404) {
reject(new Error(`CloudStorageService - ${e.message} at path: ${filePath}`))
}
reject(new Error(`Error in downloading file from Google Cloud bucket at path: ${filePath}`))
})
})
}
// this method takes normal time to retrieve the file and resolve
const downloadBufferFileFetch = async (filePath: string, originalName: string): Promise<GoogleCloudStorageDownload> {
const fetchParams = {
headers: {
Authorization: 'Bearer ...oauth2_bearer_token...'
}
}
const fetchResponse = await fetch(filePath, fetchParams)
if (!fetchResponse.ok) {
throw new Error(`Error in fetch request: ${filePath}`)
}
const downloadedFile = await fetchResponse.buffer()
const result = {
fileBuffer: downloadedFile,
fileName: originalName
}
return result
}
const filePath = '...complete_file_path_at_gcp_bucket...'
const originalName = 'fileName.csv'
const slowResult = await downloadBufferFile(filePath, originalName)
const fastResult = await downloadBufferFileFetch(filePath, originalName)
The bucket has standard configuration.
You may suggest to just use the REST API with fetch but that should be not optimal and/or annoying, since I would have to manage the Authorization Bearer token and its refresh, for each environment the application will be running on.
Am I doing something wrong? What may be the cause of the very/extremely slow download?

How to download files in /tmp folder of Google Cloud Function and then upload it in Google Cloud Storage

So I need to deploy a Google Cloud Function that allow me to make 2 things.
The first one is to DOWNLOAD any files on SFTP/FTP server on /tmp local directory of the Cloud Function. Then, the second step, is to UPLOAD this file in a bucket on the Google Cloud Storage.
Actually I know how to upload but I don't get how to DOWNLOAD files from ftp server to my local /tmp directory.
So, actually I have written a GCF that receive in parameters (on the body), the configuration (config) to allow me to connect on the FTP server, the filename and the path.
For my test I used the following ftp server test: https://www.sftp.net/public-online-sftp-servers with this configuration.
{
config:
{
hostname: 'test.rebex.net',
username: 'demo',
port: 22,
password: 'password'
},
filename: 'FtpDownloader.png',
path: '/pub/example'
}
After my DOWNLOAD, I start my UPLOAD. For that I check if I found the DOWNLOAD file in '/tmp/filename' before to UPLOAD but the file is nerver here.
See the following code:
exports.transferSFTP = (req, res) =>
{
let body = req.body;
if(body.config)
{
if(body.filename)
{
//DOWNLOAD
const Client = require('ssh2-sftp-client');
const fs = require('fs');
const client = new Client();
let remotePath
if(body.path)
remotePath = body.path + "/" + body.filename;
else
remotePath = "/" + body.filename;
let dst = fs.createWriteStream('/tmp/' + body.filename);
client.connect(body.config)
.then(() => {
console.log("Client is connected !");
return client.get(remotePath, dst);
})
.catch(err =>
{
res.status(500);
res.send(err.message);
})
.finally(() => client.end());
//UPLOAD
const {Storage} = require('#google-cloud/storage');
const storage = new Storage({projectId: 'my-project-id'});
const bucket = storage.bucket('my-bucket-name');
const file = bucket.file(body.filename);
fs.stat('/tmp/' + body.filename,(err, stats) =>
{
if(stats.isDirectory())
{
fs.createReadStream('/tmp/' + body.filename)
.pipe(file.createWriteStream())
.on('error', (err) => console.error(err))
.on('finish', () => console.log('The file upload is completed !!!'));
console.log("File exist in tmp directory");
res.status(200).send('Successfully executed !!!')
}
else
{
console.log("File is not on the tmp Google directory");
res.status(500).send('File is not loaded in tmp Google directory')
}
});
}
else res.status(500).send('Error: no filename on the body (filename)');
}
else res.status(500).send('Error: no configuration elements on the body (config)');
}
So, I received the following message: "File is not loaded in tmp Google directory" because after fs.stat() method, stats.isDirectory() is false. Before I use the fs.stats() method to check if the file is here, I have just writen files with the same filenames but without content.
So, I conclude that my upload work but without DONWLOAD files is really hard to copy it in the Google Cloud Storage.
Thanks for your time and I hope I will find a solution.
The problem is that your not waiting for the download to be completed before your code which performs the upload starts running. While you do have a catch() statement, that is not sufficient.
Think of the first part (the download) as a separate block of code. You have told Javascript to go off an do that block asynchronously. As soon as your script has done that, it immediately goes on to do the the rest of your script. It does not wait for the 'block' to complete. As a result, your code to do the upload is running before the download has been completed.
There are two things you can do. The first would be to move all the code which does the uploading into a 'then' block following the get() call (BTW, you could simplify things by using fastGet()). e.g.
client.connect(body.config)
.then(() => {
console.log("Client is connected !");
return client.fastGet(remotePath, localPath);
})
.then(() => {
// do the upload
})
.catch(err => {
res.status(500);
res.send(err.message);
})
.finally(() => client.end());
The other alternative would be to use async/await, which will make your code look a little more 'synchronous'. Something along the lines of (untested)
async function doTransfer(remotePath, localPath) {
try {
let client - new Client();
await client.connect(config);
await client.fastGet(remotePath, localPath);
await client.end();
uploadFile(localPath);
} catch(err) {
....
}
}
here is a github project that answers a similar issue to yours.
here they deploy a Cloud Function to download the file from the FTP and upload them directly to the bucket, skipping the step of having the temporal file.
The code works, the deployment way in this github is not updated so I'll put the deploy steps as I suggest and i verified they work:
Activate Cloud Shell and run:
Clone the repository from github: git clone https://github.com/RealKinetic/ftp-bucket.git
Change to the directory: cd ftp-bucket
Adapt your code as needed
Create a GCS bucket, if you dont have one already you can create one by gsutil mb -p [PROJECT_ID] gs://[BUCKET_NAME]
Deploy: gcloud functions deploy importFTP --stage-bucket [BUCKET_NAME] --trigger-http --runtime nodejs8
In my personal experience this is more efficient than having it in two functions unless you need to do some file editing within the same cloud function

Node.js Cloud Function - Stream CSV data directly to Google Cloud Storage file

I have a script that can call a RESTful API and retrieve CSV data from a report in chunks. I'm able to concatenate, parse, and display this data in the console. I am also able to write this CSV data to a local file and store it.
What I am trying to figure out is how to skip creating a file to store this data before uploading it to GCS and instead transfer it directly into Google Cloud Storage to save as a file. Since I am trying to make this a serverless cloud function, I am trying to stream it directly from memory into a Google Cloud Storage file.
I found this 'Streaming Transfers' documentation on google, but it only references doing this with 'gsutil' and I am struggling to find any examples or documentation on how to do this with node.js. I also tried to follow this answer on Stack overflow, but it's from 2013 and the methods seem a little out-dated. My script also isn't user-facing, so I don't need to hit any routes.
I am able to upload local files directly to my bucket using the function below, so Authentication isn't an issue. I'm just unsure how to convert a CSV blob or object in memory into a file in GCS. I haven't been able to find many examples so wasn't sure if anyone else has solved this issue in the past.
const { Storage } = require('#google-cloud/storage');
const storage = new Storage({
projectId,
keyFilename
});
function uploadCSVToGCS() {
const localFilePath = './test.csv';
const bucketName = "Test_Bucket";
const bucket = storage.bucket(bucketName);
bucket.upload(localFilePath);
};
I also found a 3rd party plugin that Google references called 'boto' that seems to do what I want, but this is for python, not node.js unfortunately.
Streaming object data to Cloud Storage is illustrated in the documentation. You will need to understand how node streams work, and make use of createWriteStream. The sample code is not exactly what you want, but you'll use the same pattern:
function sendUploadToGCS (req, res, next) {
if (!req.file) {
return next();
}
const gcsname = Date.now() + req.file.originalname;
const file = bucket.file(gcsname);
const stream = file.createWriteStream({
metadata: {
contentType: req.file.mimetype
},
resumable: false
});
stream.on('error', (err) => {
req.file.cloudStorageError = err;
next(err);
});
stream.on('finish', () => {
req.file.cloudStorageObject = gcsname;
file.makePublic().then(() => {
req.file.cloudStoragePublicUrl = getPublicUrl(gcsname);
next();
});
});
stream.end(req.file.buffer);
}
#doug-stevenson thanks for pushing me in the right direction. I was able to get it to work with the following code:
const { Storage } = require('#google-cloud/storage');
const storage = new Storage();
const bucketName = 'test_bucket';
const blobName = 'test.csv';
const bucket = storage.bucket(bucketName);
const blob = bucket.file(blobName);
const request = require('request');
function pipeCSVToGCS(redirectUrl) {
request.get(redirectUrl)
.pipe(blob.createWriteStream({
metadata: {
contentType: 'text/csv'
}
}))
.on("error", (err) => {
console.error(`error occurred`);
})
.on('finish', () => {
console.info(`success`);
});
};

node js get image url

i want to
1-choose an image from my filesystem and upload it to server/local
2- get its url back using node js service . i managed to do step 1 and now i want to get the image url instead of getting the success message in res.end
here is my code
app.post("/api/Upload", function(req, res) {
upload(req, res, function(err) {
if (err) {
return res.end("Something went wrong!");
}
return res.end("File uploaded sucessfully!.");
});
});
i'm using multer to upload the image.
You can do something like this, using AWS S3 and it returns the url of the image uploaded
const AWS = require('aws-sdk')
AWS.config.update({
accessKeyId: <AWS_ACCESS_KEY>,
secretAccessKey: <AWS_SECRET>
})
const uploadImage = file => {
const replaceFile = file.data_uri.replace(/^data:image\/\w+;base64,/, '')
const buf = new Buffer(replaceFile, 'base64')
const s3 = new AWS.S3()
s3.upload({
Bucket: <YOUR_BUCKET>,
Key: <NAME_TO_SAVE>,
Body: buf,
ACL: 'public-read'
}, (err, data) => {
if (err) throw err;
return data.Location; // this is the URL
})
}
also you can check this express generator, which has the route to upload images to AWS S3 https://www.npmjs.com/package/speedbe
I am assuming that you are saving the image on the server file system and not a Storage solution like AWS S3 or Google Cloud Storage, where you get the url after upload.
Since, you are storing it on the filesystem, you can rename the file with a unique identifier like uuid or something else.
Then you can make a GET route and request that ID in query or path parameter and then read the file having that ID as the name and send it back.

Resources