There is a site that contains data I want to parse through in my application. The JSON file is in a tar.gz. My code issues a request to that site, downloads the tar.gz file, extracts the JSON and then parses the information.
This is how the code looks so far but I have not added it into my backend yet.
const fs = require("fs");
const rp = require("request-promise");
const tar = require("tar");
(async function main() {
try {
const url = "https://statics.koreanbuilds.net/bulk/latest.tar.gz";
const arcName = "latest.tar.gz";
const response = await rp.get({ uri: url, encoding: null });
fs.writeFileSync(arcName, response, { encoding: null });
tar.x({ file: arcName, cwd: ".", sync: true });
let text = fs.readFileSync("latest1.json");
let fullText = JSON.parse(text);
let championsObj = {};
// Following logic that parses the json file
.......
} catch (err) {
console.error(err);
}
})();
I plan on storing my parsed JSON object into MongoDB. I also want to perform the above operation and update the JSON and tar.gz file every 24 hours.
I am worried that these operations have many consequences when deploying this project. This is my first time deploying a Full stack application and I am almost positive that having code that messes with the file structure of the overall project will cause some issues. But I just don't know what exactly I should be worried about and how to tackle it. I believe that there will be a problem with CORS but I am more worried about the application actually working and updating correctly. The entire application is being made with the MERN stack.
When you deploy your code on a VPS, saving and reading from the filesystem is completely fine. When you deploy to PaaS like Heroku, you have to keep in mind that the filesystem is ephemeral which means that you get a fresh new copy on each deploy. Files that are not part of the version control will disappear after a release. You can't rely on the filesystem for storage and you have to use an external service to store images/files (e.g.: AWS S3).
Having said that, your code will work on Heroku because you're saving and reading from the file right away. One thing I'd do is add a date/timestamp to the downloaded file name so you don't get an error on the second run when a file with that name already exists. You could also research the possibility of extracting the archive in memory so you don't have to use the filesystem at all.
Other than that you shouldn't be worried. CORS is not relevant in this context.
Related
In my node.js server I have a post request making 2 uploads to google cloud storage--each into a different bucket.
When I tested the functionality, I was able to successfully upload 2 files to 2 different buckets, but upon the next test, 1 of the 2 uploads is failing and throwing an error: Error: Could not load the default credentials.
Why would it fail on the second test on only 1 of the uploads?
Why would it say the credentials can't be loaded if it's a fully public bucket (all users have full object access to read/write/admin)?
app.post("/upload", upload.single("image"), (req, res) => {
//this takes an image file and uploads it
async function uploadFile() {
await storage.bucket(bucketName).upload(imagePath, {destination: imagePath})
}
uploadFile().catch(console.error);
//this resizes the image
Jimp.read(imagePath, (err, img) => {
if (err) throw err
img.resize(Jimp.AUTO, 300).write(imgTitle + "-thumb.jpg")
})
//this uploads the resized image
async function uploadThumb() {
await storage.bucket(thumbs).upload(imgTitle + "-thumb.jpg", {destination: cat + "/" + subCat + "/" + imgTitle + "-thumb.jpg"})
}
setTimeout(() => { //this timeout waits 2 seconds for JIMP to finish processing the image
uploadThumb().catch(console.error);
}, 2000)
});
I'm hoping someone can explain why this stopped working after the first test. The function that uploads the resized image works in both tests, but the function that uploads the original file fails on the 2nd test throwing the error: Error: Could not load the default credentials
UPDATE
After many tests, I have possibly deduced that this is a file size issue. The thumbnail upload works every time, while the full size image fails when its size reaches ~2-3MB. Reading the GCS docs, it says that 5TB is the maximum single file upload limit, so I don't know why there is an issue with a few MB. I do not want to lower the image size/resolution as they are art works that will need to be viewed at full size (that's exactly why I'm creating the thumbnails in the first place).
This issue was resolved by me through thorough research and trial and error (weeks, I'm not proud). I knew I needed to add verification parameters into my const storage = new Storage(); (not in my question, and I know now it should have been).
Originally I was tyring to use a .env file to pass in the project-id and client_secrets.json file in every way I knew how. But after rereading the documentation for the nth time, I came across the correct syntax for it.
I needed to create constants of the project-id and the location of the .json file it resides in and pass those in as parameters like this:
const projectId = 'project-id';
const keyFileName = "client_secret.json";
const storage = new Storage({projectId, keyFileName});
The reason the uploads were failing some of the time was because I was getting some sort of a free pass to upload smaller objects, but as soon as the uploaded file size reach ~3 mb it would require verification that was not present. I still don't fully understand it, but this is how I solved it.
It looks like you're credentials are not loaded. Service credentials that is.
Have you already tried running?
gcloud auth application-default login
Also, are the bucket permission identical for both?
I have been using GCS to storage my images and also use the NodeJS package to upload these images to my bucket. I have noticed that if I frequently change an image, it either does one of the following:
It changes
It serves an old image
It doesn't change
This seems to happen pretty randomly despite setting all of the options properly and even cross-referencing that with GCS.
I upload my images like this:
const options = {
destination,
public: true,
resumable: false,
metadata: {
cacheControl: 'no-cache, max-age=0',
},
};
const file = await this.bucket.upload(tempImageLocation, options);
const { bucket, name, generation } = file[0].metadata;
const imageUrl = `https://storage.googleapis.com/${bucket}/${name}`;
I have debated whether to use the base URL you see there or use this one: https://storage.cloud.google.com.
I can't seem to figure out what I am doing wrong and how to always serve a fresh image. I have also tried ?ignoreCache=1 and other query parameters.
As per the official API documentation - accessible here - shows, you should not need the await. This might be affecting your upload sometime. If you want to use the await, you need to have your function to be async in the declaration, as showed in the second example from the documentation. Your code should look like this.
const bucketName = 'Name of a bucket, e.g. my-bucket';
const filename = 'Local file to upload, e.g. ./local/path/to/file.txt';
const {Storage} = require('#google-cloud/storage');
const storage = new Storage();
async function uploadFile() {
// Uploads a local file to the bucket
await storage.bucket(bucketName).upload(filename, {
// Support for HTTP requests made with `Accept-Encoding: gzip`
gzip: true,
// By setting the option `destination`, you can change the name of the
// object you are uploading to a bucket.
metadata: {
// Enable long-lived HTTP caching headers
// Use only if the contents of the file will never change
// (If the contents will change, use cacheControl: 'no-cache')
cacheControl: 'public, max-age=31536000',
},
});
console.log(`${filename} uploaded to ${bucketName}.`);
}
uploadFile().catch(console.error);
While this is untested, it should help you avoiding the issue with not uploading always the images.
Besides that, as explained in the official documentation of Editing Metada, you can change the way that metadata - which includes the cache control - is used and managed by your project. This way, you can change your cache configuration as well.
I also, would like to include the below link for a complete tutorial on how to send images to Cloud Storage with Node.js, in case you want to check a different approach.
Image Upload With Google Cloud Storage and Node.js
Let me know if the information helped you!
u can try change ?ignoreCache=1 to ?ignoreCache=0.
I want to get an array of file names from my project public/logos folder. I am using create-react-app template and as you guess, I can not use const fs = require('fs') in browser after project start.
So, is there any way to fill an array right after npm start command and getting file names array from folder or am I out of context?
const fs = require('fs')
const path = require('path')
const appRoot = require('app-root-path').path
const getNames = () => {
fs.readdir(path.join(appRoot, "public", "logos"), (err, files) => {
return files
})
}
Although the Sathishkumar is correct, it's not the only way: having an application server just for reading static images can be too much in many situations.
What you can do is to handle this by change the webpack configuration (this requires you eject first so be really careful).
From webpack you have all of the Nodejs features available but you must make those changes static for the webapp.
An idea:
manually copy with html-copy-plugin every image in the dist folder
read every image file in that folder from node and generate a list of image names
put the list of images as a global variable in your bundle by using webpack DefinePlugin
Now you will be able to read images names from this new global.
Note: this will not be a dynamic read of resources in a folder. If add/remove images you will be forced to repeat the build process of the app.
Yes. It is out of context. Not possible in browser-based JS application. You can't access the file system using Javascript in the browser.
You can use a NodeJS(or any other language for the same) to create a REST API as you mentioned which will return the files list and then can consume it(APIs like fetch or package - axios) in the frontend. This is the preferred way of doing.
If you need to read the files from file system you need to start server, like express, and then read this files on the server by request from frontend or by the link you pasted in your browser address field.
I'm working on a project using Google Cloud Storage to allow users to upload media files into a predefined bucket using Node.js. I've been testing with small .jpg files. I also used gsutil to set bucket permissions to public.
At first, all files generated links that downloaded the file. Upon investigation of the docs, I learned that I could explicitly set the Content-Type of each file after upload using the gsutil CLI. When I used this procedure to set the filetype to 'image/jpeg', the link behavior changed to display the image in the browser. But this only worked if the link had not been previously clicked prior to updating the metadata with gsutil. I thought that this might be due to browser caching, but the behavior was duplicated in an incognito browser.
Using gsutil to set the mime type would be impractical at any rate, so I modified the code in my node server POST function to set the metadata at upload time using an npm module called mime. Here is the code:
app.post('/api/assets', multer.single('qqfile'), function (req, res, next) {
console.log(req.file);
if (!req.file) {
return ('400 - No file uploaded.');
}
// Create a new blob in the bucket and upload the file data.
var blob = bucket.file(req.file.originalname);
var blobStream = blob.createWriteStream();
var metadata = {
contentType: mime.lookup(req.file.originalname)
};
blobStream.on('error', function (err) {
return next(err);
});
blobStream.on('finish', function () {
blob.setMetadata(metadata, function(err, response){
console.log(response);
// The public URL can be used to directly access the file via HTTP.
var publicUrl = format(
'https://storage.googleapis.com/%s/%s',
bucket.name, blob.name);
res.status(200).send(
{
'success': true,
'publicUrl': publicUrl,
'mediaLink': response.mediaLink
});
});
});
blobStream.end(req.file.buffer);
});
This seems to work, from the standpoint that it does actually set the Content-Type on upload, and that is correctly reflected in the response object as well as the Cloud Storage console. The issue is that some of the links returned as publicUrl cause a file download, and others cause a browser load of the image. Ideally I would like to have both options available, but I am unable to see any difference in the stored files or their metadata.
What am I missing here?
Google Cloud Storage makes no assumptions about the content-type of uploaded objects. If you don't specify, GCS will simply assign a type of "application/octet-stream".
The command-line tool gsutil, however, is smarter, and will attach the right Content-Type to files being uploaded in most cases, JPEGs included.
Now, there are two reasons why your browser is likely to download images rather than display them. First, if the Content-Type is set to "application/octet-stream", most browsers will download the results as a file rather than display them. This was likely happening in your case.
The second reason is if the server responds with a 'Content-Disposition: attachment' header. This doesn't generally happen when you fetch GCS objects from the host "storage.googleapis.com" as you are doing above, but it can if you, for instance, explicitly specified a contentDisposition for the object that you've uploaded.
For this reason I suspect that some of your objects don't have an "image/jpeg" content type. You could go through and set them all with gsutil like so: gsutil -m setmeta 'Content-Type:image/jpeg' gs://myBucketName/**
I'm quite new to node.js and would like to do the following:
user can upload one file
upload should be saved to amazon s3
file information should be saved to a database
script shouldn't be limited to specific file size
As I've never used S3 or done uploads before I might have some
wrong ideas - please correct me, if I'm wrong.
So in my opinion the original file name should be saved into the db and returned for download but the file on S3 should be renamed to my database entry id to prevent overwriting files. Next, should the files be streamed or something? I've never done this but it just seems not to be smart to cache files on the server to then push them to S3, does it?
Thanks for your help!
At first I recommend to look at knox module for NodeJS. It is from quite reliable source. https://github.com/LearnBoost/knox
I write a code below for Express module, but if you do not use it or use another framework, you should still understand basics. Take a look at CAPS_CAPTIONS in the code, you want to change them according to your needs / configuration. Please also read comments to understand pieces of code.
app.post('/YOUR_REQUEST_PATH', function(req, res, next){
var fs = require("fs")
var knox = require("knox")
var s3 = knox.createClient({
key: 'YOUR PUBLIC KEY HERE' // take it from AWS S3 configuration
, secret: 'YOUR SECRET KEY HERE' // take it from AWS S3 configuration
, bucket: 'YOUR BUCKET' // create a bucket on AWS S3 and put the name here. Configure it to your needs beforehand. Allow to upload (in AWS management console) and possibly view/download. This can be made via bucket policies.
})
fs.readFile(req.files.NAME_OF_FILE_FIELD.path, function(err, buf){ // read file submitted from the form on the fly
var s3req = s3.put("/ABSOLUTE/FOLDER/ON/BUCKET/FILE_NAME.EXTENSION", { // configure putting a file. Write an algorithm to name your file
'Content-Length': buf.length
, 'Content-Type': 'FILE_MIME_TYPE'
})
s3req.on('response', function(s3res){ // write code for response
if (200 == s3res.statusCode) {
// play with database here, use s3req and s3res variables here
} else {
// handle errors here
}
})
s3req.end(buf) // execute uploading
})
})