Uploading and requesting images from Meteor app - node.js

I want to upload images from the client to the server. The client must see a list of all images he or she has and see the image itself (a thumbnail or something like that).
I saw people using two methods (generically speaking)
1- Upload image and save the binaries to MongoDB
2- Upload an image and move it to a folder, save the path somewhere (the classic method, and the one I implemented so far)
What are the pros and cons of each method and how can I retrieve the data and show it in a template in each case (getting the path and writing to the src attribute of and img tag and sending the binaries) ?
Problems found so far: when I request foo.jpg (localhost:3000/uploads/foo.jpg) that I uploaded and the server moved to a known folder, my router (iron router) fails to find how to deal with the request.

1- Upload image and save the binaries to MongoDB
Either you limit the file size to 16MB and use only basic mongoDB, either you use gridFS and can store anything (no size limit). There are several pro-cons of using this method, but IMHO it is much better than storing on the file system :
Files don't touch your file system, they are piped to you database
You get back all the benefits of mongo and you can scale up without worries
Files are chunked and you can only send a specific byte range (useful for streaming or download resuming)
Files are accessed like any other mongo document, so you can use the allow/deny function, pub/sub, etc.
2- Upload an image and move it to a folder, save the path somewhere
(the classic method, and the one I implemented so far)
In this case, either you store everything in your public folder and make everything publicly accessible using the files names + paths, either you use dedicated asset delivery system, such as an ngix server. Either way, you will be using something less secure and maintenable than the first option.
This being said, have a look at the file collection package. It is much simpler than collection-fs and will offer you everything you are looking for out of the box (including a file api, gridFS storing, resumable uploads, and many other things).
Problems found so far: when I request foo.jpg
(localhost:3000/uploads/foo.jpg) that I uploaded and the server moved
to a known folder, my router (iron router) fails to find how to deal
with the request.
Do you know this path leads to your root folder public/uploads/foo.jpg directory? If you put it there, you should be able to request it.

Related

How to displaly PDF files which were indexed by solr in a Angular app with a node express API

I want my angular app to show a list of PDF files which were previously indexed by a solr server. The PDF file should then open in my app (pdf-viewer installed and working with external PDF files like this one. Since Angular can't access/display local files I thought I might use a node API which I'm currently using in my angular app to get some data from a db to also get the list of PDF files.
I just don't know how...
The indexed files have three fields (fileName, fileDir, fileAbsolutePath) which I can get by using the solr query (https://myServerAdress/solr/CoreName/select?fl=fileDir%2C%20fileName%2C%20fileAbsolutePath&q=*%3A*) in case it's relevant.
I don't need a exact tutorial on how to do this. A rough approach on how to do this would be sufficient and very helpful!
Screenshot and notes of the actual goal
You have to map the file name indexed to a path that your node application have access to - then make a request to your node application that returns the file. Either directly or through something like X-Sendfile.
Exactly how you do this will depend on the framework you're using in node.
Solr does not have any method to retrieve the actual raw file content for serving, so the absolute path you've indexed (or file name if you have a static dir) will have to be used. Be careful about not serving files or documents outside of your intended path.

Where save uploaded avatar

I can't find a response to my question.
I'm building a React app using NodeJS and CRA and i need to implement an uploading avatar system. But i'm not sure where to save the uploaded image. My Node server serve a static folder 'public', so does i need to save images in /public/avatar? But each time i will make update on the app and re-build the client-side folder, this will overwrite the public folder and remove all the previous uploaded avatar ? I'm right ? So what are you suggesting me ?
Thanks,
There are multiple locations that you can store your user uploaded images, though storing them in your public is probably not the best location.
In the case where you were using a database like MongoDB, you could store the image inside Mongo using gridfs and serve the data using a route when you retrieve the user information. Similarly, you can also store in the database a path to the file, and return the path, or the file data, from the route as well.
Be careful with user uploads, however, as arbitrarily allowing uploaded data can lead to unanticipated results if you're not careful.
You could also use Gravatar (https://gravatar.com/).
User can choose an avatar assigned to their mail address hash, or use an automatically generated one by default.
Though with this solution you cannot let users change their avatar directly on your website.
It is widely used on well known websites like StackOverflow.

NodeJS, how to handle image uploading with MongoDB?

I would like to know what is the best way to handle image uploading and saving the reference to the database. What I'm mostly interested is what order do you do the process in?
Should you upload the images first in the front-end (say Cloudinary), and then call the API with result links to the images and save it to the database?
Or should you upload the images to the server first, and upload them from the back-end and save the reference afterwards?
OR, should you do the image uploading after you save the record in the database and then update it once the images were uploaded?
It really depends on the resources, timeline, and number of images you need to upload daily.
So basically if you have very few images to upload then you can upload that image to your server then upload it to any cloud storage(s3, Cloudinary,..) you are using. As this will be very easy to implement(you can find code snippet over the internet) and you can securely maintain your secret keys/credential to your cloud platform on the server side.
But, according to me best way of doing this will be something like this. I am taking user registration as an example
Make server call to get a temporary credential to upload files on the cloud(Generally, all the providers give this functionality i.e. STS/Signed URL in AWS).
The user will fill up the form and select the image on the client side. When the user clicks the submit button make one call to save the user in the database and start upload with credentials. If possible keep a predictable path for upload. Like for user upload /users/:userId or something like that. this highly depends on your use case.
Now when upload finishes make a server call for acknowledgment and store some flag in the database.
Now advantages of this approach are:
You are completely offloading your server from handling file operations which are pretty heavy and I/O blocking and you are distributing that load to all clients.
If you want to post process the files after upload you can easily integrate this with serverless platforms and do that on there and again offload that.
You can easily provide retry mechanism to your users in case of file upload fails but they won't need to refill the data, just upload the image/file again
You don't need to expose the URL directly to the client for file upload as you are using temporary Creds.
If the significance of the images in your app is high then ideally, you should not complete the transaction until the image is saved. The approach should be to create an object in your code which you will eventually insert into mongodb, start upload of image to cloud and then add the link to this object. Finally then insert this object into mongodb in one go. Do not make repeated calls. Anything before that, raise an error and catch the exception
You can have many answers,
if you are working with big files greater than 16mb please go with gridfs and multer,
( changing the images to a different format and save them to mongoDB)
If your files are actually less than 16 mb, please try using this Converter that changes the image of format jpeg / png to a format of saving to mongodb, and you can see this as an easy alternative for gridfs ,
please check this github repo for more details..

For a web app that allows simple image uploads, how should I store the images? Confused about file system vs. cdn

Every search result says something about storing the images in the file system but store the paths in the database, but I'm not sure exactly what "file system" means. Would that mean you have something like:
/public (assets)
/js
/css
/img
/app (frontend)
/server (backend)
and you'd upload directly to that /public/img directory?
I remember trying something like that in the past with a Node.js app hosted on Heroku, and it wouldn't let me. I had to set up Amazon S3 and upload the images THERE, which leads to my confusion.
Is using something like Amazon S3 the usual practice or do people upload directly to the /img directory (assuming this is the "file system"?) and it just happened to be the case that Heroku doesn't allow this but other hosts do?
I'd characterize the pattern as "store the data in a blob storage service, store a pointer in your database". The uploaded file is the "blob" - once it has left the user's computer and filesystem, is it really a file anymore? :) On the server, a file system can store that "blob". S3 can store that blob. In the first case, you are storing a path. In the second case, you are storing the URL to the S3 object. A database could even store that blob (not at all recommended, though...)
In any case, the question to ask is: "what happens when I need two app servers to support my traffic?". Wherever that blob goes, both app servers need access to it.
In a data center under your control, there are many ways to share a filesystem across servers - network attached storage (NFS- or SMB-mounted volumes), or storage area networks (iSCSI, Fibre Channel). With more limited network/hardware configuration options in cloud-based Infrastructure/Platform-as-a-Service providers, the de facto standard is S3 because it is inexpensive, reliable, easy to use, and can completely offload serving the file from your servers.
For Heroku, though, you don't have much control over the file system. And, know that the file system for each of your dynos is "ephemeral" - it goes away when the dyno restarts. Which will happen when your app goes idle, or every 24 hours, whichever comes first. So that forces the choice a little.
Final point - S3 comes with the ancillary benefit of taking the burden of serving the blob off of your servers. You can also store files directly to S3 from the browser, without routing it through your app (see https://devcenter.heroku.com/articles/s3-upload-node). The benefit in both cases is that those downloads/uploads can take up lots of your application's precious time for stuff that's pretty rote.
Uploading directly to a host file system is generally not a best practice. This is one reason services like S3 are so popular.
If you're using the host file system and ever need more than one instance of a server, the file systems will grow out of sync. Imagine one user uploads 'foo.jpg' to server A (A/app/uploads) and another uploads 'bar.jpg' to server B (B/app/uploads). When either of these images is later requested, the request has a 50% chance of failing, depending on whether the load balancer routes the request to server A or server B.
There are several ancillary benefits to avoiding the host filesystem. For instance, you can set the filesystem serving your app to read-only for increased security. Files are a form of state, and stateless web servers allow you to do things like blow away one instance and deploy another instance to take over its work.
You might find this of help:
https://codeforgeek.com/2014/11/file-uploads-using-node-js/
I used multer in my node.js server file to handle uploading from the front end. Basically I had an html form that would submit the image to the server file, where it would be handled by multer. This actually led it to be saved in the file system (to answer your question concretely, yes, this was to something like the /img directory right in your project file structure). My application is running on heroku, and this feature works on there as well. However, I would not recommending using the file system to store your image like this (I doubt you will have enough space for a large amount of images/files) - using AWS storage or a DB would be better.

Expressjs File Upload Customization

Expressjs has bodyParser middleware which can handle file-uploads and can even store them in a directory given in options. But in my app I want to store the files in Amazon S3, so I basically want to stream the file straight to S3 without having to store it locally at all.
But the problem is validation of the file. How can I be sure that these files are all images. Checking the content-type isn't good enough option coz that can be faked. I want to know is it ok if I do the validation after streaming the file to S3?? I am asking from the security point of view.
After storing the image, I need to retrieve it for creating thumbnails, How can I do it asynchronously after giving the response after file upload?
You have contradictory goals of not wanting to store it locally during upload but then also wanting to download it needlessly again to make thumbnails. If you want to go for technical slickness awards, you can simultaneously stream the file upload request body to a local temporary file as well as S3. Or you can do what the rest of the industry does and store it in a local temporary file and then thumbnail it, and then upload all sizes to S3. Either of these approaches alleviates any need to immediately download it from S3 to make thumbnails.
How exactly do you intend to validate that it's really an image? You could look at the first chunk of file data and validate for the file type's magic number if that gives you warm fuzzies, but ultimately it's untrusted user data. The second half of the supposed image file could be virus code and that is just as easily faked at the Content-Type header. Sounds like your security concerns are mostly driven by FUD as opposed to specific threats you intend to defend against. As long as you don't take the user's uploaded data, mark it executable and run it as root on your server, any non-image data is just going to be corrupt and fail to render correctly in a browser (and/or cause your thumbnailer program to exit with an error or perhaps crash in an extreme case).
Regarding validation can I just try to create a thumbnail and if I can't then its not a valid image and delete it. Is this way fine?
Most of the time, yes. There will be edge cases where your thumbnailer cannot process an image but a browser can as thumbnailers are not perfect and some images are partially corrupt. For example, I have found some animated GIFs that render and animate fine in a web browser but graphicsmagick crashes trying to process them. Not sure there's anything that can be done about those 0.01% edge cases.
And for uploads part, can I send a response to the user and than carry on with the thumbnail creation and storing it in S3?
Yes, that is generally the best approach so the user knows their upload succeeded. Generally image processing is usually architected as a "work queue" model where you just record that there's work to do and then proceed and a separate process or processes take work off the queue and complete it.

Resources