stream large file upload into database using pg-promise - node.js

I would like to allow my users to upload large files <1GB to my database. I am using a database since storing raw files can be dangerous and I would like to have a single source of state in my system since its meant to be serverless.
Now the VPS am I am planning to run it on has limited ram. And multiple users should of course be able to upload simultaneously.
So in order to not exceed this ram, I would need to either
stream the image into the database as it is being uploaded from user
or I would need to first stream it into a file using something like multer and then stream it from the file into PostgreSQL as a BLOB
So is there a way to do this using pg-promise? Stream a file into the database without ever loading the whole thing into ram?

Related

How to send multipart file upload straight to mongodb in node

I can save the file to disk with formidable and then send the file bits to mongo with node, but how can I just handle streaming the file bits directly to mongo?
I don't need gridfs, these are small files. Just want to write them to the normal store.
Use options.fileWriteStreamHandler to setup your own stream. Then write to mongodb if the API accepts a stream

When uploading file chunks are they guaranteed to be received in the same order?

Javascript front end, servicestack back end.
I'm using the latest version of dropzone.js to upload large image files (up to 50GB). The file is broken into many chunks and the server receives them one by one. When I receive the last chunk I know I have the complete file and can begin processing. But what if the chunks don't arrive in order? Once the data leaves the client is it possible, due to internet routing, that the chunks could be received out of order?
The server side (service stack) has no persistence between calls (that I'm aware of) so I can't count chunks received (at least not without writing to a database or something).
Is this something I need to be concerned with and what is the best way to handle it?
First you need to know how the file chunks are sent in order to know how to handle them, e.g. whether they're using standard HTTP multipart/formdata File Uploads in which case they'll be available in ServiceStack's Request.Files collection or some other way like sending raw bytes, in which case your Request DTO will need to implement IRequiresStream to access the raw unserialized bytes.
The server can't guarantee how clients will send it, if it's guaranteed that clients only sends the chunks sequentially then the server can assume that's how it will always be sent, but for all the server knows the chunks can be sent concurrently, unordered and in parallel which it may need to support.
I'd personally avoid uploading files in chunks over independent HTTP API requests as it adds a tonne of complexity, but if the files can be up to 50GB then you're going to need to come up with a bespoke solution.
You would handle the chunks just as you would any chunked data (e.g. imagine if you had to stitch responses from several services together manually). Because the files can be so large storing them in memory (like a ConcurrentDictionary) is not an option. If you have access to a cloud storage service you may want to upload the temporary chunks in there, otherwise you'd need to store them on disk. Ideally your solution should take advantage of the final data storage solution where the file will persist.
Otherwise a naive solution would be that the server should generate a unique key like a Guid before the client uploads the file that the client would need to send along with the chunk index and total chunks that needs to be sent. Each Service would then be writing that chunk directly to disk, first at a temp file path (Path.GetTempFileName()) then after the file is written move it to a format like /uploads/{unique-id}/{chunk-index}.dat.
Either at the end of every chunk upload request, you can check that your /uploads/{unique-id}/ directory has all the chunks, if it does start the process of stitching it up and creating a single file. Although a more robust way would be for the client to initiate the file stitching after it's finished uploading all the chunks, that way if the stitch fails you can just manually call the service that stitches the files again, instead of needing to have the client re-upload the file.

Streaming multiple files in one response- Node.js

In Node.js how can I stream multiple files in one response stream? I want to make a single api call from the browser application and in the response I should be able to send multiple files from the server. How can I do this in Node.js ? Please share some hints or code samples.
The files are stored as blob data in MongoDB and I use the module called gridfs-stream to read the files.

How to serve binary (bytea) data from postgres using node?

I'm testing out postgres binary abilities by storing some mp3 data in a table. I've read you're supposed to store them in an external filesystem like S3, but for various reasons I don't want to do that right now.
So, for now I'd like to test storing files in the db. The mp3 files are TTS mp3 files from a third-party and I've stored them in a postgres table. This is working ok. But how do I serve them to the client? In other words:
client http requests the files.
node requests (pg-promise) the records (one or many).
the data arrives from db to node in binary format.
??? Do I have to convert it to a mp3 file before sending? Can I send the binary file directly? Which would be better?
client receives file(s)
client queues files in order for playing audio.
My main question is whether I need to convert the binary record I received from postgres before sending, and how to do that?

Nodejs on multiple servers files

I've 3 servers running nodejs, 4th server for ngninx load balancer (reverse proxy) and front-end code. everything works perfectly, but i want to manage file uploads. how i can manage this under this infastructure?
for example: if one of these 3 nodejs server uploads file on same server, how can i access this file?
nodejs servers are under example.com/api link, but becaouse of reverse proxy request goes to one server and i dont know on which server is particular file.
should i upload file to all nodejs servers?
If you have three separate node.js servers that are physically on separate servers with their own storage, then the usual way to share access to files is to have some shared storage that all three servers can access and then when any node.js server takes a file upload, it puts the data on the shared storage where everyone can access it.
If your three separate node.js servers are just separate processes on the same box, then they can all already access the same disk storage.
When sharing storage form separate processes or servers, you will have to make sure that your file management code is concurrency-safe - proper file locking when writing, concurrency-safe mechanisms for generating unique file names, safe caching, etc...
Or, you could use a database server for storage that all node.js servers have access to, though if you're just storing data files and don't have lots of meta data associated with them that you want to query and shared file system access is all you really need, then a database may not be the most efficient means of storing the data.
should i upload file to all nodejs servers?
Usually not since that's just not very efficient. Typically, you would upload once to a shared storage location or server or database that all servers can access.

Resources