How to save documents like pdf,etc and images in sqlite - node.js

I am using sequelize ORM with node/express.js and would like to save images or documents into sqlite database

Have you checked out BLOB column datatype ? It allows you to save binary data.This way, you could have files of any type stored as blobs in the db.
By default you could store ~1 GB max. size image/document in a SQLite BLOB field. However, it can be increased by setting SQLITE_MAX_LENGTH

Related

How to create pdf from huge dataset of mongodb data about 10million data rows

I want to create pdf from huge dataset of mongodb. (about 10million rows)
No specific format of data. you can assume employee database
Iam using MEAN stack(open to new tech if applicable)
Approches tried:
Use nodejs lib. like pdfkit to convert mongodb result (arr of obj) to pdf by doing a for loop in result. (it causes heap out of storage issue + very slow)
Create temporary collection -> do a mongoexport to csv -> csv to html using awk -> html to pdf using wkhtmltopdf tool. (this still is very slow)
After i do mongo query -> i cannot store this data in some variable because it will cause heap out of storage issue, so i cannot do any further processing on this data.
I can query using limit and skip to get data in chunks and create html and then pdf from it. but it seems very slow process.
Possible approach i think could be to create small pdfs and then merge them together, or by using streams.
What is the most efficient way to create pdf from huge datasets?

How to save data unlimited data to mongodb collection

I am trying to save the data from an API coinbase pro, The loop will run until all data is fetched and is being saved to mongodb collection. But the main issue is when we reach 16MB , the script fails.
I need a viable solution to save unlimited data to mongodb collection and utilize it.
MongoDB documents have a maximum size of 16MB according to the docs
"The maximum BSON document size is 16 megabytes.
The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API..."
(https://docs.mongodb.com/manual/reference/limits/)
It might be worth checking out that GridFS API (but I haven't yet).
Are you trying to insert ONE document that is 16MB+? or are you trying to insert MULTIPLE documents that are adding up to 16MB+?

Storing pdf files as Blobs in Cassandra table?

I have a task to create a metadata table for my timeseries cassandra db. This metadata table would like to store over 500 pdf files. Each pdf file comprises of 5-10 MB data.
I have thought of storing them as Blobs. Is cassandra able to do that?
Cassandra isn't a perfect for such blobs and at least datastax recommends to keep them smaller than 1MB for best performance.
But - just try for your self and do some testing. Problems arise when partitions become larger and there are updates in them so the coordinator has much work to do in joining them.
A simple way to go is, store your blob separate as uuid key-value pair in its own table and only store the uuid with your data. When the blob is updated - insert a new one with a new uuid and update your records. With this trick you never have different (and maybe large) versions of your blob and will not suffer that much from performance. I think I read that Walmart did this successfully with images that were partly about 10MB as well as smaller ones.
Just try it out - if you have Cassandra already.
If not you might have a look at Ceph or something similar - but that needs it's own deployment.
You can serialize the file and store them as blob. The cost is deserialization when reading the file back. There are many efficient serialization/deserialization libraries that do this efficiently. Another way is to do what #jasim waheed suggested. However, that will result in network io. So you can decide where you want to pay the cost.

should it store it as file system or store it in Cassandra table?

I’m new to Cassandra, I want to create a social network website. I would like to know how should I store image? should it store it as file system or store it in Cassandra table?
If storing image in table, how to structure the table?
should it store it as file system or store it in Cassandra table?
It depends on the size of your images. Cassandra is a database, designed primarily to store structured data. Raw files are not structured data.
However one can still want to use Cassandra for binary blob storage because of its ability to handle multi data-centers and high availability, this is a valid reason too.
If storing image in table, how to structure the table?
If the maximum ever possible size for your images is around 1Mb - 2Mb, you can try to store this image in a regular blob column like this
CREATE TABLE images(
image_id uuid,
name text,
size_in_bytes bigint,
author text,
...
content blob,
PRIMARY KEY(image_id)
);
//Load the image by id
SELECT * FROM images WHERE image_id=xxx;
Now, if you think the image size can grow wildly up to an arbitrary size, your best chance is to manually split it in your application into chunks of fixed size (let's say 64kb for example) and store all the chunks in a wide partition:
CREATE TABLE images(
image_id uuid,
name text static,
size_in_bytes bigint static,
author text static,
...
chunk_count int static,
chunk_id uuid,
content blob,
PRIMARY KEY(image_id, chunk_id)
);
//Load all the chunks of the image
//Use iterator to fetch chunks page by page
SELECT chunk_id,content FROM images WHERE image_id=xxx;
Please notice that in this case, all meta data columns (name, size_in_bytes, author ...) should be static e.g. only stored once and not repeated for every chunk
Save images of social web sites in Cassandra table, unless your file system is network distributed, high available and scalable as Cassandra.
You can create your own image table with chunks to manage it by yourself.
Or you can try https://github.com/simpleisbeauty/SimpleStore to use store input/output stream to read/write images in Cassandra just like local file system.

MongoDB: How can I store files (Word, Excel, etc.)?

I've yet to look into storing files such as Word, Excel, etc. into MongoDB seriously and I want to know - am I able to store whole docx or excel files in MongoDB and then RETRIEVE them via querying?
Using gridfs yes.
Gridfs is a storage specification. It is not built into the DB but instead into the drivers.
You can find out more here: http://www.mongodb.org/display/DOCS/GridFS.
It's normal implementation is to break down your big documents into smaller ones and store those aprts in a chunks collection mastered by a fs.files collection which you query for your files.
MongoDB is a document database that stores JSON-like documents (called BSON). Maximum size of a BSON object is 16 megabytes, which may be too little for some use cases.
If you want to store binary data of arbitrary size, you can use GridFS (http://www.mongodb.org/display/DOCS/GridFS+Specification). GridFS automatically splits your documents (or any binary data) into several BSON objects (usually 256k in size), so you only need to worry about storing and retriving complete documents (whatever their sizes are).
As far as I know, Mongoose doesn't support GridFS. However, you can use GridFS via its native driver's GridStore. Just run npm install mongodb and start hacking!

Resources