Storing big text data in database [closed] - node.js

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am trying to build a blogging site(sort of). The users can write big blogs(or text) and also have facility for customisation like fonts, size, colour of text etc (kind of like posts in stack overflow n little more). I am looking to use mongo-db or couch-base for the database part. Now I am confused in few things
Where should I store the blogs or posts? In database or in text files? If in database how will I store the fonts, size, colour(user can have different fonts, sizes for different part of posts)?? The posts can sometimes be very big, so is it advisable to store such large texts in database. I see the easier option to store them as files(text files) but I am worried about performance of the site as loading text files can be slow in websites. Just for a knowledge sake, How does google store google docs files??
Should I use any other database which is more suited to handling the kind of things I mentioned?
Though Full search of texts in the post is not a feature I am looking into right now, but might afterwards. So take that also for a small consideration for your answer.
Please help me.

Honestly MongoDB has been the best database for our NodeJS projects. Before it had a 4MB maximum BSON document size, however it was increased to 8 MB and now to 16 MB with the latest versions. This is actually a fair amount of text. According to my calculation you should be able to store 2097152 characters in a 16MB object (though that includes the overhead)
Be aware that you are able to split up text into separate BSON documents very easily using GridFS.
I saw you were entertaining the idea of using flat files. While this may be easy and fast, you will have a hard time indexing the text for later use. MongoDB has the ability to index all your text and implementing search will be a fairly easy feature to add.
MongoDB is pretty fast and I have no doubt it will be the fastest database solution for you. Development in NodeJS + MongoDB has taken months off projects for my firm compared to SQL based databases. Also I have seen some pretty impressive performance reviews for it. Keep in mind as well that those performance reviews were last year and I have seen even more impressive reviews but that was what I could find easily today.

Related

Does storing object / JSON data in chunks improve performance in MongoDB?

I have an application API that is written in Node.JS and uses MongoDB for a database. Part of this system is a messaging service, consisting of 'servers', which have 'channels', similar to the structure of Discord.
Each channel stores an array of references, which corresponds to messages.
This is so that way it's not all one monolithic array of objects. My question is, will storing this array in chunks (for example, by date) improve read/write performance on the database?
I assume it would improve memory usage, which is important for this project, but I don't know much about MongoDB performance and speed.
If I formatted this wrong I'm greatly sorry, I don't normally use StackOverflow so I don't really know much about the formatting rules beyond the basics.

Is it advisable to use short field names in a schema in Mongoose? [duplicate]

This question already has answers here:
Is shortening MongoDB property names worthwhile?
(7 answers)
Closed 7 years ago.
Perhaps related to the unanswered question I asked here yesterday is the question about whether it is advisable to use short field names in a schema in Mongoose? According to the article here, longer field names do result in a larger database in the server's hard disk, as well as the any cache used in the memory.
Is that true? Why didn't the designers of Mongoose use some kind of mapping mechanism for the field names to save space in the database?
Yes, property names do increase the size of the documents, like the article mentioned. If I were you, I honestly wouldn't worry about it. Mongo is a pretty liberal database as far as resources is concerned, it's not known for conserving disk space or RAM. If you have severely limited resources, MongoDB is probably not the database you'd want. Most people don't care about space that much as it becomes an issue only on a very very large scale, where other hadoop based databases become much better alternatives to Mongo. As far as why the devs didn't implement some sort of mapping? Who knows? Probably performance considerations, and the fact that like I mentioned, it wasn't designed for deployments where saving a few bytes per document would be a deal breaker.

Pros and cons of using an Excel file as a database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm looking for detailed answers to the question : What are the pros and cons of using an Excel file as a database ?
One of the pros seems to be that users are familiar with Excel and can work with the tables without needing to know about databases. There are however many reasons not to use Excel as a database.
- Even though you can do some validation in Excel it is no match to any good database program.
- When importing data from an Excel file to, for instance, a SQL database you often run into problems because of the misinterpretation of the valuetypes
- Also when importing dates the interpretation may fail
- Strings like 000234 will most likely be read as numbers and end up as 234
- As stated before the sharing of the database is very limited
- But one of my main concerns using Excel as a database is the fact that it is a single file that can easily be copied to various locations which may cause you to end up with several versions of it with different data
Cons: size/performance, sharing
Pro: none
P.S. If VBA is an issue, why not Access?
I wouldn't really suggest that Excel is or can properly act like database - as it lacks the features, data protection and security to act as such.
If the reason to use this is based upon ease of use and end user familiarity - it is quite easy to connect Excel as a front end to a database - using it as a reading and writing device, whilst taking advantage of the speed and stability issues of a 'true' database.
Pros:
Very familiar
VBA is easy to use to create fairly simple to use sheets
Lots of functions to manipulate data
Cons:
Slow and VERY clunky with large data set
Hard to validate on imported data
Prone to crashing with large datasets
Lack ability to use intelligent queries or views
Many more..

Ideas for full text search MongoDB & node.js [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am developing a search engine for my website and i want to add following features to it;
Full text search
Did you mean feature
Data store in MongoDB
I want to make a restful backend. I will be add data to mongodb manually and it will be indexed (which one i can prefer? Mongodb indexing or some other search indexing libraries like Lucene). I also want to use node.js. These are what i found from my researches. Any idea would be appreciated for the architecture
Thanks in advance
I'm using Node.js / MongoDB / Elasticsearch (based on Lucene). It's an excellent combination. The flow is stunning as well, since all 3 packages (can) deal with JSON as their native format, so no need for transforming DTO's etc.
Have a look:
http://www.elasticsearch.org/
I personally use Sphinx and MongoDb, it is a great pair and I have no problems with it.
I back MongoDB onto a MySQL instance which Sphinx just quickly indexes. Since you should never need to actively index _id, since I have no idea who is gonna know the _id of one of your objects to search for, you can just stash it in MySQL as a string field and it will work just fine.
When I pull the results back out of Sphinx all I do is convert to (in PHP) a new MongoId, or in your case a ObjectId and then simply query on this object id for the rest of the data. It couldn't be simpler, no problems, no hassle, no nothing. And I can spin off the load of reindexing delta indexes off to my MySQL instance keeping my MongoDB instance dealing with what it needs to: serving up tasty data for the user.

Lucene or Mysql Full text search [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Nowadays when starting a web/mobile app project in which search is going to be an important variable. Is it better to go with Lucene from the start or quickly deploy a MySQL based solution and hope for the best?
I had the same decision in November 2010. I'm a friend of mysql and tried to build an search application on mysql first - which works well...
...and fast (i thought it was fast): searching in 200.000 documents (in not more than 2-3 second)
I avoided spending time to lucene/solr, because i would like to use this time for developing the application. And: lucene was new for me... I don't know, if it is good enough, i don't know what it is....
Finally: You can't change the habits of a lifetime.
However, i run in different problems with fuzzy search (which is difficult to implement in mysql) or "more like this" (which have to be coded from scrat in an application using mysql or simple use that "more like this" solr-feature out of the box).
Finally the number of documents rises up to a million and mysql needs now more than 15 seconds to search into the documents.
So i decided to start with lucene and it feels like i opened a door to a new world.
Lot's of features (i hardly coded application-features) are now provided from solr and working out of the box. The fulltext searches are much, much faster: less than 50ms in 1 million Documents, less than 1ms, if it is cached.
So the invested time has paid off.
So if you think about to make an fulltext search: take lucene, if you have mor than a couple of data.
By the way: i'm using an hybrid construct: holding the data in mysql and lucene is only an index with (nearly) no stored data (to keep that index small and fast).
generically speaking, if you are going to have full text searches, you will most surely need lucene or sphinx + mysql (or lucene + mysql, storing the indexable fields in lucene, and returning an id for a mysql row). either of them are excellent choices.
if you are going to do "normal" searches (i.e: integer or char columns or date), mysql partitoning will suffice.
you need to specify what are you going to search for. and how often you will be reindexing your db (if you are going to reindex a lot, i'd go with sphinx)
You are asking whether to go with Lucene or MySQL. But Lucene is a library, and MySQL is a server. You should really be deciding between SOLR search engine and MySQL. In that case, the right answer is likely to be both. Manage all the data in MySQL. Run processes to regularly extract changed data, transform it into SOLR search format, and load it into the search engine. Using SOLR is much more straightforward than using Lucene directly, and if you need to modify the behavior in some way, you can still write plugins for SOLR so there is no loss of flexibility.
But it would be the kiss of death to try and manage data with SOLR. The cycle of read-edit-update works great with SQL dbs but it is not what SOLR is all about. SOLR is fast flexible text search. You can stick image URLs in SOLR for convenience of preparing search results using a non-indexed field.

Resources