Mongoose speed up the search on multiple fields - node.js

I am working on a search feature over mongoose documents where I have to search over 250,000 documents.
In this feature I have to add search indexes over multiple fields.
In documents some of the fields are string type,
some are multi level objects.
I have indexed all the possible fields.
At local I am having 100,000 documents and when I search over them it took around 300-400ms.
But when I search over them on server It took around 10-15 sec to respond.
The search query is conditional based but I am sharing a small code snippet.
$and(
{
$or:[
{'field1': {$regex: re }},
{'field2': {$regex: re }},
{'level1.level2.value': {$regex: re }}
]
},
{
$and:[
{
lowAge: {$lte: parseInt(age)}
},
{
highAge: {$gte: parseInt(age)}
},
{
$or:[
{
gender:gender
},
{
gender:"N/A"
}
]
}
]
}
)
Can someone advice me that how can I speed up the process on server.

To speed further more, you can use the text index.
But text index comes with the following Storage Requirements and Performance Costs
Text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufficiently high limit on open file descriptors. See the recommended settings.
Text indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents. As a result, phrase queries will run much more effectively when the entire collection fits in RAM.
Please see the below references
https://docs.mongodb.com/manual/core/index-text/
https://www.tutorialspoint.com/mongodb/mongodb_text_search.htm
Hope it helps!

Related

Can mongoose batch update based on an array of objects that matches the collection?

I am working on a project in Express/Node, and I am utilizing a MongoDB database that has a collection of Course documents that represent a course in my school system that changes in real-time. The Course documents in my database each look like this:
Course Document
{
courseID: Number,
restrictions: String,
status: String,
}
My program has to check for changes in the school's course system, and update any changes that it sees and updates my private MongoDB database with the changes. To accomplish this, I currently have a script that looks at all the courses in the school system, and records them in an array of objects, with each object corresponding to a course.
var allCourses =
[
{
courseID: 123456,
restrictions: "A and B",
status: "OPEN"
},
{
courseID: 678990,
restrictions: "A",
status: "FULL",
}
]
The goal now is to be able to go through my database, and skip the documents that are the same as the corresponding javascript object in the array, and update those that are not.
Obviously, I could just iterate through my array with forEach, and update every single course by filtering by 'courseID' and updating both fields one document at a time, but I can foresee that this would take a large amount of time.
I was wondering if there was a batch update function, similar to the insertMany operation, that can take my array of objects and update my database documents that correspond to an object within the array?
These are helpful links
Trying to do a bulk upsert with Mongoose. What's the cleanest way to do this?
https://docs.mongodb.com/manual/reference/method/db.collection.insertMany/

Node Mongoose Seperate Indexes for Seperate Queries

Can we create completely separate indexes for completely separate queries on the same collection?
I want an efficient query for users retrieving their activities using an index like so
index{ userDBID: 1 }
Example query
ActivityModel.find({ userDBID }).lean();
I want a separate efficient query for entire app statistics which gets activities also, but needs use a separate compound index like so
index{season: 1, matchID: 1}
Example queries
ActivityModel.find({ season, matchID }).lean()
ActivityModel.find({ season }).lean();
I am finding it hard to find a solid high-quality answer. I know hint() seems to be a solution, but I am sceptical about that one.
Daniel
Of course you can.
You can just add:
schema.index({ userDBID: 1 });
schema.index({ season: 1, matchID: 1 });
right after your schema declaration, before saving the Model with mongoose.model('Model', schema);.
You will see (after a while) the new schema added in the DB. If you use an inspection tool like MongoDB Compass you'll even have a visual representation.
I am using this efficiently in a production app so I am certain of this (just today's usage):
http://prntscr.com/qj1n2o

Elasticsearch js with Node.js: How to return aggregated results from multiple indexes?

We have two indexes: posts and users. We'd like to make queries on these two indexes, search for a post in the index "posts" and then go to the index "users" to get the user info, to eventually return an aggregated result of both the user info and the post we found.
Let me clarify it a bit with an example:
posts:
[
{
post: "this is a post about stack overflow",
username: "james_bond",
user_id: "007"
},
{...}
]
users:
[
{
username: "james_bond",
user_id: "007",
bio: "My name's James. James Bond."
nb_posts: "7"
},
{...}
]
I want to search for all the posts which contain "stack overflow", and then display all the users who are talking about it and their info (from the "users" index), it could look something like this:
result: {
username: "james_bond",
user_id: "007",
post: "this is a post about stack overflow",
bio: "My name's James. James Bond"
}
I hope this is clear enough, I'm sorry if this question has already been answered but I honestly didn't find any answer anywhere.
So is it possible to do so with only ES js?
I dont beleive it is possible to do exactly what you are asking as it would be very costly to join across two indexes which are potentially sharded across different nodes (this is not a main use case for elasticsearch). But if you have control of the data within elastic search you could structure the data so that you can acheive a different type of joining.
You can either use:
nested query
Where documents may contain fields of type nested. These fields are used to index arrays of objects, where each object can be queried (with the nested query) as an independent document.
has_child and has_parent queries
A join field relationship can exist between documents within a single index. The has_child query returns parent documents whose child documents match the specified query, while the has_parent query returns child documents whose parent document matches the specified query.
Denormalisation
Alternativly you could store the user denormalised within the post document when you insert the document into the index. This becomes a balancing act between saving time from doing multiple reads every time a post is viwed (fully normalised) and the cost of updating all posts from user 007 everytime his detials change (denormalised). There is a tradeoff here, you dont need to denormalise everything and as you have it you have already denormalised the username from users to posts.
Here is a Question/Answer that gives more detials on the options.

Is there any way other than just nesting a tag array in MongoDB for a Blog-Tag system?

I am trying to write a blog engine for myself with node.js/express/mongodb (also a start to learn node.js). To go a little further than the tutorials on the Internet, I want to add tags support to the blog engine.
I want to do the following things with tags:
Viewers could see all the tags as a tag cloud on a "tag cloud page"
Viewers could see the tags that an article has on article list page and single article page
Viewers are able to click on a single tag to show the article list
What's more, viewers are able to search articles with particluar tags in the SO way: [tag1][tag2] --> /tags/tag1+tag2 --> list of articles that has both tag1 and tag2
In relational database, a post_tag table will be used for this. But how to desgin this in MongoDB?
I have checked MongoDB design - tags
But as efdee comments, the design
db.movies.insert({
name: "The Godfather",
director: "Francis Ford Coppola",
tags: [ "mafia", "wedding", "violence" ]
})
has a problem:
This doesn't seem to actually answer his question. How would you go about getting a distinct list of tags used in the entire movie collection?
That's also my concern: in my design, I need to show a list of all the tags; I also need to know how many articles each tag has. So is there a better way than the design shown above?
My concern with the design above is: if I want to show a list of the tags, the query will go over all the article items in the database. Is there a more efficient way?
You'd need to create a multi key index on tags to start with.
Then you will be able to find document matching tag using this syntax
db.movies.find({ "tags": { $all : [ /^this/, /^that/ ] }})
Because you're using the ^ (start of string) of the reg ex mongo will still use the index.
To get keyword densities, using the aggregation framework, you could simple get a count.
db.movies.aggregate({ $project: { _id:0, tags: 1}},
{ $unwind: "$tags" },
{ $group : { _id : "$tags", occur : { $sum : 1 }}})
Sorry formatting difficult from iPad.
You would end up with collection of docs looking like:
{
_id: "mytag",
occur: 383
},
{
_id: "anothertag",
occur: 23
},
Using the aggregate command you get an inline result back, so would be down to the client app (or server) to serialise or cache the result if it's frequently used.
Let me know how you get on with that.
Hth
Sam
How would you go about getting a distinct list of tags used in the entire movie collection?
db.movies.distinct("tags")
For efficient queries, I'd probably duplicate data. tags are very unlikely to ever be edited, so I'd put the tags array in the article object, and then also put the tags in a tags collection, and tags has either a count of articles containing that tag, or an array of article ids.
db.movies.insert({
id: 1,
name: "The Godfather",
director: "Francis Ford Coppola",
tags: [ "mafia", "wedding", "violence" ]
});
db.tags.insert([
{name: "mafia", movie_count: 1},
{name: "wedding", movie_count: 1},
{name: "violence", movie_count: 1}
});
You could perform your 4 tasks using MapReduce functions. For example, for the list of all tags you'd emit the tag as the key and then in the reduce function you'd count them all up and return the count. That would be the route I'd go down. It may require a little more thought, but it's definitely powerful.
http://cookbook.mongodb.org/patterns/count_tags/

MongoDB C# Logging Search Results

I am working on a mobile site that lets you search for tags of a MongoDB collection of articles.
Basically, each article object has a tags property, which stores an array of tag strings.
The search works fine, but I also want to add logging to the searches.
The reason is that I want to see what visitors are searching for and what results they are getting in order to optimize the tags.
For example, if the user enters the tag grocery, then I want to save the query results.
Hope my question is clear. thank you!
You can't optimize something without measuring. You'll need to be able to compare new results with old results. So you'll have to save a snapshot of all the information crucial to a search query. This obviously includes the search terms itself, but also an accurate snapshot of the result.
You could create snapshots of entire products, but it's probably more efficient to save only the information involved in determining the search results. In your case these are the article tags, but perhaps also the article description if this is used by your search engine.
After each search query you'll have to build a document similar to the following, and save this in a searchLog collection in MongoDB.
{
query: "search terms",
timestamp: new Date(), // time of the search
results: [ // array of articles in the search result
{
articleId: 123, // _id of the original article
name: "Lettuce", // name of the article, for easier analysis
tags: [ "grocery", "lettuce" ] // snapshot of the article tags
// snapshots of other article properties, if relevant
},
{
articleId: 456,
name: "Bananas",
tags: [ "fruit", "banana", "yellow" ]
}
]
}

Resources