Mongoose, Sort Results by Reference Attributes

Mongoose, Sort Results by Reference Attributes - node.js

I will attempt to address the problem as clearly as possible.
Let's say I have two collections, Col_A and Col_B.
Col_A has an attribute, attribute C, that references our other collection, Col_B, via ObjectID.
image for visual reference
I want to query for documents from Col_A and sort by attribute E in Col_B.
A practical example, if the above is too broad, would be Users and Listings on a site like eBay. For this example, let's say Col_A is "Listings", Col_B is "Users", attribute C is "seller", and attribute E is "rating".
Query: grab all listings matching some criteria, sorted by sellers with the highest rating
That would mean "grab all documents from Listings matching some given criteria, and sort the results by the rating attribute for each Users document the listings reference.
How would one go about doing this? I understand how to apply basic sorting logic via mongoose, such as:
Listings.find({"attribute A" : "something"}).sort({"some_field": "uhh"}).exec(function(err, docs) { ... });
, but can I nest in the attributes for referenced documents and sort by those? Do I need to somehow apply queries within the sort parameters or something? Is Mongoose capable of something like this? I looked at Mongoose's populate functionality, but I can't get anything working, and apparently there's a bug that discourages folks from actually using populated documents for sorting.
Github Issue
I apologize if this question has been answered already. I've been digging through the documentation, SO, and various other sites, but I can't get anything going. Thanks in advance!

You could use populate method to populate seller and the query would return an array like this:
[
{
title: 'Listing #1',
seller: {
name: 'John',
rating: 100
}
}
{
title: 'Listing #2',
seller: {
name: 'John',
rating: 20
}
}
]
then you can sort the data like this:
Listings.find({"attribute A" : "something"}).populate('seller').exec(function(err, docs) {
docs.sort(function compare(a, b){
let c = 0;
if (a.seller.rating > b.seller.rating) {
c = 1;
} else if (b.seller.rating > a.seller.rating) {
c = -1;
}
return c;
});
// return or response with docs
});
It might not be what you are looking but I tried :)
Mongo or Mongoose will not do this kind of sort since populate method makes 2 queries and Mongo doesn't know there's any relation between them.

Related

Get number of products from each category in mongodb database

I'm new to mongodb and to overall databases side of development.
I'm trying to make a product listing site where all the categories would be displayed with the number of products within that particular category and when clicked on a particular category, it would get me all the products in that category.
Some things to note are:
every product will have only one category
each category will have multiple products
I don't know how to go about this problem and tried searching it online but couldn't exactly find what I was looking for. I've also tried making the schema for this but I do not know if it's the right approach or not and this is how it looks:
const productsSchema = {
category: String,
name: String,
price: String,
description: String,
thumbnail: String,
};
Side note: I'm using MERN stack.(if its of any help)

If I've understand well your question, you can use something like this:
db.collection.aggregate([
{
"$match": {
"category": "category1"
}
},
{
"$count": "total"
}
])
With this query you will get the total $count for each category.
Example here
In your frontend you will need a call for every category.
Maybe if your DB has a lot of different categories this is not a good approach, but if the number is not large enough you can call this query a couple times and you will get the result you want.
MongoDB Documentation reference here

I would say you should have a product schema and a product category schema, where the product category schema has an array of product ids that belong to that category.
In the product schema, you could also have a pointer to the category object that a product is linked to (as opposed to just the name of the category as a string).
Maybe take a look at mongoose populate https://mongoosejs.com/docs/populate.html

Storing and querying PostgreSQL database entities with multiple related entities?

Designing a PostgreSQL database that will be queried by a Node API using Sequelize. Currently, I have a table called recipes that has columns called ingredients and instructions. Those columns are stored for a given as an array of strings like {Tomatoes, Onions}.
That method of storage worked fine for simply fetching and rendering a recipe on the client side. But it wasn't working well for fuzzy search querying because, using Sequelize all I could do was ingredients: { [Op.contains] : [query] }. So if a user typed tomatoes there was no way to write a "fuzzy" search query that would return a recipe with an ingredient Tomatoes.
And then I read this in the PostgreSQL documentation:
Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.
Now I'm considering storing ingredients and instructions as separate tables, but I have a couple of questions.
1) As a recipe can have multiple ingredients related to it, should I just use a foreign key for each ingredient and the Sequelize hasMany relationship? That seems correct to me, except that I'm now potentially duplicating common ingredients each time a new recipe is created that uses that ingredient.
2) What would be the best way to write a fuzzy search query so that a user could search the main columns of the recipes table (e.g. title, description) and additionally apply their query to the instructions and ingredients tables?
Essentially I'd like to end up with a fuzzy search query applied to the three tables that looks something like this...
const recipes = await req.context.models.Recipe.findAll({
where: {
[Op.or]: [
{ title: { [Op.iLike]: '%' + query + '%' } },
{ description: { [Op.iLike]: '%' + query + '%' } },
{ ingredients: { ingredient: { [Op.iLike]: '%' + query + '%' } } },
{ instructions: { instruction: { [Op.iLike]: '%' + query + '%' } } }
]
}
});
Thanks!

I have done this, i happen to use graphql in my node layer with sequelize, and i have filter objects that do this type of thing. You'll just need some include statements in your Recipie.findAll.. after your initial where clause where you evaluate whether you are searching title or description or both type thing. i sent my search params in with prefix's i could strip off that told me what sequelize op's i would want to use on them and just ran my args through a utility method to create my where clause, but i know there are many ways to skin that cat. i just did not want to clutter up my resolvers with tonnes of hardcoded ops and conditional clauses was all.... your include might look something like this
include: [{
model: models.Ingredient,
as: 'Ingredients',
through: { some join table specifying keys where necessary since this
is many to many }
where: {some conditional code around your search param},
}, {
model: models.Instruction,
as: 'Instructions',
where: {some conditional code around your search param},
}],
There is good documentation around multiple includes, or nested includes in the sequelize docs, but from what i see above you have a fairly good understanding of what you need to do. To uncomplicate things a bit, i'd start with just searching on your fields from recipie (title, description) before you add the includes and get that working, then it will be a little clearer how you want to form your where clauses.
alternativley.. you can skip the includes and write associations in your models and call them with getters and pass the where clauses to those... i do that as well and again well documented stuff now.. Sequelize has really upped their game
Recipie.associate = function (models) {
models.Recipie.hasMany(models.Ingredient, { as: 'Ingredients', through: "recipie_ingredient" foreignKey: 'recipie_id'});
};
now you have a getter for Ingredients, and if you declare belongsToMany targetting back at Recipie in the Ingredient model then you'll have a getter there as well, and you can pass your search string to that via where clause and get all recipies that have a given ingredient or ingredient list type thing.... Clear as mud?

MongoDB: How to find a document sharing a list element with specified list

I'm currently using mongoose schemas where some of the values hold lists. For example:
var dinerSchema = mongoose.Schema({
restaurants: [String]
});
I'm looking for a way to write a mongo query that finds other documents which have at least one shared element between the two values. For example if I'm given a list of restaurants which is
[McDonalds, Burger King, Wendy's]
I want to find other documents which have restaurant values such as
[Sonic, Taco Bell, Burger King]
but not
[Red Lobster, Olive Garden, Legal Sea Foods]
I'm aware if I wanted to find documents given a single value I could do something like
dinerModel.find({ restaurants: "McDonalds" }, ...);
To return all documents which contain McDonalds in their restaurant list. However, I want to find any documents which contain ANY of the elements in a certain list. Is there a way to query for this? I don't think I can just do "or" queries because I don't know the size of the list of restaurants that I'll be looking for, and it could change from query to query.
Thanks!

Do a find with $in clause :
dinerModel.find({
'restaurants': { $in: [
'Sonic',
'Taco Bell',
'Burger King'
]}
}, function(err, docs){
if (err) {
console.log(err); // deal somehow
return;
}
console.log(docs);
}
});

How should I model my MongoDB collection for nested documents?

I'm managing a MongoDB database for a building products store. The most immediate collection is products, right?
There are quite several products, however they all belong to one among a set of 5-8 categories and then to one subcatefory among a small set of subcategories.
For example:
-Electrical
*Wires
p1
p2
..
*Tools
p5
pn
..
*Sockets
p11
p23
..
-Plumber
*Pipes
..
*Tools
..
PVC
..
I will use Angular at web site client side to show whole products catalog, I think about AJAX for querying the right subset of products I want.
Then, I wonder whether I should manage one only collection like:
{
MainCategory1: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategory2: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategoryn: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
}
}
Or a single collection per each category. The number of documents might not be higher than 500. However I care about a balance for:
quick DB answer,
easy server side DB querying, and
client-side Angular code for rendering results to html.
I'm using mongodb node.js module, not Mongoose now.
What CRUD operations will I do?
Inserts of products, I'd also like to have a way to obtain autogenerated ids (maybe sequential) per each new register. However, as it might seem natural I wouldn't offer the _id to the user.
Querying the whole documents set of a subcategory. Maybe just obtaining a few attributes at first.
Querying whole or a specific subset of attributes of a document (product) in particular.
Modifying a product's attributes values.

I agree client side should get the easiest result to render. However, to nest categories into products is still a bad idea. The trade off is once you want to change, for example, the name of a category, it will be a disaster. And if you think about the possible usecases, for example:
list all categories
find all subcategories of a certain category
find all products in a certain category
You'll find it hard to do these stuff with your data structure.
I had same situation in my current project. So here's what I do for your reference.
First, categories should be in a separate collection. DON'T nest categories into each other, as it will complicate the procedure to find all subcategories. The traditional way for finding all subcategories is to maintain an idPath property. For example, your categories are divided into 3 levels:
{
_id: 100,
name: "level1 category"
parentId: 0, // means it's the top category
idPath: "0-100"
}
{
_id: 101,
name: "level2 category"
parentId: 100,
idPath: "0-100-101"
}
{
_id: 102,
name: "level3 category"
parentId: 101,
idPath: "0-100-101-102"
}
Note with idPath, parentId is not necessary anymore. It's for you to understand the structure easier.
Once you need to find all subcategories of category 100, simply do the query:
db.collection("category").find({_id: /^0-100-/}, function(err, doc) {
// whatever you want to do
})
With category stored in a separate collection, in your product you'll need to reference them by _id, just like when we use RDBMS. For example:
{
... // other fields of product
categories: [100, 101, 102, ...]
}
Now if you want to find all products in a certain category:
db.collection("category").find({_id: new RegExp("/^" + idPath + "-/"}, function(err, categories) {
var cateIds = _.pluck(categories, "_id"); // I'm using underscore to pluck category ids
db.collection("product").find({categories: { $in: cateIds }}, function(err, products) {
// products are here
}
})
Fortunately, category collection is usually very small, with only hundreds of records inside (or thousands). And it doesn't varies a lot. So you can always store a live copy of categories inside memory, and it can be constructed as nested objects like:
[{
id: 100,
name: "level 1 category",
... // other fields
subcategories: [{
id: 101,
... // other fields
subcategories: [...]
}, {
id: 103,
... // other fields
subcategories: [...]
},
...]
}, {
// another top1 category
}, ...]
You may want to refresh this copy every several hours, so:
setTimeout(3600000, function() {
// refresh your memory copy of categories.
});
That's all I get in mind right now. Hope it helps.
EDIT:
to provide int ID for each user, $inc and findAndModify is very useful. you may have a idSeed collection:
{
_id: ...,
seedValue: 1,
forCollection: "user"
}
When you want to get an unique ID:
db.collection("idSeed").findAndModify({forCollection: "user"}, {}, {$inc: {seedValue: 1}}, {}, function(err, doc) {
var newId = doc.seedValue;
});
The findAndModify is an atomic operator provided by mongodb. It will guarantee thread safety. and the find and modify actually happens in a "transaction".
2nd question is in my answer already.
query subsets of properties is described with mongodb Manual. NodeJS API is almost the same. Read the document of projection parameter.
update subsets is also supported by $set of mongodb operator.

How to populate Comment count to an Item list?

I have two models in my app: Item and Comment. An Item can have many Comments, and a Comment instance contains a reference to an Item instance with key 'comment', to keep track of the relationship.
Now I have to send a JSON list of all Items with their Comment count when user requests on a particular URL.
function(req, res){
return Item.find()
.exec(function(err, items) {
return res.send(items);
});
};
I am not sure how can I "populate" comment count to the items. This seems to be a common problem and I tend to think there should be some nicer way of doing this job than brute force.
So please share your thoughts. How would you "populate" the Comment count to the Items?

check the MongoDB documentation and look for the method findAndModify() -- with it you can atomically update a document, e.g. add a comment and increment the document counter at the same time.
findAndModify
The findAndModify command atomically modifies and returns a single document. By default, the returned document does not include the modifications made on the update. To return the document with the modifications made on the update, use the new option.
Example
Use the update option, with update operators $inc for the counter, and $addToSet for adding the actual comment to an embedded array of comments.
db.runCommand(
{
findAndModify: "item",
query: { name: "MyItem", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { commentCount: 1 },
$addToSet: {comments: new_comment} }
}
)
See:
MongoDB: findAndModify
MongoDB: Update Operators

I did some research on this issue and came up with following results. First, MongoDB docs suggest:
In general, use embedded data models when:
you have “contains” relationships between entities.
you have one-to-many relationships where the “many” objects always appear with or are viewed in the context of their parent documents.
So in my situation, it makes much more sense if Comments are embedded into Items, instead of having independent existence.
Nevertheless, I was curious to know the solution without changing my data model. As mentioned in MongoDB docs:
Referencing provides more flexibility than embedding; however, to
resolve the references, client-side applications must issue follow-up
queries. In other words, using references requires more roundtrips to
the server.
As multiple roundtrips are kosher now, I came up with following solution:
var showList = function(req, res){
// first DB roundtrip: fetch all items
return Item.find()
.exec(function(err, items) {
// second DB roundtrip: fetch comment counts grouped by item ids
Comment.aggregate({
$group: {
_id: '$item',
count: {
$sum: 1
}
}
}, function(err, agg){
// iterate over comment count groups (yes, that little dash is underscore.js)
_.each(agg, function( itr ){
// for each aggregated group, search for corresponding item and put commentCount in it
var item = _.find(items, function( item ){
return item._id.toString() == itr._id.toString();
});
if ( item ) {
item.set('commentCount', itr.count);
}
});
// send items to the client in JSON format
return res.send(items);
})
});
};
Agree? Disagree? Please enlighten me with your comments!
If you have a better answer, please post here, I'll accept it if I find it worthy.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string