How to perform nested query on two different conditions in Sequelize? - node.js

Here's my case.
I'm building a system in which you have users (consider them admins) that create employees and a performance report is created for every employee. The users can only view employees that were created by users from the same company. However, another company can create a report for an employee from a different company using a search field. Once a report is created for that employee, then they can view that employee within the employees list, if the report is not there, they wouldn't have seen it within that list.
Note: The model names are changed but have same characteristics
I'm trying to have an endpoint which returns all employees based on either of these two scenarios:
Return all employees created by the user which their company belong to the same company as the user initiating the request.
Return all employees that (if they don't belong to the same company as the user who created the employee) have a report associated to them and the report was created by a user from the same company as the user initiating the request.
Here's a brief ERD that explains the relationship between the models:
Here's the code which uses nested joins:
async function getAll(company) {
return await Employee.findAll({
where: {
[Op.or]: [
{
'$User.company$': { [Op.eq]: company }
},
{
'$Report.User.company$': { [Op.eq]: company }
}
]
},
include: [
{
attributes: [],
model: User.scope('withoutPassword'),
},
{
attributes: [],
model: Report,
include: {
attributes: [],
model: User.scope('withoutPassword'),
}
}
]
})
.then(employees => {
return employees
});
}

Related

mongoose update millions of records while extracting information

We have a production database with over 5 million customer customer records, each customer document has an embedded array of licenses they have applied for. And example customer document is as follows:
{
_id: ObjectId('...'),
phoneNumber: 'xxxx',
// Other customer fields
licenses: [
{
_id: ObjectId('...'),
state: 'PENDING',
expired: false,
createdAt: ISODate(''),
// Other license fields
},
// More Licenses for this customer
]
}
I have been tasked with changing the state of every PENDING license applied for during the month of September to REJECTED and sending an SMS to every customer whose pending permit just got rejected.
Using the model.where(condition).countDocuments() I have found that there is over 3 million customers (not licenses) matching the aforementioned criteria. Each customer has an average of 9 licenses.
I need assistance coming up with a strategy that won't slow down the system when performing this action. Furthermore, this is around 17GB of data.
Sending SMS is fine, I can queue details for SMS service. My challenge is processing the licenses while extracting relevant information for SMS.
First of all you have to create an index on the collection:
db.collection.createIndex( { "licenses.state": 1 } )
Then you shoud do something like that:
model.updateMany({}, {
'$set': {
'licenses.$[elem].state': 'REJECTED'
}
}, { arrayFilters: [{
'elem.createdAt': { $gte: ISODate(....) }
}],
multi: true
} ).then(function (doc)){}
If you have a replica set and your updates are on the primary instance you should not affect the secondary instances when reading on those once.
If you want to split the update on many batches you can use the _id (already indexed). Of course it depends on your _id format.

checks on associated models from top model where query

I have a model Booking, which is having hasMany relation with hotels, and hotel is having one to one relation with supppliers.
What i need is, get all booking where supplier_id = 33333.
I am trying this
BOOKINGS.findAll({
where: {
'hotels.supplier.supplier_id' : '32',
},
include: [
{
model: HOTELS,
include: [
{
model: SUPPLIERS,
],
}
],
limit : 30,
offset: 0
})
It throws error like hotels.supplier... column not found.. I tried all things because on docs of sequelze it only gives solution to add check which adds where inside the include which i can't use as it adds sub queries.
I don't want to add where check alongwith supplier model inside the include array, because it adds sub queries, so If i am having 1000 bookings then for all bookings it will add sub query which crashes my apis.
I need a solutions like this query in Sequelize.
Select col1,col2,col3 from BOOKINGS let join HOTELS on BOOKINGS.booking_id = HOTELS.booking_id, inner join SUPPLIERS on BOOKINGS.supplier_id = SUPPLIERS.supplier_id
Adding a where in the include object will not add a sub query. It will just add a where clause to the JOIN which is being applied to the supplier model. It will not crash your API in anyway. You can test it out on your local machine plenty of times to make sure.
BOOKINGS.findAll({
include: [
{
model: HOTELS,
include: [
{
model: SUPPLIERS,
where: { supplier_id: 32 }
}
]
}
],
limit: 30,
offset: 0
})
If you still want to use the query on the top level you can use sequelize.where+ sequelize.literal but you will need to use the table aliases that sequelize assigns. e.g this alias for supplier table will not work hotels.supplier.supplier_id. Sequelize assings table aliases like in the example I have shown below:
BOOKINGS.findAll({
where: sequelize.where(sequelize.literal("`hotels->suppliers`.supplier_id = 32")),
include: [
{
model: HOTELS,
include: [SUPPLIERS]
}
],
limit: 30,
offset: 0
})

What if there is a name collision when two models are associated?

When I associate two models, how do I prevent name collisions?
// Find all projects with a least one task where task.state === project.state
Project.findAll({
include: [{
model: Task,
where: { state: Sequelize.col('project.state') }
}]
})
In this example, what if there was a name property in both project and task.

Get only associations of associations

I'm currently working on a project where I have 3 models with a child and parent structure.
City has multiple Locations which has multiple Stones. Each Stone only has one Location as parent and each Location only has one City as parent.
Now, I want to create a list of all Stones that 'belong' (through a Location) to a specific City. How would I retrieve all these associations, without having to do the following:
City.find({
where: {
id: 1337
},
include: [{
model: Location,
include: [{
model: Stone
}]
}]
})
.then((city) => {
city.stones = [].concat.apply([], city.locations.map(location => location.stones));
});
I'm trying to find out if there's a "SQL only" solution, so not retrieving data / having to execute JavaScript to generate this array.

How should I model my MongoDB collection for nested documents?

I'm managing a MongoDB database for a building products store. The most immediate collection is products, right?
There are quite several products, however they all belong to one among a set of 5-8 categories and then to one subcatefory among a small set of subcategories.
For example:
-Electrical
*Wires
p1
p2
..
*Tools
p5
pn
..
*Sockets
p11
p23
..
-Plumber
*Pipes
..
*Tools
..
PVC
..
I will use Angular at web site client side to show whole products catalog, I think about AJAX for querying the right subset of products I want.
Then, I wonder whether I should manage one only collection like:
{
MainCategory1: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategory2: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategoryn: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
}
}
Or a single collection per each category. The number of documents might not be higher than 500. However I care about a balance for:
quick DB answer,
easy server side DB querying, and
client-side Angular code for rendering results to html.
I'm using mongodb node.js module, not Mongoose now.
What CRUD operations will I do?
Inserts of products, I'd also like to have a way to obtain autogenerated ids (maybe sequential) per each new register. However, as it might seem natural I wouldn't offer the _id to the user.
Querying the whole documents set of a subcategory. Maybe just obtaining a few attributes at first.
Querying whole or a specific subset of attributes of a document (product) in particular.
Modifying a product's attributes values.
I agree client side should get the easiest result to render. However, to nest categories into products is still a bad idea. The trade off is once you want to change, for example, the name of a category, it will be a disaster. And if you think about the possible usecases, for example:
list all categories
find all subcategories of a certain category
find all products in a certain category
You'll find it hard to do these stuff with your data structure.
I had same situation in my current project. So here's what I do for your reference.
First, categories should be in a separate collection. DON'T nest categories into each other, as it will complicate the procedure to find all subcategories. The traditional way for finding all subcategories is to maintain an idPath property. For example, your categories are divided into 3 levels:
{
_id: 100,
name: "level1 category"
parentId: 0, // means it's the top category
idPath: "0-100"
}
{
_id: 101,
name: "level2 category"
parentId: 100,
idPath: "0-100-101"
}
{
_id: 102,
name: "level3 category"
parentId: 101,
idPath: "0-100-101-102"
}
Note with idPath, parentId is not necessary anymore. It's for you to understand the structure easier.
Once you need to find all subcategories of category 100, simply do the query:
db.collection("category").find({_id: /^0-100-/}, function(err, doc) {
// whatever you want to do
})
With category stored in a separate collection, in your product you'll need to reference them by _id, just like when we use RDBMS. For example:
{
... // other fields of product
categories: [100, 101, 102, ...]
}
Now if you want to find all products in a certain category:
db.collection("category").find({_id: new RegExp("/^" + idPath + "-/"}, function(err, categories) {
var cateIds = _.pluck(categories, "_id"); // I'm using underscore to pluck category ids
db.collection("product").find({categories: { $in: cateIds }}, function(err, products) {
// products are here
}
})
Fortunately, category collection is usually very small, with only hundreds of records inside (or thousands). And it doesn't varies a lot. So you can always store a live copy of categories inside memory, and it can be constructed as nested objects like:
[{
id: 100,
name: "level 1 category",
... // other fields
subcategories: [{
id: 101,
... // other fields
subcategories: [...]
}, {
id: 103,
... // other fields
subcategories: [...]
},
...]
}, {
// another top1 category
}, ...]
You may want to refresh this copy every several hours, so:
setTimeout(3600000, function() {
// refresh your memory copy of categories.
});
That's all I get in mind right now. Hope it helps.
EDIT:
to provide int ID for each user, $inc and findAndModify is very useful. you may have a idSeed collection:
{
_id: ...,
seedValue: 1,
forCollection: "user"
}
When you want to get an unique ID:
db.collection("idSeed").findAndModify({forCollection: "user"}, {}, {$inc: {seedValue: 1}}, {}, function(err, doc) {
var newId = doc.seedValue;
});
The findAndModify is an atomic operator provided by mongodb. It will guarantee thread safety. and the find and modify actually happens in a "transaction".
2nd question is in my answer already.
query subsets of properties is described with mongodb Manual. NodeJS API is almost the same. Read the document of projection parameter.
update subsets is also supported by $set of mongodb operator.

Resources