Bulk delete in Sequelize using join query - node.js

I have two tables in a 1-many relationship, Device and TestResult. A Device can have many TestResults.
My application presents user with a filterable search page that returns a list of TestResults. It's effectively a query builder in the back end. The user can add filters based on both Device attributes and TestResult attributes. For instance, a user may search for all TestResults that passed and were performed during a time range, for Devices who's serial number falls within a specific range. The results should be grouped by Device, so querying from the Device class.
Here is an example of what a FindOptions object might look like, which I would pass to the Device.findAndCountAll() method:
let results = Device.findAndCountAll(
{
include: [
{
model: TestResult,
attributes: [
'id',
'testBlockId',
'deviceId',
'type',
'fieldResponses',
'stationName',
'summary',
'createdAt'
],
where: {
createdAt: { [Symbol(gte)]: '2020-03-27T11:54:43.100Z' },
stationName: { [Symbol(in)]: [ 'Red', 'Green' ] },
fieldResponses: {
[Symbol(and)]: [
{ [Symbol(like)]: '%"Customer":"CustomerA"%' },
{ [Symbol(like)]: '%"Batch":"4"%' }
]
},
testBlockId: { [Symbol(in)]: [ 2, 3 ] },
summary: 'True'
},
as: 'testResults'
}
],
attributes: [ 'id', 'serialNumber', 'createdAt' ],
limit: 100,
offset: 0,
order: [ [ 'serialNumber', 'ASC' ] ],
where: {
serialNumber: { [Symbol(between)]: [ '000000001000', '000000200000' ] }
}
}
)
I'm now trying to add an option to delete all TestResult records that are returned by one of these searches but I'm not sure what the proper way to do this with Sequelize is.
The DestroyOptions type does not have an include[] attribute, so I don't know how to add an INNER JOIN to a DELETE query in Sequelize.
It might be possible to call TestResults.findAll() and destroy records in the .then() function, but I haven't figured out how to do this. Without a LIMIT set, the query might return hundreds of thousands of rows, way too many to hold in memory after Sequelize turns them all into complex objects. I also don't want to delete the records one at a time.
Ideally, a query would look like this:
DELETE FROM testResult
WHERE summary = 'True'
AND deviceId IN (
SELECT id FROM device
WHERE serialNumber BETWEEN '0000000010000' AND '000000002000'
);
But I don't know how to achieve a subquery like that in Sequelize.
Is there a proper way, using Sequelize, to do bulk deletes with complex WHERE clauses?

Related

Mongo db - how to join and sort two collection with pagination

I have 2 collections:
Office -
{
_id: ObjectId(someOfficeId),
name: "some name",
..other fields
}
Documents -
{
_id: ObjectId(SomeId),
name: "Some document name",
officeId: ObjectId(someOfficeId),
...etc
}
I need to get list of offices sorted by count of documetns that refer to office. Also should be realized pagination.
I tryied to do this by aggregation and using $lookup
const aggregation = [
{
$lookup: {
from: 'documents',
let: {
id: '$id'
},
pipeline: [
{
$match: {
$expr: {
$eq: ['$officeId', '$id']
},
// sent_at: {
// $gte: start,
// $lt: end,
// },
}
}
],
as: 'documents'
},
},
{ $sortByCount: "$documents" },
{ $skip: (page - 1) * limit },
{ $limit: limit },
];
But this doesn't work for me
Any Ideas how to realize this?
p.s. I need to show offices with 0 documents, so get offices by documets - doesn't work for me
Query
you can use lookup to join on that field, and pipeline to group so you count the documents of each office (instead of putting the documents into an array, because you only case for the count)
$set is to get that count at top level field
sort using the noffices field
you can use the skip/limit way for pagination, but if your collection is very big it will be slow see this. Alternative you can do the pagination using the _id natural order, or retrieve more document in each query and have them in memory (instead of retriving just 1 page's documents)
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[{"$group":{"_id":null, "count":{"$sum":1}}}],
"as":"noffices"}},
{"$set":
{"noffices":
{"$cond":
[{"$eq":["$noffices", []]}, 0,
{"$arrayElemAt":["$noffices.count", 0]}]}}},
{"$sort":{"noffices":-1}}])
As the other answer pointed out you forgot the _ of id, but you don't need the let or match inside the pipeline with $expr, with the above lookup. Also $sortByCount doesn't count the member of an array, you would need $size (sort by count is just group and count its not for arrays). But you dont need $size also you can count them in the pipeline, like above.
Edit
Query
you can add in the pipeline what you need or just remove it
this keeps all documents, and counts the array size
and then sorts
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[],
"as":"alldocuments"}},
{"$set":{"ndocuments":{"$size":"$alldocuments"}}},
{"$sort":{"ndocuments":-1}}])
There are two errors in your lookup
While passing the variable in with $let. You forgot the _ of the $_id local field
let: {
id: '$id'
},
In the $exp, since you are using a variable id and not a field of the
Documents collection, you should use $$ to make reference to the variable.
$expr: {
$eq: ['$officeId', '$$id']
},

Faunadb how do I paginate an index sorted by Timestamp

I created an index on faunadb to sort by the time stamp in order to get the most recent items first, and I am trying to retrieve 100 items at a time. My problem is that when I enter the "after" parameter from the result, I receive the same results as the initial query.
This is the index I created:
CreateIndex({
name: "all_school_queries",
source: Collection('<school_queries_reversed>'),
values: values: [
{
field: ["ts", {reverse: true}]
},
{
field: ["ref"]
}
]
})
This is how I am querying the database:
Map(
Paginate(Match(Index("school_query_reverse")), {
after: [ Ref(Collection("collection_name") ,'collection ref from first query')],
}),
Lambda(
['ts',"ref"],
Get(Var("ref"))
)
)
and this is the first result:
{
before: [Ref(Collection("collection_name"), "275484304279077376")],
after: [
1598907150720000,
Ref(Collection("school_queries"), "12345"),
Ref(Collection("school_queries"), "12345")
],
}
I have used both the timestamp for the after, 1598907150720000 and the ref, 12345. I tried the console first to make sure I could get the right response, but upon entering the either result from the after, I get the same result.
I'll try to answer your question (I'm the dev adv at FaunaDB). I have to say that I'm quite confused by your question due to syntax that doesn't seem to make sense to me, so I apologize if it's not the answer you are looking for.
Things that I'm confused by.
The index syntax is wrong, did you copy this somewhere or did you rewrite this manually? If you copied it somewhere then we might display it wrongly so do let me know if that's the case. The index name does not match the name you are using so I assume this is a typo.
<school_queries_reversed>, reversed in the collection name doesn't seem to make sense to me since reverse is defined on the index, not on the collection.
Doesn't matter though, I tried to reproduce your issue, since I don't have an idea how the data looks I kept it simple.
The index I used looks as follows:
CreateIndex({
name: "all_school_queries",
source: Collection('school_queries'),
values: [
{
field: ["ts"],
reverse: true
},
{
field: ["ref"]
}
]
})
If I then query this index as follows:
Map(
Paginate(Match(Index("all_school_queries")), {size: 1}),
Lambda(
['ts',"ref"],
Get(Var("ref"))
)
)
I do get the last element I added first (reverse index)
{
after: [
1599220462170000,
Ref(Collection("school_queries"), "275735235372515847"),
Ref(Collection("school_queries"), "275735235372515847")
],
data: [
{
ref: Ref(Collection("school_queries"), "275735244842205703"),
ts: 1599220471200000,
data: {
query: "bli"
}
}
]
}
and when I use the returned after cursor to get the next page (I have specified pages of only one element here):
Map(
Paginate(Match(Index("all_school_queries")), {size: 1, after: [
1599220462170000,
Ref(Collection("school_queries"), "275735235372515847"),
Ref(Collection("school_queries"), "275735235372515847")
]}),
Lambda(
['ts',"ref"],
Get(Var("ref"))
)
)
I do get (as expected) the other element.
{
before: [
1599220462170000,
Ref(Collection("school_queries"), "275735235372515847"),
Ref(Collection("school_queries"), "275735235372515847")
],
data: [
{
ref: Ref(Collection("school_queries"), "275735235372515847"),
ts: 1599220462170000,
data: {
query: "bla"
}
}
]
}
Is that not working for you?
I had the same problem and created a forum post on how to deal with it.
https://forums.fauna.com/t/filter-by-timestamp-with-gql-resolver/3302
I guess most people are missing
paginated: true in the gql schema or
Map(Var("page"), Lambda(["ts", "ref"], Get(Var("ref")))) to pass ["ts", "ref"] to the lambda after pagination

mongodb sort by dynamic property [duplicate]

I have a Mongo collection of messages that looks like this:
{
'recipients': [],
'unRead': [],
'content': 'Text'
}
Recipients is an array of user ids, and unRead is an array of all users who have not yet opened the message. That's working as intended, but I need to query the list of all messages so that it returns the first 20 results, prioritizing the unread ones first, something like:
db.messages.find({recipients: {$elemMatch: userID} })
.sort({unRead: {$elemMatch: userID}})
.limit(20)
But that doesn't work. What's the best way to prioritize results based on whether they fit a certain criteria?
If you want to "weight" results by certain criteria or have any kind of "calculated value" within a "sort", then you need the .aggregate() method instead. This allows "projected" values to be used in the $sort operation, for which only a present field in the document can be used:
db.messages.aggregate([
{ "$match": { "messages": userId } },
{ "$project": {
"recipients": 1,
"unread": 1,
"content": 1,
"readYet": {
"$setIsSubset": [ [userId], "$unread" ] }
}
}},
{ "$sort": { "readYet": -1 } },
{ "$limit": 20 }
])
Here the $setIsSubset operator allows comparison of the "unread" array with a converted array of [userId] to see if there are any matches. The result will either be true where the userId exists or false where it does not.
This can then be passed to $sort, which orders the results with preference to the matches ( decending sort is true on top ), and finally $limit just returns the results up to the amount specified.
So in order to use a calulated term for "sort", the value needs to be "projected" into the document so it can be sorted upon. The aggregation framework is how you do this.
Also note that $elemMatch is not required just to match a single value within an array, and you need only specify the value directly. It's purpose is where "multiple" conditions need to be met on a single array element, which of course does not apply here.

Sort using a field in an embedded doc in an array without considering other equivalent fields in mongodb

I have collection called Products.
Documents of Products look like this:
{
id: 123456,
recommendationByCategory: [
{ categoryId: a01,
recommendation: 3
},
{
categoryId: 0a2,
recommendation: 8
},
{
categoryId: 0b10
recommendation: 99
},
{
categoryId : 0b5
recommendation: 1
}
]
}
{
id: 567890,
recommendationByCategory: [
{ categoryId: a7,
recommendation: 3
},
{
categoryId: 0a2,
recommendation: 1
},
{
categoryId: 0b10
recommendation: 999
},
{
categoryId : 0b51
recommendation: 12
}
]
}
I want to find all the docs that contain categoryId: 0a2 in recommendationByCategory, but want to get sorted using the recommendation of the category 0a2 alone in asc order. It must not consider recommendations of other categoryId. I need id: 567890 followed by id: 123456.
I cannot use aggregation. Is it possible using Mongodb/Mongoose? I tried giving sort option of 'recommendationByCategory.recommendation: 1' but it's not working.
Expected Query: db.collection('products').find({'recommendaionByCategory.categoryId': categoryId}).sort({'recommendationByCategory.recommendation: 1'})
Expected Result:
[
{doc with id:567890},
{doc with id: 123456}
]
If you cannot use mapReduce or the aggregation pipeline, there is no easy way to both search for the matching embedded document and sort on that document's prop.
I would recommend doing the find as you do above (note the typo in the find nested key), and then sorting in-memory:
const categoryId = '0a2';
const findRec = doc => doc.recommendationByCategory.find({ categoryId }).recommendation;
db.collection('products')
.find({'recommendationByCategory.categoryId': categoryId})
.then(docs => docs.sort((a, b) => findRec(a) < findRec(b));
In regard to the Aggregation Pipeline being resource-intensive: it is several orders of magnitude more efficient than a Map-Reduce query, and solves your particular issue. Either you accept that this task will be run at a certain cost and frequency, taking into account Mongo's built-in caching, or you restructure your document schema to allow you to make this query more efficiently.

Ember data - return two resources in one request

I want to implement search for e-shop. User enters text, and API returns products AND categories which matches search phrase.
How do I get products and categories in one request ?
I'm aware I could do
return Ember.RSVP.hash( {
products: this.store.find("product", {searchTerm: "banana"})
categories: this.store.find("category", {searchTerm: "banana"})
} );
but isn't there a way to do it in a single request in order to have a better performance ?
If you can modify you backend just create a new method for search this.store.find("searchResult", {searchTerm: "banana"})
Where searchresult would be something like
{ searchResult { products: [ ... ], categories: [ ... ] } }

Resources