Filter elements within object in DynamoDB - python-3.x

I have a DynamoDB table which contains objects looking like the following:
{
'username': ...,
'subscriptions': [
...
],
...
}
And I would like to filter subscriptions for each user based on some criteria, and get back the objects which have subscriptions matching the criteria, but filter those objects so that only the matching subscriptions are present.
If a user has subscribed to 50 things, but only 3 of them match, I would like to get back an object where the `subscriptions' field is only 3 elements long. I also need the other information contained in the object.
To give a more specific example, suppose that I have two elements in my table:
{
'username': 'alice',
'subscriptions': [
1,
2
],
'email': 'alice#a.com'
},
{
'username': 'bob',
'subscriptions': [
2,
3
],
'email': 'bob#a.com'
}
And I would like to filter to get the subscription `1'. I would like to get back
{
'username': 'alice',
'subscriptions': [
1
],
'email': 'alice#a.com'
}
Or perhaps some other structure which contains all the necessary information (and no other information in order to save bandwidth).
I believe I can do this with scan(), but I do not know the specifics.

Related

Query Azure Cosmos for items that have all elements

Let's say I have couple sample items in my DB that look like this:
Example 1:
{
'id': 'someid',
'columns': ['apple', 'banana'],
'filters': {'car': 'red', 'truck': 'blue'}
}
Example 2:
{
'id': 'someid',
'columns': ['apple', 'banana', 'carrots'],
'filters': {'truck': 'blue', 'car': 'red', 'boat':'green'}
}
Fast forward to the present, I have a new set of columns and of filters that might look like:
newcolumns=['banana','apple'] and newfilters={'truck':'blue', 'car':'red'}
I want to do a query like
select *
from f
where f.columns=newcolumns and f.filters=newfilters
but I don't care about the order and they have to be a complete match. Neither set can be a super or subset of the other. So in this case my query should return example 1 but not example 2.
As a note, there are two parts to my question, the columns matching is answered by this but the filters field isn't a list so the syntax isn't the same.
You can't really match by doing an equality-comparison on two arrays like you had:
where f.columns=newcolumns and f.filters=newfilters
Instead, use Cosmos DB's built-in ARRAY_CONTAINS(), combined with ARRAY_LENGTH().
Since columns is a scalar array, it would look something like:
WHERE ARRAY_CONTAINS(c.columns, "banana")
AND ARRAY_CONTAINS(c.columns, "apple")
For filters, that isn't an array, so... first check to make sure the key is defined, then check if the value is correct:
WHERE IS_DEFINED(c.filters.truck) AND c.filters.truck="blue"
AND IS_DEFINED(c.filters.car) AND c.filters.car="red"
note: probably a good idea to turn your filters into subdocuments, since your current schema is an anti-pattern (using values as keys). Something like:
{
"filters": [
{ "vehicleType": "truck", "color": "blue" },
{ "vehicleType": "car", "color": "red" }
}
at that point you can compare with ARRAY_CONTAINS() again. Something like:
WHERE ARRAY_CONTAINS(c.filters, { "vehicleType": "truck", "color": "blue"}, true)

Bulk delete in Sequelize using join query

I have two tables in a 1-many relationship, Device and TestResult. A Device can have many TestResults.
My application presents user with a filterable search page that returns a list of TestResults. It's effectively a query builder in the back end. The user can add filters based on both Device attributes and TestResult attributes. For instance, a user may search for all TestResults that passed and were performed during a time range, for Devices who's serial number falls within a specific range. The results should be grouped by Device, so querying from the Device class.
Here is an example of what a FindOptions object might look like, which I would pass to the Device.findAndCountAll() method:
let results = Device.findAndCountAll(
{
include: [
{
model: TestResult,
attributes: [
'id',
'testBlockId',
'deviceId',
'type',
'fieldResponses',
'stationName',
'summary',
'createdAt'
],
where: {
createdAt: { [Symbol(gte)]: '2020-03-27T11:54:43.100Z' },
stationName: { [Symbol(in)]: [ 'Red', 'Green' ] },
fieldResponses: {
[Symbol(and)]: [
{ [Symbol(like)]: '%"Customer":"CustomerA"%' },
{ [Symbol(like)]: '%"Batch":"4"%' }
]
},
testBlockId: { [Symbol(in)]: [ 2, 3 ] },
summary: 'True'
},
as: 'testResults'
}
],
attributes: [ 'id', 'serialNumber', 'createdAt' ],
limit: 100,
offset: 0,
order: [ [ 'serialNumber', 'ASC' ] ],
where: {
serialNumber: { [Symbol(between)]: [ '000000001000', '000000200000' ] }
}
}
)
I'm now trying to add an option to delete all TestResult records that are returned by one of these searches but I'm not sure what the proper way to do this with Sequelize is.
The DestroyOptions type does not have an include[] attribute, so I don't know how to add an INNER JOIN to a DELETE query in Sequelize.
It might be possible to call TestResults.findAll() and destroy records in the .then() function, but I haven't figured out how to do this. Without a LIMIT set, the query might return hundreds of thousands of rows, way too many to hold in memory after Sequelize turns them all into complex objects. I also don't want to delete the records one at a time.
Ideally, a query would look like this:
DELETE FROM testResult
WHERE summary = 'True'
AND deviceId IN (
SELECT id FROM device
WHERE serialNumber BETWEEN '0000000010000' AND '000000002000'
);
But I don't know how to achieve a subquery like that in Sequelize.
Is there a proper way, using Sequelize, to do bulk deletes with complex WHERE clauses?

And condition on Join table on the same field multiple times in Sequelize NodeJS?

I have one table called Users. One user can be present on multiple Email Group.
User Table
id, name
1, Sudhir Roy
2, Rahul Singh
Email Group Table
id, emailType, userID
1, Promotional, 1
2, Advertisement, 2
3, Advertisement, 1
Need to get all users based on Email Type. (Email type is dynamic and send in array form like ["Promotional", "Advertisement"]
How can we find users based on the emailGroup.
["Advertisement"] return [1, 2] from user table
["Advertisement", "Promotional"] return [1] from user table.
I tried to use [Op.and] and [Op.in]. Both are not working.
Here is my sample code.
const user = User.findAll({
where: {
"$emailGroup.emailType$": {
[Op.in]: emailGroup // Array from client
},
},
include: {
model: EmailGroup
}
})
It works well when the email group array is single but not working when we try to find more than one email group.
You can also add where clauses to the include object.
The resulting findAll would then become:
const user = User.findAll({
include: [
{
model: EmailGroup,
where: {
emailType$: {
[Op.in]: emailGroup, // Array from client
},
},
},
],
});
As shown here in the docs: https://sequelize.org/v5/manual/querying.html#relations---associations

Handling errors with bulkinsert in Mongo NodeJS [duplicate]

This question already has answers here:
How to Ignore Duplicate Key Errors Safely Using insert_many
(3 answers)
Closed 5 years ago.
I'm using NodeJS with MongoDB and Express.
I need to insert records into a collection where email field is mandatory.
I'm using insertMany function to insert records. It works fine when unique emails are inserted, but when duplicate emails are entered, the operation breaks abruptly.
I tried using try catch to print the error message, but the execution fails as soon as a duplicate email is inserted. I want the execution to continue and store the duplicates. I want to get the final list of the records inserted/failed.
Error Message:
Unhandled rejection MongoError: E11000 duplicate key error collection: testingdb.gamers index: email_1 dup key: 
Is there any way to handle the errors or is there any other approach apart from insertMany?
Update:
Email is a unique field in my collection.
If you want to continue inserting all the non-unique documents rather than stopping on the first error, considering setting the {ordered:false} options to insertMany(), e.g.
db.collection.insertMany(
[ , , ... ],
{
ordered: false
}
)
According to the docs, unordered operations will continue to process any remaining write operations in the queue but still show your errors in the BulkWriteError.
I can´t make comment, so goes as answer:
is you database collection using unique index for this field, or your schema has unique attribute for the field? please share more information about you code.
From MongoDb docs:
"Inserting a duplicate value for any key that is part of a unique index, such as _id, throws an exception. The following attempts to insert a document with a _id value that already exists:"
try {
db.products.insertMany( [
{ _id: 13, item: "envelopes", qty: 60 },
{ _id: 13, item: "stamps", qty: 110 },
{ _id: 14, item: "packing tape", qty: 38 }
] );
} catch (e) {
print (e);
}
Since _id: 13 already exists, the following exception is thrown:
BulkWriteError({
"writeErrors" : [
{
"index" : 0,
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: restaurant.test index: _id_ dup key: { : 13.0 }",
"op" : {
"_id" : 13,
"item" : "envelopes",
"qty" : 60
}
}
],
(some code omitted)
Hope it helps.
Since you know that the error is occurring due to duplicate key insertions, you can separate the initial array of objects into two parts. One with unique keys and the other with duplicates. This way you have a list of duplicates you can manipulate and a list of originals to insert.
let a = [
{'email': 'dude#gmail.com', 'dude': 4},
{'email': 'dude#yahoo.com', 'dude': 2},
{'email': 'dude#hotmail.com', 'dude': 2},
{'email': 'dude#gmail.com', 'dude': 1}
];
let i = a.reduce((i, j) => {
i.original.map(o => o.email).indexOf(j.email) == -1? i.original.push(j): i.duplicates.push(j);
return i;
}, {'original': [], 'duplicates': []});
console.log(i);
EDIT: I just realised that this wont work if the keys are already present in the DB. So you should probably not use this answer. But Ill just leave it here as a reference for someone else who may think along the same lines.
Nic Cottrell's answer is right.

mongodb sort by dynamic property [duplicate]

I have a Mongo collection of messages that looks like this:
{
'recipients': [],
'unRead': [],
'content': 'Text'
}
Recipients is an array of user ids, and unRead is an array of all users who have not yet opened the message. That's working as intended, but I need to query the list of all messages so that it returns the first 20 results, prioritizing the unread ones first, something like:
db.messages.find({recipients: {$elemMatch: userID} })
.sort({unRead: {$elemMatch: userID}})
.limit(20)
But that doesn't work. What's the best way to prioritize results based on whether they fit a certain criteria?
If you want to "weight" results by certain criteria or have any kind of "calculated value" within a "sort", then you need the .aggregate() method instead. This allows "projected" values to be used in the $sort operation, for which only a present field in the document can be used:
db.messages.aggregate([
{ "$match": { "messages": userId } },
{ "$project": {
"recipients": 1,
"unread": 1,
"content": 1,
"readYet": {
"$setIsSubset": [ [userId], "$unread" ] }
}
}},
{ "$sort": { "readYet": -1 } },
{ "$limit": 20 }
])
Here the $setIsSubset operator allows comparison of the "unread" array with a converted array of [userId] to see if there are any matches. The result will either be true where the userId exists or false where it does not.
This can then be passed to $sort, which orders the results with preference to the matches ( decending sort is true on top ), and finally $limit just returns the results up to the amount specified.
So in order to use a calulated term for "sort", the value needs to be "projected" into the document so it can be sorted upon. The aggregation framework is how you do this.
Also note that $elemMatch is not required just to match a single value within an array, and you need only specify the value directly. It's purpose is where "multiple" conditions need to be met on a single array element, which of course does not apply here.

Resources