Handling errors with bulkinsert in Mongo NodeJS [duplicate] - node.js

This question already has answers here:
How to Ignore Duplicate Key Errors Safely Using insert_many
(3 answers)
Closed 5 years ago.
I'm using NodeJS with MongoDB and Express.
I need to insert records into a collection where email field is mandatory.
I'm using insertMany function to insert records. It works fine when unique emails are inserted, but when duplicate emails are entered, the operation breaks abruptly.
I tried using try catch to print the error message, but the execution fails as soon as a duplicate email is inserted. I want the execution to continue and store the duplicates. I want to get the final list of the records inserted/failed.
Error Message:
Unhandled rejection MongoError: E11000 duplicate key error collection: testingdb.gamers index: email_1 dup key: 
Is there any way to handle the errors or is there any other approach apart from insertMany?
Update:
Email is a unique field in my collection.

If you want to continue inserting all the non-unique documents rather than stopping on the first error, considering setting the {ordered:false} options to insertMany(), e.g.
db.collection.insertMany(
[ , , ... ],
{
ordered: false
}
)
According to the docs, unordered operations will continue to process any remaining write operations in the queue but still show your errors in the BulkWriteError.

I can´t make comment, so goes as answer:
is you database collection using unique index for this field, or your schema has unique attribute for the field? please share more information about you code.
From MongoDb docs:
"Inserting a duplicate value for any key that is part of a unique index, such as _id, throws an exception. The following attempts to insert a document with a _id value that already exists:"
try {
db.products.insertMany( [
{ _id: 13, item: "envelopes", qty: 60 },
{ _id: 13, item: "stamps", qty: 110 },
{ _id: 14, item: "packing tape", qty: 38 }
] );
} catch (e) {
print (e);
}
Since _id: 13 already exists, the following exception is thrown:
BulkWriteError({
"writeErrors" : [
{
"index" : 0,
"code" : 11000,
"errmsg" : "E11000 duplicate key error collection: restaurant.test index: _id_ dup key: { : 13.0 }",
"op" : {
"_id" : 13,
"item" : "envelopes",
"qty" : 60
}
}
],
(some code omitted)
Hope it helps.

Since you know that the error is occurring due to duplicate key insertions, you can separate the initial array of objects into two parts. One with unique keys and the other with duplicates. This way you have a list of duplicates you can manipulate and a list of originals to insert.
let a = [
{'email': 'dude#gmail.com', 'dude': 4},
{'email': 'dude#yahoo.com', 'dude': 2},
{'email': 'dude#hotmail.com', 'dude': 2},
{'email': 'dude#gmail.com', 'dude': 1}
];
let i = a.reduce((i, j) => {
i.original.map(o => o.email).indexOf(j.email) == -1? i.original.push(j): i.duplicates.push(j);
return i;
}, {'original': [], 'duplicates': []});
console.log(i);
EDIT: I just realised that this wont work if the keys are already present in the DB. So you should probably not use this answer. But Ill just leave it here as a reference for someone else who may think along the same lines.
Nic Cottrell's answer is right.

Related

MongoDB query to find most recently added object in an array within a document then making a further query based on that result

I have two collections in MongoDB; users & challenges.
The structure of the users collection looks like this:
name: "John Doe"
email: "john#doe.com"
progress: [
{
_id : ObjectId("610be25ae20ce4872b814b24")
challenge: ObjectId("60f9629edd16a8943d2cab9b")
completed: true
date_completed: 2021-08-06T12:15:32.129+00:00
}
{
_id : ObjectId("611be24ae32ce4772b814b32")
challenge: ObjectId("60g6723efd44a6941l2cab81")
completed: true
date_completed: 2021-08-07T12:15:32.129+00:00
}
]
date: 2021-08-05T13:06:34.129+00:00
The structure of the challenges collection looks like this:
_id: ObjectId("610be25ae20ce4872b814b24")
section_no: 1
section_name: "Print Statements"
challenge_no: 1
challenge_name: "Hello World!"
default_code: "public class Main {public static void main(String[] args) {}}"
solution: "Hello World!"
What I want to be able to do is find the most recent entry in a particular user's 'progress' array within the users collection and based on that result I want to query the challenges collection to find the next challenge for that user.
So say the most recent challenge entry in that user's 'progress' array is...
{
_id : ObjectId("611be24ae32ce4772b814b32")
challenge: ObjectId("60g6723efd44a6941l2cab81")
completed: true
date_completed: 2021-08-07T12:15:32.129+00:00
}
...which is Section 1 Challenge 2. I want to be able to query the challenges collection to return Section 1 Challenge 3, and if that doesn't exist then return Section 2 Challenge 1.
Apologies if this is worded poorly, I am fairly new to MongoDb and unsure of how to create complex queries in it.
Thanks in advance!
One approach:
[
{ // Unwind all arrays
"$unwind":"$progress"
},
{ // Sort in descending order all documents
"$sort":{
"progress.date_completed":-1
}
},
{ // Group them together again but pick only the most recent array element
"$group":{
"_id":"$_id",
"latestProgress":{
"$first":"$progress"
}
}
},
{ // Join with other collection
"$lookup":{
"from":"challenges",
"localField":"latestProgress.challenge",
"foreignField":"challenge",
"as":"Progress"
}
},
{ // Only pick the first array element (since there will be just one)
"$set":{
"Progress":{
"$first":"$Progress"
}
}
}
]
I have provided a comment for each stage so that it would be easier to understand the idea. I'm not confident it's the best approach but it does work since I have tested.
Just a note that there could be a case where Progress field is missing. In that case there is no such challenge document.

Get 1st & last item from array of unknown length - MongoDB

I have documents in a collection. Each document may or may not have a log field. If it does, this log field will be an array. This array is of unknown length. I've been trying to use the $slice operator here as best I can & I have gotten it to return the last item in the array with log: { $slice: -1 } or the 1st item in the array with log: { $slice: 1 } but I cannot figure out how to get both from a single db find query. My query to the db is this:
db.collection('entities').find({}, {
name: 1,
log: {
$slice: -1 // returning the last item
}
})
Is this possible with a simple find query or will I have to use an aggregation query?
I did attempt something like:
db.collection('entities').find({}, {
name: 1,
"log.0": 1,
log: {
$slice: -1
}
})
But this failed due to a conflict with $slice apparently but I imagine it would fail anyway as the log field may or may not exist.

how to remove object in array by index mongodb / mongoose [duplicate]

In the following example, assume the document is in the db.people collection.
How to remove the 3rd element of the interests array by it's index?
{
"_id" : ObjectId("4d1cb5de451600000000497a"),
"name" : "dannie",
"interests" : [
"guitar",
"programming",
"gadgets",
"reading"
]
}
This is my current solution:
var interests = db.people.findOne({"name":"dannie"}).interests;
interests.splice(2,1)
db.people.update({"name":"dannie"}, {"$set" : {"interests" : interests}});
Is there a more direct way?
There is no straight way of pulling/removing by array index. In fact, this is an open issue http://jira.mongodb.org/browse/SERVER-1014 , you may vote for it.
The workaround is using $unset and then $pull:
db.lists.update({}, {$unset : {"interests.3" : 1 }})
db.lists.update({}, {$pull : {"interests" : null}})
Update: as mentioned in some of the comments this approach is not atomic and can cause some race conditions if other clients read and/or write between the two operations. If we need the operation to be atomic, we could:
Read the document from the database
Update the document and remove the item in the array
Replace the document in the database. To ensure the document has not changed since we read it, we can use the update if current pattern described in the mongo docs
You can use $pull modifier of update operation for removing a particular element in an array. In case you provided a query will look like this:
db.people.update({"name":"dannie"}, {'$pull': {"interests": "guitar"}})
Also, you may consider using $pullAll for removing all occurrences. More about this on the official documentation page - http://www.mongodb.org/display/DOCS/Updating#Updating-%24pull
This doesn't use index as a criteria for removing an element, but still might help in cases similar to yours. IMO, using indexes for addressing elements inside an array is not very reliable since mongodb isn't consistent on an elements order as fas as I know.
in Mongodb 4.2 you can do this:
db.example.update({}, [
{$set: {field: {
$concatArrays: [
{$slice: ["$field", P]},
{$slice: ["$field", {$add: [1, P]}, {$size: "$field"}]}
]
}}}
]);
P is the index of element you want to remove from array.
If you want to remove from P till end:
db.example.update({}, [
{ $set: { field: { $slice: ["$field", 1] } } },
]);
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to update an array by removing an element at a given index:
// { "name": "dannie", "interests": ["guitar", "programming", "gadgets", "reading"] }
db.collection.update(
{ "name": "dannie" },
[{ $set:
{ "interests":
{ $function: {
body: function(interests) { interests.splice(2, 1); return interests; },
args: ["$interests"],
lang: "js"
}}
}
}]
)
// { "name": "dannie", "interests": ["guitar", "programming", "reading"] }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify. The function here simply consists in using splice to remove 1 element at index 2.
args, which contains the fields from the record that the body function takes as parameter. In our case "$interests".
lang, which is the language in which the body function is written. Only js is currently available.
Rather than using the unset (as in the accepted answer), I solve this by setting the field to a unique value (i.e. not NULL) and then immediately pulling that value. A little safer from an asynch perspective. Here is the code:
var update = {};
var key = "ToBePulled_"+ new Date().toString();
update['feedback.'+index] = key;
Venues.update(venueId, {$set: update});
return Venues.update(venueId, {$pull: {feedback: key}});
Hopefully mongo will address this, perhaps by extending the $position modifier to support $pull as well as $push.
I would recommend using a GUID (I tend to use ObjectID) field, or an auto-incrementing field for each sub-document in the array.
With this GUID it is easy to issue a $pull and be sure that the correct one will be pulled. Same goes for other array operations.
For people who are searching an answer using mongoose with nodejs. This is how I do it.
exports.deletePregunta = function (req, res) {
let codTest = req.params.tCodigo;
let indexPregunta = req.body.pregunta; // the index that come from frontend
let inPregunta = `tPreguntas.0.pregunta.${indexPregunta}`; // my field in my db
let inOpciones = `tPreguntas.0.opciones.${indexPregunta}`; // my other field in my db
let inTipo = `tPreguntas.0.tipo.${indexPregunta}`; // my other field in my db
Test.findOneAndUpdate({ tCodigo: codTest },
{
'$unset': {
[inPregunta]: 1, // put the field with []
[inOpciones]: 1,
[inTipo]: 1
}
}).then(()=>{
Test.findOneAndUpdate({ tCodigo: codTest }, {
'$pull': {
'tPreguntas.0.pregunta': null,
'tPreguntas.0.opciones': null,
'tPreguntas.0.tipo': null
}
}).then(testModificado => {
if (!testModificado) {
res.status(404).send({ accion: 'deletePregunta', message: 'No se ha podido borrar esa pregunta ' });
} else {
res.status(200).send({ accion: 'deletePregunta', message: 'Pregunta borrada correctamente' });
}
})}).catch(err => { res.status(500).send({ accion: 'deletePregunta', message: 'error en la base de datos ' + err }); });
}
I can rewrite this answer if it dont understand very well, but I think is okay.
Hope this help you, I lost a lot of time facing this issue.
It is little bit late but some may find it useful who are using robo3t-
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
interests: "guitar" // you can change value to
}
},
{ multi: true }
);
If you have values something like -
property: [
{
"key" : "key1",
"value" : "value 1"
},
{
"key" : "key2",
"value" : "value 2"
},
{
"key" : "key3",
"value" : "value 3"
}
]
and you want to delete a record where the key is key3 then you can use something -
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
property: { key: "key3"} // you can change value to
}
},
{ multi: true }
);
The same goes for the nested property.
this can be done using $pop operator,
db.getCollection('collection_name').updateOne( {}, {$pop: {"path_to_array_object":1}})

Mongoose count by subobjects

I am trying to count the number of models in a collection based on a property:
I have an upvote model, that has: post (objectId) and a few other properties.
First, is this good design? Posts could get many upvotes, so I didn’t want to store them in the Post model.
Regardless, I want to count the number of upvotes on posts with a specific property with the following and it’s not working. Any suggestions?
upvote.count({‘post.specialProperty’: mongoose.Types.ObjectId(“id”), function (err, count) {
console.log(count);
});
Post Schema Design
In regards to design. I would design the posts collection for documents to be structured as such:
{
"_id" : ObjectId(),
"proprerty1" : "some value",
"property2" : "some value",
"voteCount" : 1,
"votes": [
{
"voter": ObjectId()// voter Id,
other properties...
}
]
}
You will have an array that will hold objects that can contain info such as voter id and other properties.
Updating
When a posts is updated you could simply increment or decrement the voteCountaccordingly. You can increment by 1 like this:
db.posts.update(
{"_id" : postId},
{
$inc: { voteCount: 1},
$push : {
"votes" : {"voter":ObjectId, "otherproperty": "some value"}
}
}
)
The $inc modifier can be used to change the value for an existing key or to create a new key if it does not already exist. Its very useful for updating votes.
Totaling votes of particular Post Criteria
If you want to total the amount for posts fitting a certain criteria, you must use the Aggregation Framework.
You can get the total like this:
db.posts.aggregate(
[
{
$match : {property1: "some value"}
},
{
$group : {
_id : null,
totalNumberOfVotes : {$sum : "$voteCount" }
}
}
]
)

MongoDB update/insert document and Increment the matched array element

I use Node.js and MongoDB with monk.js and i want to do the logging in a minimal way with one document per hour like:
final doc:
{ time: YYYY-MM-DD-HH, log: [ {action: action1, count: 1 }, {action: action2, count: 27 }, {action: action3, count: 5 } ] }
the complete document should be created by incrementing one value.
e.g someone visits a webpage first this hour and the incrementation of action1 should create the following document with a query:
{ time: YYYY-MM-DD-HH, log: [ {action: action1, count: 1} ] }
an other user in this hour visits an other webpage and document should be exteded to:
{ time: YYYY-MM-DD-HH, log: [ {action: action1, count: 1}, {action: action2, count: 1} ] }
and the values in count should be incremented on visiting the different webpages.
At the moment i create vor each action a doc:
tracking.update({
time: moment().format('YYYY-MM-DD_HH'),
action: action,
info: info
}, { $inc: {count: 1} }, { upsert: true }, function (err){}
Is this possible with monk.js / mongodb?
EDIT:
Thank you. Your solution looks clean and elegant, but it looks like my server can't handle it, or i am to nooby to make it work.
i wrote a extremly dirty solution with the action-name as key:
tracking.update({ time: time, ts: ts}, JSON.parse('{ "$inc":
{"'+action+'": 1}}') , { upsert: true }, function (err) {});
Yes it is very possible and a well considered question. The only variation I would make on the approach is to rather calculate the "time" value as a real Date object ( Quite useful in MongoDB, and manipulative as well ) but simply "round" the values with basic date math. You could use "moment.js" for the same result, but I find the math simple.
The other main consideration here is that mixing array "push" actions with possible "updsert" document actions can be a real problem, so it is best to handle this with "multiple" update statements, where only the condition you want is going to change anything.
The best way to do that, is with MongoDB Bulk Operations.
Consider that your data comes in something like this:
{ "timestamp": 1439381722531, "action": "action1" }
Where the "timestamp" is an epoch timestamp value acurate to the millisecond. So the handling of this looks like:
// Just adding for the listing, assuming already defined otherwise
var payload = { "timestamp": 1439381722531, "action": "action1" };
// Round to hour
var hour = new Date(
payload.timestamp - ( payload.timestamp % ( 1000 * 60 * 60 ) )
);
// Init transaction
var bulk = db.collection.initializeOrderedBulkOp();
// Try to increment where array element exists in document
bulk.find({
"time": hour,
"log.action": payload.action
}).updateOne({
"$inc": { "log.$.count": 1 }
});
// Try to upsert where document does not exist
bulk.find({ "time": hour }).upsert().updateOne({
"$setOnInsert": {
"log": [{ "action": payload.action, "count": 1 }]
}
});
// Try to "push" where array element does not exist in matched document
bulk.find({
"time": hour,
"log.action": { "$ne": payload.action }
}).updateOne({
"$push": { "log": { "action": payload.action, "count": 1 } }
});
bulk.execute();
So if you look through the logic there, then you will see that it is only ever possible for "one" of those statements to be true for any given state of the document either existing or not. Technically speaking, the statment with the "upsert" can actually match a document when it exists, however the $setOnInsert operation used makes sure that no changes are made, unless the action actually "inserts" a new document.
Since all operations are fired in "Bulk", then the only time the server is contacted is on the .execute() call. So there is only "one" request to the server and only "one" response, despite the multiple operations. It is actually "one" request.
In this way the conditions are all met:
Create a new document for the current period where one does not exist and insert initial data to the array.
Add a new item to the array where the current "action" classification does not exist and add an initial count.
Increment the count property of the specified action within the array upon execution of the statement.
All in all, yes posssible, and also a great idea for storage as long as the action classifications do not grow too large within a period ( 500 array elements should be used as a maximum guide ) and the updating is very efficient and self contained within a single document for each time sample.
The structure is also nice and well suited to other query and possible addtional aggregation purposes as well.

Resources