MongoDB (mongoose) retrieve substring - node.js

Let's say I have a blog with some very long posts.
So, I want to display a list of my posts in "preview mode", for instance only first 50 chars of text.
Simple answer is to do this:
Post.find(
(err, posts) => {
if(err) return console.log(err);
posts.forEach(
post => {
console.log('post preview:', post.data.substr(0,50) + '...');
}
)
}
)
This way we retrieve all data from specific collection.
If each post has more than 3 KB of data retrieving 30 posts seems very inefficient in terms of data transfer and processing.
So, I wondered if there is a way to retrieve already sliced string from DB?
Or at least do you have a better solution for my issue?

yes, you can use the $substr operator with a query like this :
db.collection.aggregate(
[
{
$project:
{
preview: { $substr: [ "$data", 0, 50 ] }
}
}
]
)
Edit:
$substr is deprecated from mongodb 3.4 because it only has a well-defined behavior for strings of ASCII characters. If you're facing UTF-8 issues, consider upgrading to mongodb 3.4 in order to use the $substrCP operator
so your query becomes :
db.collection.aggregate(
[
{
$project:
{
preview: { $substrCP: [ "$data", 0, 50 ] }
}
}
]
)
As of today, mongodb 3.4 is only available for development, but a production version should be released soon

Related

Speed ​issue and also memory error when querying in Mongo

I have a table that contains over 100,000 records. Server: node.js/express.js. DB: mongo
On the client, a table with a pager is implemented. 10 records are requested each time.
When there were 10,000 records, of course, everything worked faster, but now there was a problem with speed and not only.
My aggregation:
import { concat } from 'lodash';
...
let query = [{$match: {}}];
query = concat(query, [{$sort : {createdAt: -1}}]);
query = concat(query, [
{$skip : (pageNum - 1) * perPage}, // 0
{$limit : perPage} // 10
]);
return User.aggregate(query)
.collation({locale: 'en', strength: 2})
.then((users) => ...;
2 cases:
first fetch very slow
when I click to last page I got error:
MongoError: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.
Please, tell me, I am building the aggregation incorrectly, or is there a problem with memory on the server as the error says and additional nginx settings are needed (another person is engaged in this) or is the problem complex, or perhaps something else altogether?
Added:
I noticed that the index is not used when sorting, although it should be used?
aggregation to execute console.log =>
[
{
"$match": {}
},
{
"$lookup": {
...
}
},
{
"$project": {
...,
createdAt: 1,
...
}
},
{
"$match": {}
},
{
"$sort": {
"createdAt": -1
}
},
{
"$skip": 0
},
{
"$limit": 10
}
]
Thanks for any answers and sorry my English :)
It does say that you've memory limit, which makes sense, considering that you're trying to filter through 100,000 requests. I'd try using return User.aggregate(query, { allowDiskUse: true }) //etc, and see if that helps your issue.
Whilst this isn't the documentation on the Node.js driver specifically, this link summaries what the allowDiskUse option does (or in short, it allows MongoDB to go past the 100MB memory limit, and uses your system storage to temporarily store some information while it performs the query).

How can I do multiple queries to mongo at one request

Let say I have a collection of Person{email: 'actual email', ..other data} and want to query if Person exists with given email and retrieve it data if so or get a null if not.
If i want to do that once than no problem just do a query, through mongoose using Person.findOne() or whatever.
But what if I have to do a check for 25-100 given emails? Of course I can just send a tons of requests to mongodb and retrieve the data but it seems a vast of network.
Is there a good and perfomant way to query a mongodb with multiple clauses in single batch like findBatch([{email: 'email1'}, {email: 'email2'}...{email: 'emailN'} ]) and got as result [document1,null,document3,null, documentN] where null is for not matched find criterias?
Currently I see only one option:
Huge find with single {email: $in: [] } query and that do a matching through the searching on the server side in application logic. Cons: quite cumbersome and error prone if you have more than one search criteria.
Is there any better ways to implement such thing?
Try this:
Replace the arrayOfEmails with your query array
Replace emailField with the actual name in your db documents
db.collName.aggregate([
{
"$match" : {
"emailField" : {
"$in" : arrayOfEmails
}
}
},
{
"$group" : {
"_id" : null,
"docs" : {
"$push" : {
"$cond" : [
{
"$in" : [
arrayOfEmails,
[
"$emailField"
]
]
},
"$$ROOT",
null
]
}
}
}
}
])

how to remove object in array by index mongodb / mongoose [duplicate]

In the following example, assume the document is in the db.people collection.
How to remove the 3rd element of the interests array by it's index?
{
"_id" : ObjectId("4d1cb5de451600000000497a"),
"name" : "dannie",
"interests" : [
"guitar",
"programming",
"gadgets",
"reading"
]
}
This is my current solution:
var interests = db.people.findOne({"name":"dannie"}).interests;
interests.splice(2,1)
db.people.update({"name":"dannie"}, {"$set" : {"interests" : interests}});
Is there a more direct way?
There is no straight way of pulling/removing by array index. In fact, this is an open issue http://jira.mongodb.org/browse/SERVER-1014 , you may vote for it.
The workaround is using $unset and then $pull:
db.lists.update({}, {$unset : {"interests.3" : 1 }})
db.lists.update({}, {$pull : {"interests" : null}})
Update: as mentioned in some of the comments this approach is not atomic and can cause some race conditions if other clients read and/or write between the two operations. If we need the operation to be atomic, we could:
Read the document from the database
Update the document and remove the item in the array
Replace the document in the database. To ensure the document has not changed since we read it, we can use the update if current pattern described in the mongo docs
You can use $pull modifier of update operation for removing a particular element in an array. In case you provided a query will look like this:
db.people.update({"name":"dannie"}, {'$pull': {"interests": "guitar"}})
Also, you may consider using $pullAll for removing all occurrences. More about this on the official documentation page - http://www.mongodb.org/display/DOCS/Updating#Updating-%24pull
This doesn't use index as a criteria for removing an element, but still might help in cases similar to yours. IMO, using indexes for addressing elements inside an array is not very reliable since mongodb isn't consistent on an elements order as fas as I know.
in Mongodb 4.2 you can do this:
db.example.update({}, [
{$set: {field: {
$concatArrays: [
{$slice: ["$field", P]},
{$slice: ["$field", {$add: [1, P]}, {$size: "$field"}]}
]
}}}
]);
P is the index of element you want to remove from array.
If you want to remove from P till end:
db.example.update({}, [
{ $set: { field: { $slice: ["$field", 1] } } },
]);
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to update an array by removing an element at a given index:
// { "name": "dannie", "interests": ["guitar", "programming", "gadgets", "reading"] }
db.collection.update(
{ "name": "dannie" },
[{ $set:
{ "interests":
{ $function: {
body: function(interests) { interests.splice(2, 1); return interests; },
args: ["$interests"],
lang: "js"
}}
}
}]
)
// { "name": "dannie", "interests": ["guitar", "programming", "reading"] }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify. The function here simply consists in using splice to remove 1 element at index 2.
args, which contains the fields from the record that the body function takes as parameter. In our case "$interests".
lang, which is the language in which the body function is written. Only js is currently available.
Rather than using the unset (as in the accepted answer), I solve this by setting the field to a unique value (i.e. not NULL) and then immediately pulling that value. A little safer from an asynch perspective. Here is the code:
var update = {};
var key = "ToBePulled_"+ new Date().toString();
update['feedback.'+index] = key;
Venues.update(venueId, {$set: update});
return Venues.update(venueId, {$pull: {feedback: key}});
Hopefully mongo will address this, perhaps by extending the $position modifier to support $pull as well as $push.
I would recommend using a GUID (I tend to use ObjectID) field, or an auto-incrementing field for each sub-document in the array.
With this GUID it is easy to issue a $pull and be sure that the correct one will be pulled. Same goes for other array operations.
For people who are searching an answer using mongoose with nodejs. This is how I do it.
exports.deletePregunta = function (req, res) {
let codTest = req.params.tCodigo;
let indexPregunta = req.body.pregunta; // the index that come from frontend
let inPregunta = `tPreguntas.0.pregunta.${indexPregunta}`; // my field in my db
let inOpciones = `tPreguntas.0.opciones.${indexPregunta}`; // my other field in my db
let inTipo = `tPreguntas.0.tipo.${indexPregunta}`; // my other field in my db
Test.findOneAndUpdate({ tCodigo: codTest },
{
'$unset': {
[inPregunta]: 1, // put the field with []
[inOpciones]: 1,
[inTipo]: 1
}
}).then(()=>{
Test.findOneAndUpdate({ tCodigo: codTest }, {
'$pull': {
'tPreguntas.0.pregunta': null,
'tPreguntas.0.opciones': null,
'tPreguntas.0.tipo': null
}
}).then(testModificado => {
if (!testModificado) {
res.status(404).send({ accion: 'deletePregunta', message: 'No se ha podido borrar esa pregunta ' });
} else {
res.status(200).send({ accion: 'deletePregunta', message: 'Pregunta borrada correctamente' });
}
})}).catch(err => { res.status(500).send({ accion: 'deletePregunta', message: 'error en la base de datos ' + err }); });
}
I can rewrite this answer if it dont understand very well, but I think is okay.
Hope this help you, I lost a lot of time facing this issue.
It is little bit late but some may find it useful who are using robo3t-
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
interests: "guitar" // you can change value to
}
},
{ multi: true }
);
If you have values something like -
property: [
{
"key" : "key1",
"value" : "value 1"
},
{
"key" : "key2",
"value" : "value 2"
},
{
"key" : "key3",
"value" : "value 3"
}
]
and you want to delete a record where the key is key3 then you can use something -
db.getCollection('people').update(
{"name":"dannie"},
{ $pull:
{
property: { key: "key3"} // you can change value to
}
},
{ multi: true }
);
The same goes for the nested property.
this can be done using $pop operator,
db.getCollection('collection_name').updateOne( {}, {$pop: {"path_to_array_object":1}})

How to update a field using its previous value in MongoDB/Mongoose

For example, I have some documents that look like this:
{
id: 1
name: "foo"
}
And I want to append another string to the current name field value.
I tried the following using Mongoose, but it didn't work:
Model.findOneAndUpdate({ id: 1 }, { $set: { name: +"bar" } }, ...);
Edit:
From Compatibility Changes in MongoDB 3.6:
MongoDB 3.6.1 deprecates the snapshot query option.
For MMAPv1, use hint() on the { _id: 1} index instead to prevent a cursor from returning a document more than once if an intervening write operation results in a move of the document.
For other storage engines, use hint() with { $natural : 1 } instead.
Original 2017 answer:
You can't refer to the values of the document you want to update, so you will need one query to retrieve the document and another one to update it. It looks like there's a feature request for that in OPEN state since 2016.
If you have a collection with documents that look like:
{ "_id" : ObjectId("590a4aa8ff1809c94801ecd0"), "name" : "bar" }
Using the MongoDB shell, you can do something like this:
db.test.find({ name: "bar" }).snapshot().forEach((doc) => {
doc.name = "foo-" + doc.name;
db.test.save(doc);
});
The document will be updated as expected:
{ "_id" : ObjectId("590a4aa8ff1809c94801ecd0"), "name": "foo-bar" }
Note the .snapshot() call.
This ensures that the query will not return a document multiple times because an intervening write operation moves it due to the growth in document size.
Applying this to your Mongoose example, as explained in this official example:
Cat.findById(1, (err, cat) => {
if (err) return handleError(err);
cat.name = cat.name + "bar";
cat.save((err, updatedCat) => {
if (err) return handleError(err);
...
});
});
It's worth mentioning that there's a $concat operator in the aggregation framework, but unfortunately you can't use that in an update query.
Anyway, depending on what you need to do, you can use that together with the $out operator to save the results of the aggregation to a new collection.
With that same example, you will do:
db.test.aggregate([{
$match: { name: "bar" }
}, {
$project: { name: { $concat: ["foo", "-", "$name"] }}
}, {
$out: "prefixedTest"
}]);
And a new collection prefixedTest will be created with documents that look like:
{ "_id" : ObjectId("XXX"), "name": "foo-bar" }
Just as a reference, there's another interesting question about this same topic with a few answers worth reading: Update MongoDB field using value of another field
If this is still relevant, I have a solution for MongoDB 4.2.
I had the same problem where "projectDeadline" fields of my "project" documents were Array type (["2020","12","1"])
Using Robo3T, I connected to my MongoDB Atlas DB using SRV link. Then executed the following code and it worked for me.
Initial document:
{
_id : 'kjnolqnw.KANSasdasd',
someKey : 'someValue',
projectDeadline : ['2020','12','1']
}
CLI Command:
db
.getCollection('mainData')
.find({projectDeadline: {$not: {$eq: "noDeadline"}}})
.forEach((doc) => {
var deadline = doc.projectDeadline;
var deadlineDate = new Date(deadline);
db
.mainData
.updateOne({
_id: doc._id},
{"$set":
{"projectDeadline": deadlineDate}
}
)}
);
Resulting document:
{
_id : 'kjnolqnw.KANSasdasd',
someKey : 'someValue',
projectDeadline : '2020-12-01 21:00:00.000Z'
}

Combine multiple query with one single $in query and specify limit for each array field?

I am using mongoose with node.js for MongoDB. Now i need to make 20 parallel find query requests in my database with limit of documents 4, same as shown below just brand_id will change for different brand.
areamodel.find({ brand_id: brand_id }, { '_id': 1 }, { limit: 4 }, function(err, docs) {
if (err) {
console.log(err);
} else {
console.log('fetched');
}
}
Now as to run all these query parallely i thought about putting all 20 brand_id in a array of string and then use a $in query to get the results, but i don't know how to specify the limit 4 for every array field which will be matched.
I write below code with aggregation but don't know where to specify limit for each element of my array.
var brand_ids = ["brandid1", "brandid2", "brandid3", "brandid4", "brandid5", "brandid6", "brandid7", "brandid8", "brandid9", "brandid10", "brandid11", "brandid12", "brandid13", "brandid14", "brandid15", "brandid16", "brandid17", "brandid18", "brandid19", "brandid20"];
areamodel.aggregate(
{ $project: { _id: 1 } },
{ $match : { 'brand_id': { $in: brand_ids } } },
function(err, docs) {
if (err) {
console.error(err);
} else {
}
}
);
Can anyone please tell me how can i solve my problem using only one query.
UPDATE- Why i don't think $group be helpful for me.
Suppose my brand_ids array contains these strings
brand_ids = ["id1", "id2", "id3", "id4", "id5"]
and my database have below documents
{
"brand_id": "id1",
"name": "Levis",
"loc": "india"
},
{
"brand_id": "id1",
"name": "Levis"
"loc": "america"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "india"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "america"
}
Desired JSON output
{
"name": "Levis"
},
{
"name": "Lee"
}
For above example suppose i have 25000 documents with "name" as "Levis" and 25000 of documents where "name" is "Lee", now if i will use group then all of 50000 documents will be queried and grouped by "name".
But according to the solution i want, when first document with "Levis" and "Lee" gets found then i will don't have to look for remaining thousands of the documents.
Update- I think if anyone of you can tell me this then probably i can get to my solution.
Consider a case where i have 1000 total documents in my mongoDB, now suppose out of that 1000, 100 will pass my match query.
Now if i will apply limit 4 on this query then will this query take same time to execute as the query without any limit, or not.
Why i am thinking about this case
Because if my query will take same time then i don't think $group will increase my time as all documents will be queried.
But if time taken by limit query is more than the time taken without the limit query then.
If i can apply limit 4 on each array element then my question will be solved.
If i cannot apply limit on each array element then i don't think $group will be useful, as in this case i have to scan whole documents to get the results.
FINAL UPDATE- As i read on below answer and also on mongodb docs that by using $limit, time taken by query does not get affected it is the network bandwidth that gets compromised. So i think if anyone of you can tell me how to apply limit on array fields (by using $group or anything other than that)then my problem will get solved.
mongodb: will limit() increase query speed?
Solution
Actually my thinking about mongoDB was very wrong i thought adding limit with queries decrease time taken by query but it is not the case that's why i stumbled so many days to try the answer which Gregory NEUT and JohnnyHK Told me to. Thanks a lot both of you guys i must have found the solution at the day one if i had known about this thing. thanks alot for helping me out of here guys i really appreciate it.
I propose you to use the $group aggregation attribute to group all data you got from the $match by brand_id, and then limit the groups of data using $slice.
Look at this stack overflow post
db.collection.aggregate(
{
$sort: {
created: -1,
}
}, {
$group: {
_id: '$city',
title: {
$push: '$title',
}
}, {
$project: {
_id: 0,
city: '$_id',
mostRecentTitle: {
$slice: ['$title', 0, 2],
}
}
})
I propose using distinct, since that will return all different brand names in your collection. (I assume this is what you are trying to achieve?)
db.runCommand ( { distinct: "areamodel", key: "name" } )
MongoDB docs
In mongoose i think it is: areamodel.db.db.command({ distinct: "areamodel", key: "name" }) (Untested)

Resources