MongoDB error: $merge cannot be used in a transaction - node.js

I have a transaction operation and I want to make a merge request into a table (that doesn't have a schema)
This is my implementation but it's not working in transactions, I get: $merge cannot be used in a transaction
await User.aggregate([
{
$match: {
_id: new mongoose.mongo.ObjectID(id),
},
},
{
$merge: {
into: 'deleted-users',
},
},
]).option({ session });
is there an alternative to do this scenario which is to add a record in a Newley created collection inside a transaction ?

Seeing that merge is
Excluding the following stages As we can read in official docs:
The following read/write operations are allowed in transactions: (among others) ... aggregate command.
Excluding the following stages: (among others) merge.
And here You have reference to merge and more...

Related

MongoDB: How to perform a second match using the results (an array of ObjectIds) of the previous match in aggregation pipeline

I have a MongoDB collection called users with documents that look like:
{
_id: ObjectId('123'),
username: "abc",
avatar: "avatar/long-unique-random-string.jpg",
connections: [ObjectId('abc'), ObjectId('xyz'), ObjectId('lmn'), ObjectId('efg')]
}
This document belongs to the users collection.
What I want to do:
First, find one document from the users' collection that matches _id -> '123'.
Project the connections field received from step 1, which is an array of ObjectIds of other users within the same collection.
Find all documents of users from the array field projected in step 2.
Project and return an array of only the username and avatar of all those users from step 3.
While I know that I can do this in two separate queries. First using findOne which returns the friends array. Then, using find with the results of findOne to get all the corresponding usernames and avatars.
But, I would like to do this in one single query, using the aggregation pipeline.
What I want to know, is it even possible to do this in one query using aggregation?
If so, what would the query look like?
What, I currently have:
await usersCollection
.aggregate([
{ $match: { _id: new ObjectId(userId) } },
{ $project: { ids: "$connections" } },
{ $match: { _id: { $in: "ids" } } },
{
$project: {
username: "$username",
avatar: { $ifNull: ["$avatar", "$$REMOVE"] },
},
},
])
.toArray()
I know this is wrong because each aggregation stage receives the results from the previous stage. So, the second match cannot query on the entire users' collection, as far as I know.
I'm using MongoDB drivers for nodeJS. And I would like to avoid $lookup for possible solutions.

How can I combine multiple Mongodb queries into one using node.js?

Context:
My database has two collections: "Users" and "Files".
Sample document from "Users" collection:
{
"username":"Adam",
"email":"adam#gmail.com",
"IdFilesOwned":[1,3],
}
As you can see, Adam currently owns two files on the server. Their ids are 1 and 3.
Sample documents from "Files" collection:
{
"fileId":1,
"name":"randomPNG.png"
}
{
"fileId":2,
"name":"somePDF.pdf"
}
{
"fileId":3,
"name":"business.pdf"
}
As you can see, I have three documents on my server, each document having an id and some metadata.
Now, Adam wants to see all the files that he owns, and their metadata. The way i would implement this, is:
1.Lookup the array of file ids that Adam owns.
2.Have node.js run through each id (using a for each loop), and query the metadata for each id.
The problem is that nodejs will make multiple queries (1 query per id). This seems very inefficient. So my question is if there is a better way to implement this?
You can use the $in operator find more info here
So it will look something like this.
db.collection('files').find({ fileId: { $in: [<value1>, <value2>, ... <valueN> ] } })
This is more efficent than lookup for sure. Good luck.
I don't have much experience using the driver directly, but you should be able to do the equivalent of the following
db.users.aggregate([
{
$lookup: {
from: "files",
localField: "IdFilesOwned",
foreignField: "fileId",
as: "files"
}
}
]);

Create View from multiple collections MongoDB

I have following Mongo Schemas(truncated to hide project sensitive information) from a Healthcare project.
let PatientSchema = mongoose.Schema({_id:String})
let PrescriptionSchema = mongoose.Schema({_id:String, patient: { type: Number, ref: 'Patient', createdAt:Date }})
let ReportSchema = mongoose.Schema({_id:String, patient: { type: Number, ref: 'Patient', createdAt:Date }})
let EventsSchema = mongoose.Schema({_id:String, patient: { type: Number, ref: 'Patient', createdAt:Date }})
There is ui screen from the mobile and web app called Health history, where I need to paginate the entries from prescription, reports and events sorted based on createAt. So I am building a REST end point to get this heterogeneous data. How do I achieve this. Is it possible to create a "View" from multiple schema models so that I won't load the contents of all 3 schema to fetch one page of entries. The schema of my "View" should look like below so that I can run additional queries on it (e.g. find last report)
{recordType:String,/* prescription/report/event */, createdDate:Date, data:Object/* content from any of the 3 tables*/}
I can think of three ways to do this.
Imho the easiest way to achieve this is by using an aggregation something like this:
db.Patients.aggregate([
{$match : {_id: <somePatientId>},
{
$lookup:
{
from: Prescription, // replicate this for Report and Event,
localField: _id,
foreignField: patient,
as: prescriptions // or reports or events,
}
},
{ $unwind: prescriptions }, // or reports or events
{ $sort:{ $createDate : -1}},
{ $skip: <positive integer> },
{ $limit: <positive integer> },
])
You'll have to adapt it further, to also get the correct createdDate. For this, you might want to look at the $replaceRoot operator.
The second option is to create a new "meta"-collection, that holds your actual list of events, but only holds a reference to your patient as well as the actual event using a refPath to handle the three different event types. This solution is the most elegant, because it makes querying your data way easier, and probably also more performant. Still, it requires you to create and handle another collection, which is why I didn't want to recommend this as the main solution, since I don't know if you can create a new collection.
As a last option, you could create virtual populate fields in Patient, that automatically fetch all prescriptions, reports and events. This has the disadvantage that you can not really sort and paginate properly...

MongoDB update multiple items with multiple changes

Is there any recommended way to update multiple items in MongoDB with one query ? I know that this is possible:
db.collection('mycollection').update({active: 1}, {$set: {active:0}}, {multi: true});
But in my case I want to update several documents with "unique" changes.
e.g. I want to combine these two queries into one:
db.collection('mycollection').update({
id: 'my id'
}, {
$set: {
name: "new name"
}
});
db.collection('mycolleciont').update({
id: 'my second id'
}, {
$set: {
name: "new name two"
}
});
Why ? I have a system which gets daily updates imported. The updates are mostly large so its around 200,000 Updates a day so currently I am executing 200,000 times the update query which takes a long time.
If its necessary to know: I am using Mongo 3 and nodeJS.

Delay response until all queries finished

My db contains projects and phases. Projects can have multiple phases. The models are similar to these:
Phase:
var phaseSchema = new mongoose.Schema({
project: { type: mongoose.Schema.Types.ObjectId, ref: 'Project' }
});
Project:
var projectSchema = new mongoose.Schema({
name : { type: String }
});
Currently I'm using the following approach to retrieve the phases for each project:
var calls = [];
var projects = _.each(projects, function (p) {
calls.push(function (callback) {
req.app.db.models.Phase.find({ project: p._id }, function (err, doc) {
if (err) {
callback(err);
} else {
p.phases = doc;
callback();
}
});
})
});
async.parallel(calls, function (err) {
workflow.outcome.projects = projects;
return workflow.emit('response');
});
As you can see I'm not passing anything to callback() just (ab)using async's parallel to wait with the response until the lookup finishes.
Alternatively I could pass the phase object to the callback but then in parallel I should iterate over phase and over projects to find the appropriate project for the current phase.
Am I falling into a common pitfall with this design and for some reason it would be better to iterate over the projects and the phases again, or I should take a completely different approach?
I actually think in this case you would be better of running one query to match all the potential results. For the "test" query you would issue all the _id values as an $in clause, then just do some matching on the results to your source array to assign the match(ed) document(s):
Matching all at once
// Make a hash from the source for ease of matching
var pHash = {};
_.each(projects,function(p) {
pHash[p._id.toString()] = p;
});
// Run the find with $in
req.app.db.models.Phase.find({ "project": { "$in": _.keys(pHash) } },function(err,response) {
_.each(response,function(r) {
// Assign phases array if not already there
if (!phash[r.project.toString()].hasOwnProperty("phases")
pHash[r.project.toString()].phases = [];
// Append to array of phases
pHash[r.project.toString()].phases.push(r)
});
// Now return the altered hash as orginal array
projects = _.mapObject(pHash,function(val,key) {
return val;
});
});
Also adding like you say "projects can have multiple phases", so the logic would be an "array" rather than an assignment of a single value.
More efficient $lookup
On the other hand, if you have MongoDB 3.2 available, then the $lookup aggregation pipeline operator seems to be for you. In this case you would just be working with the Projects model, but doing the $lookup on the `"phases" collection. With "collection" being the operative term here, since it is a server side operation that therefore only knows about collections and not the application "models":
// BTW all models are permanently registered with mongoose
mongoose.model("Project").aggregate(
[
// Whatever your match conditions were for getting the project list
{ "$match": { .. } },
// This actually does the "join" (but really a "lookup")
{ "$lookup": {
"from": "phases",
"localField": "_id",
"foreignField": "project",
"as": "phases"
}}
],function(err,projects) {
// Now all projects have an array containing any matched phase
// or an empty array. Just like a "left join"
})
);
That would be the most efficient way to handle this since all the work is done on the server.
So what you seem to be asking here is basically the "reverse case" of .populate() where instead of holding the "phases" as references on the "project" object the reference to the project is instead listed in the "phase".
In that case, either form of "lookup" should be what you are looking for. Either where you emulate that join via the $in and "mapping" stage, or directly using the aggregation framework $lookup operator.
Either way, this reduces the server contact down to "one" operation, where as your current approach is going to create a lot of connections and each up a fair amount of resources. Also no need to "Wait for all responses". I'd wager that both were much faster as well.

Resources