Mongodb node.js $out with aggregation only working if calling toArray()

Mongodb node.js $out with aggregation only working if calling toArray() - node.js

Saving an aggregation query using "mongodb": "^3.0.6" as result with the $out operator is only working when calling .toArray().
The aggregation step(s):
let aggregationSteps = [{
$group: {
_id: '$created_at',
}
}, {'$out': 'ProjectsByCreated'}];
Executing the aggregation:
await collection.aggregate(aggregationSteps, {'allowDiskUse': true})
Expected result: New collection called ProjectsByCreated.
Result: No collection, query does not throw an exception but is not being executed? (takes only 1ms)
Appending toArray() results in the expected behaviour:
await collection.aggregate(aggregationSteps, {'allowDiskUse': true}).toArray();
Why does mongodb only create the result collection when calling .toArray() and where does the documentation tell so? How can I fix this?
The documentation doesn't seem to provide any information about this:
https://mongodb.github.io/node-mongodb-native/3.0/api/Collection.html#aggregate
https://docs.mongodb.com/manual/reference/operator/aggregation/out/index.html

MongoDB acknowledge this behaviour, but they also say this is working as designed.
It has been logged as a bug in the MongoDB JIRA, $out aggregation stage doesn't take effect, and the responses say it is not a fault:
This behavior is intentional and has not changed in some time with the node driver. When you "run" an aggregation by calling Collection.prototype.aggregate, we create an intermediary Cursor which is not executed until some sort of I/O is requested. This allows us to provide the chainable cursor API (e.g. cursor.limit(..).sort(..).project(..)), building up the find or aggregate options in a builder before executing the initial query.
... Chaining toArray in order to execute the out stage doesn't feel quite right. Is there something more natural that I haven't noticed?
Unfortunately not, the chained out method there simply continues to build your aggregation. Any of the following methods will cause the initial aggregation to be run: toArray, each, forEach, hasNext, next. We had considered adding something like exec/run for something like change streams, however it's still in our backlog. For now you could theoretically just call hasNext which should run the first aggregation and retrieve the first batch (this is likely what exec/run would do internally anyway).
So, it looks like you do have to call one of the methods to start iterating the cursor before $out will do anything. Adding .toArray(), as you're already doing, is probably safest. Note that to.Array() does not load the entire result into RAM as normal; because it includes a $out, the aggregation returns an empty cursor.

Because the Aggregation operation returns a cursor, not the results.
In order to return all the documents from the cursor, we need to use toArray method.

Related

Consecutive calls to updateOne of mongodb: 3rd one does not work

I receive 3 post calls from client, let say in a second, and with nodejs-mongodb immediately(without any pause, sleep, etc) I try to insert the data that is posted in database using updateOne. All data is new, so in every call, insert would happen.
Here is the code (js):
const myCollection = mydb.collection("mydata")
myCollection.updateOne({name:req.data.name},{$set:{name:req.data.name, data:req.data.data}}, {upsert:true}, function(err, result) {console.log("UPDATEONE err: "+err)})
When I call just 1 time this updateOne, it works; 2 times successively, it works. But if I call 2+ times in succession, only the first two ones correctly inserted into database, and the rest, no.
The error that I get after updateOne is, MongoWriteConcernError: No write concern mode named 'majority;' found in replica set configuration. However, I always get this error, also even when the insertion is done correctly. So I don't think this is related to my problem.
Probably you will suggest to me to use updateMany, bulkWrite, etc. and you will be right, but I want to know the reason why after 2+ the insertion is not done.

Have in mind .updateOne() returns a Promise so it should be handled properly in order to avoid concurrency issues. More info about it here.
The error MongoWriteConcernError might be related to the connection string you are using. Check if there is any &w=majority and remove it as recommended here.

Trouble fetching multiple documents using find() in MongoDB

I'm trying to fetch more than one document that satisfies a particular criteria from a collection in my MongoDB database. When I use findOne(), it works perfectly fine and returns the first document that follows the criteria, but find() isn't returning all of the documents. I've checked a lot of websites and the documentation but still haven't found good examples of how it is done. This is the current syntax that I'm trying :
db.collection('mycollection').find({someproperty :
somevalue}).then((docs)=>{
// trying to do something with the returned documents 'docs'
}
Also, I'd really prefer a non-mongoose solution, unless this is absolutely impossible using plain MongoDB. I know it's probably easier with mongoose but I want to know the MongoDB implementation.

In the Mongo Docs the find function returns a cursor.
A cursor to the documents that match the query criteria. When the find() method “returns documents,” the method is actually returning a cursor to the documents.
I'm guessing you expect an array? You need to use the toArray function, the docs for this are https://mongodb.github.io/node-mongodb-native/api-generated/cursor.html#toarray
Unfortunately it's a callback, no promise implementation so you will need to put the promise in there yourself.
return new Promise((resolve, reject) => db.collection('mycollection')
.find({someproperty : somevalue})
.toArray((err, documents) => resolve(documents)));

Mongoose population - callback vs exec

On Node.js/Mongoose/Mongo is
SomeModel.findOne({_id: id}, callback).populate('ref')
equivalent to
SomeModel.findOne({_id: id}).populate('ref').exec(callback)
"ref" is single doc (not an array).
The problem is that using the first syntax the "child" document is randomly not populated when callback is called.

I don't know the internals, but I'd say they are not the same.
The first probably does this:
find the document
call the callback with the document
populate the ref (this is done through a separate query)
The second probably does this:
find the document
populate the ref
call the callback when the ref has been resolved
The randomness that you're witnessing is because the "populate the ref" call, when fast enough, may populate the document before you use it in the callback. In other words: a race condition.

Sails 0.10 association fails to populate

I'm working on a custom adapter in sails#0.10.0-rc4 which will support associations but I am having trouble getting them working in conjunction with my adapter. My configuration is a one-to-many association between article and stats. My models and adapter are setup like this:
// api/models/article.js
module.exports = {
connection: ['myadapter'],
tableName: 'Knowledge_Base__kav',
attributes: {
KnowledgeArticleId: { type: 'string', primaryKey: true }
stats: {
collection: 'stats',
via: 'parentId'
}
}
// api/models/stats.js
module.exports = {
connection: ['myadapter'],
tableName: 'KnowledgeArticleViewStat',
attributes: {
count: 'integer',
ParentId: {
model: 'article'
}
}
}
// adapter.js
find: function(connectionName, collectionName, options, cb) {
console.dir(options)
// output
// {where: null}
db.query(options, function(err, res)) {
cb(err, res)
}
}
However, when I try to populate using Article.find().populate('stats').exec(console.log()), my adapter gets {where: null} as options when I would expect it to receive {where: {parentId: [<some-article-id>]}}. It will return a list of articles to me but the field which is supposed to be populated from another model (stats) is just an empty list.
I feel like this is related to the fact that my adapter is not getting the proper where param to search for the related model on the primary key. To test this further, I setup a test one-to-many relationship using the the sails-mongo adapter. In this case the adapter did receive params I expected and the association worked fine.
Does anyone have any idea on why .populate('stats') wouldn't be sending the proper "where" params to my adapter?
Update 3/7
So it seems like what happens in associations is that SomeModel.find() will hit the adapter once and then .populate('othermodel') hits the adapter again using the primary key of the first request. Then the results of both are joined together. In my case, the second hit against the adapter isn't happening for some unknown reason.
Update
The original issue was related to an attribute naming error that's mentioned in the comments below. However, there still appears to be some issue with the final population step mentioned by particlebanana:
Final step will: Take all of the query results from all the returned query operations
and combine them in-memory to build up a result set you can return in
the exec callback.
I'm seeing that all required queries are now firing but they are failing to actually populate the alias. Here's the call with some added debugging output in the form of a gist for easier consumption: https://gist.github.com/jasonsims/9423170

It looks like you are on the right track! The way the operation sets get built up, the .find() on the Article should run with the first log (empty where) and the second query should get run with the parentId criteria in the log. The second query isn't running because it can't build up that parentId array of primary keys when you don't return anything from the first query.
Short answer: you need to return something in the find callback to see the second log, which should match your expected criteria.
The query lifecycle looks something like this:
Check if all query pieces are on the same connection, if not break out which queries will run on which connections
For all queries on a single connection, check if the adapter supports native joins (has a .join() method, if so you can pass the criteria down and let the adapter handle the joins.
If no native join method is defined run the "parent" operation (in this case the Article.find())
Use the results of the parent operation to build up criteria for any populations that need to run. (The parentId array in your criteria) and run the child results.
Take all of the query results from all the returned query operations and combine them in-memory to build up a result set you can return in the exec callback.
I hope that helps some. Shoot me the url of your repo and I will look through it, if it's able to be open sourced, and can help some more if you come across any issues.

Just to summarize, there were multiple issues going on here which were causing associations not to populate:
Custom primary keys
There was a problem with waterline when joining data from models using custom primary keys. #particlebanana fixed this in 8eff54b and it should be included in the next rc of waterline (waterline#0.10.0-rc5).
Malformed SOQL query
When waterline queries the adapter for a second time in order to acquire the child rows, it does so using { foreignKey: [ value ] }. Since the value was a list, jsforce was incorrectly generating the SOQL query since it expected all list values to be accompanied by either $in or $nin operators. I addressed this issue in github/jsforce#9 and it's now included in jsforce#1.1.2.
Model attributes are case sensitive
The models in my project were defined in snakeCase but the json response from Salesforce was using EveryWordCapitalized. This causes 1-to-many joins in waterline to reduce the many child records to one when it runs _.uniq(childRows, pk). Since the model has defined pk == id but the actual value returned from Salesforce is pk == Id, this call to uniq blows away all child records but one. I'm not entirely sure if this should be a waterline bug or not but fixing the capitalization in the model attribute definitions resolved this.

List recent operations on mongodb

I'm using MongoDB with Node.js framework.
There is a weird behavior that some documents are not getting inserted into db, though from orm's point of view there are no errors: err = null in callback of Collection.create() and fresh document with _id is returned. When I try to search by that _id in db - no document is found.
I tried to manually insert new document to db and it was successfull.
Is there a way I can trace these operations from db's point of view? Some command to list recent requests and their results..?

You can enable profiling for all operations:
db.setProfilingLevel(2)
Then, look at system.profile collection to see what's happen. system.profile is a capped collection that can be searched as any other collection. Profiling can be noisy, and eventually you should have to change the size of the system.profile collection
db.setProfilingLevel(0)
db.system.profile.drop()
db.createCollection( "system.profile", { capped: true, size:4000000 } )
db.setProfilingLevel(2)

The most notable way of tracking errors within MongoDB is to use the --diaglog option: http://docs.mongodb.org/manual/reference/program/mongod/#cmdoption--diaglog with maybe a level of 3, however 1 might be enough for you.
As noted by #Neil this has unfortunately become deprecated as of 2.6.
The only way currently is to write out ALL operations MongoDB performs, via #Rauls answer, and then use a query like:
db.system.profile.find({op:{$in:['update', 'insert', 'remove']}});
and possibly resize the capped collection used for profiling: http://docs.mongodb.org/manual/tutorial/manage-the-database-profiler/#profiler-overhead to capture the amount you want.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string