Sails 0.10 association fails to populate - node.js

I'm working on a custom adapter in sails#0.10.0-rc4 which will support associations but I am having trouble getting them working in conjunction with my adapter. My configuration is a one-to-many association between article and stats. My models and adapter are setup like this:
// api/models/article.js
module.exports = {
connection: ['myadapter'],
tableName: 'Knowledge_Base__kav',
attributes: {
KnowledgeArticleId: { type: 'string', primaryKey: true }
stats: {
collection: 'stats',
via: 'parentId'
}
}
// api/models/stats.js
module.exports = {
connection: ['myadapter'],
tableName: 'KnowledgeArticleViewStat',
attributes: {
count: 'integer',
ParentId: {
model: 'article'
}
}
}
// adapter.js
find: function(connectionName, collectionName, options, cb) {
console.dir(options)
// output
// {where: null}
db.query(options, function(err, res)) {
cb(err, res)
}
}
However, when I try to populate using Article.find().populate('stats').exec(console.log()), my adapter gets {where: null} as options when I would expect it to receive {where: {parentId: [<some-article-id>]}}. It will return a list of articles to me but the field which is supposed to be populated from another model (stats) is just an empty list.
I feel like this is related to the fact that my adapter is not getting the proper where param to search for the related model on the primary key. To test this further, I setup a test one-to-many relationship using the the sails-mongo adapter. In this case the adapter did receive params I expected and the association worked fine.
Does anyone have any idea on why .populate('stats') wouldn't be sending the proper "where" params to my adapter?
Update 3/7
So it seems like what happens in associations is that SomeModel.find() will hit the adapter once and then .populate('othermodel') hits the adapter again using the primary key of the first request. Then the results of both are joined together. In my case, the second hit against the adapter isn't happening for some unknown reason.
Update
The original issue was related to an attribute naming error that's mentioned in the comments below. However, there still appears to be some issue with the final population step mentioned by particlebanana:
Final step will: Take all of the query results from all the returned query operations
and combine them in-memory to build up a result set you can return in
the exec callback.
I'm seeing that all required queries are now firing but they are failing to actually populate the alias. Here's the call with some added debugging output in the form of a gist for easier consumption: https://gist.github.com/jasonsims/9423170

It looks like you are on the right track! The way the operation sets get built up, the .find() on the Article should run with the first log (empty where) and the second query should get run with the parentId criteria in the log. The second query isn't running because it can't build up that parentId array of primary keys when you don't return anything from the first query.
Short answer: you need to return something in the find callback to see the second log, which should match your expected criteria.
The query lifecycle looks something like this:
Check if all query pieces are on the same connection, if not break out which queries will run on which connections
For all queries on a single connection, check if the adapter supports native joins (has a .join() method, if so you can pass the criteria down and let the adapter handle the joins.
If no native join method is defined run the "parent" operation (in this case the Article.find())
Use the results of the parent operation to build up criteria for any populations that need to run. (The parentId array in your criteria) and run the child results.
Take all of the query results from all the returned query operations and combine them in-memory to build up a result set you can return in the exec callback.
I hope that helps some. Shoot me the url of your repo and I will look through it, if it's able to be open sourced, and can help some more if you come across any issues.

Just to summarize, there were multiple issues going on here which were causing associations not to populate:
Custom primary keys
There was a problem with waterline when joining data from models using custom primary keys. #particlebanana fixed this in 8eff54b and it should be included in the next rc of waterline (waterline#0.10.0-rc5).
Malformed SOQL query
When waterline queries the adapter for a second time in order to acquire the child rows, it does so using { foreignKey: [ value ] }. Since the value was a list, jsforce was incorrectly generating the SOQL query since it expected all list values to be accompanied by either $in or $nin operators. I addressed this issue in github/jsforce#9 and it's now included in jsforce#1.1.2.
Model attributes are case sensitive
The models in my project were defined in snakeCase but the json response from Salesforce was using EveryWordCapitalized. This causes 1-to-many joins in waterline to reduce the many child records to one when it runs _.uniq(childRows, pk). Since the model has defined pk == id but the actual value returned from Salesforce is pk == Id, this call to uniq blows away all child records but one. I'm not entirely sure if this should be a waterline bug or not but fixing the capitalization in the model attribute definitions resolved this.

Related

Sequelize update with associations

I have two models which user and merchant. I will send JSON data from UI.
In Sequelize, I have used "include" option to insert the data like below.
models.user.create(req.body, { include: [models.merchant] });
It is working well as expected. So I have tried to update the data like below.
var filter = {
where: { id: id },
include: [
models.merchant
]
};
models.user.update(req.body, filter);
The above code is updating user data only. Association is not working in the update. I don't know what is wrong with this.
Please anyone help to resolve this issue.
Thanks in advance.
The behaviour your asking for simply can't be done with a single update call. If you check the docs for the update function, there isn't an include option, i.e. sequelize can only build an update query for the table of the model who's update function is called.
You will have to update the associations separately. I advise that you put those updates inside a transaction to avoid any issues with multiple updates to the same object happening at the same time.

mongoose query using sort and skip on populate is too slow

I'm using an ajax request from the front end to load more comments to a post from the back-end which uses NodeJS and mongoose. I won't bore you with the front-end code and the route code, but here's the query code:
Post.findById(req.params.postId).populate({
path: type, //type will either contain "comments" or "answers"
populate: {
path: 'author',
model: 'User'
},
options: {
sort: sortBy, //sortyBy contains either "-date" or "-votes"
skip: parseInt(req.params.numberLoaded), //how many are already shown
limit: 25 //i only load this many new comments at a time.
}
}).exec(function(err, foundPost){
console.log("query executed"); //code takes too long to get to this line
if (err){
res.send("database error, please try again later");
} else {
res.send(foundPost[type]);
}
});
As was mentioned in the title, everything works fine, my problem is just that this is too slow, the request is taking about 1.5-2.5 seconds. surely mongoose has a method of doing this that takes less to load. I poked around the mongoose docs and stackoverflow, but didn't really find anything useful.
Using skip-and-limit approach with mongodb is slow in its nature because it normally needs to retrieve all documents, then sort them, and after that return the desired segment of the results.
What you need to do to make it faster is to define indexes on your collections.
According to MongoDB's official documents:
Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
-- https://docs.mongodb.com/manual/indexes/
Using indexes may cause increased collection size but they improve the efficiency a lot.
Indexes are commonly defined on fields which are frequently used in queries. In this case, you may want to define indexes on date and/or vote fields.
Read mongoose documentation to find out how to define indexes in your schemas:
http://mongoosejs.com/docs/guide.html#indexes

Is it possible to have a collection attribute in SailsJs without using the 'via' field?

For example, if I had a 'Conversation' model a simple chat messaging system, I might do the following:
module.exports = {
attributes: {
messages: {
collection: 'Message'
}
}
}
Is this allowed in SailsJs? If not, is it recommended to mimic a "Has" relationship from Conversation to Message by using some form of custom array? Such as below:
module.exports = {
attributes: {
messages: {
type: 'array'
}
}
}
In a more complex scenario, my goal is to have the 'Conversation' know all of its 'Message' objects, but it is unnecessary for those 'Message' objects to know of its associated 'Conversation'.
I'd been using that construct for quite a while but only now did I find that the official docs don't specify it.
They mention that in one-way associations a model is associated with another model and don't mention collections. (Though they should work in just the same manner.)
For one-to-many associations they specify that a model can be associated with many other models (a collection) but don't specify what happens if you ignore the via attribute. They simply mention it is needed.
However, if you simply leave out the via attribute, the id field is used as the key for the association. So the construct you specified is allowed.
On a different note, you might want to reconsider keeping messages as either an array or a collection. Since you might need to add/retrieve/update/remove messages in a random fashion and collections and arrays can only be accessed as a whole, it might make sense to specify a relevant index on the Message collection and forgo having an association. This would let you quickly run queries like "retrieve the last 10 messages of thread " and so on.

Sequelize.js - how to properly use get methods from associations (no sql query on each call)?

I'm using Sequelize.js for ORM and have a few associations (which actually doesn't matter now). My models get get and set methods from those associations. Like this (from docs):
var User = sequelize.define('User', {/* ... */})
var Project = sequelize.define('Project', {/* ... */})
// One-way associations
Project.hasOne(User)
/*
...
Furthermore, Project.prototype will gain the methods getUser and setUser
according to the first parameter passed to define.
*/
So now, I have Project.getUser(), which returns a Promise. But if I call this twice on the very same object, I get SQL query executed twice.
My question is - am I missing something out, or this is an expected behavior? I actually don't want to make additional queries each time I call the same method on this object.
If this is expected - should I use custom getters with member variables which I manually populate and return if present? Or there is something more clever? :)
Update
As from DeBuGGeR's answer - I understand I can use includes when making a query in order to eager load everything, but I simply don't need it, and I can't do it all the time. It's waste of resources and a big overhead if I load my entire DB at the beginning, just to understand (by some criteria) that I won't need it. I want to make additional queries depending on situation. But I also can't afford to destroy all models (DAO objects) that I have and create new ones, with all the info inside them. I should be able to update parts of them, which are missing (from relations).
If you use getUser() it will make the query call, it dosent give you access to the user. You can manually save it to project.user or project.users depending on the association.
But you can try Eager Loading
Project.find({
include: [
{ model: User, as: 'user' } // here you HAVE to specify the same alias as you did in your association
]
}).success(function(project){
project.user // contains the user
});
Also e.g of getUser(). Dont expect it to automatically cache user and dont override this cleverly as it will create side effects. getUser is expected to get from database and it should!
Project.getUser().then(function(user){
// user is available and is a sequelize object
project.user = user; // save project.user and use it till u want to
})
The first part of things is clear - every call to get[Association] (for example Project.getUser()) WILL result in database query.
Sequelize does not maintain any kind of state nor cache for the results. You can get user in the Promisified result of the call, but if you want it again - you will have to make another query.
What #DeBuGGeR said - about using accessors is also not true - accessors are present only immediately after a query, and are not preserved.
As sometimes this is not ok, you have to implement some kind of caching system by yourself. Here comes the tricky part:
IF you want to use the same get method Project.getUser(), you won't be able to do it, as Sequelize overrides your instanceMethods. For example, if you have the association mentioned above, this won't work:
instanceMethods: {
getUser: function() {
// check if you have it, otherwise make a query
}
}
There are few possible ways to fix it - either change Sequelize core a little (to first check if the method exists), or use some kind of wrapper to those functions.
More details about this can be found here: https://github.com/sequelize/sequelize/issues/3707
Thanks to mickhansen for the cooperation on how to understand what to do :)

Node.js + Mongoose / Mongo & a shortened _id field

I'd like the unique _id field in one of my models to be relatively short: 8 letters/numbers, instead of the usual Mongo _id which is much longer. Having a short unique-index like this helps elsewhere in my code, for reasons I'll skip over here. I've successfully created a schema that does the trick (randomString is a function that generates a string of the given length):
new Schema('Activities', {
'_id': { type: String, unique: true, 'default': function(){ return randomString(8); } },
// ... other definitions
}
This works well so far, but I am concerned about duplicate IDs generated from the randomString function. There are 36^8 possible IDs, so right now it is not a problem... but as the set of possible IDs fills up, I am worried about insert commands failing due to a duplicate ID.
Obviously, I could do an extra query to check if the ID was taken before doing an insert... but that makes me cry inside.
I'm sure there's a better way to be doing this, but I'm not seeing it in the documentation.
This shortid lib https://github.com/dylang/shortid is being used by Doodle or Die, seems to be battle tested.
By creating a unique index on _id you'll get an error if you try to insert a document with a duplicate key. So wrap error handling around any inserts you do that looks for the error and then generates another ID and retries the insert in that case. You could add a method to your schema that implements this enhanced save to keep things clean and DRY.

Resources