How to obtain a MongoDb collection in NodeJS - node.js

There are two different methods to obtain a reference to a MongoDB collection - both of them are used throughout the official documentation.
There is
var mycollection = db.collection('mycollection)'
and there is
db.collection('mycollection', function(err, collection){
//use collection
}
I tend to use the second one because it is consistent with "db.createCollecion(collection, callback)"
What is the difference between these methods?
Is there any database interaction when using these methods?

If you look at the code for Database, currently around line 456, you'll see that the only difference between the two in the way you've used them is how the collection object is returned. If you specify a callback, then it's returned that way, otherwise, it's returned as the value to the function. If you set the options however and specifically the option strict to true, you need to use the callback. When strict is set to true, the collection is verified before continuing (asynchronously).
Given that collections can be created dynamically (and usually are upon first use), there often isn't need to use strict mode.
So, it's really matter of personal coding preference otherwise. There is normally no activity to the database when creating a Collection object via: db.collection('collectionname') with the exception I mentioned above.

Related

Express with pug, Postgres and proper MVC

I recently started using Node.js + Express.js (generated with pug) + pg-promise for handling db.
My first target is to obtain data from Postgres (already set up) and display it pretty using render and pug. Let's say it is user list from Users table.
On this restful tutorial I have learned how to get data and return it as JSON - it worked.
Based on Mozilla's tutorial I seperated my code:
routes/users.js: where for '/' I call user_controller.user_list method (using router.get)
controllers/userController.js I have exported user_list where I would like to ask model for data and call render if I have results
queries.js which is kinda my model? But I'm not sure. It has API: connection to db with promises and one function for every query I am going to use in Controllers. I believe I should have like one Model file per table (or any logical entity) but where to store pgp connections?
This file is based on first tutorial I mentioned
// queries.js (connectionString is set properly to my postgres)
var pgp = require('pg-promise')(options);
var db = pgp(connectionString);
function getUsers(req, res, next) {
db.any('SELECT (user_id, username) FROM public.users ORDER BY user_id ASC LIMIT 1000')
.then(function (data) {
res.json({ data: data });
})
.catch(function (err) {
return next(err);
});
}
module.exports = {
getUsers: getUsers
};
Here starts my problem as most tutorials uses mongoose which is very model-db-schema-friendly and what I have is simple 'SELECT ...' string I pass to pg-promise's any() function.
Therefore I have no model class like User.
In userControllers.js I don't know how to call getUsers() to handle its data. Returning JS object from getUsers() would be nice.
Also: where should I call render? In controller or only in
db.any(...).then(function (data) { <--here--> })
Before, I also tried to embed whole Postgres handling into Controller but from db.any() I got this array for handling:
[{ row: '(1,John)' },{ row: '(2,Amy)' },{ row: '(50,Peter)' } ]
Didn't know how go from there as I probably lost my API functionality as well ;-)
I am browsing through multiple tutorials how to handle MVC but usually they handle MongoDB and
satisfy readers with res.send() not render().
I am not sure that I understand what your question is exactly about, but since I do not have enough reputation to comment, I'll do my best to help you with your interrogations. :)
First, regarding the queries.js file, it is IMO not exactly a model, but rather a DAO (Data Access Object) file. DAO comes between you Model (which is actually you database) and your Controller layers. There usually is a DAO file per object (User, Pet, whatever you want) in your data model.
When the data model is rather complex, it can be useful to use an Object Relational Mapping (ORM) such as Mongoose to map your database and execute complexe processes on your objects. In such a case, you might need a specific file per object so as to describe your model and store your queries. But since you don't need an ORM, you DAO can directly interact with your database. That is why you do not have a User.js file.
Regarding the way the db object should be used, I think you should refer directly to pg-promise documentation on the matter.
IMPORTANT: For any given connection, you should only create a single
Database object in a separate module, to be shared in your application
(see the code example below). If instead you keep creating the
Database object dynamically, your application will suffer from loss in
performance, and will be getting a warning in a development
environment (when NODE_ENV = development)
As a matter of fact, a db object in pg-promise sort of represents the database itself and is actually designed for the simultaneous use of several databases, which does not seem to be your case for the moment.
Finally, when it comes to the render function, I believe it should be in the controller, as your DAO is not supposed to know how the data it has gathered is going to be used.
Modularity is always a time-saving choice on the long-term.
Furthermore, note that you might later need a Business Layer between your DAO and your controller, in order to preprocess and postprocess data you are going to persist or to display. In such a case, if you need for instance to ask for data from your database, you will need to render data after it is processed by the Business layer. If the render is made in the DAO layer, it will not be possible.
In the link I provided earlier to pg-promise's db object connection, you will also find documentation on the any() method. You might already have looked it up.
It specifically states that it returns
A promise object that represents the query result:
When no rows are returned, it resolves with an empty array.
When 1 or more rows are returned, it resolves with the array of rows.
so your returned data is a JS Array. If you want to make it a JS object, just use
JSON.stringify(yourArray) to process your data before rendering it in your controller.
But I wonder if Pug is not able to use your data directly.
Also, if you cannot get any data out of your DAO, maybe you should check that your data object is not empty, as such a case is tolerated by the any() method. If you expect your query to always return something, you might want to consider using the many() or the one() methods.
I hope this helps you.

A collection 'DBName.CollectionName' already exists MongoDB [duplicate]

I'm working on a node.js app that uses MongoDB and I read this from the docs:
db.collection
Fetch a specific collection (containing the actual collection information). If the application does not use strict mode you can can use it without a callback in the following way.
var collection = db.collection('mycollection');
First of all, what 'strict mode' is the doc referring to?
Also, is it a bad practice to grab the collection in this fashion? Without the callback, wouldn't I lose the ability to capture a potential connection error when trying to select the right collection?
db.collection('some_collection', function(err, collection) {
// query goes here
});
http://mongodb.github.io/node-mongodb-native/api-generated/db.html#collection
strict, (Boolean, default:false) returns an error if the collection
does not exist
Right there in the documentation.
That is there so your application may not create new collections itself and can only reference what has been created before. Hence the need for the callback, in order to trap the error.
It might be referring to Javascript's strict mode instead of a Mongo specific feature. strict mode enables some optional but backwards incompatible changes in the Javascript language that help catch some bugs:
What does "use strict" do in JavaScript, and what is the reasoning behind it?

Mongoose compound index creation fields order

I have a question regarding compound index creation in mongodb:
Say I want to create this compound index:
cSchema.index({account:1, authorization: 1, c_type: 1});
Problem is, javascript doesn't guarantee dictionary order, so I won't be sure the compound index is in the order I want.
How can I make sure it's indeed {account:1, authorization:1, c_type:1} in that order ?
Thanks !
The simple answer is that most abuse the behaviour that simple String properties that don't parse to an integer on an Object will enumerate in creation order. Although not guaranteed in ES2015 for some enumeration methods, it does work in a confined environment like Node/V8. This has worked in most JS engines for a while now but had never been part of the ES spec before ES2015.
MongoDB
The underlying MongoDB driver createIndex function supports String, Array and Object index definitions. The parsing code is in a util function called parseIndexOptions.
If you specify an array of Strings, Array pairs, or Objects then the order will be fixed:
['location','type']
[['location', '2d'],['type', 1]]
[{location: '2d'},{type: 1}]
Also note that the createIndex code does use Object.keys to enumerate when it gets an Object of index properties.
Mongoose
The Schema#index function is documented as requiring an Object to pass to "MongoDB driver's createIndex() function" so looks to support passing through whatever options you provide.
There are a couple of places where indexes are further processed though. For example when you create a sub document with an index, the index needs to have the parent schema prefixed onto field names. On a quick glance I think this code still works with an Array but I can't see any tests for that in the Mongoose code so you might want to confirm it yourself.
Mongoose does have a compound index test that relies on Object property order.

Sequelize.js - how to properly use get methods from associations (no sql query on each call)?

I'm using Sequelize.js for ORM and have a few associations (which actually doesn't matter now). My models get get and set methods from those associations. Like this (from docs):
var User = sequelize.define('User', {/* ... */})
var Project = sequelize.define('Project', {/* ... */})
// One-way associations
Project.hasOne(User)
/*
...
Furthermore, Project.prototype will gain the methods getUser and setUser
according to the first parameter passed to define.
*/
So now, I have Project.getUser(), which returns a Promise. But if I call this twice on the very same object, I get SQL query executed twice.
My question is - am I missing something out, or this is an expected behavior? I actually don't want to make additional queries each time I call the same method on this object.
If this is expected - should I use custom getters with member variables which I manually populate and return if present? Or there is something more clever? :)
Update
As from DeBuGGeR's answer - I understand I can use includes when making a query in order to eager load everything, but I simply don't need it, and I can't do it all the time. It's waste of resources and a big overhead if I load my entire DB at the beginning, just to understand (by some criteria) that I won't need it. I want to make additional queries depending on situation. But I also can't afford to destroy all models (DAO objects) that I have and create new ones, with all the info inside them. I should be able to update parts of them, which are missing (from relations).
If you use getUser() it will make the query call, it dosent give you access to the user. You can manually save it to project.user or project.users depending on the association.
But you can try Eager Loading
Project.find({
include: [
{ model: User, as: 'user' } // here you HAVE to specify the same alias as you did in your association
]
}).success(function(project){
project.user // contains the user
});
Also e.g of getUser(). Dont expect it to automatically cache user and dont override this cleverly as it will create side effects. getUser is expected to get from database and it should!
Project.getUser().then(function(user){
// user is available and is a sequelize object
project.user = user; // save project.user and use it till u want to
})
The first part of things is clear - every call to get[Association] (for example Project.getUser()) WILL result in database query.
Sequelize does not maintain any kind of state nor cache for the results. You can get user in the Promisified result of the call, but if you want it again - you will have to make another query.
What #DeBuGGeR said - about using accessors is also not true - accessors are present only immediately after a query, and are not preserved.
As sometimes this is not ok, you have to implement some kind of caching system by yourself. Here comes the tricky part:
IF you want to use the same get method Project.getUser(), you won't be able to do it, as Sequelize overrides your instanceMethods. For example, if you have the association mentioned above, this won't work:
instanceMethods: {
getUser: function() {
// check if you have it, otherwise make a query
}
}
There are few possible ways to fix it - either change Sequelize core a little (to first check if the method exists), or use some kind of wrapper to those functions.
More details about this can be found here: https://github.com/sequelize/sequelize/issues/3707
Thanks to mickhansen for the cooperation on how to understand what to do :)

Referencing external doc in CouchDB view

I am scraping an 90K record database using JSON-RPC and I am trying to put in some basic error checking. I want to start by scraping the database twice using two different settings and adding a prefix to the second scrape. This way I can check to ensure that the two settings are not producing different records (due to dropped updates, etc). I wanted to implement the comparison using a view which compares each document from the first scrape with it's twin produced by the second scrape and then emit the names of records with a difference between them.
However, I cannot quite figure out how to pull in another doc in the view, everything I have read only discusses external docs using the emit() function, which is too late to permit me to compare it. In the example below, the lookup() function would grab the referenced document.
Is this just not possible?
function(doc) {
if(doc._id.slice(0,1)!=='$' && doc._id.slice(0,1)!== "_"){
var otherDoc = lookup('$test" + doc._id);
if(otherDoc){
var keys = doc.value.keys();
var same = true;
keys.forEach(function(key) {
if ((key.slice(0,1) !== '_') && (key.slice(0,1) !=='$') && (key!=='expires')) {
if (!Object.equal(otherDoc[key], doc[key])) {
same = false;
}
}
});
if(!same){
emit(doc._id, 1);
}
}
}
}
Context
You are correct that this is not possible in CouchDB. The whole point of the map function is that it must be idempotent, otherwise you lose all the other nice benefits of a pre-calculated index.
This is why you cannot access external resources in the map function, whether they be other records or the clock. Any time you run a map you must always get the same result if you put the same record into it. Since there are no relationships between records in CouchDB, you cannot promise that this is possible.
Solution
However, you can still achieve your end goal, just be different means. Some possibilities...
Assuming there is some meaningful numeric value in each doc, you could use a view to take the sum of all those values and group them by which import you did ({key: <batch id>, value: <meaningful number>}). Then compare the two numbers in your client or the browser to see if they match.
A brute force approach would be to use a view to pair the docs that should match. Each doc is on a different row, but they're grouped by a common field. Then iterate through the entire index comparing the pairs. This would certainly be the quickest to code and doesn't depend on your application or data.
Implement a validation function to enforce a schema on your data. Just be warned that this will reduce your write throughput since each written record will be piped out of Erlang and into the JS engine. Also, this is only applicable if you're worried about properly formed records instead of their precise content, which might not be the case.
Instead of your different batch jobs creating different docs, have them place them into the same doc. The structure might look like this: { "_id": "something meaningful", "batch_one": { ..data.. }, "batch_two": { ..data.. } } Then your validation function could compare them or you could create a view that indexes all the docs that don't match. All depends on where in your pipeline you want to do the error checking and correction.
Personally I like the last option better, but only if you don't plan to use the database as is in production. Ie., you wouldn't want to carry around all that extra data in each record.
Hope that helps.
Cheers.

Resources