Mongoose: Read on ReplicaSet

Mongoose: Read on ReplicaSet - node.js

I have a mongodb replica set from which I want to read data from primary and secondary db.
I have used this command to connect to the db:
mongoose.connect('mongodb://user:password#54.230.1.1,user:password#54.230.1.2,user:password#54.230.1.3/PanPanDB?replicaSet=rs0&readPreference=nearest');
It doesn't work.. My application continues to read from the primary.. Any suggestion please?

If you want to read from a secondary, you should set your read preference to either of:
secondaryPreferred - In most situations, operations read from secondary members but if no secondary members are available, operations read from the primary.
secondary - All operations read from the secondary members of the replica set.
Reading from nearest as per your example will select the nearest member by ping time (which could be either the primary or a secondary).
Caveats
When using any read preference other than primary, you need to be aware of potential issues with eventual consistency that may affect your application logic. For example, if you are reading from a secondary there may be changes on the primary that have not replicated to that secondary yet.
If you are concerned about stronger consistency when reading from secondaries you should review the Write Concern for Replica Sets documentation.
Since secondaries have to write the same data as the primary, reading from secondaries may not improve performance unless your application is very read heavy or is fine with eventual consistency.

Following the documentation found on MongoDB website and on Mongoose web site, you can add this instruction for configuring the ReadPreference on Mongoose:
var opts = { replSet: {readPreference: 'ReadPreference.NEAREST'} };
mongoose.connect('mongodb://###:#######:###/###', opts);
This has been tested using Mongoose version 3.8.9

As well as setting the connection URI (as you did) and the connection options (as Emas did), I also had to explicitly choose the server for each query, e.g.
var query = User.find({}).read("nearest");
query.exec(function(err, users) {
// ...
});

Mongoose use node package "mongodb", connection uri or opts is parsed by "mongodb". Here is mongodb connect opts and mongodb readPreference source code.
So, we can use mongoose like this:
var opts = {db: {readPreference: 'nearest'};
mongoose.connect(uri, opts);
Also, just use uri like this:
var uri = 'mongodb://###?readPreference=nearest';
mongoose.connect(uri, opts);
In mongoose 4.3.4 above take effect.

This is the proper instantiation in Mongoose v5.9.5:
const opts = {
readPreference: 'nearest',
}
mongoose.connect(MONGODB_CONNECTION, opts)
These are the different string values depending on the preference type you're looking for:
ReadPreference.PRIMARY = 'primary';
ReadPreference.PRIMARY_PREFERRED = 'primaryPreferred';
ReadPreference.SECONDARY = 'secondary';
ReadPreference.SECONDARY_PREFERRED = 'secondaryPreferred';
ReadPreference.NEAREST = 'nearest'

You can simply do that by using below code
var collection = db.collection(collectionName,{readPreference:'secondaryPreferred'});
http://p1bugs.blogspot.in/2016/06/scaling-read-query-load-on-mongodb.html

Related

how to set mongoose indexes correctly and test them

I want to set 2 indexes for now, perhaps a 3rd but wanted to know how I can test if they are actually working? Do I need to use with mongo shell or is there a way to check using Node.js during development? I also saw an example of the indexes being created in mongoDb Compass. I am using mongoDb Atlas so wondered if I must just set the index in Compass or do I still need to do it in my mongoose schema?
Also, the mongoose docs say you should set autoIndex to false. Is the below then correct?
const mongoose = require("mongoose");
const Schema = mongoose.Schema;
const userSchema = new Schema({
firstName: {
type: String,
},
lastName: {
type: String,
},
});
userSchema.set("autoIndex", false);
userSchema.index({ firstName: 1, lastName: 1 });
module.exports = mongoose.model("User", userSchema);

There are a bunch of different questions here, let's see if we can tackle them in order.
I want to set 2 indexes for now, perhaps a 3rd
This isn't a question from your side, but rather from mine. What are the indexes that you are considering and what queries will you be running?
The reason I ask is because I only see a single index definition provided in the question ({ firstName: 1, lastName: 1 }) and no query. Normally indexes are designed specifically to support the queries, so the first step towards ensuring a successful indexing strategy is to make sure they align appropriately with the anticipated workload.
how I can test if they are actually working? Do I need to use with mongo shell or is there a way to check using Node.js during development?
There are a few ways to approach this, which include:
Using the explain() method to confirm that the winningPlan is using the index as expected. This is often done via the MongoDB Shell or via Compass.
Using the $indexStats aggregation stage to confirm that usage counters of the index are incrementing as expected when the application runs.
Taking a look at some of the tabs in the Atlas UI such as Performance Advisor or the Profiler which may help alert you to unoptimized operations and missing indexes.
I am using mongoDb Atlas so wondered if I must just set the index in Compass or do I still need to do it in my mongoose schema?
You can use Compass (or the Atlas UI, or the MongoDB Shell) to create your indexes. I would recommend against doing this in the application directly.
Also, the mongoose docs say you should set autoIndex to false. Is the below then correct?
As noted above, I would go further and remove index creation from the application code altogether. There can be some unintended side effects of making the application directly responsible for index management, which is one of the reasons that Mongoose no longer recommends using the autoIndex functionality.

node-mysql2: resultset not reflecting the latest results

I'm using node-mysql2 with a connection pool and a connection limit of 10. When I restart the application, the results are good - they match what I have on the db. But when I start inserting new records and redo the same select queries, then I get intermittent results missing the latest record I just added.
If I do check the database directly, I can see the records I just added through my application. It's only the application that cannot see it somehow.
I think this is a bug, but here's how I have my code setup:
module.exports.getDB = function (dbName) {
if (!(dbName in dbs)) {
console.log(`Initiating ${dbName}`);
let config = dbConfigs[dbName];
dbs[dbName] = mysql.createPool({
host: config.host,
port: config.port || 3306,
user: config.user,
password: config.password,
connectionLimit: 10,
database: config.database,
debug: config.debug
});
}
return dbs[dbName]; // I just initialize each database once
};
This is my select query:
let db = dbs.getDB('myDb');
const [rows] = await db.query(`my query`);
console.log(rows[0]); // this one starts to show my results inconsistently once I insert records
And this is my insert query:
module.exports = {
addNote: async function(action, note, userID, expID) {
let db = dbs.getDB('myDb');
await db.query(`INSERT INTO experiment_notes (experiment_id, action, created_by, note)
VALUES (?, ?, ?, ?)`, [expID, action, userID, note]);
}
};
If I set the connectionLimit to 1, I cannot reproduce the problem... at least not yet
Any idea what I'm doing wrong?

Setting your connection_limit to 1 has an interesting side-effect: it serializes all access from your node program to your database. Each operation, be it INSERT or SELECT, must run to completion before the next one starts because it has to wait for the one connection in the pool to free up.
It's likely that your intermittently missing rows are due to concurrent access to your DBMS from different connections in your pool. If you do a SELECT from one connection while MySQL is handling the INSERT from another connection, the SELECT won't always find the row being inserted. This is a feature. It's part of ACID (atomicity, consistency, isolation, durability). ACID is vital to making DBMSs scale up.
In more complex applications than the one you showed us, the same thing can happen when you use DBMS transactions and forget to COMMIT them.
Edit Multiple database connections, even connections from the same pool in the same program, work independently of each other. So, if you're performing a not-yet-committed transaction on one connection and a query on another connection, the query will (usually) reflect the database's state before the transaction started. The query cannot force the transaction to roll back unless it somehow causes a deadlock. But deadlocks generate error messages; you probably are not seeing any.
You can sometimes control what a query sees by preceding it, on the same connection, with SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; . That can, on a busy DBMS, improve query performance a little bit, and prevent some deadlocks, as long as you're willing to have your query see only part of a transaction. I use it for historical queries (what happened yesterday). It's documented here. The default, the one that explains what you see, is SET TRANSACTION LEVEL REPEATABLE READ;
But, avoid that kind of isolation-level stuff until you need it. (That advice comes under the general heading of "too smart is dumb.")

Mongoose - how to find discriminators already in use

I'm using MongoDB and Mongoose in a REST API. Some deployments require a replica set, thus separate read/write databases, so as a result I have separate read/write connections in the API. However, more simple deployments don't need a replica-set, and in those cases I point my read/write connections to the same MongoDB instance and database.
My general approach is to create all models for both connections at API start up. Even when read/write conns are connecting to same database, I am able to create the same models on both connections without error.
let ReadUser = dbRead.model('User', userSchema);
let WriteUser = dbWrite.model('User', userSchema);
// no error even when dbRead and dbWrite point to same DB
Trouble comes when until I start using Mongoose Discriminators.
let ReadSpecialUser = ReadUser.discriminator('SpecialUser', specialUserSchema);
let WriteSpecialUser = WriteUser.discriminator('SpecialUser', specialUserSchema);
// Results in this Error when read and write point to same DB:
// Error: Discriminator with name "SpecialUser" already exists
I'm look for an elegant way to deal with this. Is there a way to query the db for discriminators that are already in use?

According to the Mongoose API docs the way to do this is to use Model.discriminators. So in the case above it would be
ReadUser.discriminators
or
WriteUser.discriminators
However this doesn't return anything for me. What does work is using
Object.keys(Model.discriminators)
As expected this gets you an array of strings of the discriminator names you've set previously.
If you want to use the existing discriminator model and know its name what you can do is use Model.discriminators.discriminatorName. In your example it would be:
let ReadSpecialUserDocument = new ReadUser.discriminators.SpecialUser({
key: value,
key: value,
});
ReadSpecialUserDocument.save()
This can be useful when you need to reuse the discriminator at different times, and its name is tied to your data in some way.

Can you use CouchDB 'document update handlers' with replication?

I am replicating docs from DB A to DB B, every time a Doc from DB A arrives in DB B I want to run a 'stored procedure' to remove most of the fields from DB A (DB A is private, but has attachments that I want to be publicly available)
So far I've seen that this might be achieved using the _changes feed (continuous)and then running an 'update' handler on each document.
The document update handlers doc: https://wiki.apache.org/couchdb/Document_Update_Handlers
This seems like something that CouchDB would implement for me... (and I'm not really sure yet how to do the above).
Is there something like a 'hook' that can be run on every document that enters the database?
== EDIT ==
It seems that I would want to somehow include the update handler command in the replication trigger?

It sounds like with some changes to how your storing documents you may be able to benefit from CouchDB's filtered replication. You'd need to store the attachments in documents that could be equivalently copied (without modification) between the two databases.
If that's not an option, then you could potentially use transform-pouchdb plus PouchDB's .replicate.from() method to manage the replication.
Some quick pseudo-code for this idea looks a bit like this:
var PouchDB = require('pouchdb');
PouchDB.plugin(require('transform-pouch'));
var dbA = new PouchDB('a'); // "a" could be a URL to CouchDB or Cloudant
var dbB = new PouchDB('b');
dbB.transform({
incoming: function (doc) {
// do something to the document before storage
return doc;
}
});
dbB.replicate.from(dbA);
In theory, that (or something like it) should do what you're wanting...or at least giving you the framework in which to do what you're wanting. ^_^
Hope that helps!

node.js + mongoose connection and creation issue

I just want to know if when I set a mongoose connection and I define some models, (previously adding their appropriate requires on app.js, or wathever), the model, if not exist, will be created automatically the first time when I run node app.js?
Is this kind of logic correct?
If not, do I have to create before my mongoDB collections, models and so on?
I was thinking to an automatic creation of the mongo db collection when I first run the app.js
Thanks!
Michele Prandina

Schemas (and models) are a client-side (node.js) manifestation of your data model. A few things, like the indexes you've defined, are created upon first use (like saving a document for example). Nearly everything else is delay created, including collections.
If you want consistent behavior regarding your models (and their associated schemas), you'll need to make sure they're loaded prior to any access of the associated database. It doesn't really matter where you put them, as long as they are created/executed prior to usage. You might for example:
app.js
models\Cheese.js
\Cracker.js
Then, in app.js:
var Cheese = require('Cheese.js');
var Cracker = require('Cracker.js');
Assuming, of course, you've exported the models:
model.exports = mongoose.model('Cheese',
new mongoose.Schema({
name: String,
color: String
})
);

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string