I'm using node-mysql2 with a connection pool and a connection limit of 10. When I restart the application, the results are good - they match what I have on the db. But when I start inserting new records and redo the same select queries, then I get intermittent results missing the latest record I just added.
If I do check the database directly, I can see the records I just added through my application. It's only the application that cannot see it somehow.
I think this is a bug, but here's how I have my code setup:
module.exports.getDB = function (dbName) {
if (!(dbName in dbs)) {
console.log(`Initiating ${dbName}`);
let config = dbConfigs[dbName];
dbs[dbName] = mysql.createPool({
host: config.host,
port: config.port || 3306,
user: config.user,
password: config.password,
connectionLimit: 10,
database: config.database,
debug: config.debug
});
}
return dbs[dbName]; // I just initialize each database once
};
This is my select query:
let db = dbs.getDB('myDb');
const [rows] = await db.query(`my query`);
console.log(rows[0]); // this one starts to show my results inconsistently once I insert records
And this is my insert query:
module.exports = {
addNote: async function(action, note, userID, expID) {
let db = dbs.getDB('myDb');
await db.query(`INSERT INTO experiment_notes (experiment_id, action, created_by, note)
VALUES (?, ?, ?, ?)`, [expID, action, userID, note]);
}
};
If I set the connectionLimit to 1, I cannot reproduce the problem... at least not yet
Any idea what I'm doing wrong?
Setting your connection_limit to 1 has an interesting side-effect: it serializes all access from your node program to your database. Each operation, be it INSERT or SELECT, must run to completion before the next one starts because it has to wait for the one connection in the pool to free up.
It's likely that your intermittently missing rows are due to concurrent access to your DBMS from different connections in your pool. If you do a SELECT from one connection while MySQL is handling the INSERT from another connection, the SELECT won't always find the row being inserted. This is a feature. It's part of ACID (atomicity, consistency, isolation, durability). ACID is vital to making DBMSs scale up.
In more complex applications than the one you showed us, the same thing can happen when you use DBMS transactions and forget to COMMIT them.
Edit Multiple database connections, even connections from the same pool in the same program, work independently of each other. So, if you're performing a not-yet-committed transaction on one connection and a query on another connection, the query will (usually) reflect the database's state before the transaction started. The query cannot force the transaction to roll back unless it somehow causes a deadlock. But deadlocks generate error messages; you probably are not seeing any.
You can sometimes control what a query sees by preceding it, on the same connection, with SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; . That can, on a busy DBMS, improve query performance a little bit, and prevent some deadlocks, as long as you're willing to have your query see only part of a transaction. I use it for historical queries (what happened yesterday). It's documented here. The default, the one that explains what you see, is SET TRANSACTION LEVEL REPEATABLE READ;
But, avoid that kind of isolation-level stuff until you need it. (That advice comes under the general heading of "too smart is dumb.")
Related
Postgres sequence name - post_seq
SELECT query to get the next sequence - select nextval(post_seq)
Using sequelize v5.x
pool configuration -
{
max: 10,
min: 1,
acquire: 30000,
idle: 10000,
validate: async pgClient => {
const result = await pgClient.query('SELECT pg_is_in_recovery()');
const isReadOnly = result.rows[0].pg_is_in_recovery;
console.log(isReadOnly, 'isReadOnly:src/utils/db.js')
return !isReadOnly;
}
}
Expectation -
options.pool.validate is called for all the queries running in the application including the above SELECT query to get the next sequence id
What's happening -
options.pool.validate is called only for non-SELECT queries
I am assuming this is the default behavior of sequelize. If that's the case, what would be the other way to force SELECT queries to use only writable connection? The reason for this expectation is that during AWS RDS Failover, the reader connection can't be used to run the above SELECT query since nextval() isn't just a select query. If there's a way to call options.pool.validate for this SELECT query, sequelize would discard that connection before making this nextval() query because of pool configuration used. As of now, the error I am getting in the server logs is as follows -
SequelizeDatabaseError: cannot execute nextval() in a read-only transaction\n
Couple of other points to note -
I am connecting to cluster writer endpoint in the nodejs application
I am using 'SELECT pg_is_in_recovery()' query to check whether the connection being used is read-only. If it's read-only, the connection is discarded by sequelize.
I have tried using useMaster:true in the pool config and it doesn't seem to help during the failover scenario. Probably, this is useful mainly in case of replication rather than a DR setup.
I'm using MongoDB and Mongoose in a REST API. Some deployments require a replica set, thus separate read/write databases, so as a result I have separate read/write connections in the API. However, more simple deployments don't need a replica-set, and in those cases I point my read/write connections to the same MongoDB instance and database.
My general approach is to create all models for both connections at API start up. Even when read/write conns are connecting to same database, I am able to create the same models on both connections without error.
let ReadUser = dbRead.model('User', userSchema);
let WriteUser = dbWrite.model('User', userSchema);
// no error even when dbRead and dbWrite point to same DB
Trouble comes when until I start using Mongoose Discriminators.
let ReadSpecialUser = ReadUser.discriminator('SpecialUser', specialUserSchema);
let WriteSpecialUser = WriteUser.discriminator('SpecialUser', specialUserSchema);
// Results in this Error when read and write point to same DB:
// Error: Discriminator with name "SpecialUser" already exists
I'm look for an elegant way to deal with this. Is there a way to query the db for discriminators that are already in use?
According to the Mongoose API docs the way to do this is to use Model.discriminators. So in the case above it would be
ReadUser.discriminators
or
WriteUser.discriminators
However this doesn't return anything for me. What does work is using
Object.keys(Model.discriminators)
As expected this gets you an array of strings of the discriminator names you've set previously.
If you want to use the existing discriminator model and know its name what you can do is use Model.discriminators.discriminatorName. In your example it would be:
let ReadSpecialUserDocument = new ReadUser.discriminators.SpecialUser({
key: value,
key: value,
});
ReadSpecialUserDocument.save()
This can be useful when you need to reuse the discriminator at different times, and its name is tied to your data in some way.
So I am writing a node app that reads from redis, I would like to do a query of some sort that returns the number of databases does anyone know how to do that.
So right now basically what I have is a way to get all keys in a database but I want the level higher, I want to iterate over all databases and then get all keys. This is the code for getting all the keys for the current DB.
const client = redis.createClient({host: "127.0.0.1", port: 6379});
client.multi()
.keys('*', function (err, replies) {
console.log("MULTI got " + replies.length + " replies");
let dbs = [replies];
let dbData = {};
replies.forEach(function (reply, index) {
client.get(reply, function (err, data) {
console.log(reply + " " +data);
});
});
})
.exec(function (err, replies) { });
Solution 1
As #carebdayrvis mentioned, you can use INFO command to get the database info, and parse the info to get the number of databases.
There're two problems with this solution:
It only returns the info of databases that are NOT empty. It doesn't show you the total number of databases.
If the format of the info text changes, you have to rewrite the parsing code.
Solution 2
Call CONFIG GET DATABASES to get the total number of databases. This result includes both empty and non-empty databases. You can use SELECT db-index and DBSIZE commands to figure out which databases are NOT empty.
The advantage of this solution is that it's more programmable.
Other Stuff
By the way, KEYS should NOT be used in production environment, it might block Redis for a long time. You should consider using SCAN command instead.
This redis command's output includes that information. You should be able to call that from a node client.
Redis security recommends disabling the CONFIG command so that remote users cannot reconfigure an instance. The RedisHttpSessionConfiguration requires access to this during its initialization. Hosted Redis services, like AWS ElastiCache disable this command by default, with no option to re-enable it.
Ref: https://github.com/spring-projects/spring-session/issues/124
One more reliable alternative is to use the select command and loop until you get an error.
Aim: sync elasticsearch with postgres database
Why: sometimes newtwork or cluster/server break so future updates should be recorded
This article https://qafoo.com/blog/086_how_to_synchronize_a_database_with_elastic_search.html suggests that I should create a separate table updates that will sync elasticsearch's id, allowing to select new data (from database) since the last record (in elasticsearch). So I thought what if I could record elasticsearch's failure and successful connection: if client ponged back successfully (returned a promise), I could launch a function to sync records with my database.
Here's my elasticConnect.js
import elasticsearch from 'elasticsearch'
import syncProcess from './sync'
const client = new elasticsearch.Client({
host: 'localhost:9200',
log: 'trace'
});
client.ping({
requestTimeout: Infinity,
hello: "elasticsearch!"
})
.then(() => syncProcess) // successful connection
.catch(err => console.error(err))
export default client
This way, I don't even need to worry about running cron job (if question 1 is correct), since I know that cluster is running.
Questions
Will syncProcess run before export default client? I don't want any requests coming in while syncing...
syncProcess should run only once (since it's cached/not exported), no matter how many times I import elasticConnect.js. Correct?
Is there any advantages using the method with updates table, instead of just selecting data from parent/source table?
The articles' comments say "don't use timestamp to compare new data!".Ehhh... why? It should be ok since database is blocking, right?
For 1: As it is you have not warranty that syncProcess will have run by the time the client is exported. Instead you should do something like in this answer and export a promise instead.
For 2: With the solution I linked to in the above question, this would be taken care of.
For 3: An updates table would also catch record deletions, while simply selecting from the DB would not, since you don't know which records have disappeared.
For 4: The second comment after the article you linked to provides the answer (hint: timestamps are not strictly monotonic).
I have a mongodb replica set from which I want to read data from primary and secondary db.
I have used this command to connect to the db:
mongoose.connect('mongodb://user:password#54.230.1.1,user:password#54.230.1.2,user:password#54.230.1.3/PanPanDB?replicaSet=rs0&readPreference=nearest');
It doesn't work.. My application continues to read from the primary.. Any suggestion please?
If you want to read from a secondary, you should set your read preference to either of:
secondaryPreferred - In most situations, operations read from secondary members but if no secondary members are available, operations read from the primary.
secondary - All operations read from the secondary members of the replica set.
Reading from nearest as per your example will select the nearest member by ping time (which could be either the primary or a secondary).
Caveats
When using any read preference other than primary, you need to be aware of potential issues with eventual consistency that may affect your application logic. For example, if you are reading from a secondary there may be changes on the primary that have not replicated to that secondary yet.
If you are concerned about stronger consistency when reading from secondaries you should review the Write Concern for Replica Sets documentation.
Since secondaries have to write the same data as the primary, reading from secondaries may not improve performance unless your application is very read heavy or is fine with eventual consistency.
Following the documentation found on MongoDB website and on Mongoose web site, you can add this instruction for configuring the ReadPreference on Mongoose:
var opts = { replSet: {readPreference: 'ReadPreference.NEAREST'} };
mongoose.connect('mongodb://###:#######:###/###', opts);
This has been tested using Mongoose version 3.8.9
As well as setting the connection URI (as you did) and the connection options (as Emas did), I also had to explicitly choose the server for each query, e.g.
var query = User.find({}).read("nearest");
query.exec(function(err, users) {
// ...
});
Mongoose use node package "mongodb", connection uri or opts is parsed by "mongodb". Here is mongodb connect opts and mongodb readPreference source code.
So, we can use mongoose like this:
var opts = {db: {readPreference: 'nearest'};
mongoose.connect(uri, opts);
Also, just use uri like this:
var uri = 'mongodb://###?readPreference=nearest';
mongoose.connect(uri, opts);
In mongoose 4.3.4 above take effect.
This is the proper instantiation in Mongoose v5.9.5:
const opts = {
readPreference: 'nearest',
}
mongoose.connect(MONGODB_CONNECTION, opts)
These are the different string values depending on the preference type you're looking for:
ReadPreference.PRIMARY = 'primary';
ReadPreference.PRIMARY_PREFERRED = 'primaryPreferred';
ReadPreference.SECONDARY = 'secondary';
ReadPreference.SECONDARY_PREFERRED = 'secondaryPreferred';
ReadPreference.NEAREST = 'nearest'
You can simply do that by using below code
var collection = db.collection(collectionName,{readPreference:'secondaryPreferred'});
http://p1bugs.blogspot.in/2016/06/scaling-read-query-load-on-mongodb.html