I need some clarification about what the pool is and what it does. The docs say Sequelize will setup a connection pool on initialization so you should ideally only ever create one instance per database.
var sequelize = new Sequelize('database', 'username', 'password', {
host: 'localhost',
dialect: 'mysql'|'mariadb'|'sqlite'|'postgres'|'mssql',
pool: {
max: 5,
min: 0,
idle: 10000
},
// SQLite only
storage: 'path/to/database.sqlite'
});
When your application needs to retrieve data from the database, it creates a database connection. Creating this connection involves some overhead of time and machine resources for both your application and the database. Many database libraries and ORM's will try to reuse connections when possible, so that they do not incur the overhead of establishing that DB connection over and over again. The pool is the collection of these saved, reusable connections that, in your case, Sequelize pulls from. Your configuration of
pool: {
max: 5,
min: 0,
idle: 10000
}
reflects that your pool should:
Never have more than five open connections (max: 5)
At a minimum, have zero open connections/maintain no minimum number of connections (min: 0)
Remove a connection from the pool after the connection has been idle (not been used) for 10 seconds (idle: 10000)
tl;dr: Pools are a good thing that help with database and overall application performance, but if you are too aggressive with your pool configuration you may impact that overall performance negatively.
pool is draining error
I found this thread in my search for a Sequalize error was giving my node.js app: pool is draining. I could not for the life of me figure it out. So for those who follow in my footsteps:
The issue was that I was closing the database earlier than I thought I was, with the command sequelize.closeConnections(). For some reason, instead of an error like 'the database has been closed`, it was instead giving the obscure error 'pool is draining'.
Seems that you can try to put pool to false to avoid having the pool bing created. Here is the API details table :http://sequelize.readthedocs.org/en/latest/api/sequelize/
[options.pool={}]
Object
Should sequelize use a connection pool.
Default is true
Related
I am running a Node.js (Node v14.15.4) application with Sequelize (v6.6.2) as an ORM connecting to a PostgreSQL database and after several operations, i find that there is about 35 idle processes on my pgadmin dashboard, see below image for reference:
In my index file, i have setup Sequelize like below:
sequelize = new Sequelize(process.env[config.use_env_variable], {
logging: false,
pool: {
max: 15,
min: 0,
acquire: 30000,
idle: 10000,
evict: 10000
}
});
is there something that i am missing here? Because i understand that evict instructs sequelize to remove any idle processes after the specified amount of time.
The connections with application name as "pgAdmin 4 - CONN-*" are used by the query tool of pgAdmin. Check if you have any open instances of query tool.
In order to improve the pool design of an application, I would like to be notified (ideally with an event) when the pool size of an application is reached. This way, I can add a log, if this log occurs too often I will increase the pool size.
With a mongo client initialized this way :
const client = new MongoClient(url, {
poolSize: 10,
});
Is there a way to be notified when the 10 connections are reached within my application ?
Use connection pool events. These should be implemented by all recent MongoDB drivers.
Node documentation
Documentation/example in Ruby
For your question you would track the pool size using ConnectionCheckOut*/ConnectionCheckedIn events if your driver does not expose the pool size or the pool at all directly.
I'm using this code to run the tests outlined in this blog post.
(For posterity, relevant code pasted at the bottom).
What I've found is that if I run these experiments with a local instance of Mongo (in my case, using docker)
docker run -d -p 27017:27017 -v ~/data:/data/db mongo
Then I get pretty good performance, similar results as outlined in the blog post:
finished populating the database with 10000 users
default_query: 277.986ms
query_with_index: 262.886ms
query_with_select: 157.327ms
query_with_select_index: 136.965ms
lean_query: 58.678ms
lean_with_index: 65.777ms
lean_with_select: 23.039ms
lean_select_index: 21.902ms
[nodemon] clean exit - waiting
However, when I switch do using a cloud instance of Mongo, in my case an Atlas sandbox instance, with the following configuration:
CLUSTER TIER
M0 Sandbox (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked
(Note that I'm based in Melbourne, Australia).
Then I get much worse performance.
adding 10000 users to the database
finished populating the database with 10000 users
default_query: 8279.730ms
query_with_index: 8791.286ms
query_with_select: 5234.338ms
query_with_select_index: 4933.209ms
lean_query: 13489.728ms
lean_with_index: 10854.134ms
lean_with_select: 4906.428ms
lean_select_index: 4710.345ms
I get that obviously there's going to be some round trip overhead between my computer and the mongo instance, but I would expect that to add 200ms max.
It seems that that round trip time must be being added multiple times, or something completely else that I'm not aware of - can someone explain just what it is that would cause this to blow out?
A good answer might involve doing an explain plan, and explaining that in terms of network latency.
Tests against different Atlas instances - For those suggesting the issue is that I'm using a Sandbox instance of Atlas - here is the results for a M20 and M30 instances:
BACKUPS
Active
CLUSTER TIER
M20 (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked
BI CONNECTOR
Disabled
adding 10000 users to the database
finished populating the database with 10000 users
default_query: 9015.309ms
query_with_index: 8779.388ms
query_with_select: 4568.794ms
query_with_select_index: 4696.811ms
lean_query: 7694.718ms
lean_with_index: 7886.828ms
lean_with_select: 3654.518ms
lean_select_index: 5014.867ms
BACKUPS
Active
CLUSTER TIER
M30 (General)
REGION
GCP / Iowa (us-central1)
TYPE
Replica Set - 3 nodes
LINKED STITCH APP
None Linked
BI CONNECTOR
Disabled
adding 10000 users to the database
finished populating the database with 10000 users
default_query: 8268.799ms
query_with_index: 8933.502ms
query_with_select: 4740.234ms
query_with_select_index: 5457.168ms
lean_query: 9296.202ms
lean_with_index: 9111.568ms
lean_with_select: 4385.125ms
lean_select_index: 4812.982ms
These really don't show any significant difference (be aware than any difference may just be network noise).
Tests colocating the Mongo client and the mongo database instance
I created a docker container and ran it on Google's Cloud Run, in the same region (US Central1), the results are:
2019-12-30 11:46:06.814 AEDTfinished populating the database with 10000 users
2019-12-30 11:46:07.885 AEDTdefault_query: 1071.233ms
2019-12-30 11:46:08.917 AEDTquery_with_index: 1031.952ms
2019-12-30 11:46:09.375 AEDTquery_with_select: 457.659ms
2019-12-30 11:46:09.657 AEDTquery_with_select_index: 281.678ms
2019-12-30 11:46:10.281 AEDTlean_query: 623.417ms
2019-12-30 11:46:10.961 AEDTlean_with_index: 680.622ms
2019-12-30 11:46:11.056 AEDTlean_with_select: 94.722ms
2019-12-30 11:46:11.148 AEDTlean_select_index: 91.984ms
So while this doesn't give results as fast as running on my own machine - it does show that colocating the client and the database gives a very large performance improvement.
So the question again is - why is the improvement ~7000ms?
The test code:
(async () => {
try {
await mongoose.connect('mongodb://localhost:27017/perftest', {
useNewUrlParser: true,
useCreateIndex: true
})
await init()
// const query = { age: { $gt: 22 } }
const query = { favoriteFruit: 'potato' }
console.time('default_query')
await User.find(query)
console.timeEnd('default_query')
console.time('query_with_index')
await UserWithIndex.find(query)
console.timeEnd('query_with_index')
console.time('query_with_select')
await User.find(query)
.select({ name: 1, _id: 1, age: 1, email: 1 })
console.timeEnd('query_with_select')
console.time('query_with_select_index')
await UserWithIndex.find(query)
.select({ name: 1, _id: 1, age: 1, email: 1 })
console.timeEnd('query_with_select_index')
console.time('lean_query')
await User.find(query).lean()
console.timeEnd('lean_query')
console.time('lean_with_index')
await UserWithIndex.find(query).lean()
console.timeEnd('lean_with_index')
console.time('lean_with_select')
await User.find(query)
.select({ name: 1, _id: 1, age: 1, email: 1 })
.lean()
console.timeEnd('lean_with_select')
console.time('lean_select_index')
await UserWithIndex.find(query)
.select({ name: 1, _id: 1, age: 1, email: 1 })
.lean()
console.timeEnd('lean_select_index')
process.exit(0)
} catch (err) {
console.error(err)
}
})()
My best guess is that you're dealing with slow network throughput between your local machine and Atlas (something I've experienced myself this week - hence how I found this post!)
Looking at your local query performance:
default_query: 277.986ms
query_with_index: 262.886ms
The query with index isn't noticeably any faster than the one without. For an indexed query to take 262ms in a Node app with a local DB probably means that either:
The index isn't being used properly OR more likely...
You're returning quite a few results in the query. If the query returns say 3,000 results and each result is 1KB, that's 3MB of JSON data that your app needs to handle.
I've got a 150Mbit/s internet connection and yet my throughput to Atlas (M2 shared tier, if that makes a difference) fluctuates between around 1Mbit/s to 6Mbit/s.
On localhost I have a Mongo query that returns 2,400 results for a total of 1.7MB of JSON data. The roundtrip time for that query in my Node app (using console.time() like you did) connected to Mongo on the same local dev machine is ~150ms. But when connecting that local app to Atlas the query takes 2,400ms to 3,400ms to return. When I profiled the query on Atlas it only took 2ms to execute, so the query itself is really fast, it's apparently the data transfer that's slow.
Based on these results, I have a feeling that Atlas perhaps throttles throughput over the public internet (or just doesn't bother optimizing for it in their network) because 99% of apps are colocated in the same network region as their Atlas DB. That's the reason why they ask you to pick not just AWS, Azure, etc but your specific network region when creating a cluster.
UPDATE: I just ran a few Amazon EC2 speed tests for my network region (us-east-1) using a 3rd-party service and the average download speed was 4.5Mbit/s for smaller files (1KB to 128KB) and 41Mbit/s for larger files (256KB to 10MB). So the primary issue may be generally slow throughput on the EC2 instances that Atlas clusters run on rather than any throttling by Atlas, or perhaps a combination of both.
Usually, It takes a little bit of time for a request to propagate over the network. this depends on the connection speed, latency, and distance to the server and so many factors. but the server on your local computer doesn't face above mentioned issues as it is for a cloud environment.
But since you are confident about the max delay due to network propagations is ~200ms.
There may be several other possible reasons also to consider
Usually, sandbox plans are for testing and they have limited resources allocated to them.
They don't use SSD drives to store data and uses cheap storage solutions.
They assume that sandbox plans are usually just for exploring features.
Most of the times those instances are run on shared virtual machines.
Make sure there are no other services running on your computer which consumes a higher data rate eg :( torrent applications )
Cloud services depend on a variety of metrics like System Availability, Response Time, Throughput, Latency and many more...
If the average response time of the user base and the data centers is located in the same region then the average overall response time is about 50ms but if located in the different region the response time significantly increases from 200ms - 400ms which can also depend upon the type of instance you're using and the region which you choose.
Since you're using the Atlas Sandbox cluster you must first select the nearest region to avoid poor performance issues as Atlas Sandbox clusters do have it's own limitations. If you're looking for quick response time and faster performance try to upgrade your instance.
If you are sure that it's not about network issues like latency and bandwidth vs response size, then it's either low edge host (non-SSD, low RAMs) or misconfigured web server/proxy, or there is throttling/filtering happening to your traffic.
To narrow it down more use encrypted (https) connection (it's easy, just install letsencrypt on your server) and try to use VPN to change your network route.
Also you can try running the script directly on the server to measure actual executing performance.
Of course you have to consider that your network delay is for each request to the cloud instance , so if you have a ping time of +30ms , you will take 30ms more for each query (approximately) , moreover if your instance is a sandbox ( free account https://docs.atlas.mongodb.com/tutorial/deploy-free-tier-cluster/ ) you will have a poor and shared CPU/RAM.
This is why your mongo db queries are slow.
Making a system faster in production is one of the design goals
We need to take into the account many variables:
Networking, for example, VPC/subnetting
MongoDB Storage (SSD)
MongoDB Indexes
MongoDB RAM, CPU
Node Web Servers or Cluster
Cluud Tenants
TLS encryption
You may need to discard each and every single possible bottleneck
We use Knex with generic pool as our Query Builder and Pool Manager for our Oracle 11.2 Database.
The problem we are facing is that some time Knex / generic-pool starts to accumulate connections and cant recycle them.
I tried to pass some parameters to Knex / Generic Pool to make them kill connections after some point, but looks like it did not worked out.
Packges version:
Knex: v0.13.0
Oracledb: v1.13.1
Generic Pool: v2.5.4
Knex configuration:
{
client: 'oracledb',
connection: {
user: DB_USER,
password: DB_PASSWORD,
host: `${DB_HOST}:${DB_PORT}`,
database: DB_NAME
},
debug: true,
fetchAsString: ['number', 'clob'],
acquireConnectionTimeout: 843600000,
pool: {
min: 2,
max: 150,
acquireTimeoutMillis: 100000,
evictionRunIntervalMillis: 120000,
maxWaitingClients: 100,
idleTimeoutMillis: 100000
}
}
Openshift print with environment variable DEBUG="Knex:*" showing a lot of clients waiting for connection
Try knex 0.14.2 some pool related problems were fixed in that. Also try to add some debug information when transactions are created/committed/rolled back. Open transactions will take connection from pool and does bot release it until transaction is ended. You can get information about pool and transacions bu running app with DEBUG=knex:* environment variable set.
A common connection string for mongoose connecting to a replica set is something like follows
var connection = mongoose.createConnection("mongodb://db_1:27017/client_test,mongodb://db_2:27017/client_test", {
replSet : { rs_name : "rs0", poolSize : 5, socketOptions : { keepAlive : 1 } }
}, function(err) {
if (err) { throw err; }
});
The problem with that is if one of the two hosts is down, then it will fail to connect. If you only specify one host, then no requests end up getting sent to secondaries.
Here's my proof for that claim. If you specify one host, and setup your replica set so that there is one primary and an arbiter and then perform a query such as
myApi.find({}).slaveOk().read("s").exec(function(err, docs) {
console.log(docs)
})
It will return results. Well, since I am specifying "s" (secondary), this query should throw an error because there are no running secondaries. In addition, if you bring the secondary online and then do db.currentOp(true), you will never see any actual queries sent it's way.
The moment you alter the connection string to specify every host then you will see connections go to the secondary. The dilemma is that now, because you had to specify the additional host in the connection string, in the event a secondary was offline, it would fail to connect and we've now lost failover (or the entire point to replica sets)
I can't determine if this is a configuration mistake on my part, a bug in Mongoose, or a conceptual flaw in my understanding of the way replica sets function. From some of the docs, they seem to state that reading from secondaries is basically a bad idea, but the reason for doing so is usually issues with stale data. My issue doesn't have anything to do with stale date, I can't figure out a way to setup the system so that I can get queries to secondaries without losing failover capacity.
1.connection string just defines seed servers, mongodb driver tries to connect to these servers and get information about other servers in replicaSet( by calling rs.status()). You could have replicaSet with 5 nodes, but specify only one in connection string, but driver would be able to find four others if server from connection string is available.
2.My proposal is to use secondaryPreferred instead of just secondary, so that in case there is no secondary available, request would be done to primary.
Ok, I believe I have solved all of my problems. Here is what I learned.
Specify all possible replica nodes in your connection string, otherwise Mongoose will never send requests there. Mongoose has a specific format for this which is different than the node-mongodb-native driver. Example below.
In order to prevent it from hanging forever if one of the nodes is down on bootup you need to specify connectTimeoutMS in the 'replset' options, then it will only wait that long for responses from each nodes on initial connection. If the node comes online at a later date, it will still be available.
The host name entries in your mongodb replica setup need to match the hostname entries in the connection string from your application and all hostnames need to be accessible from all parties (mongo to mongo and application to mongo). In my case I had aliased the hostnames from mongo to mongo as mongo1:27017, mongo2:27017, and mongo3:27017. My application server used a connection string with IPs. Mongoose was attempting to re-initate the connection using the mongo1:27017 hostname (which my application server could not reach) rather than the IP address I specified in the connection string. This resulted in it never re-connecting to a node it lost contact with. It is possible had I used hostnames that the application could reach it still would have worked, but I think it's a best practice to make the connection string and the replica setup identical to remove possibly places for error.
On the mongodb node that you rs.initiate() you might need to update the hostname to be a value that all boxes (other mongodbs and application server can reach). By default it will likely end up with a hostname like localhost, which means something different on each machine. This can be from that boxes mongo shell like so.
Example:
// from mongo shell
conf = rs.conf()
conf.members[0].host = "mongo1:27017"
rs.reconfig(conf)
Final functioning connection string which successfully fails over between nodes, including throwing errors if a query is destined for a secondary but there aren't secondaries.
var connection = mongoose.createConnection("mongodb://mongo1:27017/client_test,mongo2:27017/client_test,mongo3:27017/client_test", {
replset : { rs_name : "rs0", poolSize : 5, socketOptions : { keepAlive : 1, connectTimeoutMS : 1000 } },
}, function(err) {
if (err) { throw err; }
});
Working replica setup
{
"_id" : "rs0",
"version" : 4,
"members" : [
{
"_id" : 0,
"host" : "mongo1:27017"
},
{
"_id" : 1,
"host" : "mongo2:27017"
},
{
"_id" : 2,
"host" : "mongo3:27017",
"arbiterOnly" : true
}
]
}
I had some issue similar to yours while dealing with replica, in my case I had 1 primary node with a priority of 10, 1 secondary priority of 0(for analytics) and an arbiter.
My writes would fail after reconnecting the primary instance and I went through a lot trying to fix it here's the most important thing I learnt:
When my primary is down or unreacheable, there has to be another member eligible to become primary.(At least 2members in my set has to have a priority >= 1).
If I have only arbiters, hidden, or members with a priority of 0,
queries will get stuck even after I reconnect my primary, my client is
unable to complete write queries. Read queries would still work, but
write wouldn't.
This is what I faced with mongoose, even with keepalive, autoreconnect and all the socket and connection timeout MS set.
Hopefully this helps.