mongodb in node.js connections today

mongodb in node.js connections today - node.js

When using the native mongo.db driver for node, should I open 1 connection per application, per page "serve", or open and close it whenever I need?
I've seen a few older answers but I know the project is always developing so I want to know what it the status today.

This isn't a situation that will change; opening a new connection to a server will be less performant than using an established connection.
Note: this is the general case for server applications, and not specific to MongoDB.
Typical overhead includes:
resolving server names to IPs
establishing network connection to server
per connection memory allocated on the server
For MongoDB in particular:
opening a new connection means a new socket connection and thread on the server
each connection (as of MongoDB 2.0) allocates 1Mb of RAM on the server (see also: Checking Memory Usage)
there is a per process limit on open files/connections (see also: Too Many Open Files)
For the MongoDB Node.js driver you can take advantage of connection pooling by setting the poolSize in the constructor. A blog post with an example of using this: Node.js: Connection Pools and MongoDB.

Related

MonogDB connection opening closing and caching when to use them correctly?

I'm new to the database world (learing mongodb for node)
I'm a little bit confused about the client connection, I saw mongoDB connection doc example creating connection and close it but I'm not sure why they close it.
I'm thinking and asking myself if my end user will send to my server (nextjs) and the server will open a connection - at least the first time the server is up - to mongodb and fetch the data then send it back to the user.
in such a case should I close the connection (next users could use the same one)?
if no. which cases I have to close it ?
another confusing part to me about connection is caching :
I saw some code examples that do caching connection while I read that mongo has some sort of built in caching in RAM . is there a different ?

Connection Pooling in typeorm with postgressql

I've gone through enough articles and typeorm official documentation on setting up connection pooling with typeorm and postgressql but couldn't find a solution.
All the articles, I've seen so far explains about adding the max/Poolsize attribute in orm configuration or connection pooling but this is not setting up a pool of idle connections in the database.
When I verify pg_stat_activity table after the application bootstraps, I could not see any idle connections in the DB but when a request is sent to the application I could see an active connection to the DB
The max/poolSize attribute defined under the extras in the orm configuration merely acts as the max number of connections that can be opened from the application to the db concurrently.
What I'm expecting is that during the bootstrap, the application opens a predefined number of connections with the database and keep it in idle state. When a request comes into the application one of the idle connection is picked up and the request is served.
Can anyone provide your insights on how to have this configuration defined with typeorm and postgresql?

TypeORM uses node-postgres which has built in pg-pool and doesn't have that kind of option, as far as I can tell. It supports a max, and as your app needs more connections it will create them, so if you want to pre-warm it, or maybe load/stress test it, and see those additional connections you'll need to write some code that kicks off a bunch of async queries/inserts.
I think I understand what you're looking for as I used to do enterprise Java, and connection pools in things like glassfish and jboss have more options where you can keep hot unused connections in the pool. There are no such options in TypeORM/node-postgres though.

What is a "Connection" in MongoDB?

I've been working with MongoDB for a while now and I've been liking it a lot. One thing I do not understand however is "Connections". I've searched online and everything just has very vague and basic answers. I'm using MongoDBs cloud service called "Atlas" and it describes the connection count as
The number of currently active connections to this server. A stack is allocated per connection; thus very many connections can result in significant RAM usage.
However I have a few questions.
What is a connection I guess? As I understand it, a connection is made between the server and the database service. Essentially when I use mongoose.connect(...);, a connection is made. So at most, there should only be one connection. However when I was testing my program I noticed my connection count was at 2 and in some moments it spiked up all the way to 7 and went to 5 and fluctuated. Does a "connection" have anything to do with the client? On the dashboard of Atlas it says I have a max connection amount of 500. What does this value represent? Does this mean only 500 users can use my website at once? If that's the case, how can I increase that number? Or how can I make sure that more than 500 connections never get passed? Or is a connection something that gets opened and I have to manually close myself? Because I've been learning from tutorials and I've never seen/heard anything like that.
Thanks!

mongoose.connect doesn't limit itself to 1 connection to the Mongo Server.
By default, mongoose creates a pool of 5 connections to Mongo.
You can change this default if necessary.
mongoose
.connect(mongoURI, {poolSize : 200});
See https://mongoosejs.com/docs/connections.html

More number of connections which you see in Atlas because there are some internal connections are also made in order to make the cluster running, these may include the connections from:
Connections made from a client.
Internal connections between primary and secondaries.
As it is a hosted service and everything is being monitored so connections from the monitoring agent.
As automation works, so the connections from the automation agent as well.
Hence whenever a new cluster is being created in Atlas, you will always see some connections in the metrics Page even though no client is being connected.

Should I close my mongoose node.js connection after saving into database?

I have the following code in my app.js which runs on server start (npm start)
mongo.mongoConnect('connection_string', 'users').then((x) => {
console.log('Database connection successful');
app.listen(5000, () => console.log('Server started on port 5000'));
})
.catch(err => {
console.error(err.stack);
process.exit(1);
});
process.on('SIGINT', mongo.mongoDisconnect).on('SIGTERM', mongo.mongoDisconnect);
As you can see I open up SIGINT and SIGTERM for closing my connections upon process.exit
I've been reading a lot about how to deal with database connections in mongo and know that I should just invoke it once and have it across my application.
Does that mean that even after save() method when saving data to mongo followed by POST request, I should not be closing my connection? If I close it, how am I going to invoke it again since the connections happens on app start?
I'm asking it since in PHP I had the practice to always open and close my connection after querying MySql database.
Likewise, does it mean that the connection will close only on server shutdown in other words it will always be present since I do not want to shut down my node.js backend instance?

It is formally correct to open a connection, run a query, and then close the connection, but it is not a good practice, because opening a connection is an "expensive" operation and connections can be reused, which is much more efficient. The main restriction on an open connection is that it can only be used by 1 thread at a time. (More accurately, once a request is sent on a connection, no other requests can be sent on that connection until the response to that request is received.)
If your application is short lived or inherently single threaded, as may be the case when running as a "serverless" function, it may be acceptable to open and close a connection on each request.
While in theory it might be acceptable to open a single connection at the start of the program, keep a global reference to that connection, and reuse it, in practice there are common ways in which a connection becomes unusable that you would have to account for, and handling all the possibilities requires complex code. It gets even more complicated when, as is possible with MongoDB replica sets, you are actually connecting to more than one server and want to retry a command on a second server if the first one fails to respond.
That is why the standard and "best" practice is to use a "connection pool" to manage your database connections. A pool opens a set of network connections to the database, verifies and maintains their health, and dynamically assigns virtual database connections to actual network connections as needed. The pool is implemented in a library that will have received a lot of real world testing and is extremely likely to be better than anything you would write yourself. Connection pools have configuration options that would let you set any behavior you want, including opening a new connection for each request and closing it when done, but offer a wide range of performance enhancing capabilities, such a reusing connections and avoiding the overhead of creating them for each request.
This is why for MongoDB, the standard Node.js client already implements a connection pool. I do not know what mongo.mongoConnect in your code refers to; you said in the title that you are using mongoose but it uses connect, not mongoConnect to connect to the database. In general you should either be using the standard client or a JavaScript ORM library like mongoose. Either of them will take care of the connection management issues for you.
Refer to the documentation for the client/library you use for exactly the right way to use it. In general, you would initialize some kind of client object and store it globally before entering your main application handler. Then you would use this object to handle your database operations, and the object will transparently manage the underlying connections via the pool implementation. In this kind of setup, you would only close the connection when exiting the program, and usually the library takes care of that for you automatically, so you really never need to close the connection.
Thus, when using a MongoDB connection pool in NodeJS, you write your program basically the same way you would as if you just opened a connection at startup and then kept reusing it. The libraries take care of isolating you from all the problems that can arise from actually doing this. You do not need to, and in fact should not, close the connection after a database operation when using standard MongoDB NodeJS libraries.
Note that other connection pool implementations exist that do require you to close the connection. What you do with those pools is reserve (or "check out" or "open") a connection, use it, perhaps for multiple operations, and the release (or "check in" or "close") the connection when you are done. This is probably what you were doing in PHP. It is important to read and follow the documentation for the connection pool library you are using to make sure you are using it correctly.

This may not be the exact answer you are looking for, but it is not a good idea to open a new connection for every request and then close it. It is an overhead because it takes some time (even in milliseconds) to create a new connection.
Instead, you should create a pool of connections and use it in your app.

It's a good idea to close your mongo connection when your process dies or is stopped, but you should not need to close your mongoose connection after every successful query.
If you are instantiating a new mongo connection before each query you shouldn't need to be doing that either. You should just need to do that once when booting up your server.

you have two approaches
1) reopen a connection on every call using middle wares
2) you have to save your's query in node sometime later on execute all it onces

Short or long lived connections for RethinkDB?

We have a project on Node.js that is based on restify and we are using RethinkDB as a database. The problem is that RethinkDB should be accessed from different parts of code (from route handlers, middlewares), but not for all requests. I am wondering what is the best way to connect to RethinkDB in this case?
I see next options:
have one long connection that is stored somewhere (approach we use now),
connect to RethinkDB on each HTTP request, which potentially some of the connections being never used,
connect in each part individually, with potentially several connections per HTTP request, but without useless connections.
I ask this question because I am not sure how well Rethink handle well short/long connections and how expensive they are. For instance MongoDB prefers long connections, but all examples in RethinkDB docs uses one connection per HTTP request.

I recommend a connection pool or one connection per query. Especially if you use feature like changefeeds, which is recommened to be on its own connection.
When you use a single connection for everything, you have to also handle re-connection when the connection timeout/broken. I think it's easier to just use a connection per query, or shared a connection on a request/response.
Just ensure to close your connection after using it, otherwise you will leak connections and new connection cannot be created.
Some driver goes further and doesn't require you to think of connection anymore such as: https://github.com/neumino/rethinkdbdash
Or Elixir RethinkDB: https://github.com/hamiltop/rethinkdb-elixir/issues/32 has an issue to create connection pool.
RethinkDB has an issue related connection pool: https://github.com/rethinkdb/rethinkdb/issues/281
That's probably what community is heading too.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string