I have a severe problem with my database connection in my web application. Since I use a single database connection for the whole application from singleton Database class, if i try concurrent db operations (two users) the database rollsback the transactions.
This is my static method used:
All threads/servlets call static Database.doSomething(...) methods, which in turn call the the below method.
private static /* synchronized*/ Connection getConnection(final boolean autoCommit) throws SQLException {
if (con == null) {
con = new MyRegistrationBean().getConnection();
}
con.setAutoCommit(true); //TODO
return con;
}
What's the recommended way to manage this db connection/s I have, so that I don't incurr in the same problem.
Keeping a Connection open forever is a very bad idea. It doesn't have an endless lifetime, your application may crash whenever the DB times out the connection and closes it. Best practice is to acquire and close Connection, Statement and ResultSet in the shortest possible scope to avoid resource leaks and potential application crashes caused by the leaks and timeouts.
Since connecting the DB is an expensive task, you should consider using a connection pool to improve connecting performance. A decent applicationserver/servletcontainer usually already provides a connection pool feature in flavor of a JNDI DataSource. Consult its documentation for details how to create it. In case of for example Tomcat you can find it here.
Even when using a connection pool, you still have to write proper JDBC code: acquire and close all the resources in the shortest possible scope. The connection pool will on its turn worry about actually closing the connection or just releasing it back to pool for further reuse.
You may get some more insights out of this article how to do the JDBC basics the proper way. As a completely different alternative, learn EJB and JPA. It will abstract away all the JDBC boilerplate for you into oneliners.
Hope this helps.
See also:
Is it safe to use a static java.sql.Connection instance in a multithreaded system?
Am I Using JDBC Connection Pooling?
How should I connect to JDBC database / datasource in a servlet based application?
When is it necessary or convenient to use Spring or EJB3 or all of them together?
I've not much experience with PostgreSql, but all the web applications I've worked on have used a single connection per set of actions on a page, closing it and disposing it when finished.
This allows the server to pool connections and stops problems such as the one that you are experiencing.
Singleton should be the JNDI pool connection itself; Database class with getConnection(), query methods et al should NOT be singleton, but can be static if you prefer.
In this way the pool exists indefinitely, available to all users, while query blocks use dataSource.getConnection() to draw a connection from the pool; exec the query, and then close statement, result set, and connection (to return it to the pool).
Also, JNDI lookup is quite expensive, so it makes sense to use a singleton in this case.
Related
I am evaluating Jooq and I would like to confirm my understanding about the JDBC connection lifecycle.
I am using a connection pool (Hikari) and I configure a DSLContext using the datasource.
As I understand it:
I am safe to create one DSLContext that I can reuse across my application in many threads? Is there any negative to doing this provided I don't modify the config? Contention etc?
Each time I do a database access, does Jooq obtain a connection from the pool, do the DB access and then commit and release it? Eg:
Result<Record> result = context.select().from(AUTHOR).fetch();
Result<Record> otherResult = context.select().from(BOOKS).fetch();
Each of these queries would obtain and release a new connection from the pool?
When using the transaction API like this:
context.transaction(ctx -> {
DSLContext trans = DSL.using(ctx);
...
});
A connection would be obtained from the pool and reused for all accesses using the trans context?
Thanks!
I am safe to create one DSLContext that I can reuse across my application in many threads?
Yes, you can, provided that after initialisation, you don't modify your Configuration and its components anymore, e.g. Settings.
Is there any negative to doing this provided I don't modify the config? Contention etc?
Au contraire, that's the recommended way of using a DSLContext. By reusing it, you will benefit from caching, e.g. for reflection if you're using Result.into(MyDto.class) methods, etc.
Each time I do a database access, does Jooq obtain a connection from the pool, do the DB access and then commit and release it?
Yes, that's how it works, if your DataSource is backed by a pool. Of course, the pool is free to provide jOOQ with the same connection every time, depending on transactional contexts that are governed outside of jOOQ. jOOQ doesn't care.
When using the transaction API like this: [...] A connection would be obtained from the pool and reused for all accesses using the trans context?
The transaction() method obtains a connection and keeps it for the duration of the TransactionalRunnable that you pass to it (i.e. the lambda). Your nested DSLContext trans instance will now work with the "cached" Connection, and not obtain new connections from your DataSource.
I am dealing with an Azure function that connects to a DB (Java), i suppose something quite common.
The functions may have cold or warm starts, mine should be warm for most of the time (it is called every 5 minutes). The connection is stored in a pool (JDBCPooledConnectionSource) in a static variable, so theoreticaly the connection should be reused for every warm start, gaining in efficiency.
Is it a good strategy or could cause problems? For example, a fisical connection gets broken, but its reference is still in the heap: when the reference will be used to make a query, and an exception may occur.
To avoid calls to broken connection, I could use a non static variable to store the connection. This should be safer, but less efficient because the connection should be re-created at every call.
Which strategy is the best? I suppose there are many functions that do the same (connecting to DB) so for sure somebody more experienced than me in Azure knows the best strategy or common errors.
I write the answer because I found an error about how I was using the connectionSource(). I was executing the query without releasing the connection:
ConnectionSource cs = getConnectionSource();
DatabaseConnection dbc = cs.getReadOnlyConnection("my_table");
dbc.executeStatement("Select count(*) from my_table;", JdbcDatabaseConnection.DEFAULT_RESULT_FLAGS);
and the connection was never released, so it also was not removed/reused. Now I added the following, and it works as expected:
cs.releaseConnection(dbc);
I'm using Azure Functions with queue triggers for part of our workload. The specific function queries the database and this creates problems with scaling since the large concurrent number of function instances pinging the db results in maximum allowed number of Azrue DB connections being hit constantly.
This article https://learn.microsoft.com/en-us/azure/azure-functions/manage-connections lists HttpClient as one of those resources that should be made static.
Should database access also be made static with static SqlConnection to resolve this issue, or would that cause some other problems by keeping the constant connection object?
Should database access also be made static with static SqlConnection
Definitely not. Each function invocation should open a new SqlConnection, with the same connection string, in a using block. It's not really clear how many concurrent Function Invocations the runtime will make to a single instance of your application. But if it's more than 1, then a singleton SqlConnection is a bad thing.
I wonder exactly which limit you're hitting in SQL Database, the connection limit or the concurrent request limit? In either case I'm a bit surprised (not a Functions expert) that you get that many concurrent function invocations, so there might be something else going on. Like you're leaking SqlConnections.
But reading the Functions docs, my guess is that the functions runtime is scaling by launching multiple instances of your function app. Your .NET app could scale in a single process, but that's apparently not the way Functions works. Each instance of your Functions app has it's own ConnectionPool for SQL Server, and by default each ConnectionPool can have 100 connections.
Perhaps if you sharply limit the Max Pool Size in your connection string, won't have so many connections open. When you hit the Max Pool Size, new calls to SqlConnection.Open() will block for up to 30 seconds waiting for a pooled SqlConnection to become available. So this not only limits the connection use for each instance of your application, it throttles your throughput under load.
You can use the configuration settings in host.json to control the level of concurrency your functions execute at per instance and the max scaleout setting to control how many instances you scale out to. This will let you control the total amount of load put on your database.
For future readers, the documentation has been updated with some information about the SQL connection stating:
Your function code may use the .NET Framework Data Provider for SQL Server (SqlClient) to make connections to a SQL relational database. This is also the underlying provider for data frameworks that rely on ADO.NET, such as Entity Framework. Unlike HttpClient and DocumentClient connections, ADO.NET implements connection pooling by default. However, because you can still run out of connections, you should optimize connections to the database. For more information, see SQL Server Connection Pooling (ADO.NET).
So, as David Browne already mentioned, you shouldn't make your SqlConnection static.
I have a Node.js script and a PostgreSQL database, and I'll be using a library that maintains a pool of connections to the database.
Say I have a script that queries the database multiple times (not a transaction) at different parts of the script, how do I tell if I should acquire a single connection/client and reuse it throughout*, or acquire a new client from the pool for each query? (Both works but which has better performance?)
*task in the pg-promise library, connect in the node-postgres library.
...
// Acquire connection from pool.
(Database query)
(Non-database-related code)
(Database query)
// Release connection to pool.
...
or
...
// Acquire connection from pool.
(Database query)
// Release connection to pool.
(Non-database-related code)
// Acquire connection from pool.
(Database query)
// Release connection to pool.
...
I am not sure, how the pool you are using works, but normally they should reuse the connections (don't disconnect after use), so you do not need to be concerned with caching connections.
You can use node-postgres module that will make you task easier.
And about your question when to use pool here is the brief answer.
PostgreSQL server can only handle 1 query at a time per connection.
That means if you have 1 global new pg.Client() connected to your
backend your entire app is bottleknecked based on how fast postgres
can respond to queries. It literally will line everything up, queuing
each query. Yeah, it's async and so that's alright...but wouldn't you
rather multiply your throughput by 10x? Use pg.connect set the
pg.defaults.poolSize to something sane (we do 25-100, not sure the
right number yet).
new pg.Client is for when you know what you're doing. When you need a
single long lived client for some reason or need to very carefully
control the life-cycle. A good example of this is when using
LISTEN/NOTIFY. The listening client needs to be around and connected
and not shared so it can properly handle NOTIFY messages. Other
example would be when opening up a 1-off client to kill some hung
stuff or in command line scripts.
here is the link of that module.
Hopefully this will help.
https://github.com/brianc/node-postgres
You can see the documentation over there and about the pooling. Thanks :)
And about closing the pool it provides the callback done which can be called when you want to close that pool.
I'm using the Node native client 1.4 in my application and I found something in the document a little bit confusing:
A Connection Pool is a cache of database connections maintained by the driver so that connections can be re-used when new connections to the database are required. To reduce the number of connection pools created by your application, we recommend calling MongoClient.connect once and reusing the database variable returned by the callback:
Several questions come in mind when reading this:
Does it mean the db object also maintains the fail over feature provided by replica set? Which I thought should be the work of MongoClient (not sure about this but the C# driver document does say MongoClient maintains replica set stuff)
If I'm reusing the db object, when should I invoke the db.close() function? I saw the db.close() in every example. But shouldn't we keep it open if we want to reuse it?
EDIT:
As it's a topic about reusing, I'd also want to know how we can share the db in different functions/objects?
As the project grows bigger, I don't want to nest all the functions/objects in one big closure, but I also don't want to pass it to all the functions/objects.
What's a more elegant way to share it among the application?
The concept of "connection pooling" for database connections has been around for some time. It really is a common sense approach as when you consider it, establishing a connection to a database every time you wish to issue a query is very costly and you don't want to be doing that with the additional overhead involved.
So the general principle is there that you have an object handle ( db reference in this case ) that essentially goes and checks for which "pooled" connection it can use, and possibly if the current "pool" is fully utilized then and create another ( or a few others ) connection up to the pool limit in order to service the request.
The MongoClient class itself is just a constructor or "factory" type class whose purpose is to establish the connections and indeed the connection pool and return a handle to the database for later usage. So it is actually the connections created here that are managed for things such as replica set fail-over or possibly choosing another router instance from the available instances and generally handling the connections.
As such, the general practice in "long lived" applications is that "handle" is either globally available or able to be retrieved from an instance manager to give access to the available connections. This avoids the need to "establish" a new connection elsewhere in your code, which has already been stated as a costly operation.
You mention the "example" code which is often present through many such driver implementation manuals often or always calling db.close. But these are just examples and not intended as long running applications, and as such those examples tend to be "cycle complete" in that they show all of the "initialization", the "usage" of various methods, and finally the "cleanup" as the application exits.
Good application or ODM type implementations will typically have a way to setup connections, share the pool and then gracefully cleanup when the application finally exits. You might write your code just like "manual page" examples for small scripts, but for a larger long running application you are probably going to implement code to "clean up" your connections as your actual application exits.