Apache Derby Embedded Mode and Multi-Threaded Connection Management - multithreading

I am currently working on an application (whose logic and code I cannot put out here) that creates an embedded derby database and interacts with it using multiple threads to perform CRUD and SELECT operations.
Let's say the name of the embedded database is bar and the path to this database is c:\foo\bar
Multiple threads open a their own connection to c:\foo\bar and perform their respective operations against their own tables in the database.
The connection to the database is abstracted out by a decorator that also maintains the last time the connection was accessed.
If the last time the connection was accessed exceeds a particular threshold the database connection is shutdown and reaped.
There is a reaper thread that runs at a pre-defined scheduled interval and performs the reaping logic. As a part of the reaping logic it uses the following shutdown URL:
jdbc:derby:c:\foo\bar;shutdown=true
Any thread that attempts to perform a query against this database after the reaper thread has run fails with Derby error 8003 which indicates that there is no current connection.
So is it that in Derby embedded mode even though each thread has opened it's own connection; when the reaper thread runs it shuts down the entire database and any connections that were previously opened against this database across all threads are now in an invalid or closed state?
What are the best practices for using embedded derby within such applications?

The shutdown=true attribute on the Derby JDBC Connection URL doesn't shut down the connection, it shuts down the database.
See: http://db.apache.org/derby/docs/10.13/ref/rrefattrib16471.html
If you just want to shut down a connection, call Connection.close()

Related

How many session will create using single pool?

I am using Knex version 0.21.15 npm. my pooling parameter is pool {min: 3 , max:300}.
Oracle is my data base server.
pool Is this pool count or session count?
If it is pool, how many sessions can create using a single pool?
If i run one non transaction query 10 time using knex connection ,how many sessions will create?
And when the created session will cleared from oracle session?
Is there any parameter available to remove the idle session from oracle.?
suggest me please if any.
WARNING: a pool.max value of 300 is far too large. You really don't want the database administrator running your Oracle server to distrust you: that can make your work life much more difficult. And such a large max pool size can bring the Oracle server to its knees.
It's a paradox: often you can get better throughput from a database application by reducing the pool size. That's because many concurrent queries can clog the database system.
The pool object here governs how many connections may be in the pool at once. Each connection is a so-called serially reusable resource. That is, when some part of your nodejs program needs to run a query or series of queries, it grabs a connection from the pool. If no connection is already available in the pool, the pooling stuff in knex opens a new one.
If the number of open connections is already at the pool.max value, the pooling stuff makes that part of your nodejs program wait until some other part of the program finishes using a connection in the pool.
When your part of the nodejs program finishes its queries, it releases the connection back to the pool to be reused when some other part of the program needs it.
This is almost absurdly complex. Why bother? Because it's expensive to open connections and much cheaper to re-use them.
Now to your questions:
pool Is this pool count or session count?
It is a pair of limits (min / max) on the count of connections (sessions) open within the pool at one time.
If it is pool, how many sessions can create using a single pool?
Up to the pool.max value.
If i run one non transaction query 10 time using knex connection ,how many sessions will create?
It depends on concurrency. If your tenth query before the first one completes, you may use ten connections from the pool. But you will most likely use fewer than that.
And when the created session will cleared from oracle session?
As mentioned, the pool keeps up to pool.max connections open. That's why 300 is too many.
Is there any parameter available to remove the idle session from oracle.?
This operation is called "evicting" connections from the pool. knex does not support this. Oracle itself may drop idle connections after a timeout. Ask your DBA about that.
In the meantime, use the knex defaults of pool: {min: 2, max: 10} unless and until you really understand pooling and the required concurrency of your application. max:300 would only be justified under very special circumstances.

How to increase concurrent threading allow count in node-oracledb while connecting to oracle DB?

I am using node-oracledb module for making connection and perform operation with oracle database
There are two approaches to make connection with oracle
connection-pool
concurrent threads (allows to connect with oracle whenever in need)
I am using second approach where I am creating standalone connection with oracle when in demand
A problem I am facing while making additional connection after successful concurrent 4 connections with oracle. Oracle is not allowing the 5th connection until all created connections become free.
Is there anyway to increase this thread count?
Here is the solution how can you increase thread pool size:
start your nodejs app with:
UV_THREADPOOL_SIZE=64 node myapp.js
OR
add the below line in myapp.js (starter file of node js)
process.env.UV_THREADPOOL_SIZE=64
Note: 64 is the size of thread pool
More information about Thread Pool
Node worker threads executing database statements on a connection will
commonly wait until round-trips between node-oracledb and the database
are complete.

PostgreSQL: use same connection or get another from pool?

I have a Node.js script and a PostgreSQL database, and I'll be using a library that maintains a pool of connections to the database.
Say I have a script that queries the database multiple times (not a transaction) at different parts of the script, how do I tell if I should acquire a single connection/client and reuse it throughout*, or acquire a new client from the pool for each query? (Both works but which has better performance?)
*task in the pg-promise library, connect in the node-postgres library.
...
// Acquire connection from pool.
(Database query)
(Non-database-related code)
(Database query)
// Release connection to pool.
...
or
...
// Acquire connection from pool.
(Database query)
// Release connection to pool.
(Non-database-related code)
// Acquire connection from pool.
(Database query)
// Release connection to pool.
...
I am not sure, how the pool you are using works, but normally they should reuse the connections (don't disconnect after use), so you do not need to be concerned with caching connections.
You can use node-postgres module that will make you task easier.
And about your question when to use pool here is the brief answer.
PostgreSQL server can only handle 1 query at a time per connection.
That means if you have 1 global new pg.Client() connected to your
backend your entire app is bottleknecked based on how fast postgres
can respond to queries. It literally will line everything up, queuing
each query. Yeah, it's async and so that's alright...but wouldn't you
rather multiply your throughput by 10x? Use pg.connect set the
pg.defaults.poolSize to something sane (we do 25-100, not sure the
right number yet).
new pg.Client is for when you know what you're doing. When you need a
single long lived client for some reason or need to very carefully
control the life-cycle. A good example of this is when using
LISTEN/NOTIFY. The listening client needs to be around and connected
and not shared so it can properly handle NOTIFY messages. Other
example would be when opening up a 1-off client to kill some hung
stuff or in command line scripts.
here is the link of that module.
Hopefully this will help.
https://github.com/brianc/node-postgres
You can see the documentation over there and about the pooling. Thanks :)
And about closing the pool it provides the callback done which can be called when you want to close that pool.

MongoDB Performance when connecting to multiple databases via parent-child connections

When connecting to a mongo server containing multiple dbs, what is more performant approach using node-mongodb-native driver.
Let's say I have 8 dbs(db1...db8) on the same Mongo Server. My node app needs to connect to all 8 depending on the queries received to it. What is a better option here for me
1) Create 8 separate connections (1 with each db)
OR
2) Create one parent connection to the server on test db and then call db.db 8 times to create 8 child connections under that parent. As I read in the doc(http://mongodb.github.io/node-mongodb-native/2.0/api/Db.html#db), all 8 child connections will be running on the same socket
Has anyone researched into this or has some background or thoughts that can help me determine the right course of action?
How granular is MongoDB concurrency?: this depends on the version. Since MongoDB 3 many operations lock on the document. Earlier versions would apply a lock on the entire collection. Some operations still lock on the entire instance (aka server). This means that sometimes an operation (likely operations involving multiple databases) can block an entire instance affecting all databases within it. https://docs.mongodb.com/manual/faq/concurrency/#how-granular-are-locks-in-mongodb
Threading model: node.js is asynchronous while MongoDB is not. MongoDB will use one thread per socket. If you perceive operations are blocking each other you should keep seperate connection pools. http://mongodb.github.io/node-mongodb-native/2.2/reference/faq/

JDBC: Can I share a connection in a multithreading app, and enjoy nice transactions?

It seems like the classical way to handle transactions with JDBC is to set auto-commit to false. This creates a new transaction, and each call to commit marks the beginning the next transactions.
On multithreading app, I understand that it is common practice to open a new connection for each thread.
I am writing a RMI based multi-client server application, so that basically my server is seamlessly spawning one thread for each new connection.
To handle transactions correctly should I go and create a new connection for each of those thread ?
Isn't the cost of such an architecture prohibitive?
Yes, in general you need to create a new connection for each thread. You don't have control over how the operating system timeslices execution of threads (notwithstanding defining your own critical sections), so you could inadvertently have multiple threads trying to send data down that one pipe.
Note the same applies to any network communications. If you had two threads trying to share one socket with an HTTP connection, for instance.
Thread 1 makes a request
Thread 2 makes a request
Thread 1 reads bytes from the socket, unwittingly reading the response from thread 2's request
If you wrapped all your transactions in critical sections, and therefore lock out any other threads for an entire begin/commit cycle, then you might be able to share a database connection between threads. But I wouldn't do that even then, unless you really have innate knowledge of the JDBC protocol.
If most of your threads have infrequent need for database connections (or no need at all), you might be able to designate one thread to do your database work, and have other threads queue their requests to that one thread. That would reduce the overhead of so many connections. But you'll have to figure out how to manage connections per thread in your environment (or ask another specific question about that on StackOverflow).
update: To answer your question in the comment, most database brands don't support multiple concurrent transactions on a single connection (InterBase/Firebird is the only exception I know of).
It'd be nice to have a separate transaction object, and to be able to start and commit multiple transactions per connection. But vendors simply don't support it.
Likewise, standard vendor-independent APIs like JDBC and ODBC make the same assumption, that transaction state is merely a property of the connection object.
It's uncommon practice to open a new connection for each thread.
Usually you use a connection pool like c3po library.
If you are in an application server, or using Hibernate for example, look at the documentation and you will find how to configure the connection pool.
The same connection object can be used to create multiple statement objects and these statement objects can then used by different threads concurrently. Most modern DBs interfaced by JDBC can do that. The JDBC is thus able to make use of concurrent cursors as follows. PostgreSQL is no exception here, see for example:
http://doc.postgresintl.com/jdbc/ch10.html
This allows connection pooling where the connection are only used for a short time, namely to created the statement object and but after that returned to the pool. This short time pooling is only recommended when the JDBC connection does also parallelization of statement operations, otherwise normal connection pooling might show better results. Anyhow the thread can continue work with the statement object and close it later, but not the connection.
1. Thread 1 opens statement
3. Thread 2 opens statement
4. Thread 1 does something Thread 2 does something
5. ... ...
6. Thread 1 closes statement ...
7. Thread 2 closes statement
The above only works in auto commit mode. If transactions are needed there is still no need to tie the transaction to a thread. You can just partition the pooling along the transactions that is all and use the same approach as above. But this is only needed not because of some socket connection limitation but because the JDBC then equates the session ID with the transaction ID.
If I remember well there should be APIs and products around with a less simplistic design, where teh session ID and the transaction ID are not equated. In this APIs you could write your server with one single database connection object, even when it does
transactions. Will need to check and tell you later what this APIs and products are.

Resources