Connection Pooling in VoltDB - voltdb

If I use JDBC approach, I am able to achieve connection pooling using third party library(Apache Dbcp).
I am using Client based Approach, VoltDB is not exposing connection object, How to implement connection pooling?
Is there any mechanism for Client based approach?

The Client based approach is a lighter-weight yet more powerful API than JDBC.
The Client object should be connected to each of the servers in the cluster, or you can set the "TopologyChangeAware" property to true on the ClientConfig object prior to creating the Client object, then connect the client to any server in the cluster and it will create connections to all the others automatically.
The application will then interact with the database using this Client object, which has connections, rather than using a JDBC Connection object. Since the Client object is thread-safe and can support multiple simultaneous invocations of callProcedure() on multiple threads, there is no need to create a pool of Clients.
For more details on the Client interface, see Using VoltDB Chapter 6. Designing VoltDB Client Applications
Disclaimer: I work for VoltDB.

Related

Connection Pooling in typeorm with postgressql

I've gone through enough articles and typeorm official documentation on setting up connection pooling with typeorm and postgressql but couldn't find a solution.
All the articles, I've seen so far explains about adding the max/Poolsize attribute in orm configuration or connection pooling but this is not setting up a pool of idle connections in the database.
When I verify pg_stat_activity table after the application bootstraps, I could not see any idle connections in the DB but when a request is sent to the application I could see an active connection to the DB
The max/poolSize attribute defined under the extras in the orm configuration merely acts as the max number of connections that can be opened from the application to the db concurrently.
What I'm expecting is that during the bootstrap, the application opens a predefined number of connections with the database and keep it in idle state. When a request comes into the application one of the idle connection is picked up and the request is served.
Can anyone provide your insights on how to have this configuration defined with typeorm and postgresql?
TypeORM uses node-postgres which has built in pg-pool and doesn't have that kind of option, as far as I can tell. It supports a max, and as your app needs more connections it will create them, so if you want to pre-warm it, or maybe load/stress test it, and see those additional connections you'll need to write some code that kicks off a bunch of async queries/inserts.
I think I understand what you're looking for as I used to do enterprise Java, and connection pools in things like glassfish and jboss have more options where you can keep hot unused connections in the pool. There are no such options in TypeORM/node-postgres though.

Working with WebSockets and NodeJs clusters

I currently have a Node server running that works with MongoDB. It handles some HTTP requests, but it largely used WebSockets. Basically, the server connects multiple users to rooms with WebSockets.
My server currently has around 12k WebSockets open and it's almost crippling my single threaded server, and now I'm not sure how to convert it over.
The server holds HashMap variables for the connected users and rooms. When a user does an action, the server often references those HashMap variables. So, I'm not sure how to use clusters in this. I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Does anyone have any ideas on what to do?
Thank you.
You can look at the socket.io-redis adapter for architectural ideas or you can just decide to use socket.io and the Redis adapter.
They move the equivalent of your hashmap to a separate process redis in-memory database so all clustered processes can get access to it.
The socket.io-redis adapter also supports higher-level functions so that you can emit to every socket in a room with one call and the adapter finds where everyone in the room is connected, contacts that specific cluster server, and has it send the message to them.
I thought maybe creating a thread for every WebSocket message, but I'm not sure if this is the right approach, and it would not be able to access the HashMaps for the other users
Threads in node.js are not lightweight things (each has its own V8 instance) so you will not want a nodejs thread for every WebSocket connection. You could group a certain number of WebSocket connections on a web worker, but at that point, it is likely easier to use clustering because nodejs will handle the distribution across the clusters for you automatically whereas you'll have to do that yourself for your own web worker pool.

Proper way of using db connections? (Python + pyodbc)

I am currently building a REST API using Python and Flask. Most endpoints require some type of interaction with a SQL database (SQL Server, using pyodbc). What is the best way of going around this.
Should I open a single connection (e.g. in a singleton class) and try and reuse it as much as possible?
Should I use different connections? If so, at what granularity? One connection per endpoint? One connection per query?
What role does connection pooling play in all of this and how do I make use of it?
Do I have to worry about concurrent requests?

Short or long lived connections for RethinkDB?

We have a project on Node.js that is based on restify and we are using RethinkDB as a database. The problem is that RethinkDB should be accessed from different parts of code (from route handlers, middlewares), but not for all requests. I am wondering what is the best way to connect to RethinkDB in this case?
I see next options:
have one long connection that is stored somewhere (approach we use now),
connect to RethinkDB on each HTTP request, which potentially some of the connections being never used,
connect in each part individually, with potentially several connections per HTTP request, but without useless connections.
I ask this question because I am not sure how well Rethink handle well short/long connections and how expensive they are. For instance MongoDB prefers long connections, but all examples in RethinkDB docs uses one connection per HTTP request.
I recommend a connection pool or one connection per query. Especially if you use feature like changefeeds, which is recommened to be on its own connection.
When you use a single connection for everything, you have to also handle re-connection when the connection timeout/broken. I think it's easier to just use a connection per query, or shared a connection on a request/response.
Just ensure to close your connection after using it, otherwise you will leak connections and new connection cannot be created.
Some driver goes further and doesn't require you to think of connection anymore such as: https://github.com/neumino/rethinkdbdash
Or Elixir RethinkDB: https://github.com/hamiltop/rethinkdb-elixir/issues/32 has an issue to create connection pool.
RethinkDB has an issue related connection pool: https://github.com/rethinkdb/rethinkdb/issues/281
That's probably what community is heading too.

Hazelcast: difference between Java native client and embedded version

We use Hazelcast (2.3) in a Web backend running in a Java servlet container to distribute data in a cluster. Hazelcast maps are persisted in a MySQL database using a MapStore interface. Right now, we are using the Java native client interface and I wonder what is the difference between a "native" client and the embedded version when it comes to performance.
Is it correct that a "native" client might connect to any of the cluster nodes and that this decision is made again for every single request?
Is it correct that the overhead of sending all requests and responses through a TCP socket in a native client is avoided when the embedded version is used?
Is it fair to conclude that the embedded version is in general faster than the "native" client?
In case of a "native" client: it is correct that the MapStore implementation is part of the Hazelcast server (as class during runtime)? Or is it part of the "native" client so that all data that has to be persisted is sent through the TCP socket at first?
You give the set of nodes for native client to connect. Once it connects one it will use this node for communication with cluster till it dies. When it dies client will connect to other node to continue communication.
With native client there are two hops one from client to the node, one from the node to target node. (Target node is the node the target data is located) With embedded client there is single hop as it already knows where the wanted data is located (target node)
Yes generally but see: (from hazelcast documentation)
LiteMember is a member of the cluster, it has socket connection to
every member in the cluster and it knows where the data is so it will
get to the data much faster. But LiteMember has the clustering
overhead and it must be on the same data center even on the same RAC.
However Native client is not member and relies on one of the cluster
members. Native Clients can be anywhere in the LAN or WAN. It scales
much better and overhead is quite less. So if your clients are less
than Hazelcast nodes then LiteMember can be an option; otherwise
definitely try Native Client. As a rule of thumb: Try Native client
first, if it doesn't perform well enough for you, then consider
LiteMember.
4- Store operations are executed in hazelcast server. The object sent from client is persisted to centralized datastore by the target node which also stores the object in its memory.

Resources