Increase MongoDB maximum number of connections - linux

I am getting some errors on MongoDB side, the error is complaining that it reached the max allowed connections.
Wondering if there is anyway to increase the max number of allowed connections.

Check the MongoDB documentation:
http://www.mongodb.org/
use this command line argument:
--maxConns arg max number of simultaneous connections
You might want to check this: http://blog.boxedice.com/2011/06/08/mongodb-connection-overhead/

Related

What is recommended for cassandra max connections per host? How is it calculated?

Is there a way to limit number of connections per host in cassandra cluster and based on what parameter this is calculated?
In some of cassandra node i can see established connections count for 9042 goes up to 1400+, is this something i need to worry about?
Thanks
Yes, you can limit the number of connections per host in the Cassandra cluster.
If you are using the C++ driver, check this out.
I visualize any query following the below path:
Client --> Session --> IO threads --> Connections --> Nodes
You can configure the number of IO threads(This is the number of threads that will handle query requests) associated with the session. In each IO thread, you can then configure the number of connections per host. If needed, the number of connections per host will increase based on certain parameters(The maximum count up to which this can be increased is also configurable).
So, at max, there can be x number of connections per host where,
x = number_of_sessions * number_of_IO_threads * max_number_of_connections_per_host
All 3 variables on RHS in the above equation are configurable.
Also check out:
https://www.datastax.com/dev/blog/4-simple-rules-when-using-the-datastax-drivers-for-cassandra
https://stackoverflow.com/a/28219086/5701173

com.datastax.driver.core.exceptions.BusyPoolException

Whenever I insert data in table in Cassandra, more than 1000 and fetching the data by id, it throws the following exception:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [localhost/127.0.0.1] Pool is busy (no available connection and the queue has reached its max size 256)))
at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:213)
at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:49)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:277)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution$1.onFailure(RequestHandler.java:340)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1764)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:456)
at com.google.common.util.concurrent.Futures$ImmediateFuture.addListener(Futures.java:153)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1776)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1713)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.query(RequestHandler.java:299)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:274)
at com.datastax.driver.core.RequestHandler.startNewExecution(RequestHandler.java:117)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:97)
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)
at com.outworkers.phantom.builder.query.CassandraOperations$class.scalaQueryStringToPromise(CassandraOperations.scala:67)
at com.outworkers.phantom.builder.query.InsertQuery.scalaQueryStringToPromise(InsertQuery.scala:31)
at com.outworkers.phantom.builder.query.CassandraOperations$class.scalaQueryStringExecuteToFuture(CassandraOperations.scala:31)
at com.outworkers.phantom.builder.query.InsertQuery.scalaQueryStringExecuteToFuture(InsertQuery.scala:31)
at com.outworkers.phantom.builder.query.ExecutableStatement$class.future(ExecutableQuery.scala:80)
at com.outworkers.phantom.builder.query.InsertQuery.future(InsertQuery.scala:31)
at nd.cluster.data.store.Points.upsert(Models.scala:114)
I have solved above issue using PoolingOptions.
val poolingOptions = new PoolingOptions()
.setConnectionsPerHost(HostDistance.LOCAL, 1, 200)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 256)
.setNewConnectionThreshold(HostDistance.LOCAL, 100).setCoreConnectionsPerHost(HostDistance.LOCAL, 200)
val builder1 = ContactPoint.local
.noHeartbeat()
.withClusterBuilder(_.withoutJMXReporting()
.withoutMetrics().withPoolingOptions(poolingOptions)).keySpace("nd")
Now it is working even with 1l data. But i am not sure about its efficiency.
Could anyone please help me ?
This means that you are submitting too many requests, and not waiting for the futures to complete before submitting more.
The default maximum number of requests per connection is 1024. If this number is exceeded for all connections, the connection pool will enqueue some requests, up to 256. If the queue gets full, a BusyPoolException is thrown. Of course you can increase the max number of requests per connection, and the number of max connections per host. But the real solution is of course to throttle your thread. You could e.g. submit your requests by batches of 1,000 and then wait on the futures to complete before submitting more, or use a semaphore to regulate the total number of pending requests and make sure they don't exceed a certain number (in theory, this number must stay below num_hosts * max_connections_per_host * max_requests_per_connection – in practice, I don't suggest going above 1,000 as it probably won't bring you more throughput).
You may find this links useful.
https://github.com/redisson/redisson/issues/438
https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/p3CwOL0kNrs http://docs.datastax.com/en/developer/java-driver/3.1/manual/pooling

how to check how many pooled connection we have at a specific time in MemSQL

How can i tell that i have reached the maximum number of queries that my cluster can handle at a time?
If the connection pool size is too small for the workload then we have to adjust the max_pooled_connections configuration variable, which controls the number of pooled connections between each pair of nodes.
However how can I tell how many pooled connections we have at a specific time ?
In memsql agregator status i can see following entries Aborted_connects is 11 - why do we abort those connection ? Also Max_used_connections is 41, while Connections is a number that increases constantly.
How can i tell that i have reached the maximum number of queries that
my cluster can handle at a time?
There isn't a hard limit on the number of queries you can send other than max_connections (100k), but at some point the cluster will not execute them all at once and will schedule/queue them up. Is your question about the former or the latter?
If the connection pool size is too small for the workload then we have
to adjust the max_pooled_connections configuration variable, which
controls the number of pooled connections between each pair of nodes.
However how can I tell how many pooled connections we have at a
specific time ?
show leaves will show how many connections are currently open from the current node to each leaf. So the current connection pool size is min(current open connections, max_pooled_connections). Note that this is per (node, node) pair.
In memsql agregator status i can see following entries
Aborted_connects is 11 - why do we abort those connection ? Also
Max_used_connections is 41, while Connections is a number that
increases constantly.
Aborted connects includes e.g. failed login authentications.
max_used_connections is max peak, connections is cumulative total.

Mongodb count performance issues with Node js

I am having issues with doing counts on a single table with up to 1million records. I have a 32 core 244gb ram box that I am running my test on so hardware should not be an issue.
I have indexes set up on all of my queries that I am using to perform counts. I have enabled node max_old_space_size to 15gb.
The process I am following is basically looping through a huge array, creating 1000 promises, within each promise I am performing 12 counts, waiting for the promises to all resolve, and then continuing with the next one thousand batch.
As part of my test, I am doing inserts, updates, and reads as well. All of those, are showing great performance up to 20000/sec on each. However, when I get to the portion of my code doing the counts(), I can see via mongostat that there are only 20-30 commands being executed per second. I have not determined at this point, if my node code is only sending that many, or if mongo is queuing it up.
Meanwhile, in my node.js code, all 1000 promises are started and waiting to evaluate. I know this is a lot of info, so please let me know what more granular details I should provide to get some more insight into why the count performance is so slow.
So basically, for a batch of 1000 records, doing lets say 12 counts each, for a total of 12,000 counts, it is taking close to 10 minutes, on a table of 1million records.
MongoDB Native Client v2.2.1
Node v4.2.1
What I'd like to add is that I have tried changing the maxPoolSize on the driver from 100-1000 with no change in performance. I've tried changing my queries that I perform from yield/generator/promise to callbacks wrapped in promise, which has helped somewhat.
The strange thing is, when my program starts, even if i use just the default number of connections which I see as 7 when running mongostat, I can get around 2500 count() queries per second throughout. However, after a few seconds this goes back down to about 300-400. This leads me to believe that mongo can handle that many all the time, but my code is not able to send that many requests, even though I set maxPoolSize to 300 and start 10000 simultaneous promises resolving in parallel. So what gives, any ideas from anyone ?

RPC timeout in cqlsh - Cassandra

I have 5 nodes in my ring with SimpleTopologyStrategy and replication_factor=3. I inserted 1M rows using stress tool . When am trying to read the row count in cqlsh using
SELECT count(*) FROM Keyspace1.Standard1 limit 1000000;
It fails with error:
Request did not complete within rpc_timeout.
It fetches for limit 100000. Fails even for 500000.
All my nodes are up. Do I need to increase the rpc_timeout?
Please help.
You get this error because the request is timing out on the server side. One should know that this is a very expensive operation in Cassandra as others have pointed out.
Still, if you really want to do this you should update your /etc/cassandra/cassandra.yaml file and change the range_request_timeout_in_ms parameter. This will be valid for all your range queries.
Example to set a 40 second timeout:
range_request_timeout_in_ms: 40000
You will probably have to adjust at the client side as well. When using cqlsh as a client this is accomplished by creating/updating your configuration file for cqlsh under ~/.cassandra/cqlshrc and add the client_timeout parameter to the connection section.
Example to set a 40 second timeout:
[connection]
client_timeout=40
It takes a long time to read in 1M rows so that is probably why it is timing out. You shouldn't use count like this, it is very expensive since it has to read all the data. Use Cassandra counters if you need to count lots of items.
You should also check your Cassandra logs to confirm there aren't any other issues - sometimes exceptions in Cassandra lead to timeouts on the client.
If you can live with an approximate row count, take a look at this answer to Row count of a column family in Cassandra.

Resources