com.datastax.driver.core.exceptions.BusyPoolException

com.datastax.driver.core.exceptions.BusyPoolException - cassandra

Whenever I insert data in table in Cassandra, more than 1000 and fetching the data by id, it throws the following exception:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.BusyPoolException: [localhost/127.0.0.1] Pool is busy (no available connection and the queue has reached its max size 256)))
at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:213)
at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:49)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:277)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution$1.onFailure(RequestHandler.java:340)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1764)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:456)
at com.google.common.util.concurrent.Futures$ImmediateFuture.addListener(Futures.java:153)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1776)
at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1713)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.query(RequestHandler.java:299)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:274)
at com.datastax.driver.core.RequestHandler.startNewExecution(RequestHandler.java:117)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:97)
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)
at com.outworkers.phantom.builder.query.CassandraOperations$class.scalaQueryStringToPromise(CassandraOperations.scala:67)
at com.outworkers.phantom.builder.query.InsertQuery.scalaQueryStringToPromise(InsertQuery.scala:31)
at com.outworkers.phantom.builder.query.CassandraOperations$class.scalaQueryStringExecuteToFuture(CassandraOperations.scala:31)
at com.outworkers.phantom.builder.query.InsertQuery.scalaQueryStringExecuteToFuture(InsertQuery.scala:31)
at com.outworkers.phantom.builder.query.ExecutableStatement$class.future(ExecutableQuery.scala:80)
at com.outworkers.phantom.builder.query.InsertQuery.future(InsertQuery.scala:31)
at nd.cluster.data.store.Points.upsert(Models.scala:114)
I have solved above issue using PoolingOptions.
val poolingOptions = new PoolingOptions()
.setConnectionsPerHost(HostDistance.LOCAL, 1, 200)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 256)
.setNewConnectionThreshold(HostDistance.LOCAL, 100).setCoreConnectionsPerHost(HostDistance.LOCAL, 200)
val builder1 = ContactPoint.local
.noHeartbeat()
.withClusterBuilder(_.withoutJMXReporting()
.withoutMetrics().withPoolingOptions(poolingOptions)).keySpace("nd")
Now it is working even with 1l data. But i am not sure about its efficiency.
Could anyone please help me ?

This means that you are submitting too many requests, and not waiting for the futures to complete before submitting more.
The default maximum number of requests per connection is 1024. If this number is exceeded for all connections, the connection pool will enqueue some requests, up to 256. If the queue gets full, a BusyPoolException is thrown. Of course you can increase the max number of requests per connection, and the number of max connections per host. But the real solution is of course to throttle your thread. You could e.g. submit your requests by batches of 1,000 and then wait on the futures to complete before submitting more, or use a semaphore to regulate the total number of pending requests and make sure they don't exceed a certain number (in theory, this number must stay below num_hosts * max_connections_per_host * max_requests_per_connection – in practice, I don't suggest going above 1,000 as it probably won't bring you more throughput).
You may find this links useful.
https://github.com/redisson/redisson/issues/438
https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/p3CwOL0kNrs http://docs.datastax.com/en/developer/java-driver/3.1/manual/pooling

Related

Issue with dsbulk unload

I am getting the below messages while unloading using dsbulk. I am not able to figure out what this means
[s0|347101951|0] Error sending cancel request. This is not critical (the request will eventually time out server-side). (HeartbeatException: null)
Not sending heartbeat because a previous one is still in progress. Check that advanced.heartbeat.interval is not lower than advanced.heartbeat.timeout.
Thanks

"Error sending cancel request" is typical of continuous paging queries. It seems the coordinator is in trouble for some reason, which is why you are also seeing heartbeat failures. Dsbulk may be putting too much load on the cluster.
You didn't mention which version of dsbulk exactly, but assuming 1.4+ I would recommend trying the following actions (individually or combined):
Disable continuous paging with dsbulk.executor.continuousPaging.enabled = false (this is likely to slow down dsbulk).
Use smaller page sizes, e.g. 1000 rows:
If not using continuous paging: datastax-java-driver.basic.request.page-size = 1000 .
If using continuous paging: datastax-java-driver.advanced.continuous-paging.page-size = 1000.
Throttle dsbulk to reduce the load on the cluster
Either "soft" throttle by limiting the number of concurrent requests, e.g. 128:
DSBulk < 1.6: dsbulk.executor.maxInFlight = 128.
DSBulk >= 1.6: dsbulk.engine.maxConcurrentQueries = 128.
Or "hard" throttle by limiting the number of requests per second, e.g. 500: dsbulk.executor.maxPerSecond = 500.

Elastic search could not write all entries: May be es was overloaded

I have an application where I read csv files and do some transformations and then push them to elastic search from spark itself. Like this
input.write.format("org.elasticsearch.spark.sql")
.mode(SaveMode.Append)
.option("es.resource", "{date}/" + type).save()
I have several nodes and in each node, I run 5-6 spark-submit commands that push to elasticsearch
I am frequently getting Errors
Could not write all entries [13/128] (Maybe ES was overloaded?). Error sample (first [5] error messages):
rejected execution of org.elasticsearch.transport.TransportService$7#32e6f8f8 on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor#4448a084[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 451515]]
My Elasticsearch cluster has following stats -
Nodes - 9 (1TB space,
Ram >= 15GB ) More than 8 cores per node
I have modified following parameters for elasticseach
spark.es.batch.size.bytes=5000000
spark.es.batch.size.entries=5000
spark.es.batch.write.refresh=false
Could anyone suggest, What can I fix to get rid of these errors?

This occurs because the bulk requests are incoming at a rate greater than elasticsearch cluster could process and the bulk request queue is full.
The default bulk queue size is 200.
You should handle ideally this on the client side :
1) by reducing the number the spark-submit commands running concurrently
2) Retry in case of rejections by tweaking the es.batch.write.retry.count and
es.batch.write.retry.wait
Example:
es.batch.write.retry.wait = "60s"
es.batch.write.retry.count = 6
On elasticsearch cluster side :
1) check if there are too many shards per index and try reducing it.
This blog has a good discussion on criteria for tuning the number of shards.
2) as a last resort increase the thread_pool.index.bulk.queue_size
Check this blog with an extensive discussion on bulk rejections.

The bulk queue in your ES cluster is hitting its capacity (200) . Try increasing it. See this page for how to change the bulk queue capacity.
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html
Also check this other SO answer where OP had a very similar issue and was fixed by increasing the bulk pool size.
Rejected Execution of org.elasticsearch.transport.TransportService Error

Cassandra : Pool is busy (no available connection and timed out after 10000 MILLISECONDS))

I've been trying to add rows to a cassandra table for a while but it keeps sending me the same error message after a few seconds : Pool is busy (no available connection and timed out after 10000 MILLISECONDS))
I still have a few rows inserted (a hundred) but it stops quickly
I tried to change a few poolingoptions but it didn't change much:
PoolingOptions poolingOptions = new PoolingOptions();//
poolingOptions.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768);
poolingOptions.setPoolTimeoutMillis(10000);
poolingOptions.setMaxConnectionsPerHost(HostDistance.LOCAL, 8);
poolingOptions.setMaxQueueSize(20);
poolingOptions.setIdleTimeoutSeconds(20);
Any idea of what I can do ?
Thanks for your help !

You need to set core connections to max.
val poolingOptions = new PoolingOptions()
.setConnectionsPerHost(HostDistance.LOCAL, 1, 2)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 2)
You can set configurations according to your requirement. Like, how much maximum request can come. Current configuration can manage around 50K requests.
For more information refer

how to check how many pooled connection we have at a specific time in MemSQL

How can i tell that i have reached the maximum number of queries that my cluster can handle at a time?
If the connection pool size is too small for the workload then we have to adjust the max_pooled_connections configuration variable, which controls the number of pooled connections between each pair of nodes.
However how can I tell how many pooled connections we have at a specific time ?
In memsql agregator status i can see following entries Aborted_connects is 11 - why do we abort those connection ? Also Max_used_connections is 41, while Connections is a number that increases constantly.

How can i tell that i have reached the maximum number of queries that
my cluster can handle at a time?
There isn't a hard limit on the number of queries you can send other than max_connections (100k), but at some point the cluster will not execute them all at once and will schedule/queue them up. Is your question about the former or the latter?
If the connection pool size is too small for the workload then we have
to adjust the max_pooled_connections configuration variable, which
controls the number of pooled connections between each pair of nodes.
However how can I tell how many pooled connections we have at a
specific time ?
show leaves will show how many connections are currently open from the current node to each leaf. So the current connection pool size is min(current open connections, max_pooled_connections). Note that this is per (node, node) pair.
In memsql agregator status i can see following entries
Aborted_connects is 11 - why do we abort those connection ? Also
Max_used_connections is 41, while Connections is a number that
increases constantly.
Aborted connects includes e.g. failed login authentications.
max_used_connections is max peak, connections is cumulative total.

Cassandra throwing NoHostAvailableException after 5 minutes of high IOPS run

I'm using datastax cassandra 2.1 driver and performing read/write operations at the rate of ~8000 IOPS. I've used pooling options to configure my session and am using separate session for read and write each of which connect to a different node in the cluster as contact point.
This works fine for say 5 mins but after that I get a lot of exceptions like :
Failed with: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.0.1.123:9042 (com.datastax.driver.core.TransportException: [/10.0.1.123:9042] Connection has been closed), /10.0.1.56:9042 (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)))
Can anyone help me out here on what could be the problem?
The exception asks me to increase number of connections per host but how high a value can I set for this parameter ?
Also I'm not able to set CoreConnectionsPerHost beyond 2 as it throws me exception saying 2 is the max.
This is how I'm creating each read / write session.
PoolingOptions poolingOpts = new PoolingOptions();
poolingOpts.setCoreConnectionsPerHost(HostDistance.REMOTE, 2);
poolingOpts.setMaxConnectionsPerHost(HostDistance.REMOTE, 200);
poolingOpts.setMaxSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 128);
poolingOpts.setMinSimultaneousRequestsPerConnectionThreshold(HostDistance.REMOTE, 2);
cluster = Cluster
.builder()
.withPoolingOptions( poolingOpts )
.addContactPoint(ip)
.withRetryPolicy( DowngradingConsistencyRetryPolicy.INSTANCE )
.withReconnectionPolicy( new ConstantReconnectionPolicy( 100L ) ).build();
Session s = cluster.connect(keySpace);

Your problem might not actually be in your code or the way you are connecting. If you say the problem is happening after a few minutes then it could simply be that your cluster is becoming overloaded trying to process the ingestion of data and cannot keep up. The typical sign of this is when you start seeing JVM garbage collection "GC" messages in the cassandra system.log file, too many small ones batched together of large ones on their own can mean that incoming clients are not responded to causing this kind of scenario. Verify that you do not have too many of these event showing up in your logs first before you start to look at your code. Here's a good example of a large GC event:
INFO [ScheduledTasks:1] 2014-05-15 23:19:49,678 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 2896 ms for 2 collections, 310563800 used; max is 8375238656
When connecting to a cluster there are some recommendations, one of which is only have one Cluster object per real cluster. As per the article I've linked below (apologies if you already studied this):
Use one cluster instance per (physical) cluster (per application lifetime)
Use at most one session instance per keyspace, or use a single Session and explicitly specify the keyspace in your queries
If you execute a statement more than once, consider using a prepared statement
You can reduce the number of network roundtrips and also have atomic operations by using batches
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/fourSimpleRules.html
As you are doing a high number of reads I'd most definitely recommend using setFetchSize also if its applicable to your code
http://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/reference/cqlStatements.html
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/reference/queryBuilderOverview.html
For reference heres the connection options in case you find it useful
http://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/reference/connectionsOptions_c.html
Hope this helps.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

com.datastax.driver.core.exceptions.BusyPoolException - cassandra

Related

Issue with dsbulk unload

Elastic search could not write all entries: May be es was overloaded

Cassandra : Pool is busy (no available connection and timed out after 10000 MILLISECONDS))

how to check how many pooled connection we have at a specific time in MemSQL

Cassandra throwing NoHostAvailableException after 5 minutes of high IOPS run

Categories

Resources