We are running a java micro service that uses Cassandra DB multi node cluster. While writing data seeing following error randomly from different nodes:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed
Already verified that all nodes are available and running and are reachable to each other in the cluster.
Any pointer are highly appreciated.
Thanks.
Above error suggests that driver was unable to connect to any of the host for a query. Following could be the reasons for the same.
Cassandra Node down - Which you have verified is not the case for you.
Congestion due to high traffic due to which nodes are shown down.
Intermittent issues in network connectivity between client and nodes which compels driver to mark the host down.
Related
Getting below error when trying to import data from db2 to hdfs using sqoop and spark.
Caused by: com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2043][11550][3.66.46] Exception java.net.ConnectException: Error opening socket to server ip-xx.xx.xx.ec2.interna on port 50,000 with message:
I am able to get data When trying with spark local mode. but getting above error with yarn mode
You are using an old jdbc 3.0 driver (from Db2 V10.5 fixpack 0 (GA).
Upgrade to the latest jdbc 4.0 driver (db2jcc4.jar) version 4.26.14 or higher.
In some cases this upgrade will resolve DisconnectNonTransientConnectionException issues.
Download it from here.
If the symptom repeats, do more problem determination.
Connection timeouts (sqlcode=-4499) are almost always configuration errors.
Specifically verify that the cluster name/address and port-number are correct, that there are no firewall issues, and take a jdbc trace to see what is happening under the covers (needs study).
The Db2 Knowledge Centre online details how to collect a jdbc trace at the client side here.
The Db2 Knowledge Centre also gives details of how to increase the TCPIP connection timeout if the route between client and server is complicated or has latency issues.
When I am trying to test a new connection it returns an error:
The specified host(s) could not be reached.
All host(s) tried for query failed (tried: /host_ip:9042 (com.datastax.driver.core.TransportException: [/host_ip:9042] Cannot connect))
[/host_ip:9042] Cannot connect
In my windows firewall, I have already created a rule for DevCenter, which allows DevCenter to communicate with remote Cassandra server. I have no access to Cassandra server but it is configured well, it means that the problem is somewhere on my local computer.
This type of thing typically happens when the host crashes expectantly resulting in the corruption of the sstables or commitlog files.
This is why it is really important to use replication since when you get into this situation you can run nodetool repair to repair the corrupted tables and data from other nodes.
If you are not fortunate enough to have replication configured, then you are in for some data loss. Clear the suspect file from \data\commitlogs, cry a little and restart the node.
We recently (this week) upgraded our java driver from 2.0.4 to 2.1.4 (we are running Cassandra version 2.0.12). We were using 2.0.4 for a long time (more than an year) in production and did not have this issue.
But, after upgrading we starting seeing intermittent NoHostAvailableExceptions (that fail all the requests on client side):
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:530) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeQuery(SessionManager.java:566) ~[stormjar.jar:na]
at com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:119) ~[stormjar.jar:na]
We are having this issue in production, so would love to get some advice on how to triage this. What we know for sure:
* When errors start happening Cassandra cluster isn't under unusual load.
* We are able to reach Cassandra hosts (cluster) from client side - Not a connection issue.
Cheers,
-Ankush
We are trying to save our RDD which will have close to 4 billion rows to Cassandra. While some of the data gets persisted but for some partitions we see these error logs in the spark logs.
We have already set these two properties for cassandra connector. Is there some other optimization that we would need to do? Also what are the recommended settings for reader? We have left them as default.
spark.cassandra.output.batch.size.rows=1
spark.cassandra.output.concurrent.writes=1
We are running spark-1.1.0 and spark-cassandra-connector-java_2.10 v 2.1.0
15/01/08 05:32:44 ERROR QueryExecutor: Failed to execute: com.datastax.driver.core.BoundStatement#3f480b4e
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.87.33.133:9042 (com.datastax.driver.core.exceptions.DriverException: Timed out waiting for server response))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Thanks
Ankur
I've seen something similar in my four node cluster. It seemed that if I specified EVERY cassandra node name in the spark settings, then it works, however if I only specified the seeds (of the four, two were seeds) than I got the exact same issue. I haven't followed up on it, as specifying all four is getting the job done (but I intend to at some point). I'm using hostnames for seed values and not ips. And using hostnames in spark cassandra settings. I did hear it could be due to some akka dns issues. Maybe try using ip addresses through and through, or specifying all hosts. The latter's been working flawlessly for me.
I realized, I was running the application with spark.cassandra.output.concurrent.writes=2. I changed it to 1 and there were no exceptions. The exceptions were because Spark was producing data at much higher frequency than our Cassandra cluster could write so changing the setting to 1 worked for us.
Thanks!!
If I create a new project like this .
cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
this code works.
But if I take all the jars from this project and migrate the jars to my own project .the code above doesn't work and it says:
13/07/01 16:27:16 ERROR core.Connection: [/127.0.0.1-1] No handler set for stream 1 (this is a bug, either of this driver or of Cassandra, you should report it)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/127.0.0.1])
What version of Cassandra are you running? Have you enabled the native protocol in your cassandra.yaml?
In Cassandra 1.2.0-1.2.4 the native protocol was disabled by default, but in 1.2.5+ it's on by default.
See https://github.com/apache/cassandra/blob/cassandra-1.2.5/conf/cassandra.yaml#L335
That's the most common reason I've seen for not being able to connect with the driver.