I am trying to connect to a local cassandra instance through a java client powered by Hector. I attempt to read rows after trying to connect. The code snippet is as follows
Cluster myCluster = HFactory.getOrCreateCluster("test" , "localhost:9160");
KeyspaceDefinition keySpaceDef = myCluster.describeKeyspace("testkeyspace");
.....
However the connectivity fails with this error
Exception in thread "main" java.lang.NoSuchFieldError: DEFAULT_MEMTABLE_OPERATIONS_IN_MILLIONS
at me.prettyprint.cassandra.service.ThriftCfDef.(ThriftCfDef.java:65)
at me.prettyprint.cassandra.service.ThriftCfDef.fromThriftList(ThriftCfDef.java:144)
at me.prettyprint.cassandra.service.ThriftKsDef.(ThriftKsDef.java:34)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:192)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:187)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
at me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(AbstractCluster.java:201)
I have cassandra, thrift as dependencies in my pom.xml. Any clues as to what could be wrong?
Related
I'm trying to make a connection to elasticsearch from my spark program.
My elasticsearch host is https and found no connection property for that.
We are using spark structred streaming Java API and the connection details are as follows,
SparkSession spark = SparkSession.builder()
.config(ConfigurationOptions.ES_NET_HTTP_AUTH_USER, "username")
.config(ConfigurationOptions.ES_NET_HTTP_AUTH_PASS, "password")
.config(ConfigurationOptions.ES_NODES, "my_host_url")
.config(ConfigurationOptions.ES_PORT, "9200")
.config(ConfigurationOptions.ES_NET_SSL_TRUST_STORE_LOCATION,"C:\\certs\\elastic\\truststore.jks")
.config(ConfigurationOptions.ES_NET_SSL_TRUST_STORE_PASS,"my_password") .config(ConfigurationOptions.ES_NET_SSL_KEYSTORE_TYPE,"jks")
.master("local[2]")
.appName("spark_elastic").getOrCreate();
spark.conf().set("spark.sql.shuffle.partitions",2);
spark.conf().set("spark.default.parallelism",2);
And I'm getting the following error
19/07/01 12:26:00 INFO HttpMethodDirector: I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server 10.xx.xxx.xxx failed to respond
19/07/01 12:26:00 INFO HttpMethodDirector: Retrying request
19/07/01 12:26:00 ERROR NetworkClient: Node [10.xx.xxx.xxx:9200] failed (The server 10.xx.xxx.xxx failed to respond); no other nodes left - aborting...
19/07/01 12:26:00 ERROR StpMain: Error
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:344)
Probably it's because it tries to initiate connection by http protocol but in my case I need https connection and not sure how to configure that
The error happened as spark was not able to locate the truststore file. It seems we need to add "file:\\" for the path to be accepted.
I configured Hive with mySQL as my metastore. I can enter hive shell and create tables successfully.
Spark version: 2.4.0
Hive version: 3.1.1
When I try to run a SparkSQL program using spark submit, I'm getting the below error.
2019-03-02 15:43:41 WARN HiveMetaStore:622 - Retrying creating default database after error: Error creating transactional connection factory
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
......
......
Exception in thread "main" org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
......
......
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "HikariCP" plugin to create a ConnectionPool gave an error : The connection pool plugin of type "HikariCP" was not found in the CLASSPATH!
Please let me know if anyone can help me in this regard.
I don't know if you have already solved this problem. There is my advice.
the default database connection is HikariCP in the hive-site.xml. You can search for this in the hive-site.xml: datanucleus.connectionPoolingType. The value is HikariCP. So you need to change it to dbcp since you use Mysql as your metastore.
And at last, don't forget about adding the mysql-connector-java-5.x.x.jar to the path like
/home/hadoop/spark-2.3.0-bin-hadoop2.7/jars
thanks for the time. I am trying to access a remote Cassandra DB in order to complete my assertions. I see that the Server is running:
Cassandra V 3.0.8.1293
Driver Type: Cassandra CQL
Datastax Java Driver for Apache Cassandra - Core [3.0.5]
So, I am trying with the following simple code to access the DB
import com.datastax.driver.core.*
Cluster cluster = null;
try {
cluster = Cluster.builder().addContactPoint("x.x.x.x").withCredentials("xxxxxxx", "xxxxxx").withPort(9042).build()
Session session = cluster.connect();
ResultSet rs = session.execute("select * from TABLE");
Row row = rs.one();
} finally {
if (cluster != null) cluster.close();
}
when I use the cassandra-driver-core-2.0.1.jar I am getting the error :
ERROR:com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.x.x.x(null))
Read the documentation and a lot of posts here and on other blogs and I saw that there may be an incompatibility with the driver version so I tried to upgrade the driver to many versions (cassandra-driver-core-2.5,cassandra-driver-core-3,cassandra-driver-core-3.2), but on that I am getting the following:
ERROR:java.lang.ExceptionInInitializerError
Have also tried to connect using JDBC, but to no avail, using the configuration proposed at this thread
SoapUI JDBC connection with Apache Cassandra
Actually I am running out of ideas. Can anyone propose or point to some direction on how to actually achieve this, either by pointing me to some tutorial or any idea.
Thank you very much
I think you haven't enable remote access to cassandra.
Try enabling remote access using below configuration -
File Path /etc/cassandra/default.conf/cassandra.yaml
rpc_address: 0.0.0.0
broadcast_rpc_address: <serverIPAddress>
After that, restart cassandra service.
First, I have bought the new O'Reilly Spark book and tried those Cassandra setup instructions. I've also found other stackoverflow posts and various posts and guides over the web. None of them work as-is. Below is as far as I could get.
This is a test with only a handful of records of dummy test data. I am running the most recent Cassandra 2.0.7 Virtual Box VM provided by plasetcassandra.org linked from the main Cassandra project page.
I downloaded Spark 1.2.1 source and got the latest Cassandra Connector code from github and built both against Scala 2.11. I have JDK 1.8.0_40 and Scala 2.11.6 setup on Mac OS 10.10.2.
I run the spark shell with the cassandra connector loaded:
bin/spark-shell --driver-class-path ../spark-cassandra-connector/spark-cassandra-connector/target/scala-2.11/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar
Then I do what should be a simple row count type test on a test table of four records:
import com.datastax.spark.connector._
sc.stop
val conf = new org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "192.168.56.101")
val sc = new org.apache.spark.SparkContext(conf)
val table = sc.cassandraTable("mykeyspace", "playlists")
table.count
I get the following error. What is confusing is that it is getting errors trying to find Cassandra at 127.0.0.1, but it also recognizes the host name that I configured which is 192.168.56.101.
15/03/16 15:56:54 INFO Cluster: New Cassandra host /192.168.56.101:9042 added
15/03/16 15:56:54 INFO CassandraConnector: Connected to Cassandra cluster: Cluster on a Stick
15/03/16 15:56:54 ERROR ServerSideTokenRangeSplitter: Failure while fetching splits from Cassandra
java.io.IOException: Failed to open thrift connection to Cassandra at 127.0.0.1:9160
<snip>
java.io.IOException: Failed to fetch splits of TokenRange(0,0,Set(CassandraNode(/127.0.0.1,/127.0.0.1)),None) from all endpoints: CassandraNode(/127.0.0.1,/127.0.0.1)
BTW, I can also use a configuration file at conf/spark-defaults.conf to do the above without having to close/recreate a spark context or pass in the --driver-clas-path argument. I ultimately hit the same error though, and the above steps seem easier to communicate in this post.
Any ideas?
Check the rpc_address config in your cassandra.yaml file on your cassandra node. It's likely that the spark connector is using that value from the system.local/system.peers tables and it may be set to 127.0.0.1 in your cassandra.yaml.
The spark connector uses thrift to get token range splits from cassandra. Eventually I'm betting this will be replaced as C* 2.1.4 has a new table called system.size_estimates (CASSANDRA-7688). It looks like it's getting the host metadata to find the nearest host and then making the query using thrift on port 9160.
I'm trying to connect to Cassandra from Java code using JDBC connection. Here are the jars I'm using
Now this is the code which I found in the Stackoverflow to do this:
String serverIP = "localhost";
String keyspace = "mykeyspace";
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect(keyspace);
String cqlStatement = "SELECT * FROM users";
for (Row row : session.execute(cqlStatement)) {
System.out.println(row.toString());
}
But unfortunately it's throwing following exception:
log4j:WARN No appenders could be found for logger (com.datastax.driver.core.Cluster).
log4j:WARN Please initialize the log4j system properly.
Exception in thread "main" java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.<init>(IIIIIZ)V
at com.datastax.driver.core.Frame$Decoder.<init>(Frame.java:130)
at com.datastax.driver.core.Connection$PipelineFactory.getPipeline(Connection.java:795)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:212)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:188)
at com.datastax.driver.core.Connection.<init>(Connection.java:93)
at com.datastax.driver.core.Connection$Factory.open(Connection.java:432)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:216)
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:171)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1104)
at com.datastax.driver.core.Cluster.init(Cluster.java:121)
at com.datastax.driver.core.Cluster.connect(Cluster.java:198)
at com.datastax.driver.core.Cluster.connect(Cluster.java:226)
at com.mabsisa.resources.Demo.main(Demo.java:28)
I search in the internet for this exception scenario. But not much information I found. Please help me in solving this issue as I need to fix this issue as early as possible...
I think the problem comes from the netty version you are using. You are using the version 2.3.0 of netty and in that version the class
org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder
does not have the constructor which the cassandra driver needs. In the maven repository the cassandra driver core has a depedency with the version 3.9.0.FINAL of netty:
http://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core/2.0.2
So, try to update your version of netty.
Make sure you don't have two version of netty lying in your final build .
I had the same problem where i had two version of netty 3.2.2 and 3.9.0 , latest datastax driver needs 3.9.0 .