Cassandra NoClassDefFoundError: com/google/common/util/concurrent/AsyncFunction - cassandra

cluster = Cluster.builder()
.addContactPoint("localhost")
.build();
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/util/concurrent/AsyncFunction
The only jars I have in my path are the 2 cassandra java driver jars cassandra-driver-core-2.1.10.3.jar and cassandra-driver-mapping-2.1.10.3.jar
Thanks

The issue is with missing guava.jar. Adding this in the class path solved (this) issue.
Overall issue is lack of suitable DataSatx documentaion

Related

SparkSQL Hive Error : "HikariCP" not found in the CLASSPATH

I configured Hive with mySQL as my metastore. I can enter hive shell and create tables successfully.
Spark version: 2.4.0
Hive version: 3.1.1
When I try to run a SparkSQL program using spark submit, I'm getting the below error.
2019-03-02 15:43:41 WARN HiveMetaStore:622 - Retrying creating default database after error: Error creating transactional connection factory
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
......
......
Exception in thread "main" org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
......
......
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "HikariCP" plugin to create a ConnectionPool gave an error : The connection pool plugin of type "HikariCP" was not found in the CLASSPATH!
Please let me know if anyone can help me in this regard.
I don't know if you have already solved this problem. There is my advice.
the default database connection is HikariCP in the hive-site.xml. You can search for this in the hive-site.xml: datanucleus.connectionPoolingType. The value is HikariCP. So you need to change it to dbcp since you use Mysql as your metastore.
And at last, don't forget about adding the mysql-connector-java-5.x.x.jar to the path like
/home/hadoop/spark-2.3.0-bin-hadoop2.7/jars

Apache Spark on k8s: securing RPC communication between driver and executors is not working

I have been trying Spark 2.4 deployment on k8s and want to establish a secured RPC communication channel between driver and executors. Was using the following configuration parameters as part of spark-submit
spark.authenticate true
spark.authenticate.secret good
spark.network.crypto.enabled true
spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA1
spark.network.crypto.saslFallback false
The driver and executors were not able to communicate on a secured channel and were throwing the following errors.
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown challenge message.
at org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:109)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:181)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:103)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
Can someone guide me on this?
Disclaimer: I do not have a very deep understanding of spark implementation, so, be careful when using the workaround described below.
AFAIK, spark does not have support for auth/encryption for k8s in 2.4.0 version.
There is a ticket, which is already fixed and likely will be released in a next spark version: https://issues.apache.org/jira/browse/SPARK-26239
The problem is that spark executors try to open connection to a driver, and a configuration will be sent only using this connection. Although, an executor creates the connection with default config AND system properties started with "spark.".
For reference, here is the place where executor opens the connection: https://github.com/apache/spark/blob/5fa4384/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L201
Theoretically, if you would set spark.executor.extraJavaOptions=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true ..., it should help, although driver checks that there are no spark parameters set in extraJavaOptions.
Although, there is a workaround (a little bit hacky): you can set spark.executorEnv.JAVA_TOOL_OPTIONS=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true .... Spark does not check this parameter, but JVM uses this env variable to add this parameter to properties.
Also, instead of using JAVA_TOOL_OPTIONS to pass secret, I would recommend to use spark.executorEnv._SPARK_AUTH_SECRET=<secret>.

insert data into Microsoft SQL server using Spark

I am trying to insert data into sql server using spark using the below Jdbc methods.
Option 1:
prop.put("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
dataf.write.mode(org.apache.spark.sql.SaveMode.Append).jdbc(url,table_name, prop)
Table is already created. Appending new data.Job Error-ed with the below exception
Exception in thread "main"
com.microsoft.sqlserver.jdbc.SQLServerException: CREATE TABLE
permission denied in database
Question is : Why create table permission is required for appending the data?
Option2:
prop.put("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils.saveTable(dataf, url, table_name, prop)
Above command working from spark-shell. when the same is used in scala code and packaged with dependencies giving below exception
Exception in thread "main" java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:315)
I tried setting driver class-path and executor class-path and also --jars still no luck. Included sqljdbc4.jar in driver-classpath and --jars.
Copied sqljdbc4.jar to all worker nodes as well still no luck.
Any Ideas on this?
After Lot of searching and Testing, I found the answer. It might be useful for someone.
Option 1: This is because of bug in spark 1.5.X. the same was resolved
in 1.6.x and later. Because of the bug, It always try to create a new
table.
Option2: This causes because , driver name on classpath given
priority than properties we are passing as argument. Workaround for
this is to create connection and then invoke savetable.
workaround if you are using spark 1.5.x or lower.
JdbcUtils.createConnection(url, prop)
JdbcUtils.saveTable()

How to connect to Apache Cassandra with JDBC?

I'm trying to connect to Cassandra from Java code using JDBC connection. Here are the jars I'm using
Now this is the code which I found in the Stackoverflow to do this:
String serverIP = "localhost";
String keyspace = "mykeyspace";
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect(keyspace);
String cqlStatement = "SELECT * FROM users";
for (Row row : session.execute(cqlStatement)) {
System.out.println(row.toString());
}
But unfortunately it's throwing following exception:
log4j:WARN No appenders could be found for logger (com.datastax.driver.core.Cluster).
log4j:WARN Please initialize the log4j system properly.
Exception in thread "main" java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.<init>(IIIIIZ)V
at com.datastax.driver.core.Frame$Decoder.<init>(Frame.java:130)
at com.datastax.driver.core.Connection$PipelineFactory.getPipeline(Connection.java:795)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:212)
at org.jboss.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:188)
at com.datastax.driver.core.Connection.<init>(Connection.java:93)
at com.datastax.driver.core.Connection$Factory.open(Connection.java:432)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:216)
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:171)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1104)
at com.datastax.driver.core.Cluster.init(Cluster.java:121)
at com.datastax.driver.core.Cluster.connect(Cluster.java:198)
at com.datastax.driver.core.Cluster.connect(Cluster.java:226)
at com.mabsisa.resources.Demo.main(Demo.java:28)
I search in the internet for this exception scenario. But not much information I found. Please help me in solving this issue as I need to fix this issue as early as possible...
I think the problem comes from the netty version you are using. You are using the version 2.3.0 of netty and in that version the class
org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder
does not have the constructor which the cassandra driver needs. In the maven repository the cassandra driver core has a depedency with the version 3.9.0.FINAL of netty:
http://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core/2.0.2
So, try to update your version of netty.
Make sure you don't have two version of netty lying in your final build .
I had the same problem where i had two version of netty 3.2.2 and 3.9.0 , latest datastax driver needs 3.9.0 .

Hector Cassandra Connectivity Error

I am trying to connect to a local cassandra instance through a java client powered by Hector. I attempt to read rows after trying to connect. The code snippet is as follows
Cluster myCluster = HFactory.getOrCreateCluster("test" , "localhost:9160");
KeyspaceDefinition keySpaceDef = myCluster.describeKeyspace("testkeyspace");
.....
However the connectivity fails with this error
Exception in thread "main" java.lang.NoSuchFieldError: DEFAULT_MEMTABLE_OPERATIONS_IN_MILLIONS
at me.prettyprint.cassandra.service.ThriftCfDef.(ThriftCfDef.java:65)
at me.prettyprint.cassandra.service.ThriftCfDef.fromThriftList(ThriftCfDef.java:144)
at me.prettyprint.cassandra.service.ThriftKsDef.(ThriftKsDef.java:34)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:192)
at me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:187)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
at me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(AbstractCluster.java:201)
I have cassandra, thrift as dependencies in my pom.xml. Any clues as to what could be wrong?

Resources