All the TCP connection b/w DataStax driver to the Cassandra Remain in Active close state . i.e TIME_WAIT state. - cassandra

The setup:
Web server
Apache Tomcat
RestFull web services
Using DataStax java driver 2.0
Database
-2-node Cassandra 2.0.7.31 cluster
-replicas=1
Problem
After sending set of 1500 request more than three times. I got error at the tomcat log
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
Observation
After this state of tomcat. All the further request attaining the same fate. That is drivers are not able to send my insert request to cassandra.
After executing net stat command i find that the all the TCP connection b/w web server and the Cassandra are in TIMED_WAIT state.
What could be the reason ? why Datastax driver is not able to take back the connection back to the pool? or why does the Cassandra is engaging all the connection form its client.
Thanks in Advance

The connection was increasing Due to calling creating multiple session for each request. Now it is working Fine.
builder = new Cluster.Builder().
addContactPoints("192.168.114.42");
builder.withPoolingOptions(new PoolingOptions().setCoreConnectionsPerHost(
HostDistance.LOCAL, new PoolingOptions().getMaxConnectionsPerHost(HostDistance.LOCAL)));
cluster = builder
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
.withReconnectionPolicy(new ConstantReconnectionPolicy(100L))
.build();
session = cluster.connect("demodb");
Now Driver is maintain 17-26 number of connection irrespective of number of transaction.

Related

How to define local connection to Spark Thrift in Power BI

I am trying to configure the local connection to the Spark Thrift in Power BI. I am able to connect using Spark ODBC (localhost:10000 with mechanism User Name and Thrift transport SASL). But I would like to use Spark connector as it supports Direct Query.
I couldn't find how to define the connection string. Tried several things like localhost:10000/default/;transportMode=http;ssl=true;user=... but always get the error
ERROR TThreadPoolServer:297 - Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 80
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 80
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184)
at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 4 more
Any hints would be appreciated!
Solved. As written here https://community.powerbi.com/t5/Desktop/Connect-Power-BI-to-Hadoop-Direct-query-HDFS-vs-Spark-vs-custom/td-p/374625
it just doesn't work in Power BI from Microsoft Store. It works in the app from the website.

Apache Spark on k8s: securing RPC communication between driver and executors is not working

I have been trying Spark 2.4 deployment on k8s and want to establish a secured RPC communication channel between driver and executors. Was using the following configuration parameters as part of spark-submit
spark.authenticate true
spark.authenticate.secret good
spark.network.crypto.enabled true
spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA1
spark.network.crypto.saslFallback false
The driver and executors were not able to communicate on a secured channel and were throwing the following errors.
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown challenge message.
at org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:109)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:181)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:103)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
Can someone guide me on this?
Disclaimer: I do not have a very deep understanding of spark implementation, so, be careful when using the workaround described below.
AFAIK, spark does not have support for auth/encryption for k8s in 2.4.0 version.
There is a ticket, which is already fixed and likely will be released in a next spark version: https://issues.apache.org/jira/browse/SPARK-26239
The problem is that spark executors try to open connection to a driver, and a configuration will be sent only using this connection. Although, an executor creates the connection with default config AND system properties started with "spark.".
For reference, here is the place where executor opens the connection: https://github.com/apache/spark/blob/5fa4384/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L201
Theoretically, if you would set spark.executor.extraJavaOptions=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true ..., it should help, although driver checks that there are no spark parameters set in extraJavaOptions.
Although, there is a workaround (a little bit hacky): you can set spark.executorEnv.JAVA_TOOL_OPTIONS=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true .... Spark does not check this parameter, but JVM uses this env variable to add this parameter to properties.
Also, instead of using JAVA_TOOL_OPTIONS to pass secret, I would recommend to use spark.executorEnv._SPARK_AUTH_SECRET=<secret>.

Channel is closed while reading from Alluxio using Presto

I encountered this stack trace while running a Presto query on top of Alluxio. Sometimes my query is able to succeed, but sometimes it fails with this error. What does it mean, and how can I fix it?
com.facebook.presto.spi.PrestoException: Error opening Hive split alluxio://xxxxx:19998/s3/data/m-00025 (offset=100663296, length=53990296) using org.apache.hadoop.mapred.TextInputFormat: Channel [id: 0xfa748b02, L:/xxxxx:34874 ! R:xxxxx/xxxxx:29999] is closed.
at com.facebook.presto.hive.HiveUtil.createRecordReader(HiveUtil.java:219)
at com.facebook.presto.hive.GenericHiveRecordCursorProvider.lambda$createRecordCursor$0(GenericHiveRecordCursorProvider.java:71)
at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:80)
at com.facebook.presto.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:70)
at com.facebook.presto.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:183)
at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:93)
at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:44)
at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:216)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:379)
at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:283)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:675)
at com.facebook.presto.operator.Driver.processFor(Driver.java:276)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1053)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:456)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Channel [id: 0xfa748b02, L:/xxxxx:34874 ! R:xxxxx/xxxxx:29999] is closed.
at alluxio.client.block.stream.NettyPacketReader$PacketReadHandler.channelUnregistered(NettyPacketReader.java:314)
at alluxio.core.client.runtime.io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:176)
This means the connection between the Alluxio client (Presto) and Alluxio worker was closed unexpectedly.
Usually this is caused by a long GC pause on the client. The Alluxio client periodically sends a keep-alive on the connection, but this can be delayed (to the point of the worker closing the connection) by full GCs.
You can verify if there is GC pressure by adding the Java options -XX:+PrintGCDetails and -Xloggc:<file name here> to the Presto daemons.

Datastax driver connection exception DSE 5.0 , CASSANDRA 3.0.7 ,spark

I am trying to understand the warning, every time i am seeing the below exception when i run my spark job .I am seeing this in 2 nodes of my 3 node cluster.But as i said its just warn , job succeeds how ever.
com.datastax.driver.core.exceptions.ConnectionException: [x.x.x.x/x.x.x.x:9042] Pool was closed during initialization
CASSANDRA LOG
INFO [SharedPool-Worker-1] 2017-07-17 22:25:48,716 Message.java:605
- Unexpected exception during request; channel = [id: 0xf0ee1096, /x.x.x.x:54863 => /x.x.x.x:9042]
io.netty.channel.unix.Errors$NativeIoException: readAddress() failed:
Connection timed out
at io.netty.channel.unix.Errors.newIOException(Errors.java:105)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.unix.Errors.ioResult(Errors.java:121) ~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.unix.FileDescriptor.readAddress(FileDescriptor.java:134)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.epoll.AbstractEpollChannel.doReadBytes(AbstractEpollChannel.java:239)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:822)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:348)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-all-4.0.34.Final.jar:4.0.34.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
The core of the error is "Connection timed out". I recommend troubleshooting network connectivity to the Cassandra cluster, starting with simpler tools such as ping, telnet and nc. Some potential causes:
The Cassandra client's connection configuration included an address that is not valid (not a node in the Cassandra cluster).
A network misconfiguration or firewall rule is preventing connections from the client to the Cassandra server.
The destination Cassandra server is overloaded, such that it cannot respond to new connection requests.
You mentioned that the problem is intermittent ("seeing this in 2 nodes of my 3 node cluster") and does not cause job failure. This could be an indicator that any of the problems listed above is happening for just a subset of nodes in the cluster. (If connectivity to all nodes was broken, then the job likely would have failed.)

KairosDB failed to discover other Cassandra nodes in ring with Hector client

I have a multi-node Casssandra cluster (2.2.6) and a separate KairosDB server (1.1.1-1). In KairosDB, I configured it with two Cassandra seed nodes and have it auto discover other Cassandra nodes in the ring.
After tuning KairosDB log level to DEBUG, I see that only those two seed nodes are in host pool (and working well). Hector discovery process failed with an NPE. At the end only these two seed nodes are used by KairosDB.
There might be a few solutions:
Add all nodes to kairos properties, but it's harder to maintain.
Custom build a new KairosDB binary to have later version of Hector 2.0.0, but I prefer to go with official releases if possible.
Do you know a way to get around this? Thanks.
08-04|18:54:57.755 [Hector.me.prettyprint.cassandra.connection.NodeAutoDiscoverService-1] DEBUG [NodeDiscovery.java:50] - Node discovery running...
08-04|18:54:57.756 [Hector.me.prettyprint.cassandra.connection.NodeAutoDiscoverService-1] DEBUG [NodeDiscovery.java:74] - using existing hosts [cassandra-seed1(172.16.109.43):9160, cassandra-seed2(172.16.108.51):9160]
08-04|18:54:57.756 [Hector.me.prettyprint.cassandra.connection.NodeAutoDiscoverService-1] ERROR [NodeDiscovery.java:105] - Discovery Service failed attempt to connect CassandraHost
java.lang.NullPointerException: null
at me.prettyprint.cassandra.connection.NodeDiscovery.discoverNodes(NodeDiscovery.java:79) [hector-core-1.1-4.jar:na]
at me.prettyprint.cassandra.connection.NodeDiscovery.doAddNodes(NodeDiscovery.java:52) [hector-core-1.1-4.jar:na]
at me.prettyprint.cassandra.connection.NodeAutoDiscoverService.doAddNodes(NodeAutoDiscoverService.java:45) [hector-core-1.1-4.jar:na]
at me.prettyprint.cassandra.connection.NodeAutoDiscoverService$QueryRing.run(NodeAutoDiscoverService.java:51) [hector-core-1.1-4.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
08-04|18:54:57.756 [Hector.me.prettyprint.cassandra.connection.NodeAutoDiscoverService-1] DEBUG [NodeDiscovery.java:62] - Node discovery run complete.

Resources