Hbase distributed scan issue using Spark2 - apache-spark

When i am trying to execute hbase operation programmatically from spark/scala file
Observation
We just migrated from spark 1.6 version to 2.3 version and hbase 1.2 version same in both the cases. but now we are facing the issue while doing hbase distributed scan operation
Getting Error
Exception in thread "main" org.apache.hadoop.hbase.DoNotRetryIOException: /0.0.0.0:60020 is unable to read call parameter from client ; java.lang.UnsupportedOperationException: GetRegionLoad
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:359)
at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:336)
at org.apache.hadoop.hbase.client.HBaseAdmin.getRegionMetrics(HBaseAdmin.java:2129)
at org.apache.hadoop.hbase.mapreduce.RegionSizeCalculator.init(RegionSizeCalculator.java:82)
at org.apache.hadoop.hbase.mapreduce.RegionSizeCalculator.<init>(RegionSizeCalculator.java:61)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.oneInputSplitPerRegion(TableInputFormatBase.java:294)
at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:257)
at org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:254)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:127)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092)
at org.apache.spark.rdd.RDD.count(RDD.scala:1162)
at com.amobee.spark.dps.cdp.util.CopyHbaseTableUtil$.main(CopyHbaseTableUtil.scala:53)
at com.amobee.spark.dps.cdp.util.CopyHbaseTableUtil.main(CopyHbaseTableUtil.scala)
Any help on it would be much appreciated .

In my experience this operation is incompatible with HBase below 2.x

Related

spark version 3.1.1 job stuck

I am running spark 3.1.1 using spark-submit
The app opens HTTP interface and I am sending it tasks.
the first run works great but the second one I send just stuck without any error in the log.
the UI is responsive and from the executor tab I can see:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.FutureTask.awaitDone(FutureTask.java:426)
java.util.concurrent.FutureTask.get(FutureTask.java:204)
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:194)
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeBroadcast$1(SparkPlan.scala:193)
org.apache.spark.sql.execution.SparkPlan$$Lambda$5710/2015474824.apply(Unknown Source)
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
org.apache.spark.sql.execution.SparkPlan$$Lambda$4150/1031027984.apply(Unknown Source)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:189)
org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doExecute(BroadcastHashJoinExec.scala:142)
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
org.apache.spark.sql.execution.SparkPlan$$Lambda$4148/1862342205.apply(Unknown Source)
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
org.apache.spark.sql.execution.SparkPlan$$Lambda$4150/1031027984.apply(Unknown Source)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
Doing the same with spark 2.4.7 all works great.
What could be the problem?

Cassandra throwing NoHostAvailableException

I am having 3 node cluster with replication factor 2 and one of my Cassandra node was down and I started receiving following exception in my java service in
org.springframework.cassandra.support.exception.CassandraConnectionFailureException: All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at org.springframework.cassandra.support.CassandraExceptionTranslator.translateExceptionIfPossible(CassandraExceptionTranslator.java:145)
at org.springframework.cassandra.core.CqlTemplate.potentiallyConvertRuntimeException(CqlTemplate.java:946)
at org.springframework.cassandra.core.CqlTemplate.translateExceptionIfPossible(CqlTemplate.java:930)
at org.springframework.cassandra.core.CqlTemplate.translateExceptionIfPossible(CqlTemplate.java:912)
at org.springframework.cassandra.core.CqlTemplate.doExecute(CqlTemplate.java:278)
at org.springframework.cassandra.core.CqlTemplate.doExecute(CqlTemplate.java:559)
at org.springframework.cassandra.core.CqlTemplate.doExecute(CqlTemplate.java:549)
at org.springframework.cassandra.core.CqlTemplate.query(CqlTemplate.java:485)
at org.springframework.cassandra.core.CqlTemplate.query(CqlTemplate.java:510)
at org.springframework.cassandra.core.CqlTemplate.query(CqlTemplate.java:505)
at org.springframework.data.cassandra.core.CassandraTemplate.selectOne(CassandraTemplate.java:638)
at org.springframework.data.cassandra.core.CassandraTemplate.selectOne(CassandraTemplate.java:509)
at org.springframework.data.cassandra.repository.query.CassandraQueryExecution$SingleEntityExecution.execute(CassandraQueryExecution.java:104)
at org.springframework.data.cassandra.repository.query.CassandraQueryExecution$ResultProcessingExecution.execute(CassandraQueryExecution.java:143)
at org.springframework.data.cassandra.repository.query.AbstractCassandraQuery.execute(AbstractCassandraQuery.java:113)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:499)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:477)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:56)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:57)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy125.getSearchAppearanceCount(Unknown Source)
at sun.reflect.GeneratedMethodAccessor168.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
Cassandra Version: 3.0.0
Java Driver version: 3.3.0
After removing node(which was down) from DNS, I tried restarting my java service and I started getting exceptions
com.datastax.driver.core.exceptions.AuthenticationException: Authentication error on host cassandra-activitydb.resdex.com/172.10.212.52:9042: Error during authentication of user naukCassandra : org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_ONE
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:588)
at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:88)
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:366)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1225)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:552)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:306)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:302)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:759)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:866)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:542)
at org.springframework.boot.context.embedded.EmbeddedWebApplicationContext.refresh(EmbeddedWebApplicationContext.java:122)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:761)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:371)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1186)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1175)
at com.ie.naukri.activity_tracker.main.ApplicationConfig.main(ApplicationConfig.java:25)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
So my guess is that your SYSTEM_AUTH keyspace has a RF=1 and the node that went down had your authentication record (user row). You could try a different user who's data resided on an available host to get connected, and drop/recreate the affected users, etc. In general, ensure that the SYSTEM_AUTH table is replicated to something higher than 1 (with 3 nodes, I would simply change it to 3, but you could do 2 and hopefully be safe for a single node failure).

How can see the SQL statements that SPARK sends to my database?

I have a spark cluster and a vertica database. I use
spark.read.jdbc( # etc
to load Spark dataframes into the cluster. When I do a certain groupby function
df2 = df.groupby('factor').agg(F.stddev('sum(PnL)'))
df2.show()
I then get a vertica syntax exception
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1916)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:347)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:39)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2193)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2546)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2192)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2199)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1935)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1934)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2576)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1934)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2149)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:239)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLSyntaxErrorException: [Vertica][VJDBC](4856) ERROR: Syntax error at or near "Window"
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.io.ProtocolStream.readExpectedMessage(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepareImpl(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepare(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepare(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.<init>(Unknown Source)
at com.vertica.jdbc.jdbc4.S4PreparedStatement.<init>(Unknown Source)
at com.vertica.jdbc.VerticaJdbc4PreparedStatementImpl.<init>(Unknown Source)
at com.vertica.jdbc.VJDBCObjectFactory.createPreparedStatement(Unknown Source)
at com.vertica.jdbc.common.SConnection.prepareStatement(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.<init>(JDBCRDD.scala:400)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:379)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
Caused by: com.vertica.support.exceptions.SyntaxErrorException: [Vertica][VJDBC](4856) ERROR: Syntax error at or near "Window"
... 27 more
What I want to know is, what exactly did spark try to execute against the vertica database? Is there a trace config I can set somewhere?
Thanks!
You can look at the query_requests system table to see what SQL has been run against your database. You can filter on user_name and start_timestamp to try and help find the query.
Typically when you control the SQL, you would want to add in a label. But in this case you'll have to search for it.
Also be aware that the retention period of this is governed by the data collector settings.
Using the Spark web UI you can check the behavior and performance of your Spark application. It can also show you the SQL's in SQL tab for web UI. You can also explore the Resource Manager log for more details.
Spark web UI at http://<host ip>:4040.
You can access the SQL tab under /SQL URL, e.g. http://:4040/SQL/.

Pentaho Cassandra input : Column family does not exist in keyspace

I am trying to set up an input step from Cassandra in Pentaho Spoon UI. But when I try to show schema or run the task I get the following exception:
java.lang.Exception: The column family local does not seem to exist in the keyspace system
at org.pentaho.di.trans.steps.cassandrainput.CassandraInputDialog$10.widgetSelected(CassandraInputDialog.java:486)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.trans.steps.cassandrainput.CassandraInputDialog.open(CassandraInputDialog.java:632)
at org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:136)
at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:7835)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:2749)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:1998)
at org.pentaho.di.ui.spoon.trans.TransGraph.mouseDown(TransGraph.java:809)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1221)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7044)
at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:8304)
at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:580)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.pentaho.commons.launcher.Launcher.main(Launcher.java:134)
I can access the table in the keyspace with CQLSH and DevCenter. I have also tried other keyspaces and tables only to get the same error.
Any ideas would be greatly appreciated as I can't seem to find any previous answers to this issue here or in other forums.
So to answer my own question. Having spoken to a few people in Pentaho. It is simply support for the version of cassandra that is lacking. I am using PDI spoon version 4.4.0 which apparently will only provide support for cassandra up to 2.0. In Cassandra 2.1 or higher there was a change to meta structure that hasn't yet been added to PDI.

Decommissioning of nodes in cassandra

I got the following exception when i was trying to decommission a live node. There is no reference in the manual saying that i need to stop the data coming into this node. Even then, decommissioning is specified for live nodes.
Can anyone let me know if am doing something wrong or if this is a bug on cassandra part?
Cassandra Version Used : 2.0.3
Exception in thread "main" java.lang.UnsupportedOperationException:
data is currently moving to this node; unable to leave the ring at
org.apache.cassandra.service.StorageService.decommission(StorageService.java:2629)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601) at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45)
at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:235)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:250)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:791)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1486)
at
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:847)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601) at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177) at
sun.rmi.transport.Transport$1.run(Transport.java:174) at
java.security.AccessController.doPrivileged(Native Method) at
sun.rmi.transport.Transport.serviceCall(Transport.java:173) at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

Resources