Spark2 Zeppelin interpreter dependency results in Null pointer NullPointerException - apache-spark

I am trying to add dependency to Spark2 interpreter in Zeppelin as follows:
org.bdgenomics.adam:adam-core-spark2_2.11:0.23.0
There are no jars at localRepo location. So after I add this
dependency and run a simple command like println("hello world") I
am getting this NullPointerException:
java.lang.NullPointerException
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:861)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Now there are a lot of packages with jars in local-repo folder and
also under local-repo/2C4U48MY3_spark2 which is the location for
parameter zeppelin.interpreter.localRepo.
Here is zeppelin spark2 interpreter log: http://pasted.co/06cbc12a
Without this dependency, everything works fine.
I'm unable to say why I am getting this NPE. Can you help me please?

Related

Error when running Zeppelin to connect Neo4j & Spark

I'm trying to run : https://github.com/conker84/zep-neo4j-gc2k18
The 1st paragraph Add Neo4j-Spark-Connector dependency:
%spark.dep
z.reset()
z.load("neo4j-contrib:neo4j-spark-connector:2.2.1-M5")
but an error occurred as follow:
java.lang.NullPointerException
at org.sonatype.aether.impl.internal.DefaultRepositorySystem.resolveDependencies(DefaultRepositorySystem.java:352)
at org.apache.zeppelin.spark.dep.SparkDependencyContext.fetchArtifactWithDep(SparkDependencyContext.java:171)
at org.apache.zeppelin.spark.dep.SparkDependencyContext.fetch(SparkDependencyContext.java:121)
at org.apache.zeppelin.spark.DepInterpreter.interpret(DepInterpreter.java:247)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:633)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
What should I do to correct the error?
You are looking at an outdated version of Zeppelin integration. Nowadays Zeppelin supports querying Neo4j out of the box with https://zeppelin.apache.org/docs/0.8.0/interpreter/neo4j.html.
You can check out how I have done it in my repo: https://github.com/tomasonjo/zeppelin-graph-algo

Getting "org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Interpreter process is not running null"

Hi I am on Docker on mac[K8 enabled] and trying to deploy Zeppelin on K8 by following https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/quickstart/kubernetes.html.
After deploying the zeppelin server on K8, I am trying to run the Spark example but getting following exception:
org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Interpreter process is not running
null
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:134)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:281)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:412)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:72)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:180)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Interpreter process is not running
null
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:163)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:131)
... 13 more
Spark interpreter status is green
Interpreter is binded
What should I check to know why it is failing?

Not able to use Cassandra Interpreter in Apache Zeppelin

On executing cassandra code in Zeppelin notebook I get the following error on first run-
java.io.IOException: Invalid argument
at java.io.WinNTFileSystem.canonicalize0(Native Method)
at java.io.WinNTFileSystem.canonicalize(WinNTFileSystem.java:428)
at java.io.File.getCanonicalPath(File.java:618)
at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getClassPathFrom$3.apply(ClassPathBuilder.scala:147)
at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getClassPathFrom$3.apply(ClassPathBuilder.scala:142)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:683)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:682)
at org.fusesource.scalate.util.ClassPathBuilder$.getClassPathFrom(ClassPathBuilder.scala:142)
at org.fusesource.scalate.util.ClassPathBuilder.addPathFrom(ClassPathBuilder.scala:68)
at org.fusesource.scalate.util.ClassPathBuilder.addPathFromContextClassLoader(ClassPathBuilder.scala:73)
at org.fusesource.scalate.support.ScalaCompiler.generateSettings(ScalaCompiler.scala:121)
at org.fusesource.scalate.support.ScalaCompiler.<init>(ScalaCompiler.scala:59)
at org.fusesource.scalate.support.ScalaCompiler$.create(ScalaCompiler.scala:42)
at org.fusesource.scalate.TemplateEngine.createCompiler(TemplateEngine.scala:231)
at org.fusesource.scalate.TemplateEngine.compiler$lzycompute(TemplateEngine.scala:221)
at org.fusesource.scalate.TemplateEngine.compiler(TemplateEngine.scala:221)
at org.fusesource.scalate.TemplateEngine.compileAndLoad(TemplateEngine.scala:757)
at org.fusesource.scalate.TemplateEngine.compileAndLoadEntry(TemplateEngine.scala:699)
at org.fusesource.scalate.TemplateEngine.liftedTree1$1(TemplateEngine.scala:419)
at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:413)
at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:471)
at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:573)
at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$.<init>(DisplaySystem.scala:369)
at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$.<clinit>(DisplaySystem.scala)
at org.apache.zeppelin.cassandra.EnhancedSession.<init>(EnhancedSession.scala:40)
at org.apache.zeppelin.cassandra.InterpreterLogic.<init>(InterpreterLogic.scala:98)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:231)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:616)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
And on the second run I get-
java.lang.NoClassDefFoundError: Could not initialize class org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$
at org.apache.zeppelin.cassandra.EnhancedSession.<init>(EnhancedSession.scala:40)
at org.apache.zeppelin.cassandra.InterpreterLogic.<init>(InterpreterLogic.scala:98)
at org.apache.zeppelin.cassandra.CassandraInterpreter.open(CassandraInterpreter.java:231)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:616)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Apache Zeppelin version-0.8.2
Could it be because of JAR file versions inside \interpreter\cassandra ?
Appreciate any help.
i am also facing similar issue and found a detailed analysis on why it is occuring . please follow the link to know root cause of the issue
And It was reported already at issues.apache.org click here for report

org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin

I'm trying to connect cassandra with zeppelin. Every time I try to execute a query I get the following error. What should I do?
%cassandra
select * from rawtweet1
org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:223)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:184)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:168)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:172)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:328)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:260)
at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:328)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The problem was in the configuration of the Cassandra interpreter. I had to change it back to the default settings and than everything worked.

Apache Zeppelin configuration with Spark

I have been trying to configure Apache Zeppeling with Spark 2.0. I managed to install them both on a linux os and I set the the spark on the 8080 port while zeppelin server on the 8082 port number.
In the zeppelin-env.sh file from zeppelin I set the SPARK_HOME variable to the location of the Spark folder.
However when I try to create a new node nothing compiles properly. From what it seems I didn't configure the interpreters as the interpreter tab is missing from in the home tab.
Any help would be much appreciated.
EDIT: E.I. when I am trying to run the zeppelin tutorial, the 'Load data into table' process I receive the following error:
java.lang.ClassNotFoundException:
org.apache.spark.repl.SparkCommandLine at
java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at
java.lang.ClassLoader.loadClass(ClassLoader.java:357) at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:400)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
at org.apache.zeppelin.scheduler.Job.run(Job.java:176) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I don't think it's possible to use spark 2.0 without building from
source, since some relatively big changes have happened with this release.
You can clone the zeppelin git repo and build using the spark 2.0 profile as mentioned in the readme on github https://github.com/apache/zeppelin.
I've tried it and it works.

Resources