Unable to connect to Neo4j Community Server via Spark Shell : The client is unauthorized due to authentication failure

Unable to connect to Neo4j Community Server via Spark Shell : The client is unauthorized due to authentication failure - apache-spark

Versions
Spark:: 2.4.0
Neo4j:: Neo4j Community Version 3.5.6
Problem Statement:
I am trying to connect spark shell with the Neo4j Community server. Everything is being run locally. The end goal that I am trying to achieve is that I want to query the Neo4j and load the data in the form of rdds. Later on I want to convert these rdds to Json Structure. I am using this connector https://github.com/neo4j-contrib/neo4j-spark-connector
But at the moment I am facing authentication Problems with the Neo4j Server. When I execute basic commands to make connection and set the Neo4j context. They seems to work fine but when I try to run rdd.count or rdd.first.schema.fieldName I run into authentication errors that client is not authenticated.
Spark Shell Commands:
spark-shell --master spark://10.62.10.71:7077 --conf spark.neo4j.bolt.username=neo4j spark.neo4j.bolt.password=<password> --jars ‪C:/Users/khalid-admin/Desktop/jar_files/neo4j-spark-connector-full-2.4.0-M6
import org.neo4j.spark._
val neo = Neo4j(sc)
val rdd = neo.cypher("MATCH (n:Person) RETURN id(n) as id ").loadRowRdd
Image:
Error:
[Stage 0:> (0 + 1) / 1]2019-08-22 00:25:17 WARN TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, 10.62.10.71, executor 0): org.neo4j.driver.v1.exceptions.AuthenticationException: The client is unauthorized due to authentication failure.
at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:122)
at org.neo4j.driver.internal.DriverFactory.verifyConnectivity(DriverFactory.java:346)
at org.neo4j.driver.internal.DriverFactory.newInstance(DriverFactory.java:93)
at org.neo4j.driver.v1.GraphDatabase.driver(GraphDatabase.java:136)
at org.neo4j.driver.v1.GraphDatabase.driver(GraphDatabase.java:119)
at org.neo4j.spark.Neo4jConfig.driver(Neo4jConfig.scala:15)
at org.neo4j.spark.Neo4jConfig.driver(Neo4jConfig.scala:19)
at org.neo4j.spark.Executor$.execute(Neo4j.scala:394)
at org.neo4j.spark.Neo4jRDD.compute(Neo4j.scala:458)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Suppresse
Steps Tried so far:
So far I have tried the following steps:
I have made sure that I am not using default credentials of the
Neo4j.
When I first login in the Neo4j Browser, It prompted me for
a Change Password window and I changed my password. Everything else seems to be working fine.
Can anybody guide what further steps I can take to narrow down the problem?

Related

Is it possible to run Spark Cluster and a client as different users?

I am running Standalone Spark Cluster v 3.3 with one Master Node and 3 workers on a set of several VMs as "sparkcluster" user. I also have my Client application on my local Windows box that connects to the Spark Cluster to load and persist the Data and eventually perform some data transformation. I can clearly see via Spark UI that my application connects to the Spark Cluster, however when I am trying to persist the data I am getting an exception :
java.io.IOException: Mkdirs failed to create file:/share/data/spark/load/TESTCACHE/f1ab4f4b-0b95-490b-b05c-5154cec2ae6b/fb51846b-c20d-42b3-bb3c-74abf71d10a2.part/_temporary/0/_temporary/attempt_202211301636564407278350514085268_0003_m_000000_34 (exists=false, cwd=file:/app/spark/spark-3.3.0-bin-hadoop3/work/app-20221129201401-0055/0)
From what I found the reason for the exception is that I run Spark cluster as user A and my client application run as user B.
One possible solution is to deploy my application to the same location where I run Spark Cluster and run it as the same user, however it defeats the purpose of distributed system in my mind. Is there some sort of "white list" of users/IDs/clients I can define in Spark Cluster that will allow particular Client to perform all kind of operations on the Cluster?
complete Stacktrace :
17:33:21.372 [task-result-getter-2] WARN org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 2.0 (TID 2) (30.103.216.22 executor 0): java.io.IOException: Mkdirs failed to create file:/share/data/spark/tmp_load/b0ef0b46-da4c-474a-b381-72d79ba38161.part/_temporary/0/_temporary/attempt_202211292233192374251041408678060_0002_m_000000_2 (exists=false, cwd=file:/app/spark/spark-3.3.0-bin-hadoop3/work/app-20221129201401-0055/0)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:515)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175)
at org.apache.parquet.hadoop.util.HadoopOutputFile.create(HadoopOutputFile.java:74)
at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:329)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:482)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:420)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:409)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:36)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:155)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:146)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:317)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$21(FileFormatWriter.scala:256)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)

AWS Graviton based EC2 Instance Upgrade in EMR causing task failures

I have a spark Scala job running in EMR that I am trying to improve. As of right now it runs on m5.8xlarge with no issues. I recently tried upgrading to the Graviton based EC2 instances m6g.8xlarge and while the job does succeed, I am seeing some weird issues. Some of the issues I see is tasks failing due to a timeout, stages running in a strange order, and it looks like the memory is strained. The stage that runs out of order is the one with failed tasks, stage 6 runs then fails, then stages 4 & 5 complete, and then stage 6 retry succeeds. In the m5.8xlarge run that currently is working, stages 4 & 5 get skipped. I'm not sure why this is happening since the only change I made was going from an m5 instance type to an m6g, so I wanted to see if anyone experienced something similar or has solutions. I will also post some of the errors from the failed tasks, but I think they are related to the oom.
Here is the main error I am seeing:
ERROR TransportClientFactory:261 - Exception while bootstrapping client after 60041 ms
java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timeout waiting for task.
at org.spark_project.guava.base.Throwables.propagate(Throwables.java:160)
at org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:263)
at org.apache.spark.network.sasl.SaslClientBootstrap.doBootstrap(SaslClientBootstrap.java:70)
at org.apache.spark.network.crypto.AuthClientBootstrap.doSaslAuth(AuthClientBootstrap.java:116)
at org.apache.spark.network.crypto.AuthClientBootstrap.doBootstrap(AuthClientBootstrap.java:89)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:257)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:121)
at org.apache.spark.network.shuffle.ExternalShuffleClient.fetchBlocks(ExternalShuffleClient.java:109)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:264)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.org$apache$spark$storage$ShuffleBlockFetcherIterator$$send$1(ShuffleBlockFetcherIterator.scala:614)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.fetchUpToMaxBytes(ShuffleBlockFetcherIterator.scala:609)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:442)
at org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:160)
at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:66)
at org.apache.spark.sql.execution.ShuffledRowRDD.compute(ShuffledRowRDD.scala:173)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1405)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.concurrent.TimeoutException: Timeout waiting for task.
at org.spark_project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
at org.spark_project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
at org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:259)
... 39 more

I think this is not an out of memory problem.
Both m6g.8xlarge and m5.8xlarge have 120 GB of memory following their specs:
https://aws.amazon.com/ec2/instance-types/m6g and https://aws.amazon.com/ec2/instance-types/m5
I see in the backtrace that the timeout is during the authentication process:
First it fails to authenticate with Spark's auth protocol in doBootstrap(AuthClientBootstrap.java:89
https://github.com/apache/spark/blob/master/common/network-common/src/main/java/org/apache/spark/network/crypto/AuthClientBootstrap.java#L99
Bootstraps a {#link TransportClient} by performing authentication using Spark's auth protocol.
This bootstrap falls back to using the SASL bootstrap if the server throws an error during authentication, and the configuration allows it. This is used for backwards compatibility with external shuffle services that do not support the new protocol.
and then it also fails to authenticate with SASL in doBootstrap(SaslClientBootstrap.java:70
https://github.com/apache/spark/blob/master/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java#L54
Bootstraps a {#link TransportClient} by performing SASL authentication on the connection. The server should be setup with a {#link SaslRpcHandler} with matching keys for the given appId.
Performs SASL authentication by sending a token, and then proceeding with the SASL challenge-response tokens until we either successfully authenticate or throw an exception due to mismatch.

Apache Spark on k8s: securing RPC communication between driver and executors is not working

I have been trying Spark 2.4 deployment on k8s and want to establish a secured RPC communication channel between driver and executors. Was using the following configuration parameters as part of spark-submit
spark.authenticate true
spark.authenticate.secret good
spark.network.crypto.enabled true
spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA1
spark.network.crypto.saslFallback false
The driver and executors were not able to communicate on a secured channel and were throwing the following errors.
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown challenge message.
at org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:109)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:181)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:103)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
Can someone guide me on this?

Disclaimer: I do not have a very deep understanding of spark implementation, so, be careful when using the workaround described below.
AFAIK, spark does not have support for auth/encryption for k8s in 2.4.0 version.
There is a ticket, which is already fixed and likely will be released in a next spark version: https://issues.apache.org/jira/browse/SPARK-26239
The problem is that spark executors try to open connection to a driver, and a configuration will be sent only using this connection. Although, an executor creates the connection with default config AND system properties started with "spark.".
For reference, here is the place where executor opens the connection: https://github.com/apache/spark/blob/5fa4384/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L201
Theoretically, if you would set spark.executor.extraJavaOptions=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true ..., it should help, although driver checks that there are no spark parameters set in extraJavaOptions.
Although, there is a workaround (a little bit hacky): you can set spark.executorEnv.JAVA_TOOL_OPTIONS=-Dspark.authenticate=true -Dspark.network.crypto.enabled=true .... Spark does not check this parameter, but JVM uses this env variable to add this parameter to properties.
Also, instead of using JAVA_TOOL_OPTIONS to pass secret, I would recommend to use spark.executorEnv._SPARK_AUTH_SECRET=<secret>.

Spark Cluster mode issue to read Hive-Hbase table on Kerberized Environment

Error description
We are not able execute our Spark job in yarn-cluster or yarn-client mode, though it is working fine in the local mode.
This issue occurs when we try to read the Hive-HBase tables in a Kerberized cluster.
What we have tried so far
Passing all the HBASE jar in the –jar parameter in spark submi
--jars /usr/hdp/current/hive-client/lib/hive-hbase-handler-1.2.1000.2.5.3.16-1.jar,/usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,/usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/current/hbase-client/lib/protobuf-java-2.5.0.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar
Passing Hbase site and hive site in file parameter in Spark submit
--files /usr/hdp/2.5.3.16-1/hbase/conf/hbase-site.xml,/usr/hdp/current/spark-client/conf/hive-site.xml,/home/pasusr/pasusr.keytab
Doing Kerberos authentication inside the application. In the code we are explicitly passing the key tab
UserGroupInformation.setConfiguration(configuration)
val ugi: UserGroupInformation =
UserGroupInformation.loginUserFromKeytabAndReturnUGI(principle, keyTab)
UserGroupInformation.setLoginUser(ugi)
ConnectionFactory.createConnection(configuration)
return ugi.doAs(new PrivilegedExceptionActionConnection {
#throws[IOException]
def run: Connection = {
ConnectionFactory.createConnection(configuration) }
})
Passing key tab information in the Spark submit
Passing the HBASE jar in the spark.driver.extraClassPath and spark.executor.extraClassPath
Error Log
18/03/20 15:33:24 WARN TableInputFormatBase: You are using an HTable instance that relies on an HBase-managed Connection. This is usually due to directly creating an HTable, which is deprecated. Instead, you should create a Connection object and then request a Table instance from it. If you don't need the Table instance for your own use, you should instead use the TableInputFormatBase.initalizeTable method directly.
18/03/20 15:47:38 WARN TaskSetManager: Lost task 0.0 in stage 7.0 (TID 406, hadoopnode.server.name): java.lang.IllegalStateException: Error while configuring input job properties
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureTableJobProperties(HBaseStorageHandler.java:444)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureInputJobProperties(HBaseStorageHandler.java:342)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=50, exceptions:
Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:679)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

I was able to resolve this by adding following configuration in the spark-env.sh
export SPARK_CLASSPATH=/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar
And removing the spark.driver.extraClassPath and spark.executor.extraClassPath in which I was passing the above Jar from the Spark submit command.

Tinkerpop Gremlin server MissingPropertyException for SparkGraphComputer in remote mode

I am new to tinkerpop, gremlin and groovy.
I have configured a Tinkerpop Gremlin Server and Console [v3.2.3] with verified integration with HDFS and Spark.
Next I try to execute below code using gremlin console in local mode, everything works fine, a spark job is submitted and successfully processed.
:load data/grateful-dead-janusgraph-schema.groovy
graph = JanusGraphFactory.open('conf/connection.properties')
defineGratefulDeadSchema(graph)
graph.close()
hdfs.copyFromLocal('data/grateful-dead.kryo','data/grateful-dead.kryo')
graph = GraphFactory.open('conf/hadoop-graph/hadoop-load.properties')
blvp = BulkLoaderVertexProgram.build().writeGraph('conf/connection.properties').create(graph)
graph.compute(SparkGraphComputer).program(blvp).submit().get()
Next I connect gremlin console to gremlin server as remote using below command.
:remote connect tinkerpop.server conf/remote.yaml
After this I execute above code prefixing statements with ":> ". As soon as I submit last line which is submitting processing to SparkGraphComputer, I get below exception at the server -
[WARN] AbstractEvalOpProcessor - Exception processing a script on request [RequestMessage{, requestId=097785d6-7114-44fb-acbc-1b116dfdaac2, op='eval', processor='', args={gremlin=graph.compute(SparkGraphComputer).program(blvp).submit().get(), bindings={}, batchSize=64}}].
groovy.lang.MissingPropertyException: No such property: SparkGraphComputer for class: Script4
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:53)
at org.codehaus.groovy.runtime.callsite.PogoGetPropertySite.getProperty(PogoGetPropertySite.java:52)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:307)
at Script4.run(Script4.groovy:1)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:119)
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$2(GremlinExecutor.java:287)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am unable to understand what does MissingPropertyException means in groovy, is it similar to NoClassDefFound in java?
I believe some configuration is missing at the server end, can someone help me out?

Well there's two ways to go about this. You can simply import SparkGraphComputer in the script that you're sending or you can add it to the scriptEngines configuration for your gremlin server. Something like
scriptEngines: {
gremlin-groovy: {
imports: [your.full.path.to.TheClass],
staticImports: [your.full.path.to.TheClass.StaticVar]
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Unable to connect to Neo4j Community Server via Spark Shell : The client is unauthorized due to authentication failure - apache-spark

Related

Is it possible to run Spark Cluster and a client as different users?

AWS Graviton based EC2 Instance Upgrade in EMR causing task failures

Apache Spark on k8s: securing RPC communication between driver and executors is not working

Spark Cluster mode issue to read Hive-Hbase table on Kerberized Environment

Tinkerpop Gremlin server MissingPropertyException for SparkGraphComputer in remote mode

Categories

Resources