Error while connecting to JanusGraph - tinkerpop3

I have the following code:
trait InMemoryConnectScala {
def messageSerializer(): MessageSerializer = {
import java.util.Collections
import org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
import org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry
val config = new util.HashMap[String, Object]()
config.put("ioRegistries", Collections.singletonList(classOf[JanusGraphIoRegistry].getName))
val serializer = new GryoMessageSerializerV1d0()
serializer.configure(config, null)
serializer
}
def connect(): JanusGraph = {
import org.apache.commons.configuration.BaseConfiguration
val conf = new BaseConfiguration()
conf.setProperty("storage.backend", "inmemory")
conf.setProperty("type", "remote")
val jg = JanusGraphFactory.open(conf)
jg
}
}
val clusterBuilder = Cluster.build.port(8182).serializer(messageSerializer()).addContactPoint("localhost")
val cl = clusterBuilder.create()
val client: Client = cl.connect()
val jg = EmptyGraph.instance.traversal.withRemote(DriverRemoteConnection.using(cl))
val res = client.submit("g.V().count()")
}
I get the following error when it hits the submit method
12:25:34.979 [pool-1-thread-1] INFO o.a.t.gremlin.driver.ConnectionPool - Opening connection pool on Host{address=localhost/127.0.0.1:8182, hostUri=ws://localhost:8182/gremlin} with core size of 2
[info] AcmTestSpec *** ABORTED ***
[info] java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists
[info] at org.apache.tinkerpop.gremlin.driver.Client.submit(Client.java:214)
[info] at org.apache.tinkerpop.gremlin.driver.Client.submit(Client.java:198)
[info] at AcmTestSpec.beforeAll(AcmTestSpec.scala:407)
[info] at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
[info] at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
[info] at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
[info] at AcmTestSpec.run(AcmTestSpec.scala:60)

Related

SBT shell project in terminal gives error

i am a beginner at scala programming language. i read sbt documentation and implement these in sbt shell but it gives error. how to resolve it?
....................
ThisBuild / scalaVersion := "2.13.0"
ThisBuild / organization := "com.example"
val scalaTest = "org.scalatest" %% "scalatest" % "3.2.7"
val gigahorse = "com.eed3si9n" %% "gigahorse-okhttp" % "0.5.0"
val playJson = "com.typesafe.play" %% "play-json" % "2.6.9"
lazy val hello = (project in file("."))
.aggregate(helloCore)
.dependsOn(helloCore)
.settings(
name := "Hello",
libraryDependencies += scalaTest % Test,
)
lazy val helloCore = (project in file("core"))
.settings(
name := "Hello Core",
libraryDependencies ++= Seq(gigahorse, playJson),
libraryDependencies += scalaTest % Test,
)
...............................the above is my build.sbt file...............
package example.core
import gigahorse._, support.okhttp.Gigahorse
import scala.concurrent._, duration._
import play.api.libs.json._
object Weather {
lazy val http = Gigahorse.http(Gigahorse.config)
def weather: Future[String] = {
val baseUrl = "https://www.metaweather.com/api/location"
val locUrl = baseUrl + "/search/"
val weatherUrl = baseUrl + "/%s/"
val rLoc = Gigahorse.url(locUrl).get.
addQueryString("query" -> "New York")
import ExecutionContext.Implicits.global
for {
loc <- http.run(rLoc, parse)
woeid = (loc \ 0 \ "woeid").get
rWeather = Gigahorse.url(weatherUrl format woeid).get
weather <- http.run(rWeather, parse)
} yield (weather \\ "weather_state_name")(0).as[String].toLowerCase
}
private def parse = Gigahorse.asString andThen Json.parse
}
...........................................hello.scala....................
package example
import scala.concurrent._, duration._
import core.Weather
object Hello extends App {
val w = Await.result(Weather.weather, 10.seconds)
println(s"Hello! The weather in New York is $w.")
Weather.http.close()
}
error:
mehveen#mehveen-Y11C:~$ cd foo-build
mehveen#mehveen-Y11C:~/foo-build$ touch build.sbt
mehveen#mehveen-Y11C:~/foo-build$ sbt
[info] Loading global plugins from /home/mehveen/.sbt/1.0/plugins
[info] Loading settings for project foo-build-build from plugins.sbt ...
[info] Loading project definition from /home/mehveen/foo-build/project
[info] Loading settings for project hello from build.sbt ...
[info] Set current project to Hello (in build file:/home/mehveen/foo-build/)
[info] sbt server started at local:///home/mehveen/.sbt/1.0/server/704a39d101f4d89588ee/sock
sbt:Hello> run
[info] Updating helloCore...
[warn] module not found: com.typesafe.play#play-json_2.13;2.6.9
[warn] ==== local: tried
[warn] /home/mehveen/.ivy2/local/com.typesafe.play/play-json_2.13/2.6.9/ivys/ivy.xml
[warn] ==== public: tried
[warn] https://repo1.maven.org/maven2/com/typesafe/play/play-json_2.13/2.6.9/play-json_2.13-2.6.9.pom
[warn] ==== local-preloaded-ivy: tried
[warn] /home/mehveen/.sbt/preloaded/com.typesafe.play/play-json_2.13/2.6.9/ivys/ivy.xml
[warn] ==== local-preloaded: tried
[warn] file:////home/mehveen/.sbt/preloaded/com/typesafe/play/play-json_2.13/2.6.9/play-json_2.13-2.6.9.pom
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: com.typesafe.play#play-json_2.13;2.6.9: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn]
[warn] Note: Unresolved dependencies path:
[warn] com.typesafe.play:play-json_2.13:2.6.9 (/home/mehveen/foo-build/build.sbt#L19)
[warn] +- com.example:hello-core_2.13:0.1.0-SNAPSHOT
[error] sbt.librarymanagement.ResolveException: unresolved dependency: com.typesafe.play#play-json_2.13;2.6.9: not found
[error] at sbt.internal.librarymanagement.IvyActions$.resolveAndRetrieve(IvyActions.scala:332)
[error] at sbt.internal.librarymanagement.IvyActions$.$anonfun$updateEither$1(IvyActions.scala:208)
[error] at sbt.internal.librarymanagement.IvySbt$Module.$anonfun$withModule$1(Ivy.scala:239)
[error] at sbt.internal.librarymanagement.IvySbt.$anonfun$withIvy$1(Ivy.scala:204)
[error] at sbt.internal.librarymanagement.IvySbt.sbt$internal$librarymanagement$IvySbt$$action$1(Ivy.scala:70)
[error] at sbt.internal.librarymanagement.IvySbt$$anon$3.call(Ivy.scala:77)
[error] at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:95)
[error] at xsbt.boot.Locks$GlobalLock.xsbt$boot$Locks$GlobalLock$$withChannelRetries$1(Locks.scala:80)
[error] at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:99)
[error] at xsbt.boot.Using$.withResource(Using.scala:10)
[error] at xsbt.boot.Using$.apply(Using.scala:9)
[error] at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:60)
[error] at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:50)
[error] at xsbt.boot.Locks$.apply0(Locks.scala:31)
[error] at xsbt.boot.Locks$.apply(Locks.scala:28)
[error] at sbt.internal.librarymanagement.IvySbt.withDefaultLogger(Ivy.scala:77)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:199)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:196)
[error] at sbt.internal.librarymanagement.IvySbt$Module.withModule(Ivy.scala:238)
[error] at sbt.internal.librarymanagement.IvyActions$.updateEither(IvyActions.scala:193)
[error] at sbt.librarymanagement.ivy.IvyDependencyResolution.update(IvyDependencyResolution.scala:20)
[error] at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:56)
[error] at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:45)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:93)
[error] at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:68)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$19(LibraryManagement.scala:106)
[error] at scala.util.control.Exception$Catch.apply(Exception.scala:224)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:106)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:89)
[error] at sbt.util.Tracked$.$anonfun$inputChanged$1(Tracked.scala:149)
[error] at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:120)
[error] at sbt.Classpaths$.$anonfun$updateTask$5(Defaults.scala:2561)
[error] at scala.Function1.$anonfun$compose$1(Function1.scala:44)
[error] at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:40)
[error] at sbt.std.Transform$$anon$4.work(System.scala:67)
[error] at sbt.Execute.$anonfun$submit$2(Execute.scala:269)
[error] at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:16)
[error] at sbt.Execute.work(Execute.scala:278)
[error] at sbt.Execute.$anonfun$submit$1(Execute.scala:269)
[error] at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:178)
[error] at sbt.CompletionService$$anon$2.call(CompletionService.scala:37)
[error] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[error] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[error] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[error] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[error] at java.base/java.lang.Thread.run(Thread.java:834)
[error] (helloCore / update) sbt.librarymanagement.ResolveException: unresolved dependency: com.typesafe.play#play-json_2.13;2.6.9: not found
[error] Total time: 7 s, completed 16 Apr. 2021, 5:00:08 pm
error pic

failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 6

I am trying to connect spark application with hbase. Below is the configuration I am giving
val conf = HBaseConfiguration.create()
conf.set("hbase.master", "localhost:16010")
conf.setInt("timeout", 120000)
conf.set("hbase.zookeeper.quorum", "2181")
val connection = ConnectionFactory.createConnection(conf)
and below are the 'jps' details:
5808 ResourceManager
8150 HMaster
8280 HRegionServer
5131 NameNode
8076 HQuorumPeer
5582 SecondaryNameNode
2798 org.eclipse.equinox.launcher_1.4.0.v20161219-1356.jar
8623 Jps
5951 NodeManager
5279 DataNode
I have alsotried with hbase master 16010
I am getting below error:
19/09/12 21:49:00 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1024)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)
19/09/12 21:49:00 WARN ReadOnlyZKClient: 0x1e3ff233 to 2181:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries = 4
19/09/12 21:49:01 INFO ClientCnxn: Opening socket connection to server 2181/0.0.8.133:2181. Will not attempt to authenticate using SASL (unknown error)
19/09/12 21:49:01 ERROR ClientCnxnSocketNIO: Unable to open socket to 2181/0.0.8.133:2181
Looks like there is a problem to join zookeeper.
Check first that zookeeper is started on your local host on port 2181.
netstat -tunelp | grep 2181 | grep -i LISTEN
tcp6 0 0 :::2181 :::* LISTEN
In your conf, in hbase.zookeeper.quorum property you have to pass the ip of your zookeeper and not the port (hbase.zookeeper.property.clientPort)
My hbase connector is build with :
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "10.80.188.65")
conf.set("hbase.master", "10.80.188.64:60000")
conf.set("hbase.zookeeper.property.clientPort", "2181")
conf.set("zookeeper.znode.parent", "/hbase-unsecure")
val connection = ConnectionFactory.createConnection(conf)

Spark Streaming works in Local mode but "stages fail" with "could not initialize class" in Client/Cluster mode

I have a Spark + Kafka streaming app that runs fine in Local mode, however when I try to launch it in yarn + local/cluster mode I get several errors like below
The first error I always see is
WARN TaskSetManager: Lost task 1.1 in stage 3.0 (TID 9, ip-xxx-24-129-36.ec2.internal, executor 2): java.lang.NoClassDefFoundError: Could not initialize class TestStreaming$
at TestStreaming$$anonfun$main$1$$anonfun$apply$1.apply(TestStreaming.scala:60)
at TestStreaming$$anonfun$main$1$$anonfun$apply$1.apply(TestStreaming.scala:59)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Next error I get is
ERROR JobScheduler: Error running job streaming job 1541786030000 ms.0
followed by
java.lang.NoClassDefFoundError: Could not initialize class
Spark version 2.1.0
Scala 2.11
Kafka version 10
Part of my code when I launch it loads the config in main. I pass this config file at runtime with -conf AFTER the jar (see below). I'm not quite sure but must I pass this config to the executors as well?
I launch my streaming app with the command below. One shows Local mode, the other shows client mode.
runJar = myProgram.jar
loggerPath=/path/to/log4j.properties
mainClass=TestStreaming
logger=-DPHDTKafkaConsumer.app.log4j=$loggerPath
confFile=application.conf
-----------Local Mode----------
SPARK_KAFKA_VERSION=0.10 nohup spark2-submit --driver-java-options
"$logger" --conf "spark.executor.extraJavaOptions=$logger" --class
$mainClass --master local[4] $runJar -conf $confFile &
-----------Client Mode----------
SPARK_KAFKA_VERSION=0.10 nohup spark2-submit --master yarn --conf >"spark.executor.extraJavaOptions=$logger" --conf >"spark.driver.extraJavaOptions=$logger" --class $mainClass $runJar -conf >$confFile &
Here is my code below. Been battling this for over a week now.
import Util.UtilFunctions
import UtilFunctions.config
import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkConf
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
import org.apache.spark.streaming.kafka010.KafkaUtils
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.log4j.Logger
object TestStreaming extends Serializable {
#transient lazy val logger: Logger = Logger.getLogger(getClass.getName)
def main(args: Array[String]) {
logger.info("Starting app")
UtilFunctions.loadConfig(args)
UtilFunctions.loadLogger()
val props: Map[String, String] = setKafkaProperties()
val topic = Set(config.getString("config.TOPIC_NAME"))
val conf = new SparkConf()
.setAppName(config.getString("config.SPARK_APP_NAME"))
.set("spark.streaming.backpressure.enabled", "true")
val spark = SparkSession.builder()
.config(conf)
.getOrCreate()
val ssc = new StreamingContext(spark.sparkContext, Seconds(10))
ssc.sparkContext.setLogLevel("INFO")
ssc.checkpoint(config.getString("config.SPARK_CHECKPOINT_NAME"))
val kafkaStream = KafkaUtils.createDirectStream[String, String](ssc, PreferConsistent, Subscribe[String, String](topic, props))
val distRecordsStream = kafkaStream.map(record => (record.key(), record.value()))
distRecordsStream.window(Seconds(10), Seconds(10))
distRecordsStream.foreachRDD(rdd => {
if(!rdd.isEmpty()) {
rdd.foreach(record => {
println(record._2) //value from kafka
})
}
})
ssc.start()
ssc.awaitTermination()
ssc.stop()
}
def setKafkaProperties(): Map[String, String] = {
val deserializer = "org.apache.kafka.common.serialization.StringDeserializer"
val zookeeper = config.getString("config.ZOOKEEPER")
val offsetReset = config.getString("config.OFFSET_RESET")
val brokers = config.getString("config.BROKERS")
val groupID = config.getString("config.GROUP_ID")
val autoCommit = config.getString("config.AUTO_COMMIT")
val maxPollRecords = config.getString("config.MAX_POLL_RECORDS")
val maxPollIntervalms = config.getString("config.MAX_POLL_INTERVAL_MS")
val props = Map(
"bootstrap.servers" -> brokers,
"zookeeper.connect" -> zookeeper,
"group.id" -> groupID,
"key.deserializer" -> deserializer,
"value.deserializer" -> deserializer,
"enable.auto.commit" -> autoCommit,
"auto.offset.reset" -> offsetReset,
"max.poll.records" -> maxPollRecords,
"max.poll.interval.ms" -> maxPollIntervalms)
props
}
}

Spark: Frequent Pattern Mining: issues in saving the results

I am using Spark's FP-growth algorithm. I was getting OOM errors when I was doing a collect, I then changed the code so that I can save the results in a text file on HDFS rather than collecting them on the driver node. Here is the related code:
// Model building:
val fpg = new FPGrowth()
.setMinSupport(0.01)
.setNumPartitions(10)
val model = fpg.run(transaction_distinct)
Here is a transformation that should give me RDD[Strings].
val mymodel = model.freqItemsets.map { itemset =>
val model_res = itemset.items.mkString("[", ",", "]") + ", " + itemset.freq
model_res
}
I then save the model results as. Unfortunately, this is really SLOW!!!
mymodel.saveAsTextFile("fpm_model")
I get these errors:
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError[akka.tcp://sparkDriver#ipaddress:46811] -> [akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720] akka.event.Logging$Error$NoCause$
16/02/04 14:47:28 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, hostname, 58683)
16/02/04 14:47:28 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver#ipaddress:46811] ->[akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720

How to use a specific directory for metastore with HiveContext?

So this is what I tried in Spark Shell.
scala> import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive.HiveContext
scala> import java.nio.file.Files
import java.nio.file.Files
scala> val hiveDir = Files.createTempDirectory("hive")
hiveDir: java.nio.file.Path = /var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338
scala> val hiveContext = new HiveContext(sc)
15/12/31 12:05:27 INFO HiveContext: Initializing execution hive, version 0.13.1
hiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext#6f959640
scala> hiveContext.sql(s"SET hive.metastore.warehouse.dir=${hiveDir.toUri}")
15/12/31 12:05:34 INFO HiveContext: Initializing HiveMetastoreConnection version 0.13.1 using Spark classes.
...
res0: org.apache.spark.sql.DataFrame = [: string]
scala> Seq("create database foo").foreach(hiveContext.sql)
15/12/31 12:05:42 INFO ParseDriver: Parsing command: create database foo
15/12/31 12:05:42 INFO ParseDriver: Parse Completed
...
15/12/31 12:05:43 INFO HiveMetaStore: 0: create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr cmd=create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)
15/12/31 12:05:43 INFO HiveMetaStore: 0: get_database: foo
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr cmd=get_database: foo
15/12/31 12:05:43 ERROR RetryingHMSHandler: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:734)
...
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/12/31 12:05:43 ERROR DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:248)
...
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:242)
... 78 more
15/12/31 12:05:43 ERROR Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
15/12/31 12:05:43 ERROR ClientWrapper:
======================
HIVE FAILURE OUTPUT
======================
SET hive.metastore.warehouse.dir=file:///var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338/
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
======================
END HIVE FAILURE OUTPUT
======================
org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:349)
...
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
scala>
But it doesn't seem to recognize the directory I am setting. I've removed content from the stacktrace since it was very verbose. The entire stacktrace is here.
I am not sure what I am doing wrong. Would appreciate any help provided.

Resources