Spark RDD saveAsTextFile gives java.io.IOException: Mkdirs failed to create - apache-spark

I am using spark 1.6.3 and trying to save rdd as textFile,but i am getting the following error.
pRdd = opRdd.coalesce(1);
opRdd.saveAsTextFile("file:///home/user1/Tarun/voucher");
java.io.IOException: Mkdirs failed to create file:/home/user1/Tarun/voucher/_temporary/0/_temporary/attempt_201910261108_0002_m_000000_25 (exists=false, cwd=file:/opt/spark-1.6.3-3/work/app-20191026110834-0031/0)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:801)
at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123)
at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1191)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1183)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
What is the issue ?
I have given 777 permission to Tarun folder.
I am submitting the code using spark-submit on Unix machine.

I found out the explanation, from my side.
The directory, inside which files are stored, was created successfully but the files not. That's because driver (who creates the directory) and the executors (who create files) are executed using different users.
The executors are executed using the user used to run the spark Master.

I got the solution and would like to give solution.I change the folder to /tmp/Tarun/ and file was saved.

Related

Zeppelin+Spark+Cassandra: Spark dont work

Watched one nice youtube video about Zeppelin+Spark+Cassandra. Trying to repeat. OS Win10.
Runned Zeppelin like a docker image ;
Setuped options for Cassandra Interpreters, it works
Now trying to setup Spark, and i cant. Installed spark-3.0.1-bin-hadoop2.7 (folder named spark-3.0.1-bin-hadoop2.7, it is ok), spark-shell from cmd works. What i have to do with spark-cassandra-connector and what options i have to setup for spark Interpreters? Thanks.
org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "C:/bin/spark-3.3.1-bin-hadoop3/bin/spark-submit": error=2, No such file or directory
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:129)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:271)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:438)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:69)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "C:/bin/spark-3.3.1-bin-hadoop3/bin/spark-submit": error=2, No such file or directory
at org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:127)
at org.apache.zeppelin.interpreter.launcher.StandardInterpreterLauncher.launchDirectly(StandardInterpreterLauncher.java:77)
at org.apache.zeppelin.interpreter.launcher.InterpreterLauncher.launch(InterpreterLauncher.java:110)
at org.apache.zeppelin.interpreter.InterpreterSetting.createInterpreterProcess(InterpreterSetting.java:856)
at org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:66)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:104)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:154)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126)
... 13 more
Ok guys, here we go:
Install Spark on Win10, there are many tutorials in internet. My version 3.0.1
Download docker image with Zeppelin
In image settings setuped path folder with Spark and port 8080, lounch it http://localhost:8080/
Spark interpreter settings: set SPARK_HOME like in prev point 3, spark.jars.packages = com.datastax.spark:spark-cassandra-connector_2.12:3.0.1. Add settings for Cassandra: spark.cassandra.connection.host, spark.cassandra.auth.username, spark.cassandra.auth.password.
Welcome

pySpark job failing on yarn

i am trying submit pyspark job from yarnclient. getting below error from RM without any further logs.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby ENOENT: No
such file or directory at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at
org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:231)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:773)
at
org.apache.hadoop.fs.DelegateToFileSystem.setPermission(DelegateToFileSystem.java:218)
at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:266) at
org.apache.hadoop.fs.FileContext$11.next(FileContext.java:1008) at
org.apache.hadoop.fs.FileContext$11.next(FileContext.java:1004) at
org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at
org.apache.hadoop.fs.FileContext.setPermission(FileContext.java:1011)
at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload.java:483)
at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload.java:481)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at
org.apache.hadoop.yarn.util.FSDownload.changePermissions(FSDownload.java:481)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:419) at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) For more detailed output,
check the application tracking page:
https://.com:8090/cluster/app/application_1638972290118_64750
Then click on links to logs of each attempt. . Failing the
application.
cluster is fine and other pyspark jobs running fine.
Please help
Thanks in advance
What do you mean by "cluster is fine and other pyspark jobs running fine"?
Did you run them on Yarn or just on Standalone mode?
However, I think it's better to check your yarn cluster first to see if it works (without spark).
you can do it using hadoop MapR examples:
yarn jar $HadoopDir/share/hadoop/mapreduce/hadoop-mapreduce-examples-$version.jar wordcount inputFilePath OutputDir
Check link 1 and link 2 too. They may help.

Spark on Yarn Container Failure

For reference: I solved this issue by adding Netty 4.1.17 in hadoop/share/hadoop/common
No matter what jar I try and run (including the example from https://spark.apache.org/docs/latest/running-on-yarn.html), I keep getting an error regarding container failure when running Spark on Yarn. I get this error in the command prompt:
Diagnostics: Exception from container-launch.
Container id: container_1530118456145_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
When I look at the logs, I then find this error:
Exception in thread "main" java.lang.NoSuchMethodError:io.netty.buffer.PooledByteBufAllocator.metric()Lio/netty/buffer/PooledByteBufAllocatorMetric;
at org.apache.spark.network.util.NettyMemoryMetrics.registerMetrics(NettyMemoryMetrics.java:80)
at org.apache.spark.network.util.NettyMemoryMetrics.<init>(NettyMemoryMetrics.java:76)
at org.apache.spark.network.client.TransportClientFactory.<init>(TransportClientFactory.java:109)
at org.apache.spark.network.TransportContext.createClientFactory(TransportContext.java:99)
at org.apache.spark.rpc.netty.NettyRpcEnv.<init>(NettyRpcEnv.scala:71)
at org.apache.spark.rpc.netty.NettyRpcEnvFactory.create(NettyRpcEnv.scala:461)
at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:57)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:530)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Any idea why this is happening? This is running on a pseudo-distributed cluster set up according to this tutorial: https://wiki.apache.org/hadoop/Hadoop2OnWindows. Spark runs fine locally, and seeing as this jar was provided with Spark, I doubt it's a problem within the jar. (Regardless, I added a Netty dependency inside another jar and I'm still getting the same error).
The only thing set in my spark-defaults.conf is spark.yarn.jars, which points to a hdfs directory where I uploaded all of Spark's jars. io.netty.buffer.PooledByteBufAllocator is contained within these jars.
Spark 2.3.1, Hadoop 2.7.6
I had exactly same issue. Previously I used Hadoop 2.6.5 and the compatible spark version, things worked out fine. When I switched to Hadoop 2.7.6, problem occurred. Not sure what is cause, but I copied to netty.4.1.17.Final jar file to the hadoop library folder then the problem goes away.
Seems like you have multiple netty version on your classpath ,
mvn clean compile
Remove all and add latest one.
This may have the version problem between your yarn and spark. check the compatibility of the versions are installed.
I strongly suggest to read more about NoSuchMethodError and some other similar Exceptions like NoClassDefFoundError and ClassNotFoundException. This suggestions reason is that when you start using spark in different situations these are the much more confusing errors and exception for the people are not so experienced. NosuchMethodError
Of course caring a lot is the best practice strategy for a programmer absolutely the ones working on distributed systems like spark. Well Done. ;)

Spark proxy-user not being used when creating hdfs folders on cluster secured w/ Kerberos

I have a Spark Application that is launched via spark submit. The proxy-user is set and yarn mode is being used.
When the application is launched it correctly shows in Hue UI that the job is running as the proxy-user.
On HDFS the .sparkStaging folder under /user/proxy-user/.sparkStaging is created (and all its contents) with the owner being the proxy-user -- all good so far.
However, once the application code executes and tries to create resources (ie. a folder named: result) under /user/proxy-user/result the owner is created by super user and not the proxy-user. This causes subsequent failures since the proxy-user now lacks correct access rights to the 'result' folder and its children.
This is being run on Cloudera 5.7 with Kerberos. I can provide more details if need be -- just didn't want to spew a long yarn (no pun intended) of irrelevant details.
Then the proxy-user looks like it cannot write to folder created by different owner. from logs:
at WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, clustername.com, executor 2): org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=WRITE, inode="/user/anonymous/result345/_temporary/0":superuser:supergroup:drwxr-xr-x
org.apache.hadoop.hdfs.server.namenode.DefaultAuthoriz ationProvider.checkFsPermission(DefaultAuthoriz ationProvider.java:281)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthoriz ationProvider.check(DefaultAuthoriz ationProvider.java:262)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthoriz ationProvider.check(DefaultAuthoriz ationProvider.java:242)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthoriz ationProvider.checkPermission(DefaultAuthoriz ationProvider.java:169)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6590)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6572)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6524)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2758)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2561)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.cre ate(NameNodeRpcServer.java:593)
at org.apache.hadoop.hdfs.server.namenode.Authoriz ationProviderProxyClientProtocol.cre ate(Authoriz ationProviderProxyClientProtocol.java:111)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTransl atorPB.cre ate(ClientNamenodeProtocolServerSideTransl atorPB.java:393)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(N ative Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInform ation.doAs(UserGroupInform ation.java:1693)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
at sun.reflect.N ativeConstructorAccessorImpl.newInstance0(N ative Method)
at sun.reflect.N ativeConstructorAccessorImpl.newInstance(N ativeConstructorAccessorImpl.java:62)
at sun.reflect.Deleg atingConstructorAccessorImpl.newInstance(Deleg atingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.ipc.RemoteException.instanti ateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCre ate(DFSOutputStream.java:1653)
at org.apache.hadoop.hdfs.DFSClient.cre ate(DFSClient.java:1689)
at org.apache.hadoop.hdfs.DFSClient.cre ate(DFSClient.java:1624)
at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.cre ate(DistributedFileSystem.java:459)
at org.apache.hadoop.hdfs.DistributedFileSystem.cre ate(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.cre ate(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.cre ate(FileSystem.java:804)
at org.apache.hadoop.mapred.TextOutputForm at.getRecordWriter(TextOutputForm at.java:123)
at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopD ataset$1$$anonfun$13 .apply(PairRDDFunctions.scala:1197)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Accumulo not getting initialised.

I am trying to initialise accumulo. I am configuring accumulo on hadoop2.0.0-cdh4.4.0.
I am making using tars on a MAC book.
I am getting an error when initialising accumulo : bin/accumulo init.
java.io.IOException: Mkdirs failed to create /accumulo/instance_id error.
The log says:
2014-05-24 01:24:33,935 [util.Initialize] FATAL: Failed to initialize filesystem
java.io.IOException: Mkdirs failed to create /accumulo/instance_id
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:447)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:867)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:829)
at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:1129)
at org.apache.accumulo.server.util.Initialize.initFileSystem(Initialize.java:269)
at org.apache.accumulo.server.util.Initialize.initialize(Initialize.java:213)
at org.apache.accumulo.server.util.Initialize.doInit(Initialize.java:199)
at org.apache.accumulo.server.util.Initialize.main(Initialize.java:545)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:103)
at java.lang.Thread.run(Thread.java:744)
2014-05-24 01:24:33,937 [conf.Configuration] WARN : fs.default.name is deprecated. Instead, use fs.defaultFS
2014-05-24 01:24:33,937 [util.Initialize] FATAL: Default filesystem value ('fs.defaultFS' or 'fs.default.name') was found in the Hadoop configuration
2014-05-24 01:24:33,938 [util.Initialize] FATAL: Please ensure that the Hadoop core-site.xml is on the classpath using 'general.classpaths' in accumulo-site.xml
Please suggest me , I tried to fix this by creating the /accumulo, /user/accumulo on hdfs and gave 777 permissions also.
The root cause is that the Hadoop jars and configuration are not being placed on Accumulo's classpath. I'm not familiar with how Cloudera packages their Hadoop artifacts.
If you notice in your stack trace, it lists out the ChecksumFileSystem class instead of the DistributedFileSystem. This means that Accumulo doesn't know about the HDFS instance you're trying to write to and is falling back to using the local file system (that's what the ChecksumFileSystem is doing).
To fix this, check a couple of things in your Accumulo configuration files. First, make sure that you have correctly defined HADOOP_PREFIX and HADOOP_CONF_DIR in accumulo-env.sh. Second, make sure that the value you have configured for general.classpaths in accumulo-site.xml all exist, specifically the ones that reference HADOOP_PREFIX and HADOOP_CONF_DIR.

Resources