Cannot connect to standalone spark cluster via sparklyr. How to debug?

Cannot connect to standalone spark cluster via sparklyr. How to debug? - apache-spark

I can confirm that connect to the cluster using spark-shell e.g.
spark-shell --master spark://myurl:7077
works
But
library(sparklyr)
sc <- spark_connect(
master="spark://myurl:7077",
spark_home = "d:/spark/spark-2.4.4-bin-hadoop2.7/"
)
doesn't and gives error
Error in force(code) :
Failed while connecting to sparklyr to port (8880) for sessionid (59811): Gateway in localhost:8880 did not respond.
Path: d:\spark\spark-2.4.4-bin-hadoop2.7\bin\spark-submit2.cmd
Parameters: --class, sparklyr.Shell, "C:\Users\user1\Documents\R\win-library\3.6\sparklyr\java\sparklyr-2.3-2.11.jar", 8880, 59811
Log: C:\Users\user1\AppData\Local\Temp\RtmpottVxI\file66ec13ea6ef0_spark.log
---- Output Log ----
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Invalid maximum heap size: -Xmx10g
The specified size exceeds the maximum representable size.

Turns out I need to install the Java 8 JDK instead of JRE.

Related

Spark 2.4 Got an error when resolving hostNames Falling back to /default-rack

Running an application in in client mode, the driver logs are printed with the below info messages, any idea on how to resolve this? Any spark configs to be updated? or missing?
[INFO ][dispatcher-event-loop-29][SparkRackResolver:54] Got an error when resolving hostNames. Falling back to /default-rack for all
The jobs runs fine, this msg is not in the executor logs.

Check this bug:
https://issues.apache.org/jira/browse/SPARK-28005
If you want to suppress this in the logs you can try to add this into your log4j.properties
log4j.logger.org.apache.spark.deploy.yarn.SparkRackResolver=ERROR

This can happen while using spart-submit with master yarn in a deploy mode local (not using --deploy-mode cluster) and the path to topology.py script is not correct into your core-site.xml.
Path to core-site.xml can be set via environment variable HADOOP_CONF_DIR (or YARN_CONF_DIR).
Check the path in the param net.topology.script.file.name value of core-site.xml.
If the path is incorrect, deploying driver in local mode will lead to error of executing with the following warning:
23/01/15 18:39:43 WARN ScriptBasedMapping: Exception running /home/alexander/xxx/.conf/topology.py 10.15.21.199
java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn/topology.py" (in directory "/home/john"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
...
23/01/15 18:39:43 INFO SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all

Neo4j refused to connect

Characteristics :
Linux
Neo4j version 3.2.1
Access on remote
Installation
I Had install neo4j and gave the folder chmod 777 .
Im running it remotely on my machine and I had already enabled non local access
Doing NEo4j start i get this message
Active database: graph.db
Directories in use:
home: /home/cloudera/Muna/apps/neo4j
config: /home/cloudera/Muna/apps/neo4j/conf
logs: /home/cloudera/Muna/apps/neo4j/logs
plugins: /home/cloudera/Muna/apps/neo4j/plugins
import: /home/cloudera/Muna/apps/neo4j/import
data: /home/cloudera/Muna/apps/neo4j/data
certificates: /home/cloudera/Muna/apps/neo4j/certificates
run: /home/cloudera/Muna/apps/neo4j/run
Starting Neo4j.
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
Started neo4j (pid 9469). It is available at http://0.0.0.0:7474/
There may be a short delay until the server is ready.
See /home/cloudera/Muna/apps/neo4j/logs/neo4j.log for current status.
and it is not connecting in the browser .
running neo4j console
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 409600000 bytes for AllocateHeap
# An error report file with more information is saved as:
# /home/cloudera/hs_err_pid18598.log
where could the problem be coming from ?

Firstly, you should set the maximum open files to 40000, which is the recommended value. Then you do not get the WARNING. Like this: http://neo4j.com/docs/1.6.2/configuration-linux-notes.html
Secondly,'failed to allocate memory' means that the Java virtual machine cannot allocate the memory you start it with.
It can be a misconfiguration, or you physically do not have enough memory.
Please read the memory sizing guidelines here:
https://neo4j.com/docs/operations-manual/current/performance/

Spark program on Windows cluster fails with error CreateProcess error=5, Access is denied

I am trying to execute a program on a Spark v2.0.0 Cluster on my Windows 10 laptop. There is a master node on port 31080 and slave node on 32080. The cluster is using the Standalone manager and am using JDK 1.8, with a custom work directory for the slave.
When the program is submitted via spark-submit or through Eclipse > Run program, I get the below error, and the executor goes in a loop (a new executor is created, and fails continuously). Please guide.
Executor updated: app-20160906203653-0001/0 is now RUNNING
Executor updated: app-20160906203653-0001/0 is now FAILED (java.io.IOException: Cannot run program ""D:\jdk1.8.0_101"\bin\java"
(in directory "D:\spark-work\app-20160906203653-0001\0"):
CreateProcess error=5, Access is denied)
Executor app-20160906203653-0001/0 removed: java.io.IOException: Cannot run program ""D:\jdk1.8.0_101"\bin\java" (in directory
"D:\spark-work\app-20160906203653-0001\0"): CreateProcess error=5,
Access is denied
Removal of executor 0 requested

Got the answer.. I was starting my master and slaves through windows batch scripts. These were invoking an env script which was setting JAVA_HOME, SCALA_HOME and SPARK_HOME. The paths were enclosed in double quotes. Hence the issue. Removing the double quotes fixed the issue... no Admin priviliges or changes needed.

Why does Spark job fail on Mesos with "hadoop: not found"?

I use Spark 1.6.1, Hadoop 2.6.4 and Mesos 0.28 on Debian 8.
While trying to submit a job via spark-submit to a Mesos cluster a slave fails with the following in stderr log:
I0427 22:35:39.626055 48258 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ad642fcf-9951-42ad-8f86-cc4f5a5cb408-S0\/hduser","items":[{"action":"BYP$
I0427 22:35:39.628031 48258 fetcher.cpp:379] Fetching URI 'hdfs://xxxxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
I0427 22:35:39.628057 48258 fetcher.cpp:250] Fetching directly into the sandbox directory
I0427 22:35:39.628078 48258 fetcher.cpp:187] Fetching URI 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
E0427 22:35:39.629243 48258 shell.hpp:93] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
Failed to fetch 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar': Failed to create HDFS client: Failed to execute 'hadoop version 2>&1'; the command was e$
Failed to synchronize with slave (it's probably exited)
My Jar file contains hadoop 2.6 binaries
The path to spark executor/binary is via an hdfs:// link
My jobs don't appear in the framework tab, but they do appear in the driver with the status 'queued' and they just sit there till I shut down the spark-mesos-dispatcher.sh service.

I was seeing a very similar error and I figured out my problem was that hadoop_home wasn't set in the mesos agent.
I added to /etc/default/mesos-slave (path may be different on your install) on each mesos-slave the following line: MESOS_hadoop_home="/path/to/my/hadoop/install/folder/"
EDIT: Hadoop has to be installed on each slave, the path/to/my/haoop/install/folder is a local path

Spark - UbuntuVM - insufficient memory for the Java Runtime Environment

I'm trying to install Spark1.5.1 on Ubuntu14.04 VM. After un-tarring the file, I changed the directory to the extracted folder and executed the command "./bin/pyspark" which should fire up the pyspark shell. But I got an error message as follows:
[ OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5550000, 715849728, 0) failed;
error='Cannot allocate memory' (errno=12) There is insufficient
memory for the Java Runtime Environment to continue.
Native memory allocation (malloc) failed to allocate 715849728 bytes
for committing reserved memory.
An error report file with more information is saved as:
/home/datascience/spark-1.5.1-bin-hadoop2.6/hs_err_pid2750.log ]
Could anyone please give me some directions to sort out the problem?

We need to set spark.executor.memory in conf/spark-defaults.conf file to a value specific to your machine. For example,
usr1#host:~/spark-1.6.1$ cp conf/spark-defaults.conf.template conf/spark-defaults.conf
nano conf/spark-defaults.conf
spark.driver.memory 512m
For more information, refer to the official documentation: http://spark.apache.org/docs/latest/configuration.html

Pretty much what it says. It wants 7GB of RAM. So give the VM ~ 8GB of RAM.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cannot connect to standalone spark cluster via sparklyr. How to debug? - apache-spark

Turns out I need to install the Java 8 JDK instead of JRE.

Related

Spark 2.4 Got an error when resolving hostNames Falling back to /default-rack

Neo4j refused to connect

Spark program on Windows cluster fails with error CreateProcess error=5, Access is denied

Why does Spark job fail on Mesos with "hadoop: not found"?

Spark - UbuntuVM - insufficient memory for the Java Runtime Environment

Categories

Resources