I'm using Spark 1.3.1 on StandAlone mode in my cluster which has 7 machines. 2 of the machines are powerful and have 64 cores and 1024 GB memory, while the others have 40 cores and 256 GB memory. One of the powerful machines is set to be the master, and others are set to be the slaves. Each of the slave machine runs 4 workers.
When I'm running my driver program on one of the powerful machines, I see that it takes the cores only from the two powerful machines. Below is a part of the web UI of my spark master.
My configuration of this Spark driver program is as follows:
spark.scheduling.mode=FAIR
spark.default.parallelism=32
spark.cores.max=512
spark.executor.memory=256g
spark.logConf=true
Why spark does this? Is this a good thing or a bad thing? Thanks!
Consider lowering your executors memory from the 256GB that you have defined.
For the future, take in consideration assigning around 75% of available memory.
Related
I have a cluster of machines that I have to share with other processes. Lets just say I am not a nice person and want my spark executor processes to have a higher priority then other people's processes. How can I set that?
I am using StandAlone mode, v2.01, running on RHEL7
Spark does not currently (2.4.0) support nice process priorities. Grepping through the codebase, there is no usage of nice, and hence no easy to set process priority on executors using out-of-the-box Spark. It would be a little odd of Spark to do this, since it only assumes it can start a JVM, not that the base operating system is UNIX.
There are hacky ways to get around this that I do NOT recommend. For instance, if you are using Mesos as a resource manager, you could set spark.mesos.executor.docker.image to an image where java actually calls nice -1 old-java "$#".
Allocate all the resources to the spark application leaving minimal
resource needed for os to run.
A simple scenario :
Imagine a cluster with six nodes running NodeManagers(Yarn Mode), each equipped with 16 cores and 64GB of memory. The NodeManager capacities, yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores, should probably be set to 63 * 1024 = 64512 (megabytes) and 15 respectively. We avoid allocating 100% of the resources to YARN containers because the node needs some resources to run the OS and Hadoop daemons. In this case, we leave a gigabyte and a core for these system processes.
I have been prototyping Spark Streaming 1.6.1 using kafka receiver on a Mesos 0.28 cluster running with Coarse grained mode.
I have 6 mesos slaves each with 64GB RAM and 16 Cores.
My kafka topic has 3 partitions.
My goal is to launch 3 executors in all (each on a different mesos slave) with each executor having one kafka receiver reading from one kafka partition.
When I launch my spark application with spark.cores.max set to 24 and spark.executor.memory set to 8GB, I get two executors - with 16 cores on one slave and with 8 cores on another slave.
I am looking to get 3 executors with 8 cores each on three different slaves. Is that possible with mesos through resource reservation / isolation, constraints etc. ?
Only workaround that works for me now is to scale down each mesos slave node to only have 8 cores max. I don't want to use mesos in fine-grained mode for performance reasons and plus its support is going away soon.
Mesosphere has contributed the following patch to Spark: https://github.com/apache/spark/commit/80cb963ad963e26c3a7f8388bdd4ffd5e99aad1a. This improvement will land in Spark 2.0. Mesosphere has backported this and other improvements to Spark 1.6.1 and made it available in DC/OS (http://dcos.io).
This patch introduces a new "spark.executor.cores" config variable in course gain mode. When the "spark.executor.cores" config variable is set, executors will be sized with the specified number of cores.
If an offer arrives with a multiple of (spark.executor.memory, spark.executor.cores), multiple executors will be launched on that offer. This means there could be multiple, but seperate, Spark executors running on the same Mesos agent node.
There is no way (currently) to spread the executors across N Mesos agents. We briefly discussed adding the ability to spread Spark executors across N Mesos agents but concluded it doesn't buy much in terms of improved availability.
Can you help us understand your motivations for spreading Spark executors across 3 Mesos agents? It's likely we haven't considered all possibly use cases and advantages.
Keith
Is there any advantage to starting more than one spark instance (master or worker) on a particular machine/node?
The spark standalone documentation doesn't explicitly say anything about starting a cluster or multiple workers on the same node. It does seem to implicitly conflate that one worker equals one node
Their hardware provisioning page says:
Finally, note that the Java VM does not always behave well with more than 200 GB of RAM. If you purchase machines with more RAM than this, you can run multiple worker JVMs per node. In Spark’s standalone mode, you can set the number of workers per node with the SPARK_WORKER_INSTANCES variable in conf/spark-env.sh, and the number of cores per worker with SPARK_WORKER_CORES.
So aside from working with large amounts of memory or testing cluster configuration, is there any benefit to running more than one worker per node?
I think the obvious benefit is to improve the resource utilization of the hardware per box without losing performance. In terms of parallelism, one big executor with multiple cores seems to be same with multiple executors with less cores.
Is it possible to have executors with different amounts of memory on a Mesos cluster? Or am I bounded by the machine with the least memory? (Assuming I want to use all available cpus).
Short anwer: No.
Unfortunately, Spark Mesos and YARN only allow giving as much resources (cores, memory, etc.) per machine as your worst machine has (discussion). Ideally, the cluster should be homogeneous in order to take full advantage of its resources.
However, there might exist a workaround for your problem. According to the linked source above, Spark standalone allows creating multiple workers on some machines. You might modify your worker configuration to be appropriate for the worst machine, and start multiple workers on these.
For example, given two computers with 4G and 20G memory respectively, you could create 5 workers on the latter, each with a configuration to use just 4G of memory, as limited per the first machine.
I have created a Spark cluster of 8 machines. Each machine have 104 GB of RAM and 16 virtual cores.
I seems that Spark only sees 42 GB of RAM per machine which is not correct. Do you know why Spark does not see all the RAM of the machines?
PS : I am using Apache Spark 1.2
Seems like a common misconception. What is displayed is the spark.storage.memoryFraction :
https://stackoverflow.com/a/28363743/4278362
Spark makes no attempt at guessing the available memory. Executors use as much memory as you specify with the spark.executor.memory setting. Looks like it's set to 42 GB.