Restricting memory usage by Spark application on worker nodes - apache-spark

I have a computer cluster consisting of a master node and a bunch of worker nodes, each having 8GB of physical RAM. I would like to run some Spark jobs on this cluster, but limit the memory that is used by each node for performing this job. Is this possible within Spark?
For example, can I run a job so that each worker will only be allowed to use at most 1GB of physical RAM to do its job?
If not, is there some other way to artificially "limit" the RAM that each worker has?
If it is possible to limit RAM in some way or the other, what actually happens when Spark "runs out" of physical RAM? Does it page to disk or does the Spark job just kill itself?

Related

How to start Cassandra and Spark services assigning half of the RAM memory on each program?

I have a 4 nodes cluster on which there are installed Spark and Cassandra on each node.
Spark version 3.1.2 and Cassandra v3.11
Let me say that each nodes have 4GB of RAM and I want to run my "spark+cassandra" program all over the cluster.
How can I assign 2GB of RAM for Cassandra execution and 2GB for Spark execution?
I noted that.
If my Cassandra cluster is up and I run start-worker.sh command on a worker node to make my spark cluster up, suddenly Cassandra service stops but spark still works. Basically, Spark steals RAM resources to Cassandra. How can I avoid also this?
On Cassandra logs of the crashed node I read the message:
There is insufficient memory for the Java Runtime Environment to continue.
In fact typing top -c and then shift+M i can see Spark Service at the top of column Memory
Thanks for any suggestions.
By default, Spark workers take up the total RAM less 1GB. On a 4GB machine, the worker JVM consumes 3GB of memory. This is the reason the machine runs out of memory.
You'll need to configure the SPARK_WORKER_MEMORY to 1GB to leave enough memory for the operating system. For details, see Starting a Spark cluster manually.
It's very important to note as Alex Ott already pointed out, a machine with only 4GB of RAM is not going to be able to do much so expect to run into performance issues. Cheers!

apache spark executors and data locality

The spark literature says
Each application gets its own executor processes, which stay up for
the duration of the whole application and run tasks in multiple
threads.
And If I understand this right, In static allocation the executors are acquired by the Spark application when the Spark Context is created on all nodes in the cluster (in a cluster mode). I have a couple of questions
If executors are acquired on all nodes and will stay allocated to
this application during the the duration of the whole application,
isn't there a chance a lot of nodes remain idle?
What is the advantage of acquiring resources when Spark context is
created and not in the DAGScheduler? I mean the application could be
arbitrarily long and it is just holding the resources.
So when the DAGScheduler tries to get the preferred locations and
the executors in those nodes are running the tasks, would it
relinquish the executors on other nodes?
I have checked a related question
Does Spark on yarn deal with Data locality while launching executors
But I'm not sure there is a conclusive answer
If executors are acquired on all nodes and will stay allocated to this application during the the duration of the whole application, isn't there a chance a lot of nodes remain idle?
Yes there is chance. If you have data skew this will happen. The challenge is to tune the executors and executor core so that you get maximum utilization. Spark also provides dynamic resource allocation which ensures the idle executors are removed.
What is the advantage of acquiring resources when Spark context is created and not in the DAGScheduler? I mean the application could be arbitrarily long and it is just holding the resources.
Spark tries to keep data in memory while doing transformation. Contrary to map-reduce model where after every Map operation it writes to disk. Spark can keep the data in memory only if it can ensure the code is executed in the same machine. This is the reason of allocating resource beforehand.
So when the DAGScheduler tries to get the preferred locations and the executors in those nodes are running the tasks, would it relinquish the executors on other nodes?
Spark can't start a task on an executor unless the executor is free. Now spark application master negotiates with the yarn to get the preferred location. It may or may not get that. If it doesn't get, it will start task in different executor.

Hadoop YARN Cluster / Spark and RAM Disks

Because my computational tasks require fast disk I/O, I am interested in mounting large RAM disks on each worker node in a YARN cluster that runs Spark, and am thus wondering how the YARN cluster manager handles the memory occupied by such a RAM disk.
If I were to allocate 32GB to a RAM disk on each 128GB RAM machine, for example, would the YARN cluster manager know how to allocate RAM so as to avoid over allocating memory when performing tasks (in this case, does YARN of RAM to the requisitioned tasks, or at most only 96GB)?
If so, is there any way to indicate to the YARN cluster manager that a RAM disk is present and so, a specific partition of the RAM is off limits to YARN? Will Spark know about these constraints either?
In Spark configurations you can set driver and executors configs like cores and memory allocation amount. Moreover, when you use yarn as the resource manager there is some extra configs supported by it you can help you to manage the cluster resources better. "spark.driver.memoryOverhead" or "spark.yarn.am.memoryOverhead" which is the amount of off-heap space with the default value of
AM memory * 0.10, with minimum of 384
for further information here is the link.

Can we assign more memory to the Spark APP than cluster it self?

Say the spark cluster is a stand alone cluster.
Master having 1GB memory and slave having 1GB memory.
When you submit a application to the cluster, you can specify how much memory the driver program and worker program can have. So is it possible that to specify some higher value like 10Gb to driver and 10Gb to worker?
I mean what will happen if the program you submitted is requiring more memory than the cluster it self. (Let us assume that the physical computer is having enough memory)
Spark has a feature called "Dynamic Allocation". It can be turned on using
spark.dynamicAllocation.enabled = true
More details here
http://www.slideshare.net/databricks/dynamic-allocation-in-spark
If you request more memory than the resource manager has access to, you will not be allocated all of your workers. The resource manager will allocate as many as it can, and if at least 1 worker is allocated, your program will be able to run (at least on YARN). Your resource manager will not allocate a worker (or the driver) with less memory than is requested. There is no such thing as a partial worker.

What are the benefits of running multiple Spark instances per node (physical machine)?

Is there any advantage to starting more than one spark instance (master or worker) on a particular machine/node?
The spark standalone documentation doesn't explicitly say anything about starting a cluster or multiple workers on the same node. It does seem to implicitly conflate that one worker equals one node
Their hardware provisioning page says:
Finally, note that the Java VM does not always behave well with more than 200 GB of RAM. If you purchase machines with more RAM than this, you can run multiple worker JVMs per node. In Spark’s standalone mode, you can set the number of workers per node with the SPARK_WORKER_INSTANCES variable in conf/spark-env.sh, and the number of cores per worker with SPARK_WORKER_CORES.
So aside from working with large amounts of memory or testing cluster configuration, is there any benefit to running more than one worker per node?
I think the obvious benefit is to improve the resource utilization of the hardware per box without losing performance. In terms of parallelism, one big executor with multiple cores seems to be same with multiple executors with less cores.

Resources