what is driver memory and executor memory in spark? [duplicate] - apache-spark

This question already has answers here:
How to set Apache Spark Executor memory
(13 answers)
Closed 3 years ago.
I am new to spark framework and i would like to know what is driver memory and executor memory? what is the effective way to get the maximum performance from both of them?

Spark need a driver to handle the executors. So the best way to understand is:
Driver
The one responsible to handle the main logic of your code, get resources with yarn, handle the allocation and handle some small amount of data for some type of logic. The Driver Memory is all related to how much data you will retrieve to the master to handle some logic. If you retrieve too much data with a rdd.collect() your driver will run out of memory. The memory for the driver usually is small 2Gb to 4Gb is more than enough if you don't send too much data to it.
Worker
Here is where the magic happens, the worker will be the one responsible to execute your job. The amount of memory depends of what you are going to do. If you just going to do a map function where you just going to transform the data with no type of aggregation, you usually don't need much memory. But if you are going to run big aggregations, a lot of steps and etc. Usually you will use a good amount of memory. And it is related to the size of your files that you will read.
Tell you a proper amount of memory for each case all depends of how your job will work. You need to understand what is the impact of each function and monitor to tune your memory usage for each job. Maybe 2Gb per worker is what you need, but sometimes 8Gb per workers is what you need.

Related

setting tuning parameters of a spark job

I'm relatively new to spark and I have a few questions related to the tuning optimizations with respect to the spark submit command.
I have followed : How to tune spark executor number, cores and executor memory?
and I understand how to utilise maximum resources out of my spark cluster.
However, I was recently asked how to define the number of cores, memory and cores when I have a relatively smaller operation to do as if I give maximum resources, it is going to be underutilised .
For instance,
if I have to just do a merge job (read files from hdfs and write one single huge file back to hdfs using coalesce) for about 60-70 GB (assume each file is of 128 mb in size which is the block size of HDFS) of data(in avro format without compression), what would be the ideal memory, no of executor and cores required for this?
Assume I have the configurations of my nodes same as the one mentioned in the link above.
I can't understand the concept of how much memory will be used up by the entire job provided there are no joins, aggregations etc.
The amount of memory you will need depends on what you run before the write operation. If all you're doing is reading data combining it and writing it out, then you will need very little memory per cpu because the dataset is never fully materialized before writing it out. If you're doing joins/group-by/other aggregate operations all of those will require much ore memory. The exception to this rule is that spark isn't really tuned for large files and generally is much more performant when dealing with sets of reasonably sized files. Ultimately the best way to get your answers is to run your job with the default parameters and see what blows up.

Does Apache Spark cache RDD in node-level or cluster-level?

I know that Apache Spark persist method saves RDDs in memory and that if there is not enough memory space, it stores the remaining partitions of the RDD in the filesystem (disk). What I can't seem to understand is the following:
Imagine we have a cluster and we want to persist an RDD. Suppose node A does not have a lot of memory space and that node B does. Let's suppose now that after running the persist command, node A runs out of memory. The question now is:
Does Apache Spark search for more memory space in node B and try to store everything in memory?
Or given that there is not enough space in node A, Spark stores the remaining partitions of the RDD in the disk of node A even if there some memory space available in node B?
Thanks for your answers.
Normally Spark doesn't search for the free space. Data is cached locally on the executor responsible for a particular partition.
The only exception is the case when you use replicated persistence mode - in that case additional copy will be place on another node.
The closest thing I could find is this To cache or not to cache. I had plenty of situations when data was mildly skewed and was getting memory related exceptions/failures when trying to cache/persist into RAM, one way around it was to use StorageLevels like MEMORY_AND_DISK, but obviously it was taking longer to cache and than read those partitions.
Also in Spark UI you can find the information about executors and how much of their memory is used for caching, you can experiment and monitor how it behaves.

Is there such a thing as too many executors in Spark?

I'm working with a Spark/YARN cluster that limits the resources I can allocate to 8GB memory and 1 core per container, but I can allocate hundreds, perhaps even thousands of executors to run my application on.
However since the driver has similar resource limitations (8GB memory, 4 cores), I'm concerned that too many executors may overwhelm the driver and cause timeouts.
Is there a rule of thumb for sizing the driver memory and cores to handle large numbers of executors?
There are rules on how to size your "executors".
For driver with 8GB and 4 core it should be able to handle thousands of executors easily as it only maintains bookkeeping metadata of the executors.
Given the assumption you are not having functions like collect() in your spark code.
Spark code analysis will help you to understand which actions in spark are performed where : http://bytepadding.com/big-data/spark/spark-code-analysis/

spark spilling independent of executor memory assigned

I've noticed strange behavior when running a pyspark application with spark 2.0. In the first step in my script involving a reduceByKey (and thus shuffle) operation, I observe that the amount the shuffle writes is roughly in line with my expectations, but that much more spills occur than I had expected. I tried to avoid these spills by increasing the amount of memory assigned per executor up to 8x the original amount, but see basically no difference in the amount spilled. Strangely, I also see that while this stage is running, hardly any of the assigned storage memory is used (as reported in the executors tab in the spark web UI).
I saw this earlier question, which led me to believe that increasing executor memory might help avoid the spills: How to optimize shuffle spill in Apache Spark application
. This leads me to believe that some hard limit is leading to the spills, and not the spark.shuffle.memoryFraction parameter. Does such a hard limit exist, possibly among HDFS parameters? Otherwise, what could be done to avoid spills besides increasing executor memory?
Many thanks, R
Spilling behavior in PySpark is controlled using spark.python.worker.memory:
Amount of memory to use per python worker process during aggregation, in the same format as JVM memory strings (e.g. 512m, 2g). If the memory used during aggregation goes above this amount, it will spill the data into disks.
which is by default set to 512MB. Moreover PySpark uses its own reducing mechanism with External(GroupBy|Sorter|Merger) and exhibits slightly different behavior than its native counterpart.

Spark partitionBy on write.save brings all data to driver?

So basically I have a python spark job that reads some simple json files, and then tries to write them as orc files partitioned by one field. The partition is not very balanced, as some keys are really big, and other really small.
I had memory issues when doing something like this:
events.write.mode('append').partitionBy("type").save("s3n://mybucket/tofolder"), format="orc")
Adding memory to the executors didn't seem to have any effect, but I solved it increasing the driver memory. Does this mean that all the data is being send to the driver for it to write? Can't each executor write its own partition? Im using Spark 2.0.1
Even if you partition dataset and then write it on storage there is no possibility that records are sent to the driver. You should look at logs of memory issues (if they occur on driver on or executors) to figure out exact reason of failing.
Probably your driver has too low memory to handle this write because of previous computations. Try decreasing spark.ui.retainedJobs and spark.ui.retainedStages to save memory on old jobs and stages metadata. If this won't help, connect to driver with jvisualvm to find job/stage than consumes large heap fragments and try to optimize.

Resources