In Spark, is it better to have many small workers or few bigger workers - apache-spark

A Spark cluster consists of a driver that distributes tasks to multiple worker nodes. Each worker can take up a number of tasks equal to the amount of cores available. So I'd think that the speed at which a task finishes depends on the total available cores.
Consider the following cluster configurations, using AWS EC2 as an example:
2 m5.4xlarge (16 vCPU/cores, 64GB RAM) workers for a total of 32 cores / 128GB RAM
OR
8 m5.xlarge (4 vCPU/cores, 16GB RAM) workers for a total of 32 cores / 128GB RAM
I'm using those instances as an example; it's not about those instances specifically but about the general idea that you can have the same total amount of cores + RAM with different configurations. Would there be any difference between the performance of those two cluster configurations? Both would have the same total amount of cores and RAM, and same ratio of RAM/core. For what kind of job would you choose one and for what the other? Some thoughts I have on this myself:
The configuration with 8 smaller instances might have a higher total network bandwidth since each workers has it's own connection
The configuration with 2 bigger instances might be more efficient when shuffling, since more cores can share the memory on a worker instead of having to shuffle across the network, so lower network overhead
The configuration with 8 smaller instances has better resiliency, since if one worker fails it's only one out of eight failing rather than one out of two.
Do you agree with the statements above? What other considerations would you make when choosing between different configurations with equal amount of total RAM / cores?

Related

Spark: How to tune memory / cores given my cluster?

There are several threads with significant votes that I am having difficulty interpreting, perhaps due to jargon of 2016 being different of that today? (or I am just not getting it, too)
Apache Spark: The number of cores vs. the number of executors
How to tune spark executor number, cores and executor memory?
Azure/Databricks offers some best practices on cluster sizing: https://learn.microsoft.com/en-us/azure/databricks/clusters/cluster-config-best-practices
So for my workload, lets say I am interested in (using Databricks current jargon):
1 Driver: Comprised of 64gb of memory and 8 cores
1 Worker: Comprised of 256gb of memory and 64 cores
Drawing on the above Microsoft link, fewer workers should in turn lead to less shuffle; among the most costly Spark operations.
So, I have 1 driver and 1 worker. How, then, do I translate these terms into what is discussed here on SO in terms of "nodes" and "executors".
Ultimately, I would like to set my Spark config "correctly" such that cores and memory are, as optimized as possible.

How spark manages IO perfomnce if we reduce the number of cores per executor and incease number of executors

As per my research whenever we run the spark job we should not run the executors with more than 5 cores, if we increase the cores beyond the limit job will suffer due to bad I/O throughput.
my doubt is if we increase the number of executors and reduce the cores, even then these executors will be ending up in the same physical machine and those executors will be reading from the same disk and writing to the same disk, why will this not cause I/O throughput issue.
can consider
Apache Spark: The number of cores vs. the number of executors
use case for reference.
The core within the executor are like threads. So just like how more work is done if we increase parallelism, we should always keep in mind that there is a limit to it. Because we have to gather the results from those parallel tasks.

What happens if I allocate all the available cores on the server for spark cluster

As is well known, It is possible to increase the number of cores when submitting our application. Actually, I'm trying to allocate all available cores on the server for the Spark application. I'm wondering what will happen to the performance? will it reduce or be better than usual?
The first thing about allocating cores (--executor-cores) might come in mind that more cores in an executor means more parallelism, more tasks will be executed concurrently, better performance. But it's not true for spark ecosystem. After leaving 1 core for os and other application running in the worker, Study has shown that it's optimal to allocate 5 cores for each executor.
For example, if you have a worker node with 16 cores, the optimal total executors and cores per executor will be --num-executors 3 and --executor-cores 5 (as 5*3=15) respectively.
Not only optimal resource allocation brings better performance, it also depends on how the transformations and actions are done on dataframes. More shuffling of data between different executors hampers in performance.
your operating system always need resource for its bare need.
It good to keep 1 core and 1 GB memory for operating system and for other applications.
If you allocate all resource to spark then it will not going to improve your performance, your other applications starve for resources.
I think its not better idea to allocate all resources to spark only.
Follow below post if you want to tune your spark cluster
How to tune spark executor number, cores and executor memory?

Max possible number of executors in cluster

Let's say I have 5 worker nodes in a cluster and each node has 48 cores and 256 GB RAM.
Then what are the maximum number of executors possible in the clusters?
will cluster have 5*48 = 240 executors or only 5 executors?
Or there are some other factors that will decide the number of executors in a cluster, then what are they?
Thanks.
The number of executors is related to the amount of parallelism your application need. You can create 5*48 executor with 1 core each, but there's others processes that should be considered, like memory overhead, cluster management process, scheduler, so you may need to reserve 2-5 cores/node to management processes.
I don't know what architecture your cluster use, but this article is a good start if you are using hadoop , https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application.html

Partitioning the RDD for Spark Jobs

When I submit job spark in yarn cluster I see spark-UI I get 4 stages of jobs but, memory used is very low in all nodes and it says 0 out of 4 gb used. I guess that might be because I left it in default partition.
Files size ranges are betweenr 1 mb to 100 mb in s3. There are around 2700 files with size of 26 GB. And exactly same 2700 jobs were running in stage 2.
Is it worth to repartition something around 640 partitons, would it improve the performace? or
It doesn't matter if partition is granular than actually required? or
My submit parameters needs to be addressed?
Cluster details,
Cluster with 10 nodes
Overall memory 500 GB
Overall vCores 64
--excutor-memory 16 g
--num-executors 16
--executor-cores 1
Actually it runs on 17 cores out of 64. I dont want to increase the number of cores since others might use the cluster.
You partition, and repartition for following reasons:
To make sure we have enough work to distribute to the distinct cores in our cluster (nodes * cores_per_node). Obviously we need to tune the number of executors, cores per executor, and memory per executor to make that happen as intended.
To make sure we evenly distribute work: the smaller the partitions, the lesser the chance than one core might have much more work to do than all other cores. Skewed distribution can have a huge effect on total lapse time if the partitions are too big.
To keep partitions in managable sizes. Not to big, and not to small so we dont overtax GC. Also bigger partitions might have issues when we have non-linear O.
To small partitions will create too much process overhead.
As you might have noticed, there will be a goldilocks zone. Testing will help you determine ideal partition size.
Note that it is ok to have much more partitions than we have cores. Queuing partitions to be assigned a task is something that I design for.
Also make sure you configure your spark job properly otherwise:
Make sure you do not have too many executors. One or Very Few executors per node is more than enough. Fewer executors will have less overhead, as they work in shared memory space, and individual tasks are handled by threads instead of processes. There is a huge amount of overhead to starting up a process, but Threads are pretty lightweight.
Tasks need to talk to each other. If they are in the same executor, they can do that in-memory. If they are in different executors (processes), then that happens over a socket (overhead). If that is over multiple nodes, that happens over a traditional network connection (more overhead).
Assign enough memory to your executors. When using Yarn as the scheduler, it will fit the executors by default by their memory, not by the CPU you declare to use.
I do not know what your situation is (you made the node names invisible), but if you only have a single node with 15 cores, then 16 executors do not make sense. Instead, set it up with One executor, and 16 cores per executor.

Resources