How SPARK_WORKER_CORES setting impacts concurrency in Spark Standalone - apache-spark

I am using a Spark 2.2.0 cluster configured in Standalone mode. Cluster has 2 octa core machines. This cluster is exclusively for Spark jobs and no other process uses them. I have around 8 Spark Streaming apps which run on this cluster.I explicitly set SPARK_WORKER_CORES (in spark-env.sh) to 8 and allocate one core to each app using total-executor-cores setting. This config reduces the capability to work in parallel on multiple tasks. If a stage works on a partitioned RDD with 200 partitions, only one task executes at a time. What I wanted Spark to do was to start separate thread for each job and process in parallel. But I couldn't find a separate Spark setting to control the number of threads.So, I decided to play around and bloated the number of cores (i.e. SPARK_WORKER_CORES in spark-env.sh) to 1000 on each machine. Then I gave 100 cores to each Spark application. I found that spark started processing 100 partitons in parallel this time indicating that 100 threads were being used.I am not sure if this is the correct method of impacting the number of threads used by a Spark job.

You mixed up two things:
Cluster manger properties - SPARK_WORKER_CORES - total number of cores that worker can offer. Use it to control a fraction of resources that should be used by Spark in total
Application properties --total-executor-cores / spark.cores.max - number of cores that application requests from the cluster manager. Use it control in-app parallelism.
Only the second on is directly responsible for app parallelism as long as, the first one is not limiting.
Also CORE in Spark is a synonym of thread. If you:
allocate one core to each app using total-executor-cores setting.
then you specifically assign a single data processing thread.

Related

Spark on Mesos - running multiple Streaming jobs

I have 2 spark streaming jobs that I want to run, as well as keeping some available resources for batch jobs and other operations.
I evaluated Spark Standalone cluster manager, but I realized that I would have to fix the resources for two jobs, which would leave almost no computing power to batch jobs.
I started evaluating Mesos, because it has "fine grained" execution model, where resources are shifted between Spark applications.
1) Does it mean that a single core can be shifted between 2 streaming applications?
2) Although I have spark & cassandra, in order to exploit data locality, do I need to have dedicated core on each of the slave machines to avoid shuffling?
3) Would you recommend running Streaming jobs in "fine grained" or "course grained" mode. I know that logical answer is course grained (in order to minimize the latency of streaming apps) but what when resource in total cluster are limited (cluster of 3 nodes, 4 cores each - there are 2 streaming applications to run and multiple time to time batch jobs)
4) In Mesos, when I run spark streaming job in cluster mode, will it occupy 1 core permanently (like Standalone cluster manager is doing), or will that core execute driver process and sometimes act as executor?
Thank you
Fine grained mode is actually deprecated now. Even with it, each core is allocated to task until completion, but in Spark Streaming, each processing interval is a new job, so tasks only last as long the time it takes to process each interval's data. Hopefully that time is less than the interval time or your processing will back up, eventually running out of memory to store all those RDDs waiting for processing.
Note also that you'll need to have one core dedicated to each stream's Reader. Each will be pinned for the life of the stream! You'll need extra cores in case the stream ingestion needs to be restarted; Spark will try to use a different core. Plus you'll have a core tied up by your driver, if it's also running on the cluster (as opposed to on your laptop or something).
Still, Mesos is a good choice, because it will allocate the tasks to nodes that have capacity to run them. Your cluster sounds pretty small for what you're trying to do, unless the data streams are small themselves.
If you use the Datastax connector for Spark, it will try to keep input partitions local to the Spark tasks. However, I believe that connector assumes it will manage Spark itself, using Standalone mode. So, before you adopt Mesos, check to see if that's really all you need.

Multiple spark streaming contexts on one worker

I have single node cluster with 2 CPUs, where I want to run 2 spark streaming jobs.
I also want to use submit mode "cluster". I am using Standalone cluster manager.
When I submit one application, I see that driver consumes 1 core, and worker 1 core.
Does it mean that there are no cores available for other streaming job? Can 2 streaming jobs reuse executors?
It is totally confusing me, and I don't find it really clear in documentation.
Srdjan
Does it mean that there are no cores available for other streaming job?
If you have a single worker with 2 CPU's and you're deploying in Cluster mode, than you'll have no available cores as the worker has to use a dedicated core for tge driver process to run on your worker machine.
Can 2 streaming jobs reuse executors?
No, each job needs to allocate dedicated resources given by the cluster manager. If one job is running with all available resources, the next scheduled job will be in WAITING state until the first completes. You can see it in the Spark UI.

Spark streaming on Mesos - course grained

I have 2 cores on my vagrant development machine, and want to run 2 streaming applications.
If:
both of them take both available cores ( I didn't specify "spark.cores.max")
they have streaming interval of 15 seconds
5 seconds is enough to perform computation
Is expected behaviour of Mesos to shift these 2 available cores between 2 applications? I would expect that behaviour, because "Mesos locks the resources until job is executed", and in Spark Streaming one job is what is executed within batch interval.
Otherwise, If resources are locked for the life of application (in spark streaming it is forever), what is the benefit of using Mesos instead of Standalone cluster manager?
Spark Streaming locks each stream Reader to a core, plus you'll need at least one other core for the rest of the processing. So you can't run two jobs simultaneously on a 2-core machine.
Mesos gives you much better resource utilization in a cluster. Standalone is more static. It might fine, though, for a fixed number of long-running streams, as long as you have enough resources and you use the recommendations for capping the allowed resources each job can grab (default is to grab everything).
If you're really just running on a single machine, use local[*] to avoid the overhead of master and slave daemons, etc.

What are the benefits of running multiple Spark instances per node (physical machine)?

Is there any advantage to starting more than one spark instance (master or worker) on a particular machine/node?
The spark standalone documentation doesn't explicitly say anything about starting a cluster or multiple workers on the same node. It does seem to implicitly conflate that one worker equals one node
Their hardware provisioning page says:
Finally, note that the Java VM does not always behave well with more than 200 GB of RAM. If you purchase machines with more RAM than this, you can run multiple worker JVMs per node. In Spark’s standalone mode, you can set the number of workers per node with the SPARK_WORKER_INSTANCES variable in conf/spark-env.sh, and the number of cores per worker with SPARK_WORKER_CORES.
So aside from working with large amounts of memory or testing cluster configuration, is there any benefit to running more than one worker per node?
I think the obvious benefit is to improve the resource utilization of the hardware per box without losing performance. In terms of parallelism, one big executor with multiple cores seems to be same with multiple executors with less cores.

start multiple processor threads on Spark worker within one core

Our situation is: using Spark streaming with AWS Kinesis.
If specify the Spark master to be in memory as "local[32]", then Spark can consume data from Kinesis fairly quick.
But if we switch to a cluster with 1 master and 3 workers (on 4 separate machines), and set master to be "spark://[IP]:[port]", then the Spark cluster is consuming data at a very slow rate. This cluster has 3 worker machines, and each worker machine with 1 core.
I'm trying to speed up the consuming speed, so I add more executors on each worker machine, but it does not help much since each executor will need 1 core at least (and my worker machine has 1 core only). I also read adding more Kinesis shard number will help scale up, but I just want to maximize my read capacity.
Since the "in memory" mode is possible to consume fast enough, is it possible to also start multiple "Kinesis record processor thread" on each worker machine, shown in the picture below? Or start many threads to consume from Kinesis within 1 core?
Thank you very much.
picture below from https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
It turns out to be related to resources of the cluster.
For AWS Kinesis, one Kinesis stream requires one receiver from Spark cluster, and one receiver will acquire one core from Spark workers.
I increased the core of each worker to be 4 cores, and then executors have extra cores to run jobs.

Resources