spark submit executor memory/failed batch

spark submit executor memory/failed batch - apache-spark

I have 2 questions on spark streaming :
I have a spark streaming application running and collection data in 20 seconds batch intervals, out of 4000 batches there are 18 batches which failed because of exception :
Could not compute split, block input-0-1464774108087 not found
I assumed the data size is bigger than spark available memory at that point, also the app StorageLevel is MEMORY_ONLY.
Please advice how to fix this.
Also in the command I use below, I use executor memory 20G(total RAM on the data nodes is 140G), does that mean all that memory is reserved in full for this app, and what happens if I have multiple spark streaming applications ?
would I not run out of memory after a few applications ? do I need that much memory at all ?
/usr/iop/4.1.0.0/spark/bin/spark-submit --master yarn --deploy-mode
client --jars /home/blah.jar --num-executors 8 --executor-cores
5 --executor-memory 20G --driver-memory 12G --driver-cores 8
--class com.ccc.nifi.MyProcessor Nifi-Spark-Streaming-20160524.jar

It seems might be your executor memory will be getting full,try these few optimization techniques like :
Instead of using StorageLevel is MEMORY_AND_DISK.
Use Kyro serialization which is fast and better than normal java serialization.f yougo for caching with memory and serialization.
Check if there are gc,you can find in the tasks being executed.

Related

spark - application returns different results based on different executor memory?

I am noticing some peculiar behaviour, i have spark job which reads the data and does some grouping ordering and join and creates an output file.
The issue is when I run the same job on yarn with memory more than what the environment has eg the cluster has 50 GB and i submit spark-submit with close to 60 GB executor and 4gb driver memory.
My results gets decreased seems like one of the data partitions or tasks are lost while processing.
driver-memory 4g --executor-memory 4g --num-executors 12
I also notice the warning message on driver -
WARN util.Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
but when i run with limited executors and memory example 15GB, it works and i get exact rows/data. no warning message.
driver-memory 2g --executor-memory 2g --num-executors 4
any suggestions are we missing some settings on cluster or anything?
Please note my job completes successfully in both the cases.
I am using spark version 2.2.

This is meaningless (except maybe for debugging) - the plan is larger when there are more executors involved and the warning is that it is too big to be converted into a string. if you need it you can set spark.debug.maxToStringFields to a larger number (as suggested in the warning message)

Spark job fails when cluster size is large, succeeds when small

I have a spark job which takes in three inputs and does two outer joins. The data is in key-value format (String, Array[String]). Most important part of the code is:
val partitioner = new HashPartitioner(8000)
val joined = inputRdd1.fullOuterJoin(inputRdd2.fullOuterJoin(inputRdd3, partitioner), partitioner).cache
saveAsSequenceFile(joined, filter="X")
saveAsSequenceFile(joined, filter="Y")
I'm running the job on EMR with r3.4xlarge driver node and 500 m3.xlarge worker nodes. The spark-submit parameters are:
spark-submit --deploy-mode client --master yarn-client --executor-memory 3g --driver-memory 100g --executor-cores 3 --num-executors 4000 --conf spark.default.parallelism=8000 --conf spark.storage.memoryFraction=0.1 --conf spark.shuffle.memoryFraction=0.2 --conf spark.yarn.executor.memoryOverhead=4000 --conf spark.network.timeout=600s
UPDATE: with this setting, number of executors seen in spark jobs UI were 500 (one per node)
The exception I see in the driver log is the following:
17/10/13 21:37:57 WARN HeartbeatReceiver: Removing executor 470 with no recent heartbeats: 616136 ms exceeds timeout 600000 ms
17/10/13 21:39:04 ERROR ContextCleaner: Error cleaning broadcast 5
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [600 seconds]. This timeout is controlled by spark.network.timeout at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
...
Some of the things I tried that failed:
I thought the problem would be because of there are too many executors being spawned and driver has an overhead of tracking these executors. I tried reducing the number of executors by increasing the executor-memory to 4g. This did not help.
I tried changing the instance type of driver to r3.8xlarge, this did not help either.
Surprisingly, when I reduce the number of worker nodes to 300, the job runs file. Does any one have any other hypothesis on why this would happen?

Well this is a little bit a problem to understand how is the allocation of Spark works.
According to your information, you have 500 nodes with 4 cores each. So, you have 4000 cores. What you are doing with your request is creating 4000 executors with 3 cores each. It means that you are requesting 12000 cores for your cluster and there is no thing like that.
This error of RPC timeout is regularly associated with how many jvms you started in the same machine, and that machine is not able to respond in proper time due to much thing happens at the same time.
You need to know that, --num-executors is better been associated to you nodes, and the number of cores should be associated to the cores you have in each node.
For example, the configuration of m3.xLarge is 4 cores with 15 Gb of RAM. What is the best configuration to run a job there? That depends what you are planning to do. See if you are going to run just one job I suggest you to set up like this:
spark-submit --deploy-mode client --master yarn-client --executor-memory 10g --executor-cores 4 --num-executors 500 --conf spark.default.parallelism=2000 --conf spark.yarn.executor.memoryOverhead=4000
This will allow you job to run fine, if you don't have problem to fit your data to your worker is better change the default.parallelism to 2000 or you are going to lost lot of time with shuffle.
But, the best approach I think that you can do is keeping the dynamic allocation that EMR enables it by default, just set the number of cores and the parallelism and the memory and you job will run like a charm.

I experimented with lot of configurations modifying one parameter at a time with 500 nodes. I finally got the job to work by lowering the number of partitions in the HashPartitioner from 8000 to 3000.
val partitioner = new HashPartitioner(3000)
So probably the driver is overwhelmed with a the large number of shuffles that has to be done when there are more partitions and hence the lower partition helps.

Why Spark utilizing only one core per executor? How it decides to utilize cores other than number of partitions?

I am running spark in HPC environment on slurm using Spark standalone mode spark version 1.6.1. The problem is my slurm node is not fully used in the spark standalone mode. I am using spark-submit in my slurm script. There are 16 cores available on a node and I get all 16 cores per executor as I see on SPARK UI. But only one core per executor is actually utilized. top + 1 command on the worker node, where executor process is running, shows that only one cpu is being used out of 16 cpus. I have 255 partitions, so partitions does not seems a problem here.
$SPARK_HOME/bin/spark-submit \
--class se.uu.farmbio.vs.examples.DockerWithML \
--master spark://$MASTER:7077 \
--executor-memory 120G \
--driver-memory 10G \
When I change script to
$SPARK_HOME/bin/spark-submit \
--class se.uu.farmbio.vs.examples.DockerWithML \
--master local[*] \
--executor-memory 120G \
--driver-memory 10G \
I see 0 cores allocated to executor on Spark UI which is understandable because we are no more using spark standalone cluster mode. But now all the cores are utilized when I check top + 1 command on worker node which hints that problem is not with the application code but with the utilization of resources by spark standalone mode.
So how spark decides to use one core per executor when it has 16 cores and also have enough partitions? What can I change so it can utilize all cores?
I am using spark-on-slurm for launching the jobs.
Spark configurations in both cases are as fallows:
--master spark://MASTER:7077
(spark.app.name,DockerWithML)
(spark.jars,file:/proj/b2015245/bin/spark-vs/vs.examples/target/vs.examples-0.0.1-jar-with-dependencies.jar)
(spark.app.id,app-20170427153813-0000)
(spark.executor.memory,120G)
(spark.executor.id,driver)
(spark.driver.memory,10G)
(spark.history.fs.logDirectory,/proj/b2015245/nobackup/eventLogging/)
(spark.externalBlockStore.folderName,spark-75831ca4-1a8b-4364-839e-b035dcf1428d)
(spark.driver.maxResultSize,2g)
(spark.executorEnv.OE_LICENSE,/scratch/10230979/SureChEMBL/oe_license.txt)
(spark.driver.port,34379)
(spark.submit.deployMode,client)
(spark.driver.host,x.x.x.124)
(spark.master,spark://m124.uppmax.uu.se:7077)
--master local[*]
(spark.app.name,DockerWithML)
(spark.app.id,local-1493296508581)
(spark.externalBlockStore.folderName,spark-4098cf14-abad-4453-89cd-3ce3603872f8)
(spark.jars,file:/proj/b2015245/bin/spark-vs/vs.examples/target/vs.examples-0.0.1-jar-with-dependencies.jar)
(spark.driver.maxResultSize,2g)
(spark.master,local[*])
(spark.executor.id,driver)
(spark.submit.deployMode,client)
(spark.driver.memory,10G)
(spark.driver.host,x.x.x.124)
(spark.history.fs.logDirectory,/proj/b2015245/nobackup/eventLogging/)
(spark.executorEnv.OE_LICENSE,/scratch/10230648/SureChEMBL/oe_license.txt)
(spark.driver.port,36008)
Thanks,

The problem is that you only have one worker node. In spark standalone mode, one executor is being launched per worker instances. To launch multiple logical worker instances in order to launch multiple executors within a physical worker, you need to configure this property:
SPARK_WORKER_INSTANCES
By default, it is set to 1. You can increase it accordingly based on the computation you are doing in your code to utilize the amount of resources you have.
You want your job to be distributed among executors to utilize the resources properly,but what's happening is only one executor is getting launched which can't utilize the number of core and the amount of memory you have. So, you are not getting the flavor of spark distributed computation.
You can set SPARK_WORKER_INSTANCES = 5
And allocate 2 cores per executor; so, 10 cores would be utilized properly.
Like this, you tune the configuration to get optimum performance.

Try setting spark.executor.cores (default value is 1)
According to the Spark documentation :
the number of cores to use on each executor. For YARN and standalone mode only. In standalone mode, setting this parameter allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker. Otherwise, only one executor per application will run on each worker.
See https://spark.apache.org/docs/latest/configuration.html

In spark cluster mode you should use command --num-executor "numb_tot_cores*num_of_nodes". For example if you have 3 nodes with 8 cores per node you should write --num-executors 24

Spark tuning job

I have a problem with tuning Spark jobs executing on Yarn cluster. I'm having a feeling that I'm not getting most of my cluster and additionally, my jobs fail (executors get removed all the time).
I have the following setup:
4 machines
each machine has 10GB of RAM
each machine has 8 cores
8GBs of RAM are allocated for yarn jobs
14 (of 16) virtual cores are allocated for yarn jobs
I have run my spark job (actually connected to a jupyter notebook) using different setups, e.g.
pyspark --master yarn --num-executors 7 --executor-cores 4 --executor-memory 3G
pyspark --master yarn --num-executors 7 --executor-cores 7 --executor-memory 2G
pyspark --master yarn --num-executors 11 --executor-cores 4 --executor-memory 1G
I've tried different combinations and none of them seems to be working as my executors get destroyed. Additionally, I've read somewhere that it is a good way to increase spark.yarn.executor.memoryOverhead to 600MB as a way not to loose executors (and I did that), but seems that doesn't help. How should I setup my job?
Additionally, it confuses me that when I look at the ResourceManager UI it says for my job vcores used 8 vcores total 56. It seems that I'm using a single core per executor, but I don't understand why?
One more thing, when I setup my job, how many partitions should I specify when I'm reading data from HDFS to get maximal performance?

Donald Knuth said premature optimisation is the root of all evil. I am sure faster running program which fails is on no use. Start by giving all the memory to one executor. Say 7GB/8GB and just 1 core. This is a complete wastage of cores, but if it works, it proves your application can possibly run on this hardware. If even this doesn't work, you should try getting bigger machines. Assuming it works, try increasing the number of cores, until it still works.
The gist of the argument is: your application requires certain memory per task. But the number of tasks running per executor is dependent on number of cores. First find the worst case memory per cores for you application and then you can set executor memory and cores to some multiple of this number.

Using all resources in Apache Spark with Yarn

I am using Apache Spark with Yarn client.
I have 4 worker PCs with 8 vcpus each and 30 GB of ram in my spark cluster.
Im set my executor memory to 2G and number of instances to 33.
My job is taking 10 hours to run and all machines are about 80% idle.
I dont understand the correlation between executor memory and executor instances. Should I have an instance per Vcpu? Should I set the executor memory to be memory of machine/#executors per machine?

I believe that you have to use the following command:
spark-submit --num-executors 4 --executor-memory 7G --driver-memory 2G --executor-cores 8 --class \"YourClassName\" --master yarn-client
Number of executors should be 4, since you have 4 workers. The executor memory should be close to the maximum memory that each yarn node has allocated, roughly ~5-6GB (I assume you have 30GB total RAM).
You should take a look on the spark-submit parameters and fully understand them.

We were using cassandra as our data source for spark. The problem was there were not enough partitions. We needed to split up the data more. Our mapping for # of cassandra partitions to spark partitions was not small enough and we would only generate 10 or 20 tasks instead of 100s of tasks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string