Spark Streaming receiver is processing only one record - apache-spark

I have 16 receivers in Spark Streaming 2.2.1 job. After a while, some of the receivers are processing less and less records, eventually processing only one record/second. The behaviour can be observed on the screenshot:
While I understand the root-cause can be difficult to find and not obvious, is there a way I could debug this problem further? Currently I have no idea where to start digging. Could it be related to back-pressure?
Spark streaming properties:
spark.app.id application_1599135282140_1222
spark.cores.max 64
spark.driver.cores 4
spark.driver.extraJavaOptions -XX:+PrintFlagsFinal -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/dump/ -Dlog4j.configuration=file:///tmp/4f892127ad794245aef295c97ccbc5c9/driver_log4j.properties
spark.driver.maxResultSize 3840m
spark.driver.memory 4g
spark.driver.port 36201
spark.dynamicAllocation.enabled false
spark.dynamicAllocation.maxExecutors 10000
spark.dynamicAllocation.minExecutors 1
spark.eventLog.enabled false
spark.executor.cores 4
spark.executor.extraJavaOptions -XX:+PrintFlagsFinal -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/dump/
spark.executor.id driver
spark.executor.instances 16
spark.executor.memory 4g
spark.jars file:/tmp/4f892127ad794245aef295c97ccbc5c9/main-e41d1cc.jar
spark.master yarn
spark.rpc.message.maxSize 512
spark.scheduler.maxRegisteredResourcesWaitingTime 300s
spark.scheduler.minRegisteredResourcesRatio 1.0
spark.scheduler.mode FAIR
spark.shuffle.service.enabled true
spark.sql.cbo.enabled true
spark.streaming.backpressure.enabled true
spark.streaming.backpressure.initialRate 25
spark.streaming.backpressure.pid.minRate 1
spark.streaming.concurrentJobs 1
spark.streaming.receiver.maxRate 100
spark.submit.deployMode client

Seems that the problem started manifesting after running for 30 mins. I think back-pressure could be a reason. According to this article:
With activated backpressure, the driver monitors the current batch scheduling delays and processing times and dynamically adjusts the maximum rate of the receivers. The communication of new rate limits can be verified in the receiver log:
2016-12-06 08:27:02,572 INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl Received a new rate limit: 51.
Here is what I would recommend you to try:
Check the receiver log to see if backpress is triggerred.
Check your stream sink to see if there is any error.
Check YARN resource manager for resource utilization.
Tune Spark parameters to see if that makes a difference.

Related

How to properly set spark cluster properties in Databricks

I have a cluster in Databricks for my spark workflow and I wanted some help in setting up right for optimal use. Following are the details of my cluster.
RUNTIME: 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)
DRIVER TYPE: c5a.8xlarge (64GB Memory, 32 Cores)
WORKER TYPE: c5a.4xlarge (32GB Memory, 16 Cores)
(Min worker 1, Max Workers 5)
This is a going to process large amount of data (not sure about the exact numbers). Here are the current properties that i am using. I think these are not optimal.
spark.driver.extraJavaOptions -Xss64M
spark.executor.cores 7
spark.executor.memory 20G
spark.driver.maxResultSize 20G
spark.sql.shuffle.partitions 35
spark.driver.memory 48G
spark.sql.execution.arrow.pyspark.enabled true
spark.sql.execution.arrow.pyspark.fallback.enabled true
spark.executor.memoryOverhead 1G
Is there a rule or a guide as how to set proper values to get the maximum performance?

spark.executor.instances over spark.dynamicAllocation.enabled = True

I'm working in a Spark project using MapR distribution where the dynamic allocation is enabled. Please refer to the below parameters :
spark.dynamicAllocation.enabled true
spark.shuffle.service.enabled true
spark.dynamicAllocation.minExecutors 0
spark.dynamicAllocation.maxExecutors 20
spark.executor.instances 2
As per my understanding spark.executor.instances is what we define as --num-executors while submitting our pySpark job.
I have following 2 questions :
if I use --num-executors 5 during my job submission will it overwrite spark.executor.instances 2 config setting?
what is the purpose of having spark.executor.instances defined when dynamic allocation min and max executors are already defined?
There is one more parameter which is
spark.dynamicAllocation.initialExecutors
it takes the value of spark.dynamicAllocation.minExecutors. If spark.executor.instances
is defined and its larger than the minExecutors then it will take the value of the initial executors.
spark.executor.instances basically is the property for static allocation. However, if dynamic allocation is enabled, the initial set of executors will be at least equal to spark.executor.instances.
It wont get overwritten in the config setting, when you set --num-executors.
Extra read: official doc

spark - application returns different results based on different executor memory?

I am noticing some peculiar behaviour, i have spark job which reads the data and does some grouping ordering and join and creates an output file.
The issue is when I run the same job on yarn with memory more than what the environment has eg the cluster has 50 GB and i submit spark-submit with close to 60 GB executor and 4gb driver memory.
My results gets decreased seems like one of the data partitions or tasks are lost while processing.
driver-memory 4g --executor-memory 4g --num-executors 12
I also notice the warning message on driver -
WARN util.Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
but when i run with limited executors and memory example 15GB, it works and i get exact rows/data. no warning message.
driver-memory 2g --executor-memory 2g --num-executors 4
any suggestions are we missing some settings on cluster or anything?
Please note my job completes successfully in both the cases.
I am using spark version 2.2.
This is meaningless (except maybe for debugging) - the plan is larger when there are more executors involved and the warning is that it is too big to be converted into a string. if you need it you can set spark.debug.maxToStringFields to a larger number (as suggested in the warning message)

Making Yarn dynamically allocate resources for Spark

I have a cluster managed with Yarn and runs Spark jobs, the components were installed using Ambari (2.6.3.0-235). I have 6 hosts each with 6 cores. I use Fair scheduler
I want Yarn to automatically add/remove executor cores, but no matter what I do it doesn't work
Relevant Spark configuration (configured in Ambari):
spark.dynamicAllocation.schedulerBacklogTimeout 10s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 5s
spark.driver.memory 4G
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.initialExecutors 6 (has no effect - starts with 2)
spark.dynamicAllocation.maxExecutors 10
spark.dynamicAllocation.minExecutors 1
spark.scheduler.mode FAIR
spark.shuffle.service.enabled true
SPARK_EXECUTOR_MEMORY="36G"
Relevant Yarn configuration (configured in Ambari):
yarn.nodemanager.aux-services mapreduce_shuffle,spark_shuffle,spark2_shuffle
YARN Java heap size 4096
yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.scheduler.fair.preemption true
yarn.nodemanager.aux-services.spark2_shuffle.class org.apache.spark.network.yarn.YarnShuffleService
yarn.nodemanager.aux-services.spark2_shuffle.classpath {{stack_root}}/${hdp.version}/spark2/aux/*
yarn.nodemanager.aux-services.spark_shuffle.class org.apache.spark.network.yarn.YarnShuffleService
yarn.nodemanager.aux-services.spark_shuffle.classpath {{stack_root}}/${hdp.version}/spark/aux/*
Minimum Container Size (VCores) 0
Maximum Container Size (VCores) 12
Number of virtual cores 12
Also I followed Dynamic resource allocation and passed all the steps to configure external shuffle service, I copied the yarn-shuffle jar:
cp /usr/hdp/2.6.3.0-235/spark/aux/spark-2.2.0.2.6.3.0-235-yarn-shuffle.jar /usr/hdp/2.6.3.0-235/hadoop-yarn/lib/
I see only 3 cores are allocated to the application (deafult executors is 2 so I guess its driver+2) the queue:
Although many tasks are pending:

Spark on Kubernetes, pods crashing abruptly

Below is the scenario being tested,
Job :
Spark SQK job is written in Scala, and to run on 1TB TPCDS BENCHMARK DATA which is in parquet,snappy format and hive tables created on top of it.
Cluster manager :
Kubernetes
Spark sql configuration :
Set 1 :
spark.executor.heartbeatInterval 20s
spark.executor.cores 4
spark.driver.cores 4
spark.driver.memory 15g
spark.executor.memory 15g
spark.cores.max 220
spark.rpc.numRetries 5
spark.rpc.retry.wait 5
spark.network.timeout 1800
spark.sql.broadcastTimeout 1200
spark.sql.crossJoin.enabled true
spark.sql.starJoinOptimization true
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namenodeHA/tmp/spark-history
spark.sql.codegen true
spark.kubernetes.allocation.batch.size 30
Set 2 :
spark.executor.heartbeatInterval 20s
spark.executor.cores 4
spark.driver.cores 4
spark.driver.memory 11g
spark.driver.memoryOverhead 4g
spark.executor.memory 11g
spark.executor.memoryOverhead 4g
spark.cores.max 220
spark.rpc.numRetries 5
spark.rpc.retry.wait 5
spark.network.timeout 1800
spark.sql.broadcastTimeout 1200
spark.sql.crossJoin.enabled true
spark.sql.starJoinOptimization true
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namenodeHA/tmp/spark-history
spark.sql.codegen true
spark.kubernetes.allocation.batch.size 30
Kryoserialiser is being used and with spark.kryoserializer.buffer.mb value of 64mb.
50 executors are being spawned using spark.executor.instances=50 submit argument.
Issues Observed:
Spark SQL job is terminating abruptly and the drivers,executors are being killed randomly.
driver and executors pods gets killed suddenly the job fails.
Few different stack traces are found across different runs,
Stack Trace 1:
"2018-05-10 06:31:28 ERROR ContextCleaner:91 - Error cleaning broadcast 136
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)"
File attached : StackTrace1.txt
Stack Trace 2:
"org.apache.spark.shuffle.FetchFailedException: Failed to connect to /192.178.1.105:38039^M
at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:442)^M
at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:418)"
File attached : StackTrace2.txt
Stack Trace 3:
"18/05/10 11:21:17 WARN KubernetesTaskSetManager: Lost task 3.0 in stage 48.0 (TID 16486, 192.178.1.35, executor 41): FetchFailed(null, shuffleId=29, mapId=-1, reduceId=3, message=^M
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 29^M
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:697)^M
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:693)"
File attached : StackTrace3.txt
Stack Trace 4:
"ERROR KubernetesTaskSchedulerImpl: Lost executor 11 on 192.178.1.123: Executor lost for unknown reasons."
This is repeating constantly until the executors are dead completely without any stack traces.
Also, we see 18/05/11 07:23:23 INFO DAGScheduler: failed: Set()
what does this mean ? anything is wrong or it says failed set is empty that means no failure ?
Observations or changes tried out :
- Monitored memory and CPU utilization across executors and none of them are hitting the limits.
As per few readings and suggestions
spark.network.timeout was increased to 1800 from 600, but did not help.
Also, driver and executor memory overhead was kept default in set 1 of the config and it was 0.1*15g=1.5gb.
Increased this value also, explicitly to 4gb and reduced driver and executor memory values to 11gb from 15gb as per set 2.
this did not yield any valuable results, same failures are being observed.
Spark SQL is being used to run the queries,
sample code lines :
val qresult = spark.sql(q)
qresult.show()
No manual repartitioning is being done in the code.

Resources