can driver program and cluster manager(resource manager) can available on same machine in spark stand alone - apache-spark

I'm doing spark submit from the same machine as spark master, using following command ./bin/spark-submit --master spark://ip:port --deploy-mode "client" test.py I'm my application running forever with following kind of output
22/11/18 13:17:37 INFO BlockManagerMaster: Removal of executor 8 requested
22/11/18 13:17:37 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 8
22/11/18 13:17:37 INFO StandaloneSchedulerBackend: Granted executor ID app-20221118131723-0008/10 on hostPort 192.168.210.94:37443 with 2 core(s), 1024.0 MiB RAM
22/11/18 13:17:37 INFO BlockManagerMasterEndpoint: Trying to remove executor 8 from BlockManagerMaster.
22/11/18 13:17:37 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20221118131723-0008/10 is now RUNNING
22/11/18 13:17:38 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20221118131723-0008/9 is now EXITED (Command exited with code 1)
22/11/18 13:17:38 INFO StandaloneSchedulerBackend: Executor app-20221118131723-0008/9 removed: Command exited with code 1
22/11/18 13:17:38 INFO BlockManagerMaster: Removal of executor 9 requested
22/11/18 13:17:38 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 9
22/11/18 13:17:38 INFO BlockManagerMasterEndpoint: Trying to remove executor 9 from BlockManagerMaster.
22/11/18 13:17:38 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20221118131723-0008/11 on worker-20221118111836-192.168.210.82-46395 (192.168.210.82:4639
But when I run from other nodes, my application is running successfully what could be reason?

Related

Spark: Jobs are not assigned

I've deployed an spark cluster into my kubernetes. Here webui:
I'm trying to submit an sark SparkPi example using:
$ ./spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://spark-cluster-ra-iot-dev.si-origin-cluster.t-systems.es:32316 \
--num-executors 1 \
--driver-memory 512m \
--executor-memory 512m \
--executor-cores 1 \
../examples/jars/spark-examples_2.11-2.4.5.jar 10
Job is reached on spark cluster:
Nevertheless, I'm getting messages like:
WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I seems like SparkPi application is scheduled but never executed...
Here complete log:
./spark-submit --class org.apache.spark.examples.SparkPi --master spark://spark-cluster-ra-iot-dev.si-origin-cluster.t-systems.es:32316 --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 ../examples/jars/spark-examples_2.11-2.4.5.jar 10
20/06/09 10:52:57 WARN Utils: Your hostname, psgd resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface enp0s3)
20/06/09 10:52:57 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
20/06/09 10:52:57 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/06/09 10:52:58 INFO SparkContext: Running Spark version 2.4.5
20/06/09 10:52:58 INFO SparkContext: Submitted application: Spark Pi
20/06/09 10:52:58 INFO SecurityManager: Changing view acls to: jeusdi
20/06/09 10:52:58 INFO SecurityManager: Changing modify acls to: jeusdi
20/06/09 10:52:58 INFO SecurityManager: Changing view acls groups to:
20/06/09 10:52:58 INFO SecurityManager: Changing modify acls groups to:
20/06/09 10:52:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jeusdi); groups with view permissions: Set(); users with modify permissions: Set(jeusdi); groups with modify permissions: Set()
20/06/09 10:52:59 INFO Utils: Successfully started service 'sparkDriver' on port 42943.
20/06/09 10:52:59 INFO SparkEnv: Registering MapOutputTracker
20/06/09 10:52:59 INFO SparkEnv: Registering BlockManagerMaster
20/06/09 10:52:59 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/06/09 10:52:59 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/06/09 10:52:59 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-b6c54054-c94b-42c7-b85f-a4e30be4b659
20/06/09 10:52:59 INFO MemoryStore: MemoryStore started with capacity 117.0 MB
20/06/09 10:52:59 INFO SparkEnv: Registering OutputCommitCoordinator
20/06/09 10:52:59 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/06/09 10:53:00 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040
20/06/09 10:53:00 INFO SparkContext: Added JAR file:/home/jeusdi/projects/workarea/valladolid/spark-2.4.5-bin-hadoop2.7/bin/../examples/jars/spark-examples_2.11-2.4.5.jar at spark://10.0.2.15:42943/jars/spark-examples_2.11-2.4.5.jar with timestamp 1591692780146
20/06/09 10:53:00 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-cluster-ra-iot-dev.si-origin-cluster.t-systems.es:32316...
20/06/09 10:53:00 INFO TransportClientFactory: Successfully created connection to spark-cluster-ra-iot-dev.si-origin-cluster.t-systems.es/10.49.160.69:32316 after 152 ms (0 ms spent in bootstraps)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20200609085300-0002
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/0 on worker-20200609084543-10.129.3.127-45867 (10.129.3.127:45867) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/0 on hostPort 10.129.3.127:45867 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/1 on worker-20200609084543-10.129.3.127-45867 (10.129.3.127:45867) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/1 on hostPort 10.129.3.127:45867 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/2 on worker-20200609084543-10.129.3.127-45867 (10.129.3.127:45867) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/2 on hostPort 10.129.3.127:45867 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/3 on worker-20200609084543-10.129.3.127-45867 (10.129.3.127:45867) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/3 on hostPort 10.129.3.127:45867 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/4 on worker-20200609084543-10.129.3.127-45867 (10.129.3.127:45867) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/4 on hostPort 10.129.3.127:45867 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33755.
20/06/09 10:53:01 INFO NettyBlockTransferService: Server created on 10.0.2.15:33755
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/5 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/5 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/6 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/6 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/7 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/7 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/8 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/8 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/9 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/9 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/10 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/10 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/11 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/11 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/12 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/12 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/13 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/13 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/14 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/14 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/5 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/6 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/7 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/8 is now RUNNING
20/06/09 10:53:01 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.2.15, 33755, None)
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/9 is now RUNNING
20/06/09 10:53:01 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:33755 with 117.0 MB RAM, BlockManagerId(driver, 10.0.2.15, 33755, None)
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/10 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/11 is now RUNNING
20/06/09 10:53:01 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.2.15, 33755, None)
20/06/09 10:53:01 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.2.15, 33755, None)
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/12 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/13 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/14 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/0 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/1 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/2 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/3 is now RUNNING
20/06/09 10:53:01 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/4 is now RUNNING
20/06/09 10:53:01 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
20/06/09 10:53:02 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
20/06/09 10:53:02 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
20/06/09 10:53:02 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
20/06/09 10:53:02 INFO DAGScheduler: Parents of final stage: List()
20/06/09 10:53:02 INFO DAGScheduler: Missing parents: List()
20/06/09 10:53:02 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
20/06/09 10:53:03 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 2.0 KB, free 117.0 MB)
20/06/09 10:53:03 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1381.0 B, free 117.0 MB)
20/06/09 10:53:03 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.2.15:33755 (size: 1381.0 B, free: 117.0 MB)
20/06/09 10:53:03 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1163
20/06/09 10:53:03 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
20/06/09 10:53:03 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks
20/06/09 10:53:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:53:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:53:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:54:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:54:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:54:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:54:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/13 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/13 removed: Command exited with code 1
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/15 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/15 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 13 from BlockManagerMaster.
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 13 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 13
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/15 is now RUNNING
20/06/09 10:55:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/12 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/12 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 12 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 12
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 12 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/16 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/16 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/16 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/14 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/14 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 14 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 14
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 14 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/17 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/17 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/17 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/10 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/10 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 10 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 10
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 10 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/18 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/18 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/18 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/8 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/8 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 8 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 8
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 8 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/19 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/19 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/11 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/11 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 11 from BlockManagerMaster.
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 11 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 11
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/20 on worker-20200609084426-10.131.1.27-46041 (10.131.1.27:46041) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/20 on hostPort 10.131.1.27:46041 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/19 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/20 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/7 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/7 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 7 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 7
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 7 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/21 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Granted executor ID app-20200609085300-0002/21 on hostPort 10.128.3.197:41600 with 1 core(s), 512.0 MB RAM
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/21 is now RUNNING
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200609085300-0002/9 is now EXITED (Command exited with code 1)
20/06/09 10:55:03 INFO StandaloneSchedulerBackend: Executor app-20200609085300-0002/9 removed: Command exited with code 1
20/06/09 10:55:03 INFO BlockManagerMaster: Removal of executor 9 requested
20/06/09 10:55:03 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 9
20/06/09 10:55:03 INFO BlockManagerMasterEndpoint: Trying to remove executor 9 from BlockManagerMaster.
20/06/09 10:55:03 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200609085300-0002/22 on worker-20200609084509-10.128.3.197-41600 (10.128.3.197:41600) with 1 core(s)
...

How to deploy spark on Kubeedge?

I tried to use k8s deployment mode to deploy spark-2.4.3 on Kubeedge 1.1.0 but failed (docker version 19.03.4,k8s version 1.16.1).
SPARK_DRIVER_BIND_ADDRESS=10.4.20.34
SPARK_IMAGE=spark:2.4.3
SPARK_MASTER="k8s://http://127.0.0.1:8080"
CMD=(
"$SPARK_HOME/bin/spark-submit"
--conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
--conf "spark.kubernetes.container.image=${SPARK_IMAGE}"
--conf "spark.executor.instances=1"
--conf "spark.kubernetes.executor.limit.cores=1"
--deploy-mode client
--master ${SPARK_MASTER}
--name spark-pi
--class org.apache.spark.examples.SparkPi
--driver-memory 1G
--executor-memory 1G
--num-executors 1
--executor-cores 1
file://${PWD}/spark-examples_2.11-2.4.3.jar
)
${CMD[#]}
Node status is normal.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
edge-node-001 Ready edge 6d1h v1.15.3-kubeedge-v1.1.0-beta.0.178+c6a5aa738261e7-dirty
ubuntu-ms-7b89 Ready master 6d4h v1.16.1
But I got some errors
19/11/17 21:45:12 INFO k8s.ExecutorPodsAllocator: Going to request 1 executors from Kubernetes.
19/11/17 21:45:12 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46571.
19/11/17 21:45:12 INFO netty.NettyBlockTransferService: Server created on 10.4.20.34:46571
19/11/17 21:45:12 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/11/17 21:45:12 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.4.20.34, 46571, None)
19/11/17 21:45:12 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.4.20.34:46571 with 366.3 MB RAM, BlockManagerId(driver, 10.4.20.34, 46571, None)
19/11/17 21:45:12 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.4.20.34, 46571, None)
19/11/17 21:45:12 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.4.20.34, 46571, None)
19/11/17 21:45:12 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler#451882b2{/metrics/json,null,AVAILABLE,#Spark}
19/11/17 21:45:42 INFO k8s.KubernetesClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
19/11/17 21:45:42 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:38
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Parents of final stage: List()
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Missing parents: List()
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
19/11/17 21:45:42 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1936.0 B, free 366.3 MB)
19/11/17 21:45:42 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1256.0 B, free 366.3 MB)
19/11/17 21:45:42 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.4.20.34:46571 (size: 1256.0 B, free: 366.3 MB)
19/11/17 21:45:42 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161
19/11/17 21:45:42 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
19/11/17 21:45:42 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
19/11/17 21:45:57 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/11/17 21:46:12 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/11/17 21:46:27 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/11/17 21:46:42 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/11/17 21:46:57 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/11/17 21:47:12 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Is it possible to deploy spark on Kubeedge in Kubernetes deployment mode? Or I should try standalone deployment mode?
I'm so confused.

Spark executor lost when increasing the number of executor instances

My Hadoop cluster currently has 4 nodes and 45 cores running pyspark 2.4 through YARN. When I run spark-submit with one executor everything works fine, but if I change the number of executor-instances to 3 or 4 the executor is killed by the driver and only one task is working.
I have changed the below settings on Cloudera manager:
yarn.nodemanager.resource.memory-mb : 64 GB
yarn.nodemanager.resource.cpu-vcores:45
And below is the log that I get:
19/03/21 11:28:48 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks
19/03/21 11:28:48 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, datanode1, executor 2, partition 0, PROCESS_LOCAL, 7701 bytes)
19/03/21 11:28:48 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode1:42432 (size: 71.0 KB, free: 366.2 MB)
19/03/21 11:29:43 INFO spark.ExecutorAllocationManager: Request to remove executorIds: 1, 3
19/03/21 11:29:43 INFO cluster.YarnClientSchedulerBackend: Requesting to kill executor(s) 1, 3
19/03/21 11:29:43 INFO cluster.YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 1, 3
19/03/21 11:29:43 INFO spark.ExecutorAllocationManager: Removing executor 1 because it has been idle for 60 seconds (new desired total will be 2)
19/03/21 11:29:43 INFO spark.ExecutorAllocationManager: Removing executor 3 because it has been idle for 60 seconds (new desired total will be 1)
19/03/21 11:29:45 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 3.
19/03/21 11:29:45 INFO scheduler.DAGScheduler: Executor lost: 3 (epoch 0)
19/03/21 11:29:45 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 3 from BlockManagerMaster.
19/03/21 11:29:45 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, datanode2, 32853, None)
19/03/21 11:29:45 INFO storage.BlockManagerMaster: Removed 3 successfully in removeExecutor
19/03/21 11:29:45 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 1.
19/03/21 11:29:45 INFO scheduler.DAGScheduler: Executor lost: 1 (epoch 0)
19/03/21 11:29:45 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
19/03/21 11:29:45 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, datanode3, 39466, None)
19/03/21 11:29:45 INFO storage.BlockManagerMaster: Removed 1 successfully in removeExecutor
19/03/21 11:29:45 INFO cluster.YarnScheduler: Executor 3 on datanode2 killed by driver.
19/03/21 11:29:45 INFO cluster.YarnScheduler: Executor 1 on datanode3 killed by driver.
19/03/21 11:29:45 INFO spark.ExecutorAllocationManager: Existing executor 3 has been removed (new total is 2)
19/03/21 11:29:45 INFO spark.ExecutorAllocationManager: Existing executor 1 has been removed (new total is 1)

Spark removing executors in DataProc clusters

During a Spark job execution I'm constantly facing this following issue. Even the DataProc cluster was idle Spark driver is killing all the executors and I end up with one executor.
18/02/26 08:47:05 INFO spark.ExecutorAllocationManager: Existing executor 35 has been removed (new total is 2)
18/02/26 08:50:40 INFO scheduler.TaskSetManager: Finished task 189.0 in stage 5.0 (TID 6002) in 569184 ms on dse-dev-dataproc-w-53.c.k-ddh-lle.internal (executor 57) (499/500)
18/02/26 08:51:40 INFO spark.ExecutorAllocationManager: Request to remove executorIds: 57
18/02/26 08:51:40 INFO cluster.YarnClientSchedulerBackend: Requesting to kill executor(s) 57
18/02/26 08:51:40 INFO cluster.YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 57
18/02/26 08:51:40 INFO spark.ExecutorAllocationManager: Removing executor 57 because it has been idle for 60 seconds (new desired total will be 1)
18/02/26 08:51:42 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 57.
18/02/26 08:51:42 INFO scheduler.DAGScheduler: Executor lost: 57 (epoch 7)
18/02/26 08:51:42 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 57 from BlockManagerMaster.
18/02/26 08:51:42 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(57, dse-dev-dataproc-w-53.c.k-ddh-lle.internal, 53072, None)
18/02/26 08:51:42 INFO storage.BlockManagerMaster: Removed 57 successfully in removeExecutor
18/02/26 08:51:42 INFO cluster.YarnScheduler: Executor 57 on dse-dev-dataproc-w-53.c.k-ddh-lle.internal killed by driver.
18/02/26 08:51:42 INFO spark.ExecutorAllocationManager: Existing executor 57 has been removed (new total is 1)
Once the above task is completed when it starts the next Stage it looks for the shuffled data.
18/02/26 08:52:33 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.206.52.190:42676
18/02/26 08:52:33 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 7 to 10.206.52.157:45812
18/02/26 08:52:33 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 7 to 10.206.52.177:53612
18/02/26 08:52:33 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.206.52.166:41901
This window will hang for sometime and the tasks fail with the following exception.
18/02/26 09:12:33 INFO BlockManagerMaster: Removal of executor 21 requested
18/02/26 09:12:33 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asked to remove non-existent executor 21
18/02/26 00:12:33 INFO BlockManagerMasterEndpoint: Trying to remove executor 21 from BlockManagerMaster.
18/02/26 09:12:33 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1480517110174_0001_01_000049 on host: ip-10-138-114-125.ec2.internal. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1480517110174_0001_01_000049
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
My Job executor settings:
spark.driver.maxResultSize 3840m
spark.driver.memory 16G
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.maxExecutors 100
spark.dynamicAllocation.minExecutors 1
spark.executor.cores 6
spark.executor.id driver
spark.executor.instances 20
spark.executor.memory 30G
spark.hadoop.yarn.timeline-service.enabled False
spark.shuffle.service.enabled true
spark.sql.catalogImplementation hive
spark.sql.parquet.cacheMetadata false

Why would Spark executors be removed (with "ExecutorAllocationManager: Request to remove executorIds" in the logs)?

Im trying to execute a spark job in an AWS cluster of 6 c4.2xlarge nodes and I don't know why Spark is killing the executors...
Any help will be appreciated
Here the spark submit command:
. /usr/bin/spark-submit --packages="com.databricks:spark-avro_2.11:3.2.0" --jars RedshiftJDBC42-1.2.1.1001.jar --deploy-mode client --master yarn --num-executors 12 --executor-cores 3 --executor-memory 7G --driver-memory 7g --py-files dependencies.zip iface_extractions.py 2016-10-01 > output.log
At line this line starts to remove executors
17/05/25 14:42:50 INFO ExecutorAllocationManager: Request to remove executorIds: 5, 3
Output spark-submit log:
Ivy Default Cache set to: /home/hadoop/.ivy2/cache
The jars for the packages stored in: /home/hadoop/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-avro_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found com.databricks#spark-avro_2.11;3.2.0 in central
found org.slf4j#slf4j-api;1.7.5 in central
found org.apache.avro#avro;1.7.6 in central
found org.codehaus.jackson#jackson-core-asl;1.9.13 in central
found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in central
found com.thoughtworks.paranamer#paranamer;2.3 in central
found org.xerial.snappy#snappy-java;1.0.5 in central
found org.apache.commons#commons-compress;1.4.1 in central
found org.tukaani#xz;1.0 in central
:: resolution report :: resolve 284ms :: artifacts dl 8ms
:: modules in use:
com.databricks#spark-avro_2.11;3.2.0 from central in [default]
com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
org.apache.avro#avro;1.7.6 from central in [default]
org.apache.commons#commons-compress;1.4.1 from central in [default]
org.codehaus.jackson#jackson-core-asl;1.9.13 from central in [default]
org.codehaus.jackson#jackson-mapper-asl;1.9.13 from central in [default]
org.slf4j#slf4j-api;1.7.5 from central in [default]
org.tukaani#xz;1.0 from central in [default]
org.xerial.snappy#snappy-java;1.0.5 from central in [default]
:: evicted modules:
org.slf4j#slf4j-api;1.6.4 by [org.slf4j#slf4j-api;1.7.5] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 10 | 0 | 0 | 1 || 9 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 9 already retrieved (0kB/8ms)
17/05/25 14:41:37 INFO SparkContext: Running Spark version 2.1.0
17/05/25 14:41:38 INFO SecurityManager: Changing view acls to: hadoop
17/05/25 14:41:38 INFO SecurityManager: Changing modify acls to: hadoop
17/05/25 14:41:38 INFO SecurityManager: Changing view acls groups to:
17/05/25 14:41:38 INFO SecurityManager: Changing modify acls groups to:
17/05/25 14:41:38 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
17/05/25 14:41:38 INFO Utils: Successfully started service 'sparkDriver' on port 37132.
17/05/25 14:41:38 INFO SparkEnv: Registering MapOutputTracker
17/05/25 14:41:38 INFO SparkEnv: Registering BlockManagerMaster
17/05/25 14:41:38 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/05/25 14:41:38 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/05/25 14:41:38 INFO DiskBlockManager: Created local directory at /mnt/tmp/blockmgr-e368a261-c1a1-49e7-8533-8081896a45e4
17/05/25 14:41:38 INFO MemoryStore: MemoryStore started with capacity 4.0 GB
17/05/25 14:41:38 INFO SparkEnv: Registering OutputCommitCoordinator
17/05/25 14:41:39 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/05/25 14:41:39 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.185.53.161:4040
17/05/25 14:41:39 INFO Utils: Using initial executors = 12, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
17/05/25 14:41:39 INFO RMProxy: Connecting to ResourceManager at ip-10-185-53-161.eu-west-1.compute.internal/10.185.53.161:8032
17/05/25 14:41:39 INFO Client: Requesting a new application from cluster with 5 NodeManagers
17/05/25 14:41:40 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
17/05/25 14:41:40 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
17/05/25 14:41:40 INFO Client: Setting up container launch context for our AM
17/05/25 14:41:40 INFO Client: Setting up the launch environment for our AM container
17/05/25 14:41:40 INFO Client: Preparing resources for our AM container
17/05/25 14:41:40 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
17/05/25 14:41:42 INFO Client: Uploading resource file:/mnt/tmp/spark-4f534fa1-c377-4113-9c86-96d5cdab4cb5/__spark_libs__6500399427935716229.zip -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/__spark_libs__6500399427935716229.zip
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/RedshiftJDBC42-1.2.1.1001.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/RedshiftJDBC42-1.2.1.1001.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/com.databricks_spark-avro_2.11-3.2.0.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/com.databricks_spark-avro_2.11-3.2.0.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.slf4j_slf4j-api-1.7.5.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.apache.avro_avro-1.7.6.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.apache.avro_avro-1.7.6.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.codehaus.jackson_jackson-core-asl-1.9.13.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/com.thoughtworks.paranamer_paranamer-2.3.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.xerial.snappy_snappy-java-1.0.5.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.apache.commons_commons-compress-1.4.1.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/.ivy2/jars/org.tukaani_xz-1.0.jar -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/org.tukaani_xz-1.0.jar
17/05/25 14:41:43 INFO Client: Uploading resource file:/etc/spark/conf/hive-site.xml -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/hive-site.xml
17/05/25 14:41:43 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/pyspark.zip
17/05/25 14:41:43 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/py4j-0.10.4-src.zip -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/py4j-0.10.4-src.zip
17/05/25 14:41:43 INFO Client: Uploading resource file:/home/hadoop/dependencies.zip -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/dependencies.zip
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/com.databricks_spark-avro_2.11-3.2.0.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.apache.avro_avro-1.7.6.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar added multiple times to distributed cache.
17/05/25 14:41:43 WARN Client: Same path resource file:/home/hadoop/.ivy2/jars/org.tukaani_xz-1.0.jar added multiple times to distributed cache.
17/05/25 14:41:43 INFO Client: Uploading resource file:/mnt/tmp/spark-4f534fa1-c377-4113-9c86-96d5cdab4cb5/__spark_conf__1516567354161750682.zip -> hdfs://ip-10-185-53-161.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1495720658394_0004/__spark_conf__.zip
17/05/25 14:41:43 INFO SecurityManager: Changing view acls to: hadoop
17/05/25 14:41:43 INFO SecurityManager: Changing modify acls to: hadoop
17/05/25 14:41:43 INFO SecurityManager: Changing view acls groups to:
17/05/25 14:41:43 INFO SecurityManager: Changing modify acls groups to:
17/05/25 14:41:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
17/05/25 14:41:43 INFO Client: Submitting application application_1495720658394_0004 to ResourceManager
17/05/25 14:41:43 INFO YarnClientImpl: Submitted application application_1495720658394_0004
17/05/25 14:41:43 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1495720658394_0004 and attemptId None
17/05/25 14:41:44 INFO Client: Application report for application_1495720658394_0004 (state: ACCEPTED)
17/05/25 14:41:44 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1495723303463
final status: UNDEFINED
tracking URL: http://ip-10-185-53-161.eu-west-1.compute.internal:20888/proxy/application_1495720658394_0004/
user: hadoop
17/05/25 14:41:45 INFO Client: Application report for application_1495720658394_0004 (state: ACCEPTED)
17/05/25 14:41:46 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
17/05/25 14:41:46 INFO Client: Application report for application_1495720658394_0004 (state: ACCEPTED)
17/05/25 14:41:46 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip-10-185-53-161.eu-west-1.compute.internal, PROXY_URI_BASES -> http://ip-10-185-53-161.eu-west-1.compute.internal:20888/proxy/application_1495720658394_0004), /proxy/application_1495720658394_0004
17/05/25 14:41:46 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
17/05/25 14:41:47 INFO Client: Application report for application_1495720658394_0004 (state: RUNNING)
17/05/25 14:41:47 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.185.52.31
ApplicationMaster RPC port: 0
queue: default
start time: 1495723303463
final status: UNDEFINED
tracking URL: http://ip-10-185-53-161.eu-west-1.compute.internal:20888/proxy/application_1495720658394_0004/
user: hadoop
17/05/25 14:41:47 INFO YarnClientSchedulerBackend: Application application_1495720658394_0004 has started running.
17/05/25 14:41:47 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37860.
17/05/25 14:41:47 INFO NettyBlockTransferService: Server created on 10.185.53.161:37860
17/05/25 14:41:47 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/05/25 14:41:47 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.185.53.161, 37860, None)
17/05/25 14:41:47 INFO BlockManagerMasterEndpoint: Registering block manager 10.185.53.161:37860 with 4.0 GB RAM, BlockManagerId(driver, 10.185.53.161, 37860, None)
17/05/25 14:41:47 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.185.53.161, 37860, None)
17/05/25 14:41:47 INFO BlockManager: external shuffle service port = 7337
17/05/25 14:41:47 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.185.53.161, 37860, None)
17/05/25 14:41:47 INFO EventLoggingListener: Logging events to hdfs:///var/log/spark/apps/application_1495720658394_0004
17/05/25 14:41:47 INFO Utils: Using initial executors = 12, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
17/05/25 14:41:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.185.52.31:57406) with ID 5
17/05/25 14:41:50 INFO ExecutorAllocationManager: New executor 5 has registered (new total is 1)
17/05/25 14:41:50 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-185-52-31.eu-west-1.compute.internal:38781 with 4.0 GB RAM, BlockManagerId(5, ip-10-185-52-31.eu-west-1.compute.internal, 38781, None)
17/05/25 14:41:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.185.53.45:40096) with ID 3
17/05/25 14:41:50 INFO ExecutorAllocationManager: New executor 3 has registered (new total is 2)
17/05/25 14:41:50 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-185-53-45.eu-west-1.compute.internal:43702 with 4.0 GB RAM, BlockManagerId(3, ip-10-185-53-45.eu-west-1.compute.internal, 43702, None)
17/05/25 14:41:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.185.53.135:42390) with ID 2
17/05/25 14:41:50 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 3)
17/05/25 14:41:50 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-185-53-135.eu-west-1.compute.internal:41552 with 4.0 GB RAM, BlockManagerId(2, ip-10-185-53-135.eu-west-1.compute.internal, 41552, None)
17/05/25 14:41:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.185.53.10:60612) with ID 1
17/05/25 14:41:50 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 4)
17/05/25 14:41:50 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-185-53-10.eu-west-1.compute.internal:33391 with 4.0 GB RAM, BlockManagerId(1, ip-10-185-53-10.eu-west-1.compute.internal, 33391, None)
17/05/25 14:41:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.185.53.68:57424) with ID 4
17/05/25 14:41:50 INFO ExecutorAllocationManager: New executor 4 has registered (new total is 5)
17/05/25 14:41:50 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-185-53-68.eu-west-1.compute.internal:34222 with 4.0 GB RAM, BlockManagerId(4, ip-10-185-53-68.eu-west-1.compute.internal, 34222, None)
17/05/25 14:42:09 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
17/05/25 14:42:09 INFO SharedState: Warehouse path is 'hdfs:///user/spark/warehouse'.
17/05/25 14:42:10 WARN Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
17/05/25 14:42:11 INFO CodeGenerator: Code generated in 170.416763 ms
17/05/25 14:42:11 INFO SparkContext: Starting job: collect at /home/hadoop/iface_extractions/select_fields.py:90
17/05/25 14:42:11 INFO DAGScheduler: Got job 0 (collect at /home/hadoop/iface_extractions/select_fields.py:90) with 1 output partitions
17/05/25 14:42:11 INFO DAGScheduler: Final stage: ResultStage 0 (collect at /home/hadoop/iface_extractions/select_fields.py:90)
17/05/25 14:42:11 INFO DAGScheduler: Parents of final stage: List()
17/05/25 14:42:11 INFO DAGScheduler: Missing parents: List()
17/05/25 14:42:11 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[2] at collect at /home/hadoop/iface_extractions/select_fields.py:90), which has no missing parents
17/05/25 14:42:11 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 7.5 KB, free 4.0 GB)
17/05/25 14:42:11 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 4.1 KB, free 4.0 GB)
17/05/25 14:42:11 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.185.53.161:37860 (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:11 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996
17/05/25 14:42:11 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at collect at /home/hadoop/iface_extractions/select_fields.py:90)
17/05/25 14:42:11 INFO YarnScheduler: Adding task set 0.0 with 1 tasks
17/05/25 14:42:11 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-185-53-135.eu-west-1.compute.internal, executor 2, partition 0, PROCESS_LOCAL, 5899 bytes)
17/05/25 14:42:11 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-185-53-135.eu-west-1.compute.internal:41552 (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:12 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1101 ms on ip-10-185-53-135.eu-west-1.compute.internal (executor 2) (1/1)
17/05/25 14:42:12 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/05/25 14:42:12 INFO DAGScheduler: ResultStage 0 (collect at /home/hadoop/iface_extractions/select_fields.py:90) finished in 1.109 s
17/05/25 14:42:12 INFO DAGScheduler: Job 0 finished: collect at /home/hadoop/iface_extractions/select_fields.py:90, took 1.290037 s
17/05/25 14:42:12 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.185.53.161:37860 in memory (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:12 INFO SparkContext: Starting job: collect at /home/hadoop/iface_extractions/select_fields.py:91
17/05/25 14:42:12 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-185-53-135.eu-west-1.compute.internal:41552 in memory (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:12 INFO DAGScheduler: Got job 1 (collect at /home/hadoop/iface_extractions/select_fields.py:91) with 1 output partitions
17/05/25 14:42:12 INFO DAGScheduler: Final stage: ResultStage 1 (collect at /home/hadoop/iface_extractions/select_fields.py:91)
17/05/25 14:42:12 INFO DAGScheduler: Parents of final stage: List()
17/05/25 14:42:12 INFO DAGScheduler: Missing parents: List()
17/05/25 14:42:12 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at collect at /home/hadoop/iface_extractions/select_fields.py:91), which has no missing parents
17/05/25 14:42:12 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 7.5 KB, free 4.0 GB)
17/05/25 14:42:12 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 4.1 KB, free 4.0 GB)
17/05/25 14:42:12 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.185.53.161:37860 (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:12 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:996
17/05/25 14:42:12 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at collect at /home/hadoop/iface_extractions/select_fields.py:91)
17/05/25 14:42:12 INFO YarnScheduler: Adding task set 1.0 with 1 tasks
17/05/25 14:42:12 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, ip-10-185-53-68.eu-west-1.compute.internal, executor 4, partition 0, PROCESS_LOCAL, 5900 bytes)
17/05/25 14:42:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-185-53-68.eu-west-1.compute.internal:34222 (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:14 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1047 ms on ip-10-185-53-68.eu-west-1.compute.internal (executor 4) (1/1)
17/05/25 14:42:14 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/05/25 14:42:14 INFO DAGScheduler: ResultStage 1 (collect at /home/hadoop/iface_extractions/select_fields.py:91) finished in 1.047 s
17/05/25 14:42:14 INFO DAGScheduler: Job 1 finished: collect at /home/hadoop/iface_extractions/select_fields.py:91, took 1.054768 s
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 13.109425 ms
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 12.568665 ms
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 11.257538 ms
17/05/25 14:42:14 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 10.185.53.161:37860 in memory (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:14 INFO BlockManagerInfo: Removed broadcast_1_piece0 on ip-10-185-53-68.eu-west-1.compute.internal:34222 in memory (size: 4.1 KB, free: 4.0 GB)
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 11.563958 ms
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 18.189301 ms
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 13.490762 ms
17/05/25 14:42:14 INFO CodeGenerator: Code generated in 15.156166 ms
17/05/25 14:42:50 INFO ExecutorAllocationManager: Request to remove executorIds: 5, 3
17/05/25 14:42:50 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 5, 3
17/05/25 14:42:50 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 5, 3
17/05/25 14:42:50 INFO ExecutorAllocationManager: Removing executor 5 because it has been idle for 60 seconds (new desired total will be 4)
17/05/25 14:42:50 INFO ExecutorAllocationManager: Removing executor 3 because it has been idle for 60 seconds (new desired total will be 3)
17/05/25 14:42:50 INFO ExecutorAllocationManager: Request to remove executorIds: 1
17/05/25 14:42:50 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 1
17/05/25 14:42:50 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 1
17/05/25 14:42:50 INFO ExecutorAllocationManager: Removing executor 1 because it has been idle for 60 seconds (new desired total will be 2)
17/05/25 14:42:50 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 5.
17/05/25 14:42:50 INFO DAGScheduler: Executor lost: 5 (epoch 0)
17/05/25 14:42:50 INFO BlockManagerMasterEndpoint: Trying to remove executor 5 from BlockManagerMaster.
17/05/25 14:42:50 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(5, ip-10-185-52-31.eu-west-1.compute.internal, 38781, None)
17/05/25 14:42:50 INFO BlockManagerMaster: Removed 5 successfully in removeExecutor
17/05/25 14:42:50 INFO YarnScheduler: Executor 5 on ip-10-185-52-31.eu-west-1.compute.internal killed by driver.
17/05/25 14:42:50 INFO ExecutorAllocationManager: Existing executor 5 has been removed (new total is 4)
17/05/25 14:42:51 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 1.
17/05/25 14:42:51 INFO DAGScheduler: Executor lost: 1 (epoch 0)
17/05/25 14:42:51 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
17/05/25 14:42:51 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, ip-10-185-53-10.eu-west-1.compute.internal, 33391, None)
17/05/25 14:42:51 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor
17/05/25 14:42:51 INFO YarnScheduler: Executor 1 on ip-10-185-53-10.eu-west-1.compute.internal killed by driver.
17/05/25 14:42:51 INFO ExecutorAllocationManager: Existing executor 1 has been removed (new total is 3)
17/05/25 14:42:51 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 3.
17/05/25 14:42:51 INFO DAGScheduler: Executor lost: 3 (epoch 0)
17/05/25 14:42:51 INFO BlockManagerMasterEndpoint: Trying to remove executor 3 from BlockManagerMaster.
17/05/25 14:42:51 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, ip-10-185-53-45.eu-west-1.compute.internal, 43702, None)
17/05/25 14:42:51 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
17/05/25 14:42:51 INFO YarnScheduler: Executor 3 on ip-10-185-53-45.eu-west-1.compute.internal killed by driver.
17/05/25 14:42:51 INFO ExecutorAllocationManager: Existing executor 3 has been removed (new total is 2)
17/05/25 14:43:12 INFO ExecutorAllocationManager: Request to remove executorIds: 2
17/05/25 14:43:12 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 2
17/05/25 14:43:12 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 2
17/05/25 14:43:12 INFO ExecutorAllocationManager: Removing executor 2 because it has been idle for 60 seconds (new desired total will be 1)
17/05/25 14:43:13 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 2.
17/05/25 14:43:13 INFO DAGScheduler: Executor lost: 2 (epoch 0)
17/05/25 14:43:13 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster.
17/05/25 14:43:13 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, ip-10-185-53-135.eu-west-1.compute.internal, 41552, None)
17/05/25 14:43:13 INFO BlockManagerMaster: Removed 2 successfully in removeExecutor
17/05/25 14:43:13 INFO YarnScheduler: Executor 2 on ip-10-185-53-135.eu-west-1.compute.internal killed by driver.
17/05/25 14:43:13 INFO ExecutorAllocationManager: Existing executor 2 has been removed (new total is 1)
17/05/25 14:43:14 INFO ExecutorAllocationManager: Request to remove executorIds: 4
17/05/25 14:43:14 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 4
17/05/25 14:43:14 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 4
17/05/25 14:43:14 INFO ExecutorAllocationManager: Removing executor 4 because it has been idle for 60 seconds (new desired total will be 0)
17/05/25 14:43:17 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 4.
17/05/25 14:43:17 INFO DAGScheduler: Executor lost: 4 (epoch 0)
17/05/25 14:43:17 INFO BlockManagerMasterEndpoint: Trying to remove executor 4 from BlockManagerMaster.
17/05/25 14:43:17 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(4, ip-10-185-53-68.eu-west-1.compute.internal, 34222, None)
17/05/25 14:43:17 INFO BlockManagerMaster: Removed 4 successfully in removeExecutor
17/05/25 14:43:17 INFO YarnScheduler: Executor 4 on ip-10-185-53-68.eu-west-1.compute.internal killed by driver.
17/05/25 14:43:17 INFO ExecutorAllocationManager: Existing executor 4 has been removed (new total is 0)
My guess is that you've got Dynamic Resource Allocation enabled in your Spark configuration.
Spark provides a mechanism to dynamically adjust the resources your application occupies based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand. This feature is particularly useful if multiple applications share resources in your Spark cluster.
This feature is disabled by default and available on all coarse-grained cluster managers, i.e. standalone mode, YARN mode, and Mesos coarse-grained mode.
I highlighted the relevant part that says it is disabled by default and hence I can only guess that it was enabled.
From ExecutorAllocationManager:
An agent that dynamically allocates and removes executors based on the workload.
With that said, I'd use web UI and see if spark.dynamicAllocation.enabled property is enabled or not.
There are two requirements for using this feature (Dynamic Resource Allocation). First, your application must set spark.dynamicAllocation.enabled to true. Second, you must set up an external shuffle service on each worker node in the same cluster and set spark.shuffle.service.enabled to true in your application.
This is the line that prints out the INFO message:
logInfo("Request to remove executorIds: " + executors.mkString(", "))
You can also kill executors using SparkContext.killExecutors that gives a Spark developer a way to kill executors himself.
killExecutors(executorIds: Seq[String]): Boolean Request that the cluster manager kill the specified executors.
There are two killExecutors actually and they are very helpful for demo purposes as you can easily show how executors come and go.

Resources