I wrote shell script to invoke spark-submit in yarn-client mode . When i run shell script using nohup , my spark job is submitted to yarn. After sometime , I use yarn kill -applicationId <application_id> to kill spark application.
Yarn UI shows the application has been killed.But when i execute pstree -p | grep spark (on linux machine)
It still shows
-sh(23177)---spark-submit(23178)---java(23179)-+-{java}+
i.e the java process is not killed. How do i make sure that java process also gets killed when i kill spark application?
Thanks
Related
I want to programtically kill an EMR streaming task. If I kill it from EMR UI or boto client, it disappears in EMR, but it is still active in the Hadoop cluster (see this article). Only if I go through the Hadoop resource manager and kill it from there, the job is terminated.
How can do the same programatically?
You can ssh to cluster and use yarn application -kill application_id or use yarn api to kill application
Can not kill a YARN application through REST api
As #maxime-g said, the only way to kill a yarn application is to run the following command: yarn application -kill application_id.
But it is possible to run an EMR which runs a script on the master node, and that script should include this command, and possible take an argument.
I have shell script which initializing the spark streaming job in Yarn cluster mode.I have scheduled shell script through Autosys. Now when i kill autosys job i would like to kill this spark job running in cluster mode as well.
I have tried using yarn application -kill in shell script on error return code but its not getting executed. but I am able to kill this job from another shell window. yarn application -kill command works perfectly there and kill the application.
Is there any workaround to kill cluster mode job on interruption (automatically ) through same shell ?
In error return code logic -> run yarn application -kill <$appid> as orphan process .
Stopping standalone spark master fails with the following message:
$ ./sbin/stop-master.sh
no org.apache.spark.deploy.master.Master to stop
Why? There is one Spark Standalone master up and running.
Spark master was started under different user.
/tmp/Spark-ec2-user-org.apache.spark.deploy.master.Master-1.pid
Was not accessible.Had to login under different user who actually started the stand alone cluster manager master.
In my case, I was able to open the master WebUI page on browser where it clearly mentioned that Spark Master is running on port 7077.
However, while trying to stop using stop-all.sh, was facing no org.apache.spark.deploy.master.Master to stop . So I tried a different method - to find what process is running on port 7077 using below command :
lsof -i :7077
I got the result as java with a PID of 112099
Used the below command to kill that process :
kill 112099
After this when I checked the WebUI, it had stopped working. Successfully killed the Spark Master.
We are working on spark cluster where spark job(s) are getting submitted successfully even after spark "Master" process is killed.
Here is the complete details about what we are doing.
process details :-
jps
19560 NameNode
18369 QuorumPeerMain
22414 Jps
20168 ResourceManager
22235 Master
and we submitted one spark job to this Master using the command like
spark-1.6.1-bin-without-hadoop/bin/spark-submit --class com.test.test --master yarn-client --deploy-mode client test.jar -incomingHost hostIP
where hostIP having correct ip address of the machine running "Master" process.
And after this we are able to see the job in RM Web UI also.
Now when we kill the "Master" Process , we can see the submitted job is running fine which is expected here as we we are using yarn mode and that job will run without any issue.
Now we killed the "Master" process.
But when we submit once again the same command "spark-submit" pointing to same Master IP which is currently down , we see once more job in RM web ui (host:8088), This we are not able to understand as Spark "Master" is killed ( and host:8080) the spark UI also does not come.
Please note that we are using "yarn-client" mode as below code
sparkProcess = new SparkLauncher()
.......
.setSparkHome(System.getenv("SPARK_HOME"))
.setMaster("yarn-client")
.setDeployMode("client")
Please some can explain me about this behaviour ? Did not found after reading many blogs (http://spark.apache.org/docs/latest/running-on-yarn.html ) and official docs .
Thanks
Please check cluster overview. As per your description you are running spark application on yarn cluster mode with driver placed in instance where you launch command. The Spark master is related to spark standalone cluster mode which on your case launch command should be similar to
spark-submit --master spark://your-spark-master-address:port
I tried 3 ways to kill it, but unsuccessful.
Clicked kill link on MasterWebUI, sometimes processes of the master and works had been downed.
spark-submit --master spark://xx:7077 --kill app-20160920095657-0000, the master url is correct, but throwing exception Exception in thread "main" org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect to server
at org.apache.spark.deploy.rest.RestSubmissionClient$$anonfun$killSubmission$3.apply(RestSubmissionClient.scala:130)
spark-class org.apache.spark.deploy.Client kill spark://xx:7077 20160920095657-0000, only output Use ./bin/spark-submit with "--master spark://host:port" and finished in a short time.
Get the running driverId from spark UI, and hit the post rest call(spark master rest port like 6066) to kill the pipeline.
curl -X POST http://localhost:6066/v1/submissions/kill/driverId
Hope it helps