I want to programtically kill an EMR streaming task. If I kill it from EMR UI or boto client, it disappears in EMR, but it is still active in the Hadoop cluster (see this article). Only if I go through the Hadoop resource manager and kill it from there, the job is terminated.
How can do the same programatically?
You can ssh to cluster and use yarn application -kill application_id or use yarn api to kill application
Can not kill a YARN application through REST api
As #maxime-g said, the only way to kill a yarn application is to run the following command: yarn application -kill application_id.
But it is possible to run an EMR which runs a script on the master node, and that script should include this command, and possible take an argument.
Related
I have shell script which initializing the spark streaming job in Yarn cluster mode.I have scheduled shell script through Autosys. Now when i kill autosys job i would like to kill this spark job running in cluster mode as well.
I have tried using yarn application -kill in shell script on error return code but its not getting executed. but I am able to kill this job from another shell window. yarn application -kill command works perfectly there and kill the application.
Is there any workaround to kill cluster mode job on interruption (automatically ) through same shell ?
In error return code logic -> run yarn application -kill <$appid> as orphan process .
I have a Mesos DCOS cluster running on AWS with Spark installed via the dcos package install spark command. I am able to successfully execute Spark jobs using the DCOS CLI: dcos spark run ...
Now I would like to execute Spark jobs from a Docker container running inside the Mesos cluster, but I'm not quite sure how to reach the running instance of spark. The idea would be to have a docker container execute the spark-submit command to submit a job to the Spark deployment instead of executing the same job from outside the cluster with the DCOS CLI.
Current documentation seems to be focused only on running Spark via the DCOS CLI - is there any way to reach the spark deployment from another application running inside the cluster?
DCOS IOT demo try something similar. https://github.com/amollenkopf/dcos-iot-demo
This guys run a spark docker and spark-submit in a marathon app. Check this Marathon descriptor: https://github.com/amollenkopf/dcos-iot-demo/blob/master/spatiotemporal-esri-analytics/rat01.json
Suppose there are 10 containers running on this machine(5 is mapreduce tasks, and 5 is spark on yarn executors).
And if I kill the node manager, what happens for these 10 containers process?
Before I restart the node manager ,what should I do first?
Killing nodemanager will only affect the containers of this particular node. All the running containers will get lost on restart/kill. They will get relaunched once the node comes up or the nodemanager process get start(if application/job still running).
NOTE: Jobs ApplicationMaster should not be running on this slave.
what happens when the node with the ApplicationMaster dies ?
In this case the yarn launchs a new ApplicationMaster on some other node. All the containers relaunched again in this case.
Answering according to hadoop 2.7.x dist: check this article: http://hortonworks.com/blog/resilience-of-yarn-applications-across-nodemanager-restarts/
If you don't have yarn.nodemanager.recovery.enabled set to true then you're container will be KILLED (spark or mapreduce or anything else) However you're job will most likely continue to run.
You need to check this property in your env using hadoop conf | grep yarn.nodemanager.recovery.dir . If it false which is for me by default then nothing you can do to prevent getting those container killed upon restart imo. However you can try to modify the flag and set other required property for future cases if you want containers to be recovered.
Look at this one too: http://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_ha_yarn_work_preserving_recovery.html
I wrote shell script to invoke spark-submit in yarn-client mode . When i run shell script using nohup , my spark job is submitted to yarn. After sometime , I use yarn kill -applicationId <application_id> to kill spark application.
Yarn UI shows the application has been killed.But when i execute pstree -p | grep spark (on linux machine)
It still shows
-sh(23177)---spark-submit(23178)---java(23179)-+-{java}+
i.e the java process is not killed. How do i make sure that java process also gets killed when i kill spark application?
Thanks
Now I have a job running on amazon ec2 and I use putty to connect with the ec2 cluster,but just know the connection of putty is lost.After I reconnect with the ec2 cluster I have no output of the job,so I don't know if my job is still running.Anybody know how to check the state of Spark job?
thanks
assuming you are on yarn cluster, you could run
yarn application -list
to get a list of appliactions and then run
yarn application -status applicationId
to know the status
It is good practice to use GNU Screen (or other similar tool) to keep session alive (but detached, if connection lost with machine) when working on remote machines.
The status of a Spark application can be ascertained from Spark UI (or Yarn UI).
If you are looking for cli command:
For stand-alone cluster use:
spark-submit --status <app-driver-id>
For yarn:
yarn application --status <app-id>