From where can see how many spark job running in server? - apache-spark

I submitted spark job in linux server and can see in console and determine it is running or not.
But in case of production multiple spark job submiting and running on server,
So at that time from where I can see how many spark job running?

You can get list of running applications from command line (assuming that you are using yarn)
yarn application --list
more about yarn command line operations

Every SparkContext launches a web UI, by default on port 4040 on the host you submit your application. For more application monitoring details check this link

Related

Spark jobs not showing up in Hadoop UI in Google Cloud

I created a cluster in Google Cloud and submitted a spark job. Then I connected to the UI following these instructions: I created an ssh tunnel and used it to open the Hadoop web interface. But the job is not showing up.
Some extra information:
If I connect to the master node of the cluster via ssh and run spark-shell, this "job" does show up in the hadoop web interface.
I'm pretty sure I did this before and I could see my jobs (both running and already finished). I don't know what happened in between for them to stop appearing.
The problem was that I was running my jobs in local mode. My code had a .master("local[*]") that was causing this. After removing it, the jobs showed up in the Hadoop UI as before.

How can I see the aggregated logs for a Spark standalone cluster

With Spark running over Yarn, I could simply use yarn -logs -applicationId appId to see the aggregated log, after a Spark job is finished. What is the equivalent method for a Spark standalone cluster?
Via the Web Interface:
Spark’s standalone mode offers a web-based user interface to monitor
the cluster. The master and each worker has its own web UI that shows
cluster and job statistics. By default you can access the web UI for
the master at port 8080. The port can be changed either in the
configuration file or via command-line options.
In addition, detailed log output for each job is also written to the
work directory of each slave node (SPARK_HOME/work by default). You
will see two files for each job, stdout and stderr, with all output it
wrote to its console.
Please find more information in Monitoring and Instrumentation.

Spark History Server .... list of running jobs

I am using Cloudera 5.4.1 with Spark 1.3.0. When I go to spark history server, I can see list of completed jobs and list of incomplete jobs.
However many jobs listed as incomplete are the ones which were killed.
So how does one see list of "running" jobs. Not the ones which were killed.
also how does one kill a running spark job by taking the application id from the history server?
Following is from Cloudera documentation:
To access the web application UI of a running Spark application, open http://spark_driver_host:4040 in a web browser. If multiple applications are running on the same host, the web application binds to successive ports beginning with 4040 (4041, 4042, and so on). The web application is available only for the duration of the application.
For 5.4x
For 5.9x
Answer for your second question:
You can use yarn CLI to kill the Spark application.
Ex: yarn application -kill <application ID>

What is the difference between web UIs on 4040 and 8080?

There are two different web UIs (one is for standalone mode only). Can I use web UI on port 4040 when I am launching Spark in standalone mode? (example:spark-class.cmd org.apache.spark.deploy.master.Master- web ui 8080 is working, 4040 - not.) What is the main difference between these UIs?
Is it possible for me to launch Spark (without hadoop, hdfs, yarn etc), to keep it up and to submit my jars(classes) into it? I want to watch job statistics after it finishes. I am trying something like this:
Server: Spark\bin>spark-class.cmd org.apache.spark.deploy.master.Master
Worker: Spark\bin>spark-class.cmd org.apache.spark.deploy.worker.Worker spark://169.254.8.45:7077 --cores 4 --memory 512M
Submit: Spark\bin>spark-submit.cmd --class demo.TreesSample --master spark://169.254.8.45:7077 file:///E:/spark-demo/target/demo.jar
It runs. It gets new WebUI on port 4040 up for this task. I dont see anything in Master's ui on 8080.
Currently I'm using win7 x64, spark-1.5.2-bin-hadoop2.6. I can switch into linux if it matters.
You should be able to change the web UI port for standalone Master using spark.master.ui.port or SPARK_MASTER_WEBUI_PORT as described in Configuring Ports for Network Security / Standalone mode only.
Standalone Master's web UI is a management console of a cluster manager (that happens to be part of Apache Spark, but could've been a separate product as Hadoop YARN and Apache Mesos). Having said that, it can often be confusing what the two web UIs have in common, and the answer is nothing.
The Spark driver's web UI is to show the progress of your computations (jobs, stages, storage for RDD persistence, broadcasts, accumulators) while standalone Master's web UI is to let you know the current state of your "operating environment" (aka the Spark Standalone cluster).
I leave the other part of your question about History server to #Sumit's answer.
Yes, you can launch the Spark as a standalone server, without any Hadoop or HDFS. Also as soon as you submit your job to master, it will show your job either in in-"Running jobs" or "Jobs Completed" section.
You can also enable History Server for preserving the job Statistics and analyzing the same at a later time -
./sbin/start-history-server.sh
Refer Here for more details on enabling History server

How can I verify that DSE Spark Shell is distributing across the cluster

Is it possible to verify from within the Spark shell what nodes if the shell is connected to the cluster or is running just in local mode? I'm hoping to use that to investigate the following problem:
I've used DSE to setup a small 3 node Cassandra Analytics cluster. I can log onto any of the 3 servers and run dse spark and bring up the Spark shell. I have also verified that all 3 servers have the Spark master configured by running dsetool sparkmaster.
However, when I run any task using the Spark shell, it appears that the it is only running locally. I ran a small test command:
val rdd = sc.cassandraTable("test", "test_table")
rdd.count
When I check the Spark Master webpage, I see that only one server is running the job.
I suspect that when I run dse spark it's running the shell in local mode. I looked up how to specific a master for the Spark 0.9.1 shell and even when I use MASTER=<sparkmaster> dse spark (from the Programming Guide) it still runs only in local mode.
Here's a walkthrough once you've started a DSE 4.5.1 cluster with 3 nodes, all set for Analytics Spark mode.
Once the cluster is up and running, you can determine which node is the Spark Master with command dsetool sparkmaster. This command just prints the current master; it does not affect which node is the master and does not start/stop it.
Point a web browser to the Spark Master web UI at the given IP address and port 7080. You should see 3 workers in the ALIVE state, and no Running Applications. (You may have some DEAD workers or Completed Applications if previous Spark jobs had happened on this cluster.)
Now on one node bring up the Spark shell with dse spark. If you check the Spark Master web UI, you should see one Running Application named "Spark shell". It will probably show 1 core allocated (the default).
If you click on the application ID link ("app-2014...") you'll see the details for that app, including one executor (worker). Any commands you give the Spark shell will run on this worker.
The default configuration is limiting the Spark master to only allowing each application to use 1 core, therefore the work will only be given to a single node.
To change this, login to the Spark master node and sudo edit the file /etc/dse/spark/spark-env.sh. Find the line that sets SPARK_MASTER_OPTS and remove the portion -Dspark.deploy.defaultCores=1. Then restart DSE on this node (sudo service dse restart).
Once it comes up, check the Spark master web UI and repeat the test with the Spark shell. You should see that it's been allocated more cores, and any jobs it performs will happen on multiple nodes.
In a production environment you'd want to set the number of cores more carefully so that a single job doesn't take all the resources.

Resources