Spark 2.1.0 Web UI not showing "Application Detail UI" - apache-spark

I recently upgraded by Spark 1.4.1 setup to the latest 2.1.0. I'm running Spark in Standalone mode. Everything seems to work fine except the web UI.
The post here shows that this can happen for large application logs. But I'm pretty sure my application log is not large (~8.5MB).
I have copied the same setup that I was using in 1.4.1 with following parameters as
spark.eventLog.enabled true
spark.eventLog.dir /LOCALPATH/spark/event-log
spark.history.fs.logDirectory /LOCALPATH/spark/event-log
The web UI shows current and previous applications. However, the "Application Detail UI" link is only available for running/active applications and does not show for completed applications. I've checked the eventLog directory and it does have non-empty log file for completed applications.
Attached are images for reference as well. Am I missing some new property introduced in ver 2.1.0? I've gone through the documentation multiple times and couldn't find any.
EDIT: Got it working (for any future references)
I got it working through spark history server after following the steps explained Spark History Server. In particular, I added a log4j property
log4j.logger.org.apache.spark.deploy.history=INFO
in
SPARK_HOME/conf/log4j.properties
and start the history server and access the history server interface through
<history-server-host>:18080
where history-server-host is usually the same as your master node.

Related

Azure Live Metrics doesn't show Incoming Request , Outgoing request, Server Health

I have followed below Microsoft documents to achieve Application Insights
https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-get-started?tabs=maven
https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-agent
https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-standalone-arguments
Just to give some background - My war is deployed in JBoss EAP 7 Standalone web server which is hosted in Azure Virtual machine.
But some reason I don't see Live Metrics at all, could you please guide how to enable ?
FYI - I can see all other metrics in AI
I used to create a springboot web application with app insights following this doc.
And when I finished the code and configuration, I run the program in local environment, I can reach the 'live metric' page. When I stopped the program, the web page showed 'Not available: your app is offline or using an older SDK'.
So according to your description, I think you can make sure if you have started the web container first. Or you can debug your program in local environment to check your code and configuration. If the program runs well locally and still can't reach live metric, try to modify the pom.xml and use a newer version.

jhipster warn org.apache.kafka.clients.NetworkClient Broker may not be available

So I made a helloworld jhipster application and everything seems to be working fine but I'm getting this warning about every 3 seconds and it's clouding up my output:
WARN 542 --- [ad | producer-1] org.apache.kafka.clients.NetworkClient : [Producer clientId=producer-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
Anybody know what's causing this? It's apparently not critical because everything seems to be working but it is quite aggravating, since I have to scroll way up in the terminal to find any actually relevant output, like for example what port the application is running on.
It looks like you selected the option to use Apache Kafka when you generated your project. Using Kafka is completely optional and requires a few extra steps as described in the official documentation (Using Kafka).
If this is just a "Hello World" app your best option is probably to just regenerate the project without Kafka. Otherwise, you will have to follow the steps in the documentation I linked above.
Basic instructions to use Kafka
Install Docker Desktop if you don't have it already.
Restart your computer as requested, and remember to enable hardware virtualization in your BIOS if you have it disabled.
Navigate to the root folder of your project (where your /src/ folder is) and execute docker-compose -f src/main/docker/kafka.yml up -d
Wait for the process to complete.
Add .antMatchers("/api/<appName>-kafka/publish").permitAll() to your SecurityConfiguration.java where <appName> is the name you gave while generating your project. Note that you must add this line before .antMatchers("/api/**").authenticated().
Now you can launch your application.
At this point everything is configured, so the 'Broker may not be available' messages should be gone.

How to fix 'You are using an incompatible schema.xml configuration file' error in drupal for solCloud configuration

I am setting up SolrCloud configuration for already existed solr configuration with drupal-7. I have configured zookeeper in 3 different machines and SolrCloud in 2 other machines. All the conf files are present in the configs directory in zookeeper.
Everything is fine till here but communication between drupal and Solr in not happening due to the following error.
Error: "You are using an incompatible schema.xml configuration file. Please follow the instructions in the handbook for setting up Solr."
Currently, Application is running on drupal-7 and solr-7.x-1.13 module is installed.
Till now, I didn't touch any solr configuration files in drupal server.
What else configuration I have to modify here to resolve the schema.xml incompatibility error?
I tried by configuring solrCloud using 5.4.1 and 6.4.1 version but I am getting same error.
In my case, what fixed this issue was killing the solr process and then starting solr again.
First, find the relevant solr process by trying to start solr...
cd /base/path/for/your/solr
bin/solr start
You will see something like...
Port 8983 is already being used by another process (pid: 12345)
Kill whatever process ID is mentioned in the "already being used" message...
kill 12345
Now you should be able to start solr...
bin/solr start
After this restart of solr, I refreshed the page in Drupal and the "incompatible schema.xml" message was gone.
You will have to look at the solr error logs to see what part of your schema.xml is not right.
You'd really have to do this is each one of your solr cloud nodes, since there isn't any guarantee that zookeeper uploaded the correct schema.xml on all shards, and that's why you could be getting that error.
You could use zkcli to upload your configs (https://lucene.apache.org/solr/guide/6_6/command-line-utilities.html), and then reload your collection on all nodes to apply the changes, but even then there's no guarantee it'll work.
To save time and stress, you could just use a SaaS service, such as https://opensolr.com
You can get it setup for free and you get a UI to edit your config files, upload your config files to your server, and a lot of other nice UI features to manage your solr index.

How to implement spark.ui.filter

I have a spark cluster set up on 2 CentOS machines. I want to secure the web UI of my cluster (master node). I have made a BasicAuthenticationFilter servlet. I am unable to understand:
how should I use spark.ui.filter to secure my web UI.
Where should I place the servlet/jar file.
Kindly help.
I also needed to handle this security problem to prevent unauthorized access to spark standalone UI. At last I fixed it after surfing on the web, the procedure is :
code and compile a java filter using standard basic authentication protocol, I refered to this [blog]: http://lambda.fortytools.com/post/26977061125/servlet-filter-for-http-basic-auth
packaged above filter class as a jar file, put it in $spark_home/jars/
add config lines in $spark_home/conf/spark-default.conf as :
spark.ui.filters xxx.BasicAuthFilter # the full class name
spark.test.BasicAuthFilter.params user=foo,password=cool,realm=some
the username and password need to provide to access the spark UI, “realm” is insignificant whatever you typed
restart all slave and master process and test to find it works
Hi place the jar file in all the nodes in the folder /opt/spark/conf/. In terminal, type the following commands:
Navigate to the directory /usr/local/share/jupyter/kernels/pyspark/kernel.json
Edit the file kernel.json
Add the following argument to the PYSPARK_SUBMIT_ARGS --jars /opt/spark/conf/filterauth.jar –conf spark.ui.filters=authenticate.MyFilter
Here, filterauth.jar is the jar file created and authenticate.MyFilter represents <package name>.<class name>
Hope this answers your query. :)

How can I see the log of Spark job server task?

I deployed spark job server according https://github.com/spark-jobserver/spark-jobserver. Then I created a job server project, and uploaded to the hob server. While I run the project, how can I see the logs?
It looks like it's not possible to see the logs while running a project. I browsed through the source code and couldn't find any references to a feature like this, and it's clearly not a feature of the ui. It seems like your only option would be to view the logs after running a job, which are stored by default in /var/log/job-server, which you probably already know.

Resources