what is "spark.history.retainedApplications" points to - apache-spark

As per apache doc "http://spark.apache.org/docs/latest/monitoring.html"
spark.history.retainedApplications points to "The number of application UIs to retain. If this cap is exceeded, then the oldest applications will be removed"
But I see more than configured apps into the UI. Is it correct or it stores those many apps into memory only or load again into memory when needed. Please clarify. Thx

That setting specifically applies to the history server. If you don't have one started (it's typically used with YARN and Mesos I believe), then the setting you're after is spark.ui.retainedJobs. Check the Spark UI configuration parameters for more details.
These settings only apply to jobs, so in order to pass them to the master itself, check the spark.deploy options in the stand-alone deployment section. You can set them via the SPARK_MASTER_OPTS environment variable.
If you want to clean the data files produced by workers, check the spark.worker.cleanup options in the same section. You can set them via the SPARK_WORKER_OPTS environment variable on your workers.

Related

How to set the default Puppet environment to something other than production?

I did search, but didn't find exactly what I'm looking for. In the past, I've had multiple Puppet masters, one for each environment, then made scripts to promote code from one environment to the next on a monthly basis, but that process is a pain. I'm currently working on a single Puppet Master and am trying to take advantage of the environments feature but don't want the default environment set to production. This should be simple, but I've not yet worked out the solution. Without needing to set environments in the agents, how do I set the default environment on the master to "quarantine"? I'm running puppetserver 7.9.2.1 on Oracle Linux 8.
Without needing to set environments in the agents, how do I set the default environment on the master to "quarantine"?
This is the environment setting in the main puppet.conf configuration file. For [agent] configuration it specifies which environment the agent should request (which is not necessarily the one it gets), but for [server] configuration it specifies the default environment assigned when the agent does not request one and none is provided by the node terminus.
Example:
# ...
[server]
environment = quarantine
# ...
Because it has different significance to different Puppet components, you should avoid specifying this setting in the [main] section. You can meaningfully specify different values for this setting in different sections.
Although that answers what you actually asked, it might not be what you really want. Note well that the default environment set at the server will be overridden if a client requests a specific environment. If you want to prevent your Puppet clients from (successfully) specifying their own environments then you will want to set up an external node classifier that specifies at least the environment for each node. And if you have that, then whatever default you set in puppet.conf will be moot.

elasticsearch applying security on a running cluster

I've an ELK stack 7.6.2 with logstash, an elasticsearch cluster with 3 nodes and kibana. I would like to add security but the only doc I can fin always start 'from scratch' I would like to have an example on an already running cluster in order not te mess up with it. Thanks for your help.
Guillaume
You can not enable security features on an already running cluster. Security-settings are classified as static, meaning that they can not be dynamically updated on the fly:
static:
These settings must be set at the node level, either in the elasticsearch.yml file, or as an environment variable or on the command line when starting a node. They must be set on every relevant node in the cluster.
dynamic:
These settings can be dynamically updated on a live cluster with the cluster-update-settings API.
See https://www.elastic.co/guide/en/elasticsearch/reference/7.6/modules.html for reference and for all settings that can be dynamically updated (you won't find security settings there).
Also, from this guide (https://www.elastic.co/guide/en/elasticsearch/reference/current/get-started-enable-security.html) one can tell that you need to stop your running elasticsearch and kibana instances in order to enable security.
I hope I could help you.

Submit & monitor spark jobs via java in cluster mode

I have a java class which manage jobs and execute them via spark(using 1.6).
I am using the API - sparkLauncher. startApplication(SparkAppHandle.Listener... listeners) in order to monitor the state of the job.
The problem is I moved to work in a real cluster environment and this way can’t work when the master and workers are not on the same machine, as the internal implementation is making a use of localhost only (loopback) to open a port for the workers to bind to.
The API sparkLauncher.launch() works but doesn’t let me monitor the status.
What is the best practice for cluster environment using a java code?
I also saw the option of hidden Rest API, is it mature enough? Should I enable it in spark somehow (I am getting access denied, even though the port is open from outside) ?
REST API
In addition to viewing the metrics in the UI, they are also available as JSON. This gives developers an easy way to create new visualizations and monitoring tools for Spark. The JSON is available for both running applications, and in the history server. The endpoints are mounted at /api/v1. Eg., for the history server, they would typically be accessible at http://:18080/api/v1, and for a running application, at http://localhost:4040/api/v1.
More details you can find here.
Every SparkContext launches a web UI, by default on port 4040, that displays useful information about the application. This includes:
A list of scheduler stages and tasks
A summary of RDD sizes and memory usage
Environmental information.
Information about the running executors
You can access this interface by simply opening http://driver-node:4040 in a web browser. If multiple SparkContexts are running on the same host, they will bind to successive ports beginning with 4040 (4041, 4042, etc).
More details you can find here.

How does pbs/torque/maui choose node?

We know that all the nodes features are stored in server_priv/nodes file. Everytime when we're using:
qsub -l nodes=1:linux
or
#PBS -l nodes=1:linux
to submit jobs, since we may have hundreds of machines which have linux feature. I wonder how the torque selects the right node?
From the top to the bottom searching the server_priv/nodes file?
Alphabetical?
Depends on the machine workload?
Any help is greatly appreciated!
In this case, Maui is choosing the nodes to allocate to the job. Maui is the scheduler and therefore the decision-maker. I believe the default policy is firstavailable, which I think will be the first available node in the order that the nodes are specified in the nodes file (located in PBS_HOME/server_priv/nodes).
However, I don't know which node allocation policy your site is using. If you have access, you'd check the config file for Maui for NODEALLOCATIONPOLICY to see which one you are using. If you don't have access you'd need to contact an administrator. To better understand the different options for node allocation you can check out some of the Maui docs.

How to raise or lower the log level in puppet master?

I am using puppet 3.2.3, passenger and apache on CentOS 6. I have 680 compute nodes in a cluster along with 8 gateways users use to log in to the cluster and submit jobs. All the nodes and gateways are under puppet control. I recently upgraded from 2.6. The master logs to syslog as desired, but how to change the log level for the master escapes me. I appear to have the choice of --debug, or nothing. Debug logs far too much detail, while not using that switch simply logs each time passneger/apache launch a new worker to handle incoming connections.
I find nothing in the on-line docs about doing this. What I want is to log each time a nodes hits the server; but I do not need to see the compiled catalogue, or resources in/var/log/messages.
How is this accomplished?
This is a hack, but here is how I solved the problem. In the file (config.ru) that passenger uses to launch puppet via rack middleware, which in my system lives in /usr/share/puppet/rack/puppetmasterd, I noticed these lines
require 'puppet/util/command_line'
run Puppet::Util::CommandLine.new.execute
So, this I edited to become
require 'puppet/util/command_line'
Puppet::Util::Log.level = :info
run Puppet::Util::CommandLine.new.execute
I suppose other choices for Log.level could be :warn and others.

Resources