How to get Mesos Agents Framework Executor Memory - apache-spark

Inside Mesos Web UI I can see memory usage of my Spark executors in a table
Agents -> Framework -> Executors
There is a table listing all executors for my Spark driver and their memory usage is indicated in column Mem (Used / Allocated).
Is there a way to obtain this number directly via a link and if yes how?
For example I can obtain a bunch of Mesos metrics via http://IP/mesos/metrics/snapshot but memory usage of executors is not one of them.

The memory usage of executors in fact is related with mesos task, means for every task how many memory the executors will consume.
If above is what you need, you can use following rest api to get a json and then parse the memory used from it.
http://mesos_ip:5050/master/tasks
FYI.

Found the answer myself. For each worker/agent on which executors may run, direct access to memory info is here:
http://IP_of_worker1:5051/slave(1)/monitor/statistics
http://IP_of_worker2:5051/slave(1)/monitor/statistics
etc
The content is in the form of a json and framework_id allows to find the related executors and their memory consumption, cpu usage etc what is given in the table.

Related

apache spark executors and data locality

The spark literature says
Each application gets its own executor processes, which stay up for
the duration of the whole application and run tasks in multiple
threads.
And If I understand this right, In static allocation the executors are acquired by the Spark application when the Spark Context is created on all nodes in the cluster (in a cluster mode). I have a couple of questions
If executors are acquired on all nodes and will stay allocated to
this application during the the duration of the whole application,
isn't there a chance a lot of nodes remain idle?
What is the advantage of acquiring resources when Spark context is
created and not in the DAGScheduler? I mean the application could be
arbitrarily long and it is just holding the resources.
So when the DAGScheduler tries to get the preferred locations and
the executors in those nodes are running the tasks, would it
relinquish the executors on other nodes?
I have checked a related question
Does Spark on yarn deal with Data locality while launching executors
But I'm not sure there is a conclusive answer
If executors are acquired on all nodes and will stay allocated to this application during the the duration of the whole application, isn't there a chance a lot of nodes remain idle?
Yes there is chance. If you have data skew this will happen. The challenge is to tune the executors and executor core so that you get maximum utilization. Spark also provides dynamic resource allocation which ensures the idle executors are removed.
What is the advantage of acquiring resources when Spark context is created and not in the DAGScheduler? I mean the application could be arbitrarily long and it is just holding the resources.
Spark tries to keep data in memory while doing transformation. Contrary to map-reduce model where after every Map operation it writes to disk. Spark can keep the data in memory only if it can ensure the code is executed in the same machine. This is the reason of allocating resource beforehand.
So when the DAGScheduler tries to get the preferred locations and the executors in those nodes are running the tasks, would it relinquish the executors on other nodes?
Spark can't start a task on an executor unless the executor is free. Now spark application master negotiates with the yarn to get the preferred location. It may or may not get that. If it doesn't get, it will start task in different executor.

Spark execution memory monitoring [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
What I want is to be able to monitor Spark execution memory as opposed to storage memory available in SparkUI. I mean, execution memory NOT executor memory.
By execution memory I mean:
This region is used for buffering intermediate data when performing shuffles, joins, sorts and aggregations. The size of this region is configured through spark.shuffle.memoryFraction (default0.2).
According to: Unified Memory Management in Spark 1.6
After intense search for answers I found nothing but unanswered StackOverflow questions, answers that relate only to storage memory or ones with vague answers of the type use Ganglia, use Cloudera console etc...
There seems to be a demand for this information on Stack Overflow, and yet not a single satisfactory answer is available. Here are some top posts of StackOverflow when searching monitoring spark memory
Monitor Spark execution and storage memory utilisation
Monitoring the Memory Usage of Spark Jobs
SPARK: How to monitor the memory consumption on Spark cluster?
Spark - monitor actual used executor memory
How can I monitor memory and CPU usage by spark application?
How to get memory and cpu usage by a Spark application?
Questions
Spark version > 2.0
Is it possible to monitor Execution memory of Spark job? By monitoring I mean at minimum see used/available just like for storage memory per executor in Executor tab of SparkUI. Yes or No?
Could I do it with SparkListeners (#JacekLaskowski ?) How about history-server? Or the only way is through the external tools? Graphana, Ganglia, others? If external tools, could you please point to a tutorial or provide some more detailed guidelines?
I saw this SPARK-9103 Tracking spark's memory usage seems like it is not yet possible to monitor execution memory. Also this seems relevant SPARK-23206 Additional Memory Tuning Metrics.
Does Peak Execution memory is reliable estimate of usage/occupation of execution memory in a task? If for example it a Stage UI says that a task uses 1 Gb at peak, and I have 5 cpu per executor, does it mean I need at least 5 Gb execution memory available on each executor to finish a stage?
Are there some other proxies we could use to get a glimpse of execution memory?
Is there a way to know when the execution memory starts to eat into storage memory? When my cached table disappears from Storage tab in SparkUI or only part of it remains, does it mean it was evicted by the execution memory?
Answering my own question for future reference:
We are using Mesos as cluster manager. In the Mesos UI I found a page that lists all executors on a given worker and there one can find a Memory usage of the executor. It seems to be a total memory usage storage+execution. I can clearly see that when the memory fills up the executor dies.
To access:
Go to Agents tab which lists all cluster workers
Choose worker
Choose Framework - the one with the name of your script
Inside you will have a list of executors for your job running on this particular worker.
For memory usage see: Mem (Used / Allocated)
The similar can be done for driver. For a framework you choose the one with a name Spark Cluster
If you want to know how to extract this number programatically see my response to this question: How to get Mesos Agents Framework Executor Memory
I enable Spark internal metrics for executor and I can get information about JVMHeapMemory, jvm.heap.usage, OnHeapExecutionMemory, OnHeapStroageMemory and OnHeapUnifiedMemory for my research. Please refer to the doc (https://spark.apache.org/docs/3.0.0-preview/monitoring.html) for more information.

What is and how to control Memory Storage in Executors tab in web UI?

I use Spark 1.5.2 for a Spark Streaming application.
What is this Storage Memory in Executors tab in web UI? How was this to reach 530 MB? How to change that value?
CAUTION: You use the very, very old and currently unsupported Spark 1.5.2 (which I noticed after I had posted the answer) and my answer is about Spark 1.6+.
The tooltip of Storage Memory may say it all:
Memory used / total available memory for storage of data like RDD partitions cached in memory.
It is part of Unified Memory Management feature that was introduced in SPARK-10000: Consolidate storage and execution memory management that (quoting verbatim):
Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.
There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.
Instead, we should unify these two regions and let one borrow from another if possible.
Spark Properties
You can control the storage memory using spark.driver.memory or spark.executor.memory Spark properties that set up the entire memory space for a Spark application (the driver and executors) with the split between regions controlled by spark.memory.fraction and spark.memory.storageFraction.
You should consider watching the slides Memory Management in Apache Spark by the author Andrew Or and the video Deep Dive: Apache Spark Memory Management by the author himself (again).
You may want to read how the Storage Memory values (in web UI and internally) are calculated in How does web UI calculate Storage Memory (in Executors tab)?

Limit Spark application from grabbing all the resources in a YARN cluster

We (an engineering team) are running an EMR cluster with YARN and Spark. What is typically happening is that when one user submits a heavy memory intensive job, it grabs all the YARN available memory and then all the subsequent users submitted jobs have to wait for that memory to clear (I know that autoscaling will solve this problem to a certain extent and we are looking into that, but we would like to avoid a single user occupying all the memory even when the cluster is autoscaled to it's full limits).
Is there a way to configure YARN such that any application (Spark or otherwise) may not occupy more than, say 75% of available memory?
Thanks
According to the documentation, you can manage the amount of memory allocated to an executor using the parameter: spark.executor.memory

Spark executors with different amounts of memory on Mesos

Is it possible to have executors with different amounts of memory on a Mesos cluster? Or am I bounded by the machine with the least memory? (Assuming I want to use all available cpus).
Short anwer: No.
Unfortunately, Spark Mesos and YARN only allow giving as much resources (cores, memory, etc.) per machine as your worst machine has (discussion). Ideally, the cluster should be homogeneous in order to take full advantage of its resources.
However, there might exist a workaround for your problem. According to the linked source above, Spark standalone allows creating multiple workers on some machines. You might modify your worker configuration to be appropriate for the worst machine, and start multiple workers on these.
For example, given two computers with 4G and 20G memory respectively, you could create 5 workers on the latter, each with a configuration to use just 4G of memory, as limited per the first machine.

Resources