I am using Ambari to monitor my spark cluster, and I'm a little confused by all the memory categories; Can somebody with expertise explain what these terms mean? Thanks in advance!
Here is a screen shot of the Ambari Memory Usage zoom out:
Basically what do swap, Share, Cache and Buffer memory usage stand for? (I think I understand Total well)
There is nothing specific to Spark or Ambari here. These are basic Linux / Unix memory management terms:
In short:
Swap is a part of memory written to disk. See Wikipedia and What is swap memory?.
Buffer and cache are used for caching filesystem data and file data. See What is the difference between buffer vs cache memory in Linux? and Overview of memory management
Shared memory is a part of virtual memory used for shared libraries.
Related
I've setup a 3 machine VoltDB cluster with more or less default settings. However there seems to be a constant problem with voltdb eating up all of the RAM heap and not freeing it. The heap size is recommended 2GB.
Things that I think might be bad in my setup:
I've set 1 min async snapshots
Most of my queries are AdHoc
Event though it might not be ideal, I don't think it should lead to a problem where memory doesn't get freed.
I've setup my machines accordingly to 2.3. Configure Memory Management.
On this image you can see sudden drops in memory usage. These are server shutdowns.
Heap filling warnings
DB Monitor, current state of leader server
I would also like to note that this server is not heavily loaded.
Sadly, I couldn't find anyone with a similar problem. Most of the advice were targeted on fixing problems with optimizing memory use or decreasing the amount of memory allocated to voltdb. No one seems to have this memory leak lookalike.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
What I want is to be able to monitor Spark execution memory as opposed to storage memory available in SparkUI. I mean, execution memory NOT executor memory.
By execution memory I mean:
This region is used for buffering intermediate data when performing shuffles, joins, sorts and aggregations. The size of this region is configured through spark.shuffle.memoryFraction (default0.2).
According to: Unified Memory Management in Spark 1.6
After intense search for answers I found nothing but unanswered StackOverflow questions, answers that relate only to storage memory or ones with vague answers of the type use Ganglia, use Cloudera console etc...
There seems to be a demand for this information on Stack Overflow, and yet not a single satisfactory answer is available. Here are some top posts of StackOverflow when searching monitoring spark memory
Monitor Spark execution and storage memory utilisation
Monitoring the Memory Usage of Spark Jobs
SPARK: How to monitor the memory consumption on Spark cluster?
Spark - monitor actual used executor memory
How can I monitor memory and CPU usage by spark application?
How to get memory and cpu usage by a Spark application?
Questions
Spark version > 2.0
Is it possible to monitor Execution memory of Spark job? By monitoring I mean at minimum see used/available just like for storage memory per executor in Executor tab of SparkUI. Yes or No?
Could I do it with SparkListeners (#JacekLaskowski ?) How about history-server? Or the only way is through the external tools? Graphana, Ganglia, others? If external tools, could you please point to a tutorial or provide some more detailed guidelines?
I saw this SPARK-9103 Tracking spark's memory usage seems like it is not yet possible to monitor execution memory. Also this seems relevant SPARK-23206 Additional Memory Tuning Metrics.
Does Peak Execution memory is reliable estimate of usage/occupation of execution memory in a task? If for example it a Stage UI says that a task uses 1 Gb at peak, and I have 5 cpu per executor, does it mean I need at least 5 Gb execution memory available on each executor to finish a stage?
Are there some other proxies we could use to get a glimpse of execution memory?
Is there a way to know when the execution memory starts to eat into storage memory? When my cached table disappears from Storage tab in SparkUI or only part of it remains, does it mean it was evicted by the execution memory?
Answering my own question for future reference:
We are using Mesos as cluster manager. In the Mesos UI I found a page that lists all executors on a given worker and there one can find a Memory usage of the executor. It seems to be a total memory usage storage+execution. I can clearly see that when the memory fills up the executor dies.
To access:
Go to Agents tab which lists all cluster workers
Choose worker
Choose Framework - the one with the name of your script
Inside you will have a list of executors for your job running on this particular worker.
For memory usage see: Mem (Used / Allocated)
The similar can be done for driver. For a framework you choose the one with a name Spark Cluster
If you want to know how to extract this number programatically see my response to this question: How to get Mesos Agents Framework Executor Memory
I enable Spark internal metrics for executor and I can get information about JVMHeapMemory, jvm.heap.usage, OnHeapExecutionMemory, OnHeapStroageMemory and OnHeapUnifiedMemory for my research. Please refer to the doc (https://spark.apache.org/docs/3.0.0-preview/monitoring.html) for more information.
I use Spark 1.5.2 for a Spark Streaming application.
What is this Storage Memory in Executors tab in web UI? How was this to reach 530 MB? How to change that value?
CAUTION: You use the very, very old and currently unsupported Spark 1.5.2 (which I noticed after I had posted the answer) and my answer is about Spark 1.6+.
The tooltip of Storage Memory may say it all:
Memory used / total available memory for storage of data like RDD partitions cached in memory.
It is part of Unified Memory Management feature that was introduced in SPARK-10000: Consolidate storage and execution memory management that (quoting verbatim):
Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.
There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.
Instead, we should unify these two regions and let one borrow from another if possible.
Spark Properties
You can control the storage memory using spark.driver.memory or spark.executor.memory Spark properties that set up the entire memory space for a Spark application (the driver and executors) with the split between regions controlled by spark.memory.fraction and spark.memory.storageFraction.
You should consider watching the slides Memory Management in Apache Spark by the author Andrew Or and the video Deep Dive: Apache Spark Memory Management by the author himself (again).
You may want to read how the Storage Memory values (in web UI and internally) are calculated in How does web UI calculate Storage Memory (in Executors tab)?
The following is the screenshot of htop on my dev server [arranged by MEM% used]:
I have only one cassandra instance running, but there are so many cassandra processes in htop, which is taking up 16 gb of ram.
The server is not being used in production, hence there are no queries being run on it at the moment.
I don't understand the reason why so many cassandra processes are running on my system, and how can I control this. Any suggestions will be highly appreciated.
Cassandra is a greedy process, It wont leave the RAM unless asked for.
You do not need to worry about the used RAM. If any other process will request for RAM, Cassandra process will leave the RAM.
Cassandra typically can take upto 16 GB RAM, which is the minimum prod recommendation from a performance point of view. Along with Cassandra there are a number of other processes which get the memory allocation like the JVM heap here. And as mentioned above it is a memory intensive technology.
I deployed multi node with Apache cassandra-2.0.13 version in centos 7.0. I am using heap size-8G and New heap size-2048M . system used as cached 17GB memory.
How can I limit the usage of virtual memory by cassandra.
Virtual memory use is generally not a problem. It is not to be confused with actual RAM usage. You can find a good description about virtual memory here. Please further elaborate if you still think the shown virtual memory value could be a problem.