VoltDB cluster eating all RAM - voltdb

I've setup a 3 machine VoltDB cluster with more or less default settings. However there seems to be a constant problem with voltdb eating up all of the RAM heap and not freeing it. The heap size is recommended 2GB.
Things that I think might be bad in my setup:
I've set 1 min async snapshots
Most of my queries are AdHoc
Event though it might not be ideal, I don't think it should lead to a problem where memory doesn't get freed.
I've setup my machines accordingly to 2.3. Configure Memory Management.
On this image you can see sudden drops in memory usage. These are server shutdowns.
Heap filling warnings
DB Monitor, current state of leader server
I would also like to note that this server is not heavily loaded.
Sadly, I couldn't find anyone with a similar problem. Most of the advice were targeted on fixing problems with optimizing memory use or decreasing the amount of memory allocated to voltdb. No one seems to have this memory leak lookalike.

Related

Geeting Redis memory peak issue

I've deployed the project on AWS & we are using local Redis. We have around 150+ Redis keys. having a lot of data. We have 16 GB RAM on the EC2 instance & in Redis config, we defined maxmemory 10 GB.
But we are getting bellow error.
--> redis-cli
--> memory doctor
Sam, I detected a few issues in this Redis instance memory implants:
Peak memory: In the past this instance used more than 150% the memory that is currently using. The allocator is normally not able to release memory after a peak, so you can expect to see a big fragmentation ratio, however this is actually harmless and is only due to the memory peak, and if the Redis instance Resident Set Size (RSS) is currently bigger than expected, the memory will be used as soon as you fill the Redis instance with more data. If the memory peak was only occasional and you want to try to reclaim memory, please try the MEMORY PURGE command, otherwise the only other option is to shutdown and restart the instance.
I'm here to keep you safe, Sam. I want to help you.
That would be great if anyone can help us to resolve this as soon as possible.
Please let us know.

Is infinispan cache usable as a cluster cache of small resource nodes?

Suppose, I have a lot of nodes with small resources on memory and cpu maybe 5 or maybe 20.
These nodes are not really reliable, they may be switched of by the User.
They all use a database for readonly master data which will be delivered by a kafka topic connected to from each node.
What I want to achieve is to use infinispan as a distributed[replicated] cache above the database used by the nodes, so that at any node at any point on time has the same "view" on the readonly database.
Can I get this working, especially with low resources and if yes, is there any Link to an example for getting expirience?
Thanx
I don't think you can get a definite answer here, you need to try it out. I wouldn't call 5 - 20 CPUs small resources; there's not much going on in background when you're not actively reading/writing the cache so there shouldn't be any 'constant' overhead - just JGroups' heartbeat messages and such.
When using off-heap memory, Infinispan can be started with pretty small JVM heaps (24 MB IIRC, just for the POC), so you might be fine. However if you'll replicate the database on every node it's going to occupy some memory.
If the nodes often come and go, it could cause some churn on CPU. In replicated mode leaves won't matter too much, but when a node joins it will be getting all the data (from different nodes).

Ambari dashboard memory usage explanation for spark cluster

I am using Ambari to monitor my spark cluster, and I'm a little confused by all the memory categories; Can somebody with expertise explain what these terms mean? Thanks in advance!
Here is a screen shot of the Ambari Memory Usage zoom out:
Basically what do swap, Share, Cache and Buffer memory usage stand for? (I think I understand Total well)
There is nothing specific to Spark or Ambari here. These are basic Linux / Unix memory management terms:
In short:
Swap is a part of memory written to disk. See Wikipedia and What is swap memory?.
Buffer and cache are used for caching filesystem data and file data. See What is the difference between buffer vs cache memory in Linux? and Overview of memory management
Shared memory is a part of virtual memory used for shared libraries.

GPDB:Out of memory at segment

we re facing OOM error when trying to execute multiple SQL query session via scheduled job .
Detailed error:
The error message is: org.postgresql.util.PSQLException:ERROR: Out of memory (seg6 slice5 sungpmsh0:40002 pid=13610)
Detail: VM protect failed to allocate 65584 bytes from system, VM Protect 5835 MB available
We tried
After reading the pivotal support doc, we are doing basic troubleshoot here
validated two memory parameters here
current setting in GPdb
GPDB vmprotect limit :8 GB
GPB statemen_mem: based on the vmprotect limit.as per reading it is responsible for running the query in the segment.
Test 2 Did Tuning the SQL queries. also, what should I tune here please guide?
Based on source
https://discuss.pivotal.io/hc/en-us/articles/201947018-Pivotal-Greenplum-GPDB-Memory-Configuration
https://discuss.pivotal.io/hc/en-us/articles/204268778-What-are-VM-Protect-failed-to-allocate-d-bytes-d-MB-available-error-
But still getting the same OOM error.
Do we need to increase the vmprotect limit? if Yes, then by which amount should we increase it?
How to handle concurrency at gpdb?
How much swap we need to add here when we are already running with 30 GB RAM.
currently, we have added 15GB swap here? is that ok ?
What is the query to identify host connection with Greenplum database ?
Thanks in advance
Do we need to increase the vmprotect limit? if Yes, then by which amount should we increase it?
There is a nice calculator on setting gp_vmem_protect_limit on Greenplum.org. The setting depends on how much memory, swap, and segments per host you have.
http://greenplum.org/calc/
You can be getting OOM errors for several reasons.
Bad query
Bad table distribution (skew)
Bad settings (like gp_vmem_protect_limit)
Not enough resources (RAM)
How to handle concurrency at gpdb?
More RAM, less segments per host, and workload management to limit the number of concurrent queries running.
How much swap we need to add here when we are already running with 30 GB RAM. currently, we have added 15GB swap here? is that ok ?
Only 30GB of RAM? That is pretty small. You can add more swap but it will slow down the queries compared to real RAM. I wouldn't use much more than 8GB of swap.
I recommend using 256GB of RAM or more especially if you are worried about concurrency.
What is the query to identify host connection with Greenplum database
select * from pg_stat_activity;

Cassandra Performance Tuning

I have successfully installed a multi-node Cassandra cluster with 10nodes,
The nodetool status command shows every node is UP and NORMAL.
but the Performance I am getting is very bad.
here are my results:
Operations /seconds = 4000
Read Latency = 13ms
write Latency = 10ms
I am using YCSB to measure performance
Tuning that I have done till now:
Consistency level = 1
Replication Factor = 3
Heap size = 4GB
My Hardware:
Each node is a VM with CentOS
2GHZ CPU with 8 cores
8GB RAM
1GB/ps N/W
Please let me know what more settings I can tweak to get maximum performance out of my cluster.
If you have 1 system with 10 VMs running on it and 1 disk, the performance of any (not in-memory) database will be bad. Especially with spinning disks (no matter how expensive they are) is going to be a major contention point. With a really good SSD you may be able to pull off a few instances, but performance stress testing will likely always hit either that or a CPU bottleneck (if things configured correctly for system).
Pretty good chance with 4gb heaps and a stress workload you are going to be hitting GC and memory issues, do you have any monitoring around that? Can use visualvm and connect to the ip:7199 (ip set in cassandra-env.sh).
8gb of ram per vm is on the minimum spec end. You want at least 8gb of JVM heap with space for the offheap stuff and OS. A 16gb system is likely sufficient. Once again the shared disk will kill performance so it will only go so far. but should be able to do far better than 4k/sec.

Resources