JVM memory settings in docker container in AWS beanstalk - linux

I run my java application in a docker container. I use AWS Beanstalk. The docker base image is CentOS. I run a container on an EC2 instance with 4gb of RAM on Amazon Linux AMI for Beanstalk.
How should I configure the container and JVM memory settings.
right now I have:
4GB on Amazon Linux Beanstalk AMI ec2 instance
I dedicated 3GB of 4 for a docker container
{
"AWSEBDockerrunVersion": 2,
"Authentication": {
"Bucket": "elasticbeanstalk-us-east-1-XXXXXX",
"Key": "dockercfg"
},
"containerDefinitions": [
{
"name": "my-service",
"image": "docker-registry:/myapp1.0.2",
"essential": true,
"memory": 3184,
"portMappings": [
{
"hostPort": 80,
"containerPort": 8080
}
]
}
}
JVM settings are
-Xms2560m
-Xmx2560m
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70
-XX:+ScavengeBeforeFullGC
-XX:+CMSScavengeBeforeRemark
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=./heapdump.hprof
Can I dedicate whole ec2 instance memory for a docker container, 4gb, and set JVM to 4GB too?
I think JVM could crash when it is not able to allocate whatever the Xmx memory param says. If so what are the most optimal values I could use for the container and JVM?
Right now I left 1gb margin for Amazon linux itself and 512MB for CentOS running in a container. 1.5GB wasted. Can it be done better?

The real answer is very dependent on your Java app and usage.
Start with a JVM heap of 3.5G.
Set the container max to 3.8G to cover javas overhead.
Load test, repeatedly.
Java
-Xmx and -Xms only control Java's heap size so you can't allocate all your available memory to Java's heap and expect java and the rest of the system to run well (or at all).
You will also need to cater for:
Stack: -Xss * number of threads (Xss defaults to 1MB on 64bit)
PermGen/Metaspace: -XX:MaxPermSize defaults to 64MB. Metaspace to 21MB but this can grow.
Your app could also use a lot of shared memory, do JNI things outside of the heap, rely on large mmaped files for performance or exec external binaries or any number of other oddities outside the heap. Do some load testing at your chosen memory levels, then above and below those levels to make sure you're not hitting or getting close to any memory issues that affect performance.
Container
Centos doesn't "run" in the container as such, docker just provides a file system image for your JVM process to reference. Normally you wouldn't "run" any more than the java process in that container.
There's not a huge benefit to limiting a containers max memory when you are on a dedicated host with only one container but it doesn't hurt either, in case something in the native java runs away with the available memory.
The memory overhead here needs to cater for the possible native Java usage mentioned above and the extra system files for the jre that will be loaded from the container image (rather than the host) and cached. You also won't get a nice stack or heapdump when you hit a container memory limit.
OS
A plain Amazon AMI only needs about 50-60MB of memory to run plus some spare for the file cache, so say 256MB to cover the OS with some leeway.

Related

Node process gets killed when Memory Cgroup reports OOM, when running on instances with a high RAM and CPU cores, but works with small instances

When running a job as a pipeline in Gitlab Runner's K8s pod, the job gets completed successfully only when running on a small instance like m5*.large which offers 2 vCPUs and 8GB of RAM. We set a limit for the build, helper, and services containers mentioned below. Still, the job fails with an Out Of Memory (OOM) error, getting the process node killed by cgroup when running on an instance way more powerful like m5d*.2xlarge for example which offers 8 vCPUs and 32GB of RAM.
Note that we tried to dedicate high resources to the containers, especially the build one in which the node process is a child process of this and nothing changed when running on powerful instances; the node process still got killed because of OOM, each time we give it more memory, the node process consumed higher memory and so on.
Also, regarding the CPU usage, in powerful instances, the more vCPUs we gave it, the more is consumed and we noticed that it has CPU Throtelling at ~100% almost all the time, however, in the small instances like m5*.large, the CPU throttling didn't pass the 3%.
Note that we specified a maximum of memory that be used by the node process but it looks like it does not take any effect. We tried to set it to 1GB, 1.5GB and 3GB.
NODE_OPTIONS: "--max-old-space-size=1536"
Node Version
v16.19.0
Platform
amzn2.x86_64
Logs of the host where the job runs
"message": "oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=....
....
"message": "Memory cgroup out of memory: Killed process 16828 (node) total-vm:1667604kB
resources request/limits configuration
memory_request = "1Gi"
memory_limit = "4Gi"
service_cpu_request = "100m"
service_cpu_limit = "500m"
service_memory_request = "250Mi"
service_memory_limit = "2Gi"
helper_cpu_request = "100m"
helper_cpu_limit = "250m"
helper_memory_request = "250Mi"
helper_memory_limit = "1Gi"
Resource consumption of a successful job running on m5d.large
Resource consumption of a failing job running on m5d.2xlarge
When a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.
Check did you enable persistent journaling in your container(s)?
One way: mkdir /var/log/journal && systemctl restart systemd-journald
Other way: in ystemd/man/journald.conf.html
If not and your container uses systemd, it will log to memory with limits derived from the host RAM which can lead to unexpected OOM situations..
Also if possible you can increase the amount of RAM (clamav does use quite a bit).
If the node experiences an out of memory (OOM) event prior to the kubelet being able to reclaim memory, the node depends on the oom_killer to respond.
Node out of memory behavior is well described in Kubernetes best practices: Resource requests and limits. Adjust memory requests (minimal threshold) and memory limits (maximal threshold) in your containers.
Pods crash and OS Syslog shows the OOM killer kills the container process, Pod memory limit and cgroup memory settings. Kubernetes manages the Pod memory limit with cgroup and OOM killer. We need to be careful to separate the OS OOM and the pods OOM.
Try to use the --oom-score-adj option to docker run or even --oom-kill-disable. Refer to Runtime constraints on resources for more info.
Also refer to the similar SO for more related information.

How to make wine use more memory in docker on k8s cluster?

I'm using the k8s v1.16 which is icp (ibm container platform).
I want to run some.exe files on the container.
So that I use the wineserver to run window based exe files.
But there is a problem with the memory usage.
Though I allocated 32GB of memory on the pod where the wineserver container will be running, the container does not use memory more than 3GB.
What should I do to make the wine container uses memory more than 3GB?

Is there a way to set the available resources of a docker container system using the docker container limit?

I am currently working on a Kubernetes cluster, which uses docker.
This cluster allows me to launch jobs. For each job, I specify a memory request and a memory limit.
The memory limit will be used by Kubernetes to fill the --memory option of the docker run command when creating the container. If this container exceeds this limit it will be killed for OOM reason.
Now, If I go inside a container, I am a little bit surprised to see that the available system memory is not the one from the --memory option, but the one from the docker machine. (The Kubernetes Node)
I am surprised because a system with wrong information about available resources will not behave correctly.
Take for example the memory cache used by IO operations. If you write on disk, pages will be cached on the RAM before being written. To do this the system will evaluate how many pages could be cached using the sysctl vm.dirty_ratio (20 % by default) and the memory size of the system. But how this could work if the container system memory size is wrong.
I verified it:
I ran a program with a lot of IO operations (os.write, decompression, ...) on a container limited at 10Gi of RAM, on a 180Gi Node. The container will be killed because it will reach the 10Gi memory limit. This OOM is caused by the wrong evaluation of dirty_ratio * the system memory.
This is terrible.
So, my question is the following:
Is there a way to set the available resources of a docker container system using the docker container limit?

How to config the Docker resources

I am running Docker on a Linux server. By default only 2GB of memory and 0GB of Swap space are allocated. How can I change the memory and swap space in Docker?
From official documentation:
By default, a container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows. Docker provides ways to control how much memory, CPU, or block IO a container can use, setting runtime configuration flags of the docker run command
You can use the -m or --memory option and set it to a desired value depending on your host's available memory.

Potential Memory leak in Suse

I've a SUSE server running tomcat with my web application (which has threads running in the backend to update database).
The server has 4GB RAM and tomcat is configured to use maximum of 1GB.
After running for few days, the free command shows that system has only 300MB free memory. Tomcat uses only 400MB and no other process seems to use unreasonable amount of memory.
Adding up the memory usage of all process (returned from ps aux command) shows only 2GB is in use.
Is there any way to identify if there is leak at system level?

Resources