Experience Kubernetes OOM kills can be very frustrating. Why is my application struggling if I have plenty of CPU in the node?
Managing Kubernetes pod resources can be a challenge. Many issues can arise, possibly due to an incorrect configuration of Kubernetes limits and requests.
In this article, we will try to help you detect the most common issues related to the usage of resources.
Kubernetes OOM problems
When any Unix based system runs out of memory, OOM safeguard kicks in and kills certain processes based on obscure rules only accessible to level 12 dark sysadmins (chaotic neutral). Kubernetes OOM management tries to avoid the system running behind trigger its own. When the node is low on memory, Kubernetes eviction policy enters the game and stops pods as failed. These pods are scheduled in a different node if they are managed by a ReplicaSet. This frees memory to relieve the memory pressure.
OOM kill due to container limit reached
This is by far the most simple memory error you can have in a pod. You set a memory limit, one container tries to allocate more memory than that allowed,and it gets an error. This usually ends up with a container dying, one pod unhealthy and Kubernetes restarting that pod.
test frontend 0/1 Terminating 0 9m21s
Describe pods output would show something like this:
State: Running Started: Thu, 10 Oct 2019 11:14:13 +0200 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Thu, 10 Oct 2019 11:04:03 +0200 Finished: Thu, 10 Oct 2019 11:14:11 +0200 … Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6m39s default-scheduler Successfully assigned test/frontend to gke-lab-kube-gke-default-pool-02126501-7nqc Normal SandboxChanged 2m57s kubelet, gke-lab-kube-gke-default-pool-02126501-7nqc Pod sandbox changed, it will be killed and re-created. Normal Killing 2m56s kubelet, gke-lab-kube-gke-default-pool-02126501-7nqc Killing container with id docker://db:Need to kill Pod
The Exit code 137 is important because it means that the system terminated the container as it tried to use more memory than its limit.
In order to monitor this, you always have to look at the use of memory compared to the limit. Percentage of the node memory used by a pod is usually a bad indicator as it gives no indication on how close to the limit the memory usage is. In Kubernetes, limits are applied to containers, not pods, so monitor the memory usage of a container vs. the limit of that container.
Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits
Kubernetes OOM kill due to limit overcommit
Memory requested is granted to the containers so they can always use that memory, right? Well, it’s complicated. Kubernetes will not allocate pods that sum to more memory requested than memory available in a node. But limits can be higher than requests, so the sum of all limits can be higher than node capacity. This is called overcommit and it is very common. In practice, if all containers use more memory than requested, it can exhaust the memory in the node. This usually causes the death of some pods in order to free some memory.
Memory management in Kubernetes is complex, as it has many facets. Many parameters enter the equation at the same time:
- Memory request of the container.
- Memory limit of the container.
- Lack of those settings.
- Free memory in the system.
- Memory used by the different containers.
With these parameters, a blender and some maths, Kubernetes elaborates a score. Last in the table is killed or evicted. The pod can be restarted depending on the policy, so that doesn’t mean the pod will be removed entirely.
Despite this mechanism, we can still finish up with system OOM kills as Kubernetes memory management runs only every several seconds. If the system memory fills too quickly, the system can kill Kubernetes control processes, making the node unstable.
This scenario should be avoided as it will probably require a complicated troubleshooting, ending with an RCA based on hypothesis and a node restart.
In day-to-day operation, this means that in case of overcommitting resources, pods without limits will likely be killed, containers using more resources than requested have some chances to die and guaranteed containers will most likely be fine.
CPU throttling due to CPU limit
There are many differences on how CPU and memory requests and limits are treated in Kubernetes. A container using more memory than the limit will most likely die, but using CPU can never be the reason of Kubernetes killing a container. CPU management is delegated to the system scheduler, and it uses two different mechanisms for the requests and the limits enforcement.
CPU requests are managed using the shares system. This means that the resources in the CPU are prioritized depending on the value of shares. Each CPU core is divided into 1,024 shares and the resources with more shares have more CPU time reserved. Be careful, in moments of CPU starvation, shares won’t ensure your app has enough resources, as it can be affected by bottlenecks and general collapse.
Tip: If a container requests 100m, the container will have 102 shares. These values are only used for pod allocation. Monitoring the shares in a pod does not give any idea of a problem related to CPU throttling.
On the other hand, limits are treated differently. Limits are managed with the CPU quota system. This works by dividing the CPU time in 100ms periods and assigning a limit on the containers with the same percentage that the limit represents to the total CPU in the node.
Tip: If you set a limit of 100m, the process can use 10ms of each period of processing. The system will throttle the process if it tries to use more time than the quota, causing possible performance issues. A pod will never be terminated or evicted for trying to use more CPU than its quota, the system will just limit the CPU
If you want to know if your pod is suffering from CPU throttling, you have to look at the percentage of the quota assigned that is being used. Absolute CPU use can be treacherous, as you can see in the following graphs. CPU use of the pod is around 25%, but as that is the quota assigned, it is using 100% and consequently suffering CPU throttling.
Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits
Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits
There is a great difference between CPU and memory quota management. Regarding memory, a pod without requests and limits is considered burstable and is the first of the list to OOM kill. With the CPU, this is not the case. A pod without CPU limits is free to use all the CPU resources in the node. Well, truth is, the CPU is there to be used, but if you can’t control which process is using your resources, you can end up with a lot of problems due to CPU starvation of key processes.
Lessons learned
Knowing how to monitor resource usage in your workloads is of vital importance. This will allow you to discover different issues that can affect the health of the applications running in the cluster.
Understanding that your resource usage can compromise your application and affect other applications in the cluster is the crucial first step. You have to properly configure your quotas. Monitoring the resources and how they are related to the limits and requests will help you set reasonable values and avoid Kubernetes OOM kills. This will result in a better performance of all the applications in the cluster, as well as a fair sharing of resources.
A good monitoring system like sysdig monitor will help you to ensure you avoid pod eviction and pending pods. Request a demo today!
How to do Kubernetes capacity planning with Sysdig
We, at Sysdig, use Kubernetes ourselves, and also help hundreds of customers dealing with their clusters every day. We are happy to share all that expertise with you in our out-of-the-box Kubernetes Dashboards. With the right dashboards, you won’t need to be an expert to troubleshoot or do Kubernetes capacity planning in your cluster.
With our out-of-the-box Kubernetes Dashboards, you can discover underutilized resources in a couple of clicks.
The Underutilization of Allocated Resources dashboards help you find if there are unused CPU or memory
Also, you can sign up for a free trial of Sysdig Monitor and try the out-of-the-box Kubernetes dashboards.