Understanding Kubernetes limits and requests by example

NOUVEAUTÉ! Rapport 2022 d'usage et de sécurisation des applications Cloud Native

How we set Kubernetes limits and requests is essential in optimizing application and cluster performance.

One of the challenges of every distributed system designed to share resources between applications, like Kubernetes, is, paradoxically, how to properly share the resources. Applications were typically designed to run standalone in a machine and use all of the resources at hand. It is said that good fences make good neighbors. The new landscape requires sharing the same space with others, and that makes resource quotas a hard requirement.

containers fighting for resources

Namespace quotas

Kubernetes allows administrators to set quotas, in namespaces, as hard limits for resource usage. This has an additional effect; if you set a CPU request quota in a namespace, then all pods need to set a CPU request in their definition, otherwise they will not be scheduled.

Let’s look at an example:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: mem-cpu-example
spec:
  hard:
    requests.cpu: 2
    requests.memory: 2Gi
    limits.cpu: 3
    limits.memory: 4Gi

If we apply this file to a namespace, we will set the following requirements:

  • All pod containers have to declare requests and limits for CPU and memory.
  • The sum of all the CPU requests can’t be higher than 2 cores.
  • The sum of all the CPU limits can’t be higher than 3 cores.
  • The sum of all the memory requests can’t be higher than 2 GiB.
  • The sum of all the memory limits can’t be higher than 4 GiB.

If we already have 1.9 cores allocated with pods and try to allocate a new pod with 200m of CPU request, the pod will not be scheduled and will remain in a pending state.

Explaining pod requests and limits

Let’s consider this example of a deployment:

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
 name: redis
 labels:
   name: redis-deployment
   app: example-voting-app
spec:
 replicas: 1
 selector:
   matchLabels:
    name: redis
    role: redisdb
    app: example-voting-app
 template:
   spec:
     containers:
       - name: redis
         image: redis:5.0.3-alpine
         resources:
           limits:
             memory: 600Mi
             cpu: 1
           requests:
             memory: 300Mi
             cpu: 500m
       - name: busybox
         image: busybox:1.28
         resources:
           limits:
             memory: 200Mi
             cpu: 300m
           requests:
             memory: 100Mi
             cpu: 100m

Let’s say we are running a cluster with, for example, 4 cores and 16GB RAM nodes. We can extract a lot of information:

quota configuration explained

 

  1. Pod effective request is 400 MiB of memory and 600 millicores of CPU. You need a node with enough free allocatable space to schedule the pod.

     

  2. CPU shares for the redis container will be 512, and 102 for the busybox container. Kubernetes always assign 1024 shares to every core, so:
    • redis: 1024 * 0.5 cores ≅ 512
    • busybox: 1024 * 0.1cores ≅ 102

     

  3. Redis container will be OOM killed if it tries to allocate more than 600MB of RAM, most likely making the pod fail.

     

  4. Redis will suffer CPU throttle if it tries to use more than 100ms of CPU in every 100ms, (since we have 4 cores, available time would be 400ms every 100ms) causing performance degradation.

     

  5. Busybox container will be OOM killed if it tries to allocate more than 200MB of RAM, resulting in a failed pod.

     

  6. Busybox will suffer CPU throttle if it tries to use more than 30ms of CPU every 100ms, causing performance degradation.

In order to detect problems, we should be monitoring:

  • CPU and Memory usage in the node. Memory pressure can trigger OOM kills if the node memory is full, despite all of the containers being under their limits. CPU pressure will throttle processes and affect performance.
    CPU load by node Find these metrics in Sysdig Monitor in the dashboard: Kubernetes → Resource usage → Kubernetes node health
    Memory usage by node Find these metrics in Sysdig Monitor in the dashboard: Kubernetes → Resource usage → Kubernetes node health

     

  • Disk space in the node. If the node runs out of disk, it will try to free disk space with a fair chance of pod eviction.
    disk space by node Find these metrics in Sysdig Monitor in the dashboard: Kubernetes → Resource usage → Kubernetes node health

     

  • Percentage of CPU quota used by every container. Monitoring pod CPU usage can lead to errors. Remember, Kubernetes limits are per container, not per pod. Other CPU metrics, like cpu shares used, are only valid for allocating so don’t waste time on them if you have performance issues.
    CPU quota by container Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits

     

  • Memory usage per container. You can relate this value to the limit in the same graph or analyze the percentage of memory limit used. Don’t use pod memory usage. A pod in the example can be using 300MiB of RAM, well under the pod effective limit (400MiB), but if redis container is using 100MiB and busybox container is using 200MiB, the pod will fail.
    percentage of memory limit used Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits

     

  • Percentage of resource allocation in the cluster and the nodes. You can represent this as a percentage of resources allocated from total available resources. A good warning threshold would be (n-1)/n * 100, where n is the number of nodes. Over this threshold, in case of a node failure, you wouldn’t be able to reallocate your workloads in the rest of the nodes.
    Overview of cluster resources Find these metrics in Sysdig Monitor in the Overview feature → clusters

     

  • Limit overcommit (for memory and CPU). The best way to clearly see this is the percentage that the limit represents in the total allocatable resources. This can go over 100% in a normal operation.
    CPU limit overcommit Custom graph showing cpu usage vs. capacity vs. limits vs. requests.

Choosing pragmatic requests and limits

When you have some experience with Kubernetes, you usually understand (the hard way) that properly setting requests and limits is of utmost importance for the performance of the applications and cluster.

In an ideal world, your pods should be continuously using exactly the amount of resources you requested. But the real world is a cold and fickle place, and resource usage is never regular or predictable. Consider a 25% margin up and down the request value as a good situation. If your usage is much lower than your request, you are wasting money. If it is higher, you are risking performance issues in the node.

how to set good requests

Regarding limits, achieving a good setting is a matter of try and catch. There is no optimal value for everyone as it hardly depends on the nature of the application, the demand model, the tolerance to errors and many other factors.

How to set good limits

Another thing to consider is the limit overcommit you allow on your nodes.

How to set your limit overcommit

The enforcement of these limits are on the user, as there is no automatic mechanism to tell Kubernetes how much overcommit to allow.

Conclusion

Some lessons you should learn from this are:

  • Dear developer, set requests and limits in your workloads.
  • Beloved cluster admin, setting a namespace quota will enforce all of the workloads in the namespace to have a request and limit in every container.

Quotas are a necessity to properly share resources. If someone tells you that you can use any shared service without limits, they are either lying or the system will eventually collapse, to no fault of your own.

A good monitoring system like sysdig monitor will help you to ensure your quotas are properly configured. Request a demo today!

Stay up to date

Sign up to receive our newest.

Related Posts

How to Monitor Kubernetes API Server

What is a CrashLoopBackOff? How to alert, debug / troubleshoot, and fix Kubernetes CrashLoopBackOff events.

How to monitor Golden signals in Kubernetes