In this post we are going to demonstrate how to deploy a Kubernetes autoscaler using a third party metrics provider. You will learn how to expose any custom metric directly through the Kubernetes API implementing an extension service. Dynamic scaling is not a new concept by any means, but implementing your own scaler is a rather complex and delicate task. That’s why the Kubernetes Horizontal Pod Autoscaler (HPA) is a really powerful Kubernetes mechanism: it can help you to dynamically adapt your service in a way that is reliable, predictable and easy to configure.
Why you may need custom metrics for your Kubernetes autoscaler?
If the metrics-server plugin is installed in your cluster, you will be able to see the CPU and memory values for your cluster nodes or any of the pods. These metrics are useful for internal cluster sizing, you probably want to configure your Kubernetes autoscaler using a wider set of metrics:
- Service latency ->
net.http.request.time
- I/O load ->
file.iops.total
- Memory usage ->
memory.used.percent
You need a metrics provider that is able to provide detailed performance information, aggregated using Kubernetes metadata (deployments, services, pod). The extension API server implementation in this post uses Sysdig Monitor.
Before we start deploying our custom Kubernetes autoscaler, let’s go over the HPA basics.
Kubernetes autoscaler (HPA) basic concepts
We can start with a simple diagram:
As you can see above, the HPA object will interact with a pod controller like a Deployment or ReplicaSet. It will update this object to configure the “desired” number of pods given the current metric readings and thresholds.
The pod controller, a Deployment for instance, will then terminate or create new pods as part of its reconciliation loop to reach the desired state.
The basic parameters that will you need for any HPA are:
- Scale target: the controller that this HPA will interact with
- minReplicas: minimum number of pods, the HPA cannot go below this value
- maxRepicas: maximum number of pods, the HPA cannot go above this value
- Target metric(s): metric (or metrics) used to evaluate current load and take scaling decisions
- targetValue: threshold value for the metric. If the metric readings are above this value, and (currentReplicas < maxReplicas), HPA will scale up.
You can create a Kubernetes HPA in just one line:
$ kubectl autoscale deployment shell --min=2 --max=10 --cpu-percent=10 horizontalpodautoscaler.autoscaling/shell autoscaled
If you generate high CPU loads in these pods, the HPA will scale up the desired number of replicas:
23s Normal SuccessfulRescale HorizontalPodAutoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
It will also scale down again when the CPU burst is over. Pretty neat, right? There are many other details covering the HPA algorithm and advanced configuration details in the Kubernetes official documentation.
Like many other things in Kubernetes, the set of metrics available to the HPAs can be expanded implementing an API extension. Let’s see how this is done.
Kubernetes custom metrics API
The Kubernetes HPA is able to retrieve metrics from several APIs out of the box: metrics.k8s.io
, custom.metrics.k8s.io
(the one that we will use in this post), and external.metrics.k8s.io
.
To register custom metrics and update their values, you need to:
- Enable the Kubernetes aggregation layer
- Register a new APIService object that will bind the new API path to the Kubernetes service implementing it
- The actual service implementation (a pod living inside a Kubernetes namespace for this example) that responds to the HPA requests and retrieves the metric values from the external provider
If you are using a recent Kubernetes version (1.11+), the API aggregation layer is probably enabled and configured out of the box, so you can skip this step. You can check the relevant API server parameters describing the kube-apiserver
pod living in your kube-system
namespace:
$ kubectl describe pod kube-apiserver -n kube-system ... Command: kube-apiserver ... --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User ...
In case your API server doesn’t have these flags, the Kubernetes documentation has an article explaining how to configure them.
These parameters enable the kube-aggregator, a controller in charge of two tasks:
- Discovering and registering APIService objects, creating a link between the newly registered API path and the Kubernetes service implementing it.
- Acting as the front-proxy / forwarder for these requests.
This is how a basic APIService object will look like:
apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: name: v1alpha1.wardle.k8s.io spec: insecureSkipTLSVerify: true group: wardle.k8s.io groupPriorityMinimum: 1000 versionPriority: 15 service: name: api namespace: wardle version: v1alpha1
Leaving aside the more advanced configuration details, this object will instruct the kube-aggregator to forward v1alpha1.wardle.k8s.io
requests to the API extension service, implemented by a pod in the wardle
namespace.
Now that we have covered the basic concepts, we can deploy a Horizontal Pod Autoscaler using Kubernetes custom metrics.
Kubernetes autoscaler using Sysdig’s custom metrics
Prerequisites
Sysdig Monitor will be the external metrics provider for the implementation used in this scenario. The prerequisites to deploy this stack are:
- Kubernetes version 1.11+, API aggregation layer must be enabled and configured (see previous section)
- The Sysdig agent must be running in your Kubernetes nodes. If you need further instructions to install the agent you can check the following documentation:
- Deploy the Sysdig agent using a DaemonSet
- Deploy the Sysdig agent using a Helm Chart
- Deploy the Sysdig agent using a Kubernetes operator
- The Sysdig API access token. You can find it in your web interface accessing the
Settings -> User profile
Overview diagram
Let’s start with a diagram covering the entire workflow for this integration:
1: The Horizontal Pod Autoscaler will be configured to manage a Kubernetes deployments by pivoting on the value of a custom metric and its target threshold. As we explained earlier, the HPA itself is abstracted from the implementation details, it just needs to request the metrics from the Kubernetes API.
2: The number of desired pods in this Deployment will be dynamically updated by the HPA, the reconciliation loop will kill or create new pods to reach the configured state.
3: The Sysdig agent will collect metrics and metric metadata (labels) from two sources. Container metrics are collected directly from the Linux kernel – additional metadata on these metrics (for example the namespace, service or deployment associated with a container metric) is pulled from the Kubernetes API. These two metrics streams are processed, aggregated and sent to the Sysdig backend.
4: The extension API server is a pod living in the same Kubernetes cluster as the HPA. It implements the custom metrics apiserver interface and is able to dynamically pull metric values from the Sysdig backend using the API access token. The current version will post separate metric values for every Namespace, Pod and Service in your cluster, but the code can be easily modified if you need a different aggregation.
5: The kube-aggregator in this Kubernetes API server has been configured (through an APIServer object) to forward custom.metrics.k8s.io
requeststo the extension API server. It will then return the requested value to the HPA.
Installation
Let’s get down to work. The first step is cloning the Sysdig metrics apiserver repository:
$ git clone https://github.com/draios/kubernetes-sysdig-metrics-apiserver.git
You have the complete Golang code, Makefiles and Dockerfiles for the project in this repo, but in this article we are going to focus on the actual deployment and operation. Everything you need is inside the deploy
directory.
$ cd deploy $ ls 00-kuard.yml 01-sysdig-metrics-rbac.yml 02-sysdig-metrics-server.yml 03-kuard-hpa.yml
Target deployment
The first yaml to apply is probably the simplest one – a Kubernetes Deployment and Service to deploy kuard (a demo application found in the “Kubernetes Up and Running” book).
$ kubectl apply -f 00-kuard.yml deployment.extensions/kuard created service/kuard created $ kubectl get pods NAME READY STATUS RESTARTS AGE kuard-6b6995ff77-6gxbm 1/1 Running 0 60s kuard-6b6995ff77-cvd72 1/1 Running 0 60s kuard-6b6995ff77-rvznt 1/1 Running 0 60s
Now you have a target Deployment to scale.
APIServer definition and RBAC permissions
The second yaml will create the APIServer definition:
apiVersion: apiregistration.k8s.io/v1beta1 kind: APIService metadata: name: v1beta1.custom.metrics.k8s.io spec: insecureSkipTLSVerify: true group: custom.metrics.k8s.io groupPriorityMinimum: 1000 versionPriority: 5 service: name: api namespace: custom-metrics version: v1beta1
This object will instruct the kube-aggregator to forward custom metrics requests (v1beta1 version) to the api
service in the custom-metrics
namespace.
This YAML will also create the custom-metrics
namespace itself, the ServiceAccount to be used by the custom metrics pod and several RBAC Roles and RoleBindings.
We need these RBAC bindings to allow the custom metrics account to register itself as an API extension and also to read the list of namespaces, pods and services. The custom metrics server needs this metadata to aggregate the requests to the Sysdig backend using the same labels, for example namespace=default service=kuard
.
Applying this second yaml file you should see the following output:
$ kubectl apply -f 01-sysdig-metrics-rbac.yml namespace/custom-metrics created serviceaccount/custom-metrics-apiserver created clusterrolebinding.rbac.authorization.k8s.io/custom-metrics:system:auth-delegator created rolebinding.rbac.authorization.k8s.io/custom-metrics-auth-reader created clusterrole.rbac.authorization.k8s.io/custom-metrics-resource-reader created clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-apiserver-resource-reader created clusterrole.rbac.authorization.k8s.io/custom-metrics-getter created clusterrolebinding.rbac.authorization.k8s.io/hpa-custom-metrics-getter created service/api created apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created
Check that the api extension has been configured:
$ kubectl api-versions | grep "custom.metrics" custom.metrics.k8s.io/v1beta1
Take into account that the API extension has been declared but not implemented (yet), any request to this API will fail:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/kuard/net.http.request.count" | jq . Error from server (ServiceUnavailable): the server is currently unable to handle the request
Sysdig metric server
Ok, so let’s deploy the custom metrics pod then. There are two parameters that you will need to configure, the Sysdig API endpoint and access token.
Edit the file 02-sysdig-metrics-server.yml
. You can find the API endpoint configured as an environment variable in the deployment definition:
- name: SDC_ENDPOINT value: "https://app.sysdigcloud.com/api/" - name: CLUSTER_NAME value: "YourClusterName"
You have to add a new environment variable named CLUSTER_NAME
that has to be the same as the name of your cluster in Sysdig Monitor.
If you are using the SaaS version, the default value should work and you can skip this step. If you want to connect to an on-prem backend, adjust this parameter accordingly.
The access token is mounted using a Kubernetes secret – retrieve your token from the Sysdig interface Settings -> User profile
and execute:
$ kubectl create secret generic --from-literal access-key=<YOUR_SYSDIG_API_TOKEN_HERE> -n custom-metrics sysdig-api
Once you have configured these two parameters, you can deploy the custom metrics server:
$ kubectl create -f 02-sysdig-metrics-server.yml deployment.apps/custom-metrics-apiserver created $ kubectl get pods -n custom-metrics NAME READY STATUS RESTARTS AGE custom-metrics-apiserver-96d86694b-7shmx 1/1 Running 0 16s
You can check if the new metrics are available in the Kubernetes API using a raw request (jq is optional, used to format the output)
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/kuard/net.http.request.count" | jq . { "kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/kuard/net.http.request.count" }, "items": [ { "describedObject": { "kind": "Service", "namespace": "default", "name": "kuard", "apiVersion": "/__internal" }, "metricName": "net.http.request.count", "timestamp": "2019-07-07T15:55:27Z", "value": "0" } ] }
Note that the describedObject
element contains the Kubernetes labels used to aggregate this metric. We also have the timestamp for the request and the metric value, which in this case is 0 because we are not sending any http traffic to the kuard
service (yet).
Kubernetes autoscaler using custom metrics
Now that you have the Kubernetes custom metrics, you just need an HPA to act on them.
You can configure it to target any Sysdig metric. By default we are going to use the net.http.request.count
because is a good service load indicator and is also easy to test using any HTTP client.
The HorizontalPodAutoscaler
object is quite self-explanatory:
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: kuard-autoscaler namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: kuard minReplicas: 3 maxReplicas: 10 metrics: - type: Object object: target: kind: Service name: deployment;kuard metricName: net.http.request.count targetValue: 100
It will target the kuard
deployment, setting a minimum of 3 replicas and a maximum of 10. The target metric is the net.http.request.count aggregated by the pods belonging to the kuard service.
The target value for this metric is 100 req/s. For testing purposes, you can change it to a much lower value, 4 for example.
Apply the last yaml file:
$ kubectl apply -f 03-kuard-hpa.yml horizontalpodautoscaler.autoscaling/kuard-autoscaler created
And check that is working as you expect:
$ kubectl get hpa kuard-autoscaler NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE kuard-autoscaler Deployment/kuard 0/4 3 10 3 66s
The current reading of the metrics is 0, the target value is 4, minimum number of kuard
pods is 3 and the maximum is 10. There are currently 3 pods running. The HPA is operative and ready to go, now we can test it live.
Testing the Kubernetes custom metrics HPA
Testing this deployment is fairly simple, we just need to generate HTTP requests.
First, you need to be able to access the kuard
service. In a production scenario this will mean configuring an ingress controller, for this example we can just forward the http port to the local host:
$ kubectl port-forward service/kuard 8080:80 Forwarding from 127.0.0.1:8080 -> 8080
Leave that command running and open a different console. Now we need to generate http load, there are multiple ways to do this, for example using the ab tool from Apache:
$ ab -c 10 -t 120 http://localhost:8080/
Leave the command running for a minute. Then, describe your HPA controller and you should see something similar to this output:
$ kubectl get hpa kuard-autoscaler NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE kuard-autoscaler Deployment/kuard 9/4 3 10 10 37m $ kubectl describe hpa kuard-autoscaler
Normal SuccessfulRescale 21s horizontal-pod-autoscaler New size: 6; reason: Service metric net.http.request.count above target Normal SuccessfulRescale 6s horizontal-pod-autoscaler New size: 10; reason: Service metric net.http.request.count above target
Metric readings were above the target value, so the HPA scaled your pods, first from 3 to 6 and then from 6 to 10, which is the maximum.
Once the ab tool is finished sending requests, the target value for the custom metric will go back to 0. If you wait a few minutes, you should be able to check that the HPA has downsized the deployment replica count back to 3:
$ kubectl get hpa kuard-autoscaler NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE kuard-autoscaler Deployment/kuard 0/4 3 10 3 99m $ kubectl describe hpa kuard-autoscaler Normal SuccessfulRescale 45s (x3 over 66m) horizontal-pod-autoscaler New size: 3; reason: All metrics below target
You have deployed a Kubernetes custom metrics provider and a Horizontal Pod Autoscaler that is able to resize your deployments pivoting on those metrics!
Conclusion
One of the strong points of Kubernetes has always been its extensibility. Thanks to the aggregation layer, you can extend the API, without adding extra complexity or configuration to the resource consumers (the Horizontal Pod Autoscaler in our example).
If you plan to use this integration in your organization, or just a lab environment, we definitely want to hear from you! You can reach us using slack or twitter and, of course, PRs to the project are welcome.
If you would like to run this example, but don’t have a Sysdig Monitor account, we invite you to sign-up for a free trial.