Working with Horizontal Pod Autoscalers (HPA)

Objectives

After completing this lesson, you will be able to:
  • Identify the features of Horizontal Pod Autoscalers (HPA)
  • Use HPAs in Kubernetes to automatically scale your workloads

Horizontal Pod Autoscaling

Usage scenario

You have deployed workloads to your Kubernetes cluster. So far, you have only configured a fixed number of simultaneous replicas for each workload. However, you want to be able to scale your workloads automatically based on the current load. For this, you want to explore the capabilities of Horizontal Pod Autoscalers (HPA) in Kubernetes.

What are HorizontalPodAutoscalers?

With 'Deployments', you can define a fixed replica count for the managed ReplicaSet. This amount might be too small or too big for the current load. For example, if you have a Deployment with three replicas and the current load is very high, you should scale up the Deployment to five or more replicas. If the load is low, you should scale down the Deployment to two replicas. This is where HPAs come into play. With HPAs, you can define the minimum and the maximum number of replicas for a Deployment. The HPA then automatically scales the Deployment up or down based on the current load.

You can configure this automatic scaling based on a metric, such as CPU or memory usage. The HPA checks the metric's current value and compares it to a target value. If the current value is higher than the target value, the HPA scales up the Deployment. For example, you can set a target CPU usage of a Pod to 50%. If the current CPU usage of a Pod is higher than 50%, the HPA scales up the Deployment. If the current CPU usage is below 50%, the HPA scales down the Deployment. Read more about metrics in Container resource metrics.

A HPA is explicitly called horizontal because it scales the number of Pods horizontally. This means that the HPA adds or removes Pods to increase or decrease their number. This is in contrast to vertical scaling, where the resources of a Pod are increased or decreased. For example, you can increase the CPU and memory of a Pod by increasing the resources section of the Pod definition.

Defining a HorizontalPodAutoscaler

To define an HPA, you create a YAML file. The following YAML file defines an HPA that scales the Deployment named hello-kyma:

YAML
123456789101112131415161718
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: hello-kyma-hpa spec: minReplicas: 3 maxReplicas: 5 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: hello-kyma metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50

The minReplicas and maxReplicas fields define the minimum and the maximum number of replicas for the Deployment. The scaleTargetRef field defines the Deployment that is scaled. The metrics field defines the metric used for scaling. In this case, the metric is CPU usage. The target field defines the target value for the metric. In this case, the target value is 50% CPU usage.

Deploying a HorizontalPod Autoscaler

To deploy a HorizontalPodAutoscaler, you can use the kubectl apply command:

Code Snippet
1
kubectl apply -f hpa.yaml

To create a HorizontalPodAutoscaler for an existing Deployment or ReplicaSet, you use the kubectl autoscale command:

Code Snippet
1
kubectl autoscale deployment hello-kyma --cpu-percent=50 --min=3 --max=5

You can also create HorizontalPodAutoscalers using Kyma dashboard. To do so, go to the Discovery and Network menu item and select Horizontal Pod Autoscalers. Choose the Create button and fill out the form.

The Kyma dashboard, specifically the Horizontal Pod Autoscalers section. It displays a list of autoscalers with details like name, created time, labels, metrics, min/max Pods, replicas, and status.

Summary

Horizontal Pod Autoscaling dynamically adjusts the number of pods in a deployment based on metrics like CPU or memory usage. With HPA, you can set minimum and maximum replica counts, and Kubernetes automatically scales the Deployment up or down according to the current load.

Further Reading about Horizontal Pod Autocaling