Discovering StatefulSets

Objective

After completing this lesson, you will be able to compare deployments and StatefulSets

What Are StatefulSets in Kubernetes?

You already deployed your applications to your cluster using Deployments and ReplicaSets, however, these resources are unsuitable for stateful applications such as databases. Learn more about stateful applications and find out how to deploy them in Kubernetes.

In Kubernetes, deploying stateless applications is easy. You can use a Deployment, which creates a ReplicaSet, which, in turn, creates Pods. If one Pod fails, the ReplicaSet creates a new one. These Pods are identified based on their specification and applied labels. However, Pods created by ReplicaSet have a random hash assigned at the end of their name, such as hello-kyma-59fd556764-fkr6d. This is because the Pods are stateless and can be replaced by any other Pod with the same specifications and labels.

Instead, stateful applications often need a unique and consistent identifier, such as a stable hostname. This makes it tricky with the nature of ReplicaSets, since they would randomly assign a new name to the Pod, making it hard to identify the Pod and connect to it. For this reason, Kubernetes provides StatefulSets.

Based on the official Kubernetes documentation, you can use StatefulSets when your workload requires one or more of the following features:

  • Stable, unique network identifiers
  • Stable, persistent storage
  • Ordered, graceful deployment and scaling
  • Ordered automated rolling updates

So, a StatefulSet manages a group of Pods with unique, persistent identities and stable hostnames that Kubernetes maintains across rescheduling.

How Does a StatefulSet Work?

When creating a StatefulSet, Pods are created in a sequential order. This means that the first Pod is created with the name my-stateful-app-0, the second one with the name my-stateful-app-1, and so on. The maximum number is equal to the number of replicas you have specified in the StatefulSet. This behavior differs from a ReplicaSet, where the Pods are simultaneously created in a random order.

The StatefulSet begins by initiating the Pod with the lowest index and waits for it to become ready. Only then does it proceed to create the next Pod, ensuring that Pods are created sequentially and that each Pod is ready before the next one is started. Additionally, you can attach persistent storage to a stateful Pod to prevent data loss if the Pod is deleted. This is particularly useful for applications like databases that need to persist data. If the database Pod crashes or is deleted, the data remains available on the persistent storage.

See an example manifest for a StatefulSet:

YAML
123456789101112131415161718192021
apiVersion: apps/v1 kind: StatefulSet metadata: name: my-stateful-app spec: selector: matchLabels: app: my-stateful-app serviceName: my-stateful-app replicas: 3 template: metadata: labels: app: my-stateful-app spec: containers: - name: my-stateful-app image: ghcr.io/sap-samples/kyma-runtime-learning-journey/hello-kyma:1.0.0 ports: - containerPort: 8080 name: web

An important part to consider is the definition of the Service for a StatefulSet as a so-called headless Service. When using a Service that load balances traffic to Pods managed by a ReplicaSet, the Service identifies the Pods by their labels, as the Pods are assigned a random hash at the end of their name. With StatefulSets, on the other hand, the Pods are identified by their stable names, such as my-stateful-app-1. In this context, it's not required to have a load balancer with one IP address as a Service to load balance traffic to randomly named Pods.

See an example of the headless Service manifest:

YAML
1234567891011
apiVersion: v1 kind: Service metadata: name: my-stateful-app spec: clusterIP: None selector: app: my-stateful-app ports: - name: web port: 80

Note

In Kubernetes, a Service is considered headless if its clusterIP is set to None. This means that the Service doesn't have a cluster IP.

Summary

In this lesson, you learned that StatefulSets are used to deploy stateful applications, such as databases or message brokers. You can now also differentiate StatefulSets from ReplicaSets, since they create Pods in sequential order with a unique index and a stable name. Additionally, you learned that you can use a headless Service to connect to the Pods of a StatefulSet.

Further Reading