Discovering StatefulSets

Objective

After completing this lesson, you will be able to compare deployments and StatefulSets

Business Scenario for Deployments and StatefulSets

You already deployed your applications to your cluster using Deployments and ReplicaSets. You have heard from a colleague that these are unsuitable for stateful applications such as databases. Now you want to learn more about stateful applications and how to deploy them in Kubernetes.

What are StatefulSets in Kubernetes?

Deploying stateless applications is easy in Kubernetes. You can use a Deployment, which creates a ReplicaSet, which in turn creates Pods from the ReplicaSet. If one Pod fails, the ReplicaSet will create a new one. These Pods are identified based on their specification and the labels that they have. However, these Pods are getting assigned a random hash at the end of their name, such as hello-kyma-59fd556764-fkr6d. This is because the Pods are stateless and can be replaced by any other Pod with the same specifications and labels.

Instead, stateful applications often need a unique and consistent identifier, such as a stable hostname. This makes it tricky with the nature of ReplicaSets, since they would randomly assign a new name to the Pod, making it hard to identify the Pod and connect to it. For this reason, Kubernetes provides StatefulSets.

Based on the official Kubernetes documentation, you can use StatefulSets when your workload requires one or more of the following:

  • Stable, unique network identifiers
  • Stable, persistent storage
  • Ordered, graceful deployment and scaling
  • Ordered automated rolling updates

So, a StatefulSet manages a group of Pods with unique, persistent identities and stable hostnames that Kubernetes maintain across rescheduling.

How Does a StatefulSet Work?

When creating a StatefulSet, the Pods will be created in sequential order. This means that the first Pod will be created with the name my-stateful-app-0, the second one with the name my-stateful-app-1, and so on. The maximum number is equal to the number of replicas you have specified in the StatefulSet. This differs from a ReplicaSet, where the Pods are simultaneously created in a random order.

The StatefulSet starts with the lowest index and waits for the Pod to be ready. Only then will it create the next Pod. This ensures that the Pods are created in sequential order and that the current Pod is ready before the next one is being created. You can also attach persistent storage to the stateful Pod. This ensures that the data of the Pod is stored on persistent storage and that the data is not lost when the Pod is deleted. This is useful, for example, when you have a database that has to persist data. If the database Pod crashes or is deleted, the data is still available on the persistent storage.

A manifest for a StatefulSet might look like this:

YAML
123456789101112131415161718192021
apiVersion: apps/v1 kind: StatefulSet metadata: name: my-stateful-app spec: selector: matchLabels: app: my-stateful-app serviceName: my-stateful-app replicas: 3 template: metadata: labels: app: my-stateful-app spec: containers: - name: my-stateful-app image: ghcr.io/sap-samples/kyma-runtime-learning-journey/hello-kyma:1.0.0 ports: - containerPort: 8080 name: web

An important part to consider is the definition of the service for a StatefulSet as a so-called headless service. When using a service that load balances traffic to Pods managed by a ReplicaSet, the service identifies the Pods by their labels, as the Pods are assigned a random hash at the end of their name. With StatefulSets, on the other hand, the Pods are identified by their stable names, such as my-stateful-app-1. In this context, it's not required to have a load balancer with one IP address as a service to load balance traffic to randomly named Pods.

A manifest for a Headless Service might look like this:

YAML
1234567891011
apiVersion: v1 kind: Service metadata: name: my-stateful-app spec: clusterIP: None # this makes the service headless selector: app: my-stateful-app ports: - name: web port: 80

Note

The clusterIP is set to None, which makes the service headless in Kubernetes. This means that the service will not have a cluster IP.

Summary

In this lesson, you learned about StatefulSets in Kubernetes and that StatefulSets are used to deploy stateful applications, such as databases or message brokers. You can now also differentiate StatefulSets from ReplicaSets, since they create Pods in sequential order with an unique index and a stable name. Additionally, you learned that you can use a headless service to connect to the Pods of a StatefulSet.

Further Reading

Log in to track your progress & complete quizzes