Deployments, StatefulSets, and DaemonSets: A Field Guide

At the heart of any Kubernetes deployment strategy lies The Pod. The workhorse of distributed container solutions, the Kubernetes Pod glues together a bunch of containers to a single networking stack and process namespace. Pod processes can communicate with one another over loopback (127.0.0.1), and signal each other using POSIX signal(7) mechanics.

Strange then, that almost nobody spins Pods.

Instead, virtually every Helm chart you run across will give you either a Deployment, a StatefulSet, or a DaemonSet. But what do these constructs actually do?

Each of these higher-level constructs will eventually culminate in the creation of a set of Pods. The core controllers in a Kubernetes cluster implement this assembly line of container construction, and help to ensure that if the pods that are part of one of these things goes missing, they get recreated exactly like they were originally specified.

Deployments, StatefulSets, and DaemonSets allow you to scale up your Pods, roll out new images and configurations, and more. But, how do you pick which one to use for a given situation?

It all depends on what you want out of your pods.

When Your Pods Are Interchangeable

If scaling up means just run more of these, then you want a Deployment. These are by far the most common in the wild, and they fit nicely with application workloads and other 12-factor applications that share no state and have no need fo identity.

For example, let’s say you have a deployment of your Rust-based web application:

The load balancer out front (also provided by Kubernetes, via Services) routes traffic to the single application instance, as in Figure 1. The application itself keeps all state in the relational database behind it, so if we want to scale, we just ask for more copies (Figure 2). Kubernetes’ Deployment Controller immediately springs into action, creating all of the new (now missing) Pods at once.

These new Pods boot with the same environment variables set and the same ConfigMaps mounted in, presumably allowing them to talk to the database just like the first Pod.

Best Use Cases for Deployments:

  • Stateless application workloads
  • When you just want a single Pod (replicas: 1)

When Your Pods Are Unique

The 12-factor, never-be-responsible-for-the-data approach is a great one for commodity computation, like the kind of workloads that make up API services, backend service brokers, etc., eventually somebody has to actually handle data, in order to make the whole thing worthwhile.

To scale stateful systems, you need a StatefulSet. Let’s consider the relational database system behind the applications from our previous deployment (Figure 4, below). Ideally, we’d like to be able to scale it up and build some sort of consensus-based leader/follower topology.

The exact details of how this would work are beyond the scope of this blog post, but suffice it to say that one cannot hold elections without identity.

When we scale the StatefulSet from 1 replica to 3, the StatefulSet controller starts to incrementally deploy new (missing) pods, one at a time. First, the (1) pod comes up, initialized, and then settles into a “ready” state. Then, the second pod (2) does likewise.

One of the more powerful features of a StatefulSet is the use of persistent volume claim templates, a means to provide each Pod in a StatefulSet with its own persistent volume. If you’re trying to replicate database contents (as we are in this example), this is a critical capability.

Best Use Cases for StatefulSets:

  • Data Services (databases, key-value stores, etc.)
  • Identity-sensitive systems (consensus-systems, leader-election clustering, etc.)
  • Anything that needs slow roll-out and cluster coalescence.

When Your Pods Must Be Everywhere

Occasionally, cluster operators would like to leverage Kubernetes constructs like Pods, ConfigMaps, and Secrets to provide cluster-level services. kube-proxy is a prime example of this. It handles the binding of NodePort Services to actual ports on each cluster node, to facilitate load balancing in a variety of configurations.

To do this properly, each node in the cluster needs to run exactly one instance of the kube-proxy Pod. This is a nightmare to do with a Deployment or a StatefulSet, but it’s what DaemonSets were made for. Literally.

A DaemonSet will always run exactly as many Pods as there are nodes in the cluster (whether those nodes are usable by other workloads is irrelevant). The neat thing about this is that when a new node is added to the cluster, either manually or automatically, it implicitly picks up a new Pod for the DaemonSet.

This is illustrated in the following set of figures:

We start off in Figure 7 with two Kubernetes nodes (node/0 and node/1). The Pod in the lower left-corner of each node is under the control of a DaemonSet. When we add node/2 in Figure 8, the DaemonSet Controller notices the addition and configures a new Pod to execute on that node, and only that node. The existing Pods on the first two nodes are not effected.

Best Use Cases for DaemonSets:

  • Cluster services like kube-proxy.
  • Security solutions like antivirus, intrusion detection, and image scanning.

Spread the word

twitter icon facebook icon linkedin icon