Kubernetes Pods Explained

Kubernetes Pods can be an interesting concept to get your head around. Despite the fact that the idea is pretty simple, many people often have trouble figuring out the relationships between pods and all of the other Kubernetes components. In this brief article I will describe pods and how they are used.

Beware that Kubernetes is very new and still evolving — so some things may change between the time at which I write this and you read it. With this in mind though, I don’t think that too much will change with the concept of pods.

First, let’s start by talking about Kubernetes itself. Kubernetes is a cluster-level management tool for Docker instances. Therefore, a single instance of Kubernetes is in charge of controlling many instances of Docker. Each Docker instance is then responsible for managing all of the containers it owns. Although Kubernetes does track all of the running containers (through the Kubelet), it manages containers through the Docker API for the relevant Docker instance — leaving the low level management stuff to Docker itself. This is how Kubernetes ensures that all of the jobs it’s responsible for are actually running.

This means that Kubernetes needs a way for the user to specify what containers they need running. Now, Kubernetes could simply allow users to specify individual containers they want, and then take the responsibility for scheduling those containers across the cluster, however this would artificially limit use cases. What if two containers needed to be running “close” to each other so that their inter-communications are fast? What if two containers need to be able to easily discover each other? What if you wanted to have multiple services running at the same virtual endpoint or IP?

All those things are what Kubernetes Pods allow users to specify.

The Kubernetes Pod allows the user to specify a group of pods (therefore the Pod:Container relationship is one:many.) This group of pods should then reflect a group of related processes, such as a web server and cache. You wouldn’t place unrelated processes together inside a single Pod though, as the Pod is also used as an organizational tool.

Through the JSON Pod template you are able to assign an id and a set of labels to a Pod. Labels allow you to create arbitrary key-value pairs that describe the purpose of the pod. An example labels object is shown below.

{
    "name": "WebServer",
    "environment": "Testing",
    "developer": "MikeB"
}

These labels specify that the pod that contains these labels is a Testing Web Server, owned by the developer “MikeB”. You could then query Kubernetes for all pods where name=WebServer and receive all of the WebServer instances. You could also use this to track who owns that pod, making it much easier to manage resources.

So Pods allow you to organize groups of related containers, but they aren’t just an organizational mechanism. Pods also affect the networking parameters of the owned containers as Pods create a network namespace for the containers. This means that no two containers may use the same Port within a Pod. This means that for the processes running inside of the Pods’ containers they appear as if they are running on the same machine/IP — although they aren’t able to interact with each others’ processes directly.

The ultimate impact of this is that if you had a web server, it would always know that its’ cache slave was running at the same IP, and always on the same port — making discovery very easy.

Pods in (Replication) Controllers

Pods are the smallest thing you can create in Kubernetes — they are its’ atomic unit. Therefore, if you’re looking to scale a service you would create multiples of a given pod — you would not scale the number of containers within a pod.

This concept is utilized by Kubernetes controllers. At the time of writing there is only one controller: the Replication Controller. However, there are plans to design Kubernetes in such a way so that arbitrary controllers may be designed. The Replication Controller itself allows you to specify how many replicas of a Pod you would always like to be running, and it ensures that there is always exactly that number running. Presently you cannot directly use a Replication Controller to automatically scale a Pod to application load or anything like that, so they are relatively static. However, you can resize a Replication Controller.

Summary

In summary, Pods are logical one-to-many mappings of containers that provide organizational and networking benefits:

  • Pods are the atomic unit of Kubernetes — you can create a 1:1 pod:container mapping, but you cannot create a container without placing it in a Pod.
  • Pods allow you to assign arbitrary key-value pair “labels” designed to help you organize.
  • Pods create the networking namespace for their containers — so containers are supposed to share the same port relationships no matter where they’re being run.
  • Pods guarantee that two containers are placed on the same machine or virtual machine.

There are many more things that you can do with Kubernetes related to pods (services, health checking, etc..) but the goal of this article is to explain Pods themselves. These other services build off of the basic logical axioms that Pods provide in order to create modified or added behaviour. For more information on the things you can do with Kubernetes and Pods, please see the Kubernetes GitHub page.

,

No comments yet.

Leave a Reply

Proudly made in Canada