With the current boom of containers, it is possible that you have started to consider placing your application in containers. Alternatively, you may already be doing it. Whatever your case may be, you are probably using (or considering using) Kubernetes. In this article, we are going to see what Kubernetes is and how it works. It is not the purpose of this post that you see "how to develop for Kubernetes" but understand what components are part of Kubernetes and what is the role of each one of them.
What does Kubernetes do?That is the first question we must answer: to understand how it works, it is important that we see what role Kubernetes plays. As you probably know, it is cataloged as a "container orchestrator”, which places it in the same list of products as Swarm, Marathon or EC2.
An orchestrator is in charge of managing the life cycle of the containers of an application (it is true that the concept of "orchestrator" in general is much broader, but in this article, we will focus only on container orchestrators). Therefore, the services that an orchestrator usually offers are the following:
- Cluster handling (allow adding or removing nodes to the cluster)
- Container life cycle management (eg restarting containers that fail)
- Service Discovery (that a container can find the IP / DNS routes of another container)
- Network services and load balancing (distribute the load between the different machines in the cluster)
- Monitoring services
- Health status check services (of the cluster and of each of the containers)
Kubernetes offers all those services (and some more like self-escalation or state management) and has become the de facto standard for the deployment of container-based applications.
The architecture of KubernetesYou should see Kubernetes more as a "project" than as a "product": indeed, Kubernetes relies on many other previously existing products (such as etcd ) and combines them with other proprietary developments. Installing and configuring an on-premises Kubernetes cluster is not easy, and that is precisely why Kubernetes "managed" services are so popular. These services allow having a Kubernetes cluster correctly installed and raised in a matter of minutes.
The "managed" services are those provided by a cloud provider and managed by it, such as AKS, EKS or GKE.
A Kubernetes cluster consists of several machines. Each of these machines (nodes) must play one of the following roles:
- Master node
- Minion node
Master nodesThe master nodes are responsible for coordinating the cluster. In general, master nodes do not execute containers. Although it is allowed to do so, it is not a recommended scenario in production. Every Kubernetes cluster must have at least one master node.
The master nodes are responsible for deciding in which node (minion) each container is executed, for maintaining the state of the cluster, for ensuring that at all times there are the desired number of containers in execution and for updating the applications in a coordinated manner when they deploy new versions.
A master node executes the following processes:
- Etcd: Etcd is a database of type (key, value) that is used to maintain the global configuration of the cluster. The information contained in etcd is critical and you should always have a backup plan.
- Kube-apiserver: the master nodes expose an API that is used for the minion nodes and the clients of the cluster to communicate.
- Kube-scheduler: the Kubernetes component responsible for deciding in which node a specific container is executed.
- Kube-controller-manager: is responsible for executing the different controllers. A "controller" is responsible for ensuring that the desired state of the application and cluster is met at all times (for example that at all times there are five instances of a given container).
A master node does not execute containers, so you do not need to have Docker installed and none of the other elements needed for a minion node that we would see next.
Why should a productive cluster have a minimum of three master nodes?
This is due to etcd . Remember that etcd is used to save the global status of the cluster and its information is critical. If you have three nodes of etcd and you lose one, the system can continue to work, since the two remaining nodes can continue verifying each other. However, you cannot lose any other. Therefore, the nodes of etc. are always scaled in pairs : if you have three you can afford to lose one, if you have five you can lose two and so on. Of course it is assumed that these losses are temporary (i.e. one node falls and the cluster continues to work while we add another node).
Precisely how Kubernetes is made up of several products, it is complicated to create a high availability cluster: in the previous paragraph, it was mentioned etcd, but the availability of the other components must be taken into account as well. In fact, there are Kubernetes installations in which etcd runs in separate nodes: that is we have a minimum of three master nodes (without etcd), three more nodes with etcd and two minion nodes.
Minion nodesA minion node is one that executes the containers deployed in the cluster. It consists of three basic elements:
- Container engine: Obviously, the container engine must be installed in the cluster. Many people assume that Kubernetes can only run Docker containers but that is not strictly true. As standard, Kubernetes can run Docker and rkt, being possible to integrate it with other existing engines through CRI.
- Kube-proxy: In charge of managing, the virtual network and the virtual IPs assigned to each container.
- Kubelet: the most important component of a minion node, whose main function is to make sure that all the containers that must be executed in this node are running. That is, when a kubelet is ordered to start a container, it starts it and then monitors that it continues to run (restarting it if the container ends due to an uncontrolled exception) until the order to stop the container arrives.
Other elements of KubernetesApart from the previous elements, there is a set of add-ons that run in the cluster. These add-ons are like containers (pods in the terminology of Kubernetes) and are optional: they may be present or not. They usually run in the namespace kube-systemand offer cross-sectional services. The best-known are:
- Web Dashboard: offers a graphical web interface to monitor and manage the cluster:
- Cluster monitoring, based on cAdvisor and which kubelet exposes to be consulted.
- Centralized logging systems.
- Internal DNS: this add-on is special because, although it is optional by definition, it can be considered as mandatory since it is usually assumed that we have a DNS system within the cluster. Most clusters use kube-dns for this.