Kubernetes Capacity 101
not just an afterthought
Capacity vs. Allocatable
In a Kubernetes worker node, certain resources are reserved for components like kubelet, kube-proxy, container-runtime, etc
therefore the Allocatable resources will be less than Capacity.

Allocatable on a Kubernetes node is defined as the amount of compute resources that are available for pods. As shown in the image below, a node with a capacity of 8 vCPUs (8000 millicores) and approximately 32GB of memory has slightly lower allocatable resources.

Requests and Limits
For a Pod, you can optionally specify how much of each resource a container needs. By resources, we mean CPU and memory (RAM);

Thanx to efficient resource bin packaging (scheduling), a node will host multiple pods, and the total requested resources of all running pods (scheduled on a particular node) represent the allocated resources for that node.

As shown in the image below, the sum of all requested resources of the pods running on that node amounts to allocated resources:

AVAILABLE = (CAPACITY — RESERVED) — REQUESTED
A pod is assigned to the node with more unrequested resources (available) and goes on with its happy and wonderful life full of SLO-compliant replies to requests.

Overcommitting
You can overcommit (exceed the allocatable resources) with requests and with limits, but most often you’re going to overcommit with limits.
It’s ok to overcommit on CPU (because it’s a compressible resource) because CPU can be throttled. On the contrary, if you overcommit on Memory, Kubernetes will start (OOMKilled).
[Code 137 is 128 + 9(SIGKILL) process was killed by external signal is not necessarily OOMKilled it can be because health check has failed]
M vs Mi: 1 Mi is larger than 1 M by about 4.86%
Mi is mebibytes and M is megabytes. The difference boils down to decimal(SI) units vs. binary (IEC) units.
- 400M is decimal, meaning 1M=10⁶ bytes (1 000 000 bytes)
- 400Mi is binary, meaning 1Mi=2²⁰ bytes (1 048 576 bytes)
If we talk about a smaller size, the difference between units is not that important, but if we talk about larger sizes, the difference becomes significant, and don’t forget computers use the binary system therefore Mi usage is preferred.
QoS: Quality of Service Classes: best-effort, burstable, guaranteed
QoS Guaranteed: the workloads must have equal requests and limits, because it offers predictable resource usage, prevents overcommitting resources, ensures stability, and helps with efficient resource management.

Closing words
If your solution experiences instability likely the cluster is under-provisioned and if you experience cost overruns then the cluster is probably over-provisioned.


For more insights, small wrapper scripts around kubectl
can be found at the following repo: