Most content in this document is based on "Kubernetes In Action, Second Edition" by Marko Lukša and Kevin Conner.
Chapter 2 provides a comprehensive, low-level exploration of container technology. Rather than treating containers as magic black boxes, it strips away the abstraction to reveal that containers are simply regular Linux processes running directly on the host operating system. The chapter meticulously details how Linux kernel features—specifically Namespaces, Control Groups (cgroups), and advanced security profiles—are combined to create the illusion of a fully isolated environment. It also covers the practical workflow of the Docker platform, including image layering, building with Dockerfiles, and distributing applications via registries. This foundational OS-level knowledge is critical for effectively debugging, securing, and managing containerized applications in Kubernetes.
Understanding the difference between VMs and containers requires looking at how applications interact with the underlying hardware and kernel.
graph TD
subgraph VM_Architecture [Virtual Machine]
App1[App A]
Lib1[Lib/Bins]
GOS1[Guest OS Kernel]
App2[App B]
Lib2[Lib/Bins]
GOS2[Guest OS Kernel]
HYP[Hypervisor]
HOST[Host Physical Hardware]
App1 --> Lib1
Lib1 --> GOS1
App2 --> Lib2
Lib2 --> GOS2
GOS1 -->|Virtual Syscalls| HYP
GOS2 -->|Virtual Syscalls| HYP
HYP --> HOST
end
subgraph Container_Architecture [Linux Container]
CApp1[App A]
CLib1[Lib/Bins]
CApp2[App B]
CLib2[Lib/Bins]
CHOST[Single Shared Host OS Kernel]
PHOST[Host Physical Hardware]
CApp1 --> CLib1
CApp2 --> CLib2
CLib1 -->|Direct Syscalls| CHOST
CLib2 -->|Direct Syscalls| CHOST
CHOST --> PHOST
end
A container is not a tangible “box.” It is an illusion constructed using specific Linux kernel features that restrict what a process can see and what it can use.
By default, all processes in a Linux system share the same global resources (filesystems, process IDs, network interfaces). Linux Namespaces allow the kernel to partition these resources into separate, isolated buckets. When you create a container, you assign it to a specific set of namespaces. It can only see the resources in its namespace.
/). When the process lists files, it only sees the files provided by the container image. It cannot see the host’s /etc, /var, or other containers’ files. This prevents a compromised app from reading host secrets.SIGKILL) processes outside its PID namespace.eth0) and its own IP address. Two different containers can both bind to port 80 simultaneously without collision because they exist in different network namespaces.root (UID 0) inside the container, but be mapped to an unprivileged user (e.g., UID 1000) on the host OS, drastically reducing security risks.While Namespaces limit what a process can see, cgroups limit what a process can use.
Because containers share the host kernel, preventing malicious system calls is critical.
--privileged mode, which grants full access to all host devices and kernel features.root access, Linux breaks root privileges down into granular “capabilities.” For example, a container might be given CAP_NET_BIND_SERVICE (allowed to bind to ports < 1024) but denied CAP_SYS_TIME (not allowed to change the system clock).Docker popularized containers by providing user-friendly tooling to manage these complex kernel features and standardizing how applications are packaged.
Unlike VM images, which are massive monolithic files, container images are composed of multiple thin, read-only layers.
node:alpine image, Docker does not store 10 copies of the Node.js binaries. It stores the node:alpine layers exactly once on the host’s disk, and all 10 containers share those read-only layers.RUN step and deleting it in the next RUN step does not reduce the final image size).While containers solve the “it works on my machine” problem by bundling dependencies, they are not universally portable:
Chapter 2 fundamentally redefines a container from a “lightweight VM” to its true reality: a highly configured Linux process. By combining Namespaces for environmental isolation, cgroups for resource metering, UnionFS for efficient storage, and advanced security profiles, the Linux kernel and container runtimes (like Docker or containerd) provide a fast, efficient, and reproducible execution environment. This deep OS-level understanding is a prerequisite for diagnosing complex networking, storage, and permission issues when deploying applications at scale with Kubernetes.