Containers, Very Briefly
This topic is pretty short. This just came out. The main takeaway is that containers are just filesystems running as processes. These processes are labelled differently depending how you're looking at it.
-
Why containers?
- Easy building (package dependencies, etc)
- Easy deployment (versioning, small, etc)
-
What is a container?
- A tarball of a filesystem (compression)
- Docker build = Tarball: Base OS + program/dependencies + configuration
- Docker run = Download, unpack into directory, run a program and pretend that directory is the whole filesystem
-
Containers are processes
- On mac, all the containers are running on a Linux virtual machine. Container processes can do anything a normal process on your computer can (but with restrictions: no sys calls, different pid namespace, memory limit, restrictions enforced by the linux kernel, etc)
-
Container kernel (Docker) features
- pivot_root: change a process's root directory, to access stuff in tarball
- layers: a layer is a directory, reusing layers saves space, you can combine a bunch of layers into one filesystem
- container registries: tag images to store iamges publically and privately
- cgroups: a group of processes with memory/cpu limit. too much memory and kill by OOM error, too much cpu and process gets slowed down
- I could be wrong but I think everything on my docker engine is on the same cgroup
- namespaces:
- PID: same process has different PIDs in different namespaces
- user: security feature to run stuff with unpriviledged access
- network: 127.0.0.1 for connections inside the namespace, 0.0.0.0 for every network interface in my namespace
- container IP address: containers use private IP addresses, packets get routed on your computer, distributed systems do other things (AWS uses elastic network interface)
K8s run linux VMs which pods take advantage of. There's a nice python library here to read/write stuff in kubernetes with python. Pods are wrappers around containers, etc, etc. These notes should be good enough for now.