Contents
Kernel Control Groups (abbreviated known as “cgroups”) are a kernel feature that allows aggregating or partitioning tasks (processes) and all their children into hierarchical organized groups. These hierarchical groups can be configured to show a specialized behavior that helps with tuning the system to make best use of available hardware and network resources.
The following terms are used in this chapter:
“cgroup” is another name for Control Groups.
In a cgroup there is a set of tasks (processes) associated with a set of subsystems that act as parameters constituting an environment for the tasks.
Subsystems provide the parameters that can be assigned and define CPU sets, freezer, or—more general—“resource controllers” for memory, disk I/O, etc.
cgroups are organized in a tree-structured hierarchy. There can be more than one hierarchy in the system. You use a different or alternate hierarchy to cope with specific situations.
Every task running in the system is in exactly one of the cgroups in the hierarchy.
See the following resource planning scenario for a better understanding
(source:
/usr/src/linux/Documentation/cgroups/cgroups.txt
):
Web browser such as Firefox will be part of the Web network class, while the NFS daemons such as (k)nfsd will be part of the NFS network class. On the other side, Firefox will share appropriate CPU and memory classes depending on whether a professor or student started it.
The following subsystems are available and can be classified as two types:
cpuset, namespace, freezer, device, checkpoint/restart
cpu(scheduler), memory, disk I/O, network
Either mount each subsystem separately:
mount -t cgroup -o cpu none /cpu mount -t cgroup -o cpuset none /cpuset
or all subsystems in one go:
mount -t cgroup none /cgroups
Some additional information on available subsystems:
Use cpuset to tie processes to system subsets of CPUs and memory (“memory nodes”). For an example, see Section 10.4.3, “Example: Cpusets”.
Namespace is for showing private view of system to processes in cgroup. It is mainly used for OS-level virtualization. This subsystem itself has no special functions and just tracks changes in namespace.
The Freezer subsystem is useful for high-performance computing
clusters (HPC clusters). Use it to freeze (stop) all tasks in a group
or to stop tasks, if they reach a defined checkpoint. For more
information, see
/usr/src/linux/Documentation/cgroups/freezer-subsystem.txt
.
Here are basic commands, how you can use the freezer subsystem:
mount -t cgroup freezer /freezer -o freezer # Create a child cgroup: mkdir /freezer/0 # Put a task into this cgroup: echo $task_pid > /freezer/0/tasks # Freeze it: echo FROZEN > /freezer/0/freezer.state # Unfreeze (thaw) it: echo THAWED > /freezer/0/freezer.state
A system administrator can provide a list of devices that can be accessed by processes under cgroups.
It limits access to a device or a file system on a device to only
tasks that belong to the specified cgroup. For more information, see
/usr/src/linux/Documentation/cgroups/devices.txt
.
Save the state of all processes in a cgroup to a dump file. Restart it later (or just save the state and continue).
Allows to move “saved container” between physical machines (as VM can do).
Dump all process's image to a file.
The CPU accounting controller groups tasks using cgroups and accounts
the CPU usage of these groups. For more information, see
/usr/src/linux/Documentation/cgroups/cpuacct.txt
.
Share CPU bandwidth between groups with the group scheduling function of CFS (the scheduler). Mechanically complicated.
Limits memory usage of user space processes.
Limit LRU (Least Recently Used) pages.
Anonymous and file cache.
No limits for kernel memory.
Maybe in another subsystem if needed.
For more information, see
/usr/src/linux/Documentation/cgroups/memory.txt
.
Three proposals are currently being discussed: dm-ioband, io-throttle, and io-controller.
Still under discussion.
To use cgroups, install the following additional packages:
libcgroup1
contains basic user space tools to
simplify resource management.
cpuset
libcpuset1
kernel-source
(for documentation purposes
only)
lcx
The kernel shipped with openSUSE supports cgroups. There is no need to apply additional patches. Execute lxc-checkconfig to see a cgroups environment similar to the following output:
--- Namespaces --- Namespaces: enabled Utsname namespace: enabled Ipc namespace: enabled Pid namespace: enabled User namespace: enabled Network namespace: enabled Multiple /dev/pts instances: enabled --- Control groups --- Cgroup: enabled Cgroup namespace: enabled Cgroup device: enabled Cgroup sched: enabled Cgroup cpu account: enabled Cgroup memory controller: enabled Cgroup cpuset: enabled --- Misc --- Veth pair device: enabled Macvlan: enabled Vlan: enabled File capabilities: enabled
To find out which subsystems are available, proceed as follows:
mkdir /cgroups mount -t cgroup none /cgroups grep cgroup /proc/mounts
The following subsystems are available: rw, freezer, devices, cpuacct, cpu, ns, cpuset, memory. Disk and network subsystem controllers may become available during SUSE Linux Enterprise Server 11 lifetime.
With the command line proceed as follows:
To determine the number of CPUs and memory nodes see
/proc/cpuinfo
and
/proc/zoneinfo
.
Create the cpuset hierarchy as a virtual file system (source: /usr/src/linux/Documentation/cgroups/cgroups.txt):
mkdir /dev/cpuset mount -t cpuset cpuset /dev/cpuset cd /dev/cpuset mkdir Charlie cd Charlie # List of CPUs in this cpuset: /bin/echo 2-3 > cpus # List of memory nodes in this cpuset: /bin/echo 1 > mems /bin/echo $$ > tasks # The current shell is now running in the Charlie cpuset # The next line should display '/Charlie' cat /proc/self/cpuset
Remove the cpuset using shell commands:
rmdir /dev/cpuset/Charlie
This fails as long as this cpuset is in use. First, you have to remove the inside cpusets or tasks (processes) that belong to it. Check this with:
cat /dev/cpuset/Charlie/tasks
For background information and additional configuration flags, see
/usr/src/linux/Documentation/cgroups/cpusets.txt
.
With the cset tool, proceed as follows:
# Determine the number of CPUs and memory nodes cset set --list # Creating the cpuset hierarchy cset set --cpu=2-3 --mem=1 --set=Charlie # Starting processes in a cpuset cset proc --set Charlie --exec -- stress -c 1 & # Moving existing processes to a cpuset cset proc --move --pid PID --toset=Charlie # List task in a cpuset cset proc --list --set Charlie # Removing a cpuset cset set --destroy Charlie
Using shell commands, proceed as follows:
Create the cgroups hierarchy:
mkdir /dev/cgroup mount -t cgroup cgroup /dev/cgroup cd /dev/cgroup mkdir priority cd priority cat cpu.shares
Understanding cpu.shares:
1024 is the default (for more information, see
sched-design-CFS.txt
) = 50% utilization
1524 = 60% utilization
2048 = 67% utilization
512 = 40% utilization
Changing cpu.shares
/bin/echo 1024 > cpu.shares
Kernel documentation (package kernel-source
):
files in /usr/src/linux/Documentation/cgroups
:
/usr/src/linux/Documentation/cgroups/cgroups.txt
/usr/src/linux/Documentation/cgroups/cpuacct.txt
/usr/src/linux/Documentation/cgroups/cpusets.txt
/usr/src/linux/Documentation/cgroups/devices.txt
/usr/src/linux/Documentation/cgroups/freezer-subsystem.txt
/usr/src/linux/Documentation/cgroups/memcg_test.txt
/usr/src/linux/Documentation/cgroups/memory.txt
/usr/src/linux/Documentation/cgroups/resource_counter.txt
http://lwn.net/Articles/243795/—Corbet, Jonathan: Controlling memory use in containers (2007).
http://lwn.net/Articles/236038/—Corbet, Jonathan: Process containers (2007).