LSA2 - 02 Control Groups

download LSA2 - 02   Control Groups

If you can't read please download the document

Transcript of LSA2 - 02 Control Groups

Control Groups

What do we have?

cpuset- whole cores and cpu mapping

cpuacct- cpu cycle accounting

cpu- less then core granularity

memory- limits and accounting

blkio- limits and accounting

net_cls- network classification

net_prio- network priority

Freezer + checkpoint/restore - migration

General structure

tasksattach a task(thread) and show list of threads

cgroup.procsshow list of processes

cgroup.event_controlan interface for event_fd()

# mount -t cgroup none /cgroups# mount -t cgroup -o cpuset cpuset /cg/cpuset

cpuset

Physical CPU & Memory limitscpuset.cpus - a list of allowed CPUs

cpuset.mems - a list of allowed memory slots

cpuset.cpu_exclusive - 0/1 are the CPUs exclusive to this group(no other group can use them)

cpuset.mem_exclusive or cpuset.mem_hardwall - 0/1 are the memory slots exclusive to this group(no other group can use them)

cpuset.sched_load_balance - should the kernel balance the tasks between the CPUs in the current cpuset

cpuset.sched_relax_domain_level

Documentation/cgroups/cpusets.txt

cpuset

Physical CPU & Memory limitscpuset.sched_relax_domain_level

-1 : no request. use system default or follow request of others. 0 : no search. 1 : search siblings (hyperthreads in a core). 2 : search cores in a package. 3 : search cpus in a node [= system wide on non-NUMA system]on NUMA systems only 4 : search nodes in a chunk of node 5 : search system wide

Documentation/cgroups/cpusets.txt

CPU accounting

cpu usage combined for all cpus (in nanoseconds)

cpu usage per-cpu (in nanoseconds)

per cpu and user/system(in USER_HZ)

Documentation/cgroups/cpuacct.txt

CPU

CPU scheduler limits CONFIG_CGROUP_SCHEDcpu.shares: the amount of cpu shares available to the group

cpu.cfs_quota_us: the total available run-time within a period (in microseconds) (-1 no limit)

cpu.cfs_period_us: the length of a period (in microseconds) (default 100ms)

cpu.stat: exports throttling statistics

nr_periods: Number of enforcement intervals that have elapsed.nr_throttled: Number of times the group has been throttled/limited.throttled_time: The total time duration (in nanoseconds) for which entities of the group have been throttled.

Documentation/scheduler/sched-bwc.txt

CPU examples

1. Limit a group to 1 CPU worth of runtime. If period is 250ms and quota is also 250ms, the group will get 1 CPU worth of runtime every 250ms. # echo 250000 > cpu.cfs_quota_us /* quota = 250ms */ # echo 250000 > cpu.cfs_period_us /* period = 250ms */2. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine. With 500ms period and 1000ms quota, the group can get 2 CPUs worth of runtime every 500ms. # echo 1000000 > cpu.cfs_quota_us /* quota = 1000ms */ # echo 500000 > cpu.cfs_period_us /* period = 500ms */The larger period here allows for increased burst capacity.3. Limit a group to 20% of 1 CPU. With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU. # echo 10000 > cpu.cfs_quota_us /* quota = 10ms */ # echo 50000 > cpu.cfs_period_us /* period = 50ms */By using a small period here we are ensuring a consistent latency response at the expense of burst capacity.

memory

Only Memorymemory.usage_in_bytes - show current res_counter usage for memory

memory.limit_in_bytes - set/show limit of memory usage

memory.failcnt - show the number of memory usage hits limits

memory.max_usage_in_bytes - show max memory usage recordedMemory + Swap

memory.memsw.usage_in_bytes- show current res_counter usage

memory.memsw.limit_in_bytes - set/show limit

memory.memsw.failcnt - show the number of hits limits

memory.memsw.max_usage_in_bytes - show max memory+Swap usage recorded

memory.soft_limit_in_bytes - set/show soft limit of memory usage

memory.stat - show various statistics

memory.use_hierarchy - set/show hierarchical account enabled

memory.force_empty - trigger forced move charge to parent

memory.pressure_level - set memory pressure notifications

memory.swappiness - set/show swappiness parameter of vmscan

memory

memory.move_charge_at_immigrate- set/show controls of moving charges

memory.oom_control - set/show oom controls.

memory.numa_stat - show the number of memory usage per numa node

Kernel Memory limits

memory.kmem.limit_in_bytes - set/show hard limit for kernel memory

memory.kmem.usage_in_bytes - show current kernel memory allocation

memory.kmem.failcnt - show the number of kernel memory usage hits limits

memory.kmem.max_usage_in_bytes - show max kernel memory usage recorded

memory.kmem.tcp.limit_in_bytes - set/show hard limit for tcp buf memory

memory.kmem.tcp.usage_in_bytes - show current tcp buf memory allocation

memory.kmem.tcp.failcnt - show the number of tcp buf memory usage hits limits

memory.kmem.tcp.max_usage_in_bytes - show max tcp buf memory usage recorded

blkio statistics

blkio.io_wait_time

blkio.io_merged

blkio.io_queued

blkio.avg_queue_size

blkio.group_wait_time

blkio.throttle.io_serviced

blkio.throttle.io_service_bytes

blkio.sectors

blkio.io_service_bytes

blkio.io_serviced

blkio.io_service_time

blkio.*_recursive

blkio.reset_statswrite an int to it

blkio limiting

blkio.weight - allowed range 10 - 1000

blkio.weight_device - weight per device

blkio.leaf_weight[_device] - when competing with child cgroups

blkio.time - disk time allocated in miliseconds

blkio.throttle.read_bps_device

blkio.throttle.write_bps_device

blkio.throttle.read_iops_device

Network

Adding network class to each cgroup so you can later limit it with tcDocumentation/cgroups/net_cls.txt

Prioritizing network traffic on interfaceDocumentation/cgroups/net_prio.txt

Freezer + CRIU

freezer.state HAWED

FREEZING

FROZEN

freezer.self_freezing0 (thawed)/ 1 (frozen)

freezer.parent_freezing 0 if partent is frozen

CRIU - Checkpoint and Restore In Userspace