Red Hat Linux Performance Marc...

48
Red Hat Linux Performance Marc Skinner Twin Cities Users Group :: Q4/2013

Transcript of Red Hat Linux Performance Marc...

Page 1: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Red Hat Linux Performance

Marc Skinner

Twin Cities Users Group :: Q4/2013

Page 2: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Performance in the past

●Recompile the kernel●Have your masters or PHD in performance theory●Be a kernel developer

Page 3: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

RHEL 6 now includes tuning resources

●Tuned – kernel and block device optimizations●Transparent Huge Pages – memory optimizations●Numad – multi-socket process and memory optimizations●CPU power scaling

Page 4: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

High End HP DL 980 AIM7 results w/ “ktune” (r5) “tuned-adm” (r6)

RHEL 5.5 RHEL 6 RHEL 6 "tuned"0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

90,000

100,000

4

HP DL980 64-core/256GB/30 FC/480 lun AIM7 results w/ “tuned” AIM7 file server performance – jobs per minute

Page 5: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Tuned

Page 6: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

The “tuned” package

yum -y install tuned

# tuned-adm listAvailable profiles:- virtual-host- laptop-ac-powersave- virtual-guest- server-powersave- desktop-powersave- sap- default- throughput-performance- latency-performance- laptop-battery-powersave- spindown-disk- enterprise-storage

Current active profile: virtual-host

Page 7: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

# tuned-adm list# tuned-adm active# tuned-adm profile <profile-name># tuned-adm off

Page 8: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Create a new cronjob

0 8 * * 1-5 /usr/bin/tuned-adm profile throughput-performance0 18 * * 1-5 /usr/bin/tuned-adm profile server-powersave

Change profiles with cron

Page 9: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Create your own profiles

Page 10: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

/etc/tune-profiles/<profile-name>

tuned.confktune.sysconfigsysctl.ktunektune.sh

Page 11: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

tuned.conf

Enable/Disable Power Management Plugins●CPU●Disk●Network

Example: Aggressive Link Power Management

Page 12: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

ktune.sysconfig

Enable/Disable ktune daemonConfigure disk elevators

Page 13: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Disk Elevators

●deadline ●cfq [DEFAULT]●noop

Page 14: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

I/O Tuning – Understanding I/O Elevators● Deadline

● Two queues per device, one for read and one for writes

● I/Os dispatched based on time spent in queue

● Used for multi-process applications and systems running enterprise storage

● CFQ● Per process queue

● Each process queue gets fixed time slice (based on process priority)

● Default setting - Slow storage (SATA)

● Noop● FIFO

● Simple I/O Merging

● Lowest CPU Cost

● Low latency storage and applications (Solid State Devices)

Page 15: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

I/O Tuning – Configuring I/O Elevators

● Boot-time

● Grub command line – elevator=deadline/cfq/noop● Dynamically, per device

● echo “deadline” > /sys/class/block/sda/queue/scheduler● tuned (RHEL6 utility)

● tuned-adm profile throughput-performance● tuned-adm profile enterprise-storage

Page 16: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Impact of I/O Elevators – OLTP Workload

CFQ Deadline No-op

10U

20U

40U

Tran

sact

ion

s /

Min

ute

34.26% better than CFQ with higher process count

Page 17: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

8 16 3200:00

01:26

02:53

04:19

05:46

07:12

08:38

10:05

11:31

0

10

20

30

40

50

60

70

47.69

54.01

58.4

Comparison CFQ vs Deadline

Oracle DSS Workload (with different thread count)

CFQ

Deadline

% diff

Parallel degree

Tim

e E

lap

sed

Impact of I/O Elevators – DSS Workload

Page 18: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

sysctl.ktune

sysctl settings

Page 19: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

ktune.sh

start() and stop()/etc/tune-profiles/functions

Page 20: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

/etc/tune-profiles/functions

● {set,restore)_transparent_hugepages● {set,restore}_cpu_governor● {enable,disable}_wifi● {enable,disable}_cd_polling● And many more

Page 21: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

“tuned” Profile SummaryTunable default enterprise-

storagevirtual-host

virtual-guest

latency-performance

throughput-performance

kernel.sched_min_granularity_ns

4ms 10ms 10ms 10ms 10ms

kernel.sched_wakeup_granularity_ns

4ms 15ms 15ms 15ms 15ms

vm.dirty_ratio 20% RAM 40% 10% 40% 40%

vm.dirty_background_ratio

10% RAM 5%

vm.swappiness 60 10 30

I/O Scheduler (Elevator)

CFQ deadline deadline deadline deadline deadline

Filesystem Barriers On Off Off Off

CPU Governor ondemand performance performance performance

Disk Read-ahead 4x

Disable THP Yes

CPU C-States Locked @ 1

Page 22: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Tuned – BenchmarkSQL 2.3.3 on PostgreSQL 9.2

I/O based workloads benefit from profiles built for storage / throughput

%15 increase

Page 23: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Iozone Performance Effect of TUNED EXT4/XFS/GFS

ext3 ext4 xfs gfs20

500

1000

1500

2000

2500

3000

3500

4000

4500

RHEL6.4 File System In Cache Performance

Intel Large File I/O (iozone)

not tuned

tuned

Th

rou

gh

pu

t in

MB

/Se

c

ext3 ext4 xfs gfs20

100

200

300

400

500

600

700

800

RHEL6.4 File System Out of Cache Performance

Intel Large File I/O (iozone)

not tuned

tuned

Th

rou

gh

pu

t in

MB

/Se

c

Approx %15 increase

Page 24: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Transparent Huge Pages

Page 25: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

●Memory is managed in pages, traditionally 4K in size●A huge page can be 2MB or now 1GB with Intel Sandy Bridge and newer●Less pages, means faster look ups, less misses = performance

What is a huge page?

Page 26: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Transparent Hugepagesecho never > /sys/kernel/mm/transparent_hugepages=never

[root@dhcp-100-19-50 code]# time ./memory 15 0real   0m12.434suser   0m0.936ssys    0m11.416s

# cat /proc/meminfo MemTotal:       16331124 kBAnonHugePages:  0 kB

● Boot argument: transparent_hugepages=always (enabled by default)

● # echo always > /sys/kernel/mm/redhat_transparent_hugepage/enabled

# time ./memory 15GB    real    0m7.024sreal    0m7.024s  user    0m0.073s  sys     0m6.847s# cat /proc/meminfo MemTotal:       16331124 kBAnonHugePages:  15590528 kB

SPEEDUP 12.4/7.0 = 1.77x, 56%

Page 27: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Red Hat Enterprise Linux 6 Performance / Sandy Bridge Specjbb Java w/ 1GB huge pages

● Sandy Bridge has 1GB hugepages● Support in RHEL5.8 and 6.2

● RHEL6. Transparent Huge pages

Use 2M x86_64 page vs 4k page < RHEL6, static use of hugepages

● Static pages wired-down● Need application support

DB/Java etc

Automatically use huge pages ● For all annonmous memory

Daemon to gather free dynamically %9.1 and %12.6 gain!

RHEL6.2 (disable THP)RHEL6.2 2M HugePage

RHEL6.2 1GB HugePage

0.0%

2.0%

4.0%

6.0%

8.0%

10.0%

12.0%

14.0%

0.0%

9.1%

12.6%

RHEL6.4 SPECjbb w/ 2M/1G hugepages

Intel Sandy Bridge 16core/32GB

sun_hotspot

%gain

bo

ps

Page 28: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Huge Pages - FYI

● Most databases support Huge pages● Transparent Huge Pages in RHEL6 (cannot be used for

Database shared memory – only for process private memory)

● How to configure Huge Pages (16G)● echo 8192 > /proc/sys/vm/nr_hugepages● vi /etc/sysctl.conf (vm.nr_hugepages=8192)

Page 29: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Numad

Page 30: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Understanding NUMA (Non Uniform Memory Access)

● Multi Socket – Multi core architecture

● NUMA required for scaling● RHEL 5 / 6 completely NUMA aware● Additional performance gains by enforcing NUMA placement

c1 c2c3 c4

c1 c2c3 c4

c1 c2c3 c4

c1 c2c3 c4

M1 M2

M3 M4

S1 S2

S4S3 D1 D2 D3 D4

S1 S2 S3 S4

M1 M2 M3 M4

No NUMA optimization

D1 D2 D3 D4

S1 S2 S3 S4

M1 M2 M3 M4

NUMA optimization

Page 31: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Finding NUMA layout – 4 socket by 16 cores[root@perf30 ~]# numactl --hardwareavailable: 4 nodes (0-3)node 0 cpus: 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60node 0 size: 32649 MBnode 0 free: 30868 MBnode 1 cpus: 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61node 1 size: 32768 MBnode 1 free: 29483 MBnode 2 cpus: 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62node 2 size: 32768 MBnode 2 free: 31082 MBnode 3 cpus: 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63node 3 size: 32768 MBnode 3 free: 31255 MBnode distances:node 0 1 2 3 0: 10 21 21 21 1: 21 10 21 21 2: 21 21 10 21 3: 21 21 21 10

Page 32: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

NUMA Affinity old way

# numactl -N1 -m1 ./command

● Sets CPU affinity for 'command' to CPU node 1● Allocates memory out of Memory node 1

Page 33: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Memory Tuning – Effect of NUMA Tuning

Non NUMA NUMA

OLTP workload - Multi Instance

Tra

ns/

Min

NUMA pinning improvesperformance by 8.23 %

Page 34: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Memory Tuning – NUMA - “numad”● What is numad?

● User-level daemon to automatically improve out of the box NUMA system performance

● Added to Fedora 17● Added to RHEL 6.3 as tech preview● Not enabled by default

● What does numad do?

● Monitors available system resources on a per-node basis and assigns significant consumer processes to aligned resources for optimum NUMA performance.

● Rebalances when necessary● Provides pre-placement advice for the best initial process

placement and resource affinity.

Page 35: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

numad matches resource consumers with available resources

RequiredCPUs

RequiredMemory

Process Scanner:

Consumed ResourcesPer Process

AvailableCPUs

AvailableMemory

Node Scanner:

Available ResourcesPer Node

Numad Picker

Node list For Process

Page 36: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

numad aligns process memory and CPU threads within nodes

Node 0Node 0Node 0 Node 2Node 1 Node 3

Process 37

Process 29

Process 19

Process 61

Node 0Node 0Node 0 Node 2Node 1 Node 3

Proc 37Proc

29

Proc19 Proc

61

Before numad After numad

Page 37: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

37RED HAT INC. CONFIDENTIAL

So, what's the NUMA problem?

● The Linux system scheduler is very good at maintaining responsiveness and optimizing for CPU utilization

● Tries to use idle CPUs, regardless of where process memory is located.... Using remote memory degrades performance!

● Red Hat is working with the upstream community to increase NUMA awareness of the scheduler and to implement automatic NUMA balancing.

● Remote memory latency matters most for long-running, significant processes, e.g., HPTC, VMs, etc.

Page 38: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

38RED HAT INC. CONFIDENTIAL

Use numastat to see memory layout● Rewritten for RHEL6.4 to show per-node system and process memory information● 100% compatible with prior version by default, displaying /sys...node<n>/numastat

memory allocation statistics● Any command options invoke new functionality

● -m for per-node system memory info● <pattern> for per-node process memory info

Page 39: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

numastat shows unaligned guests

# numastat -c qemu

Per-node process memory usage (in Mbs)

PID Node 0 Node 1 Node 2 Node 3 Total--------------- ------ ------ ------ ------ -----10587 (qemu-kvm) 1216 4022 4028 1456 1072210629 (qemu-kvm) 2108 56 473 8077 1071410671 (qemu-kvm) 4096 3470 3036 110 1071210713 (qemu-kvm) 4043 3498 2135 1055 10730--------------- ------ ------ ------ ------ -----Total 11462 11045 9672 10698 42877

Page 40: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

numastat shows aligned guests

# numastat -c qemu

Per-node process memory usage (in Mbs)

PID Node 0 Node 1 Node 2 Node 3 Total--------------- ------ ------ ------ ------ -----10587 (qemu-kvm) 0 10723 5 0 1072810629 (qemu-kvm) 0 0 5 10717 1072210671 (qemu-kvm) 0 0 10726 0 1072610713 (qemu-kvm) 10733 0 5 0 10738--------------- ------ ------ ------ ------ -----Total 10733 10723 10740 10717 42913

Page 41: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Bare Metal - Java Workload

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

-10

0

10

20

30

40

50

60

70

80

90

Automatic Numad Improvement

Mulitinstance Java Workload on 4 Socket, 8 Node system

DefaultNumad95Numactl%Gain

WHs

BO

Ps

Page 42: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

CPU Tuning

Page 43: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

CPU Tuning – Power Savings / cpuspeed● Power savings mode

● cpuspeed or cpupower– performance– ondemand– powersave

How To

● echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

● best of both worlds – cron jobs to configure the governor mode

● tuned-adm profile server-powersave (RHEL6)

Page 44: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

CPU Tuning – C-states

● Various states of the CPUs for power savings● C0 through C6● C0 – full frequency (no power savings)● C6 (deep power down mode – maximum power savings)● OS can tell the processors to transition between these states

Linux Tool used for monitoring c-states (only for Intel)● turbostat -i <interval>

Page 45: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Default

pk cor CPU %c0 GHz TSC %c1 %c3 %c6 %c7

0 0 0 0.24 2.93 2.88 5.72 1.32 0.00 92.72

0 1 1 2.54 3.03 2.88 3.13 0.15 0.00 94.18

0 2 2 2.29 3.08 2.88 1.47 0.00 0.00 96.25

0 3 3 1.75 1.75 2.88 1.21 0.47 0.12 96.44

latency-performance

pk cor CPU %c0 GHz TSC %c1 %c3 %c6 %c7

0 0 0 0.00 3.30 2.90 100.00 0.00 0.00 0.00

0 1 1 0.00 3.30 2.90 100.00 0.00 0.00 0.00

0 2 2 0.00 3.30 2.90 100.00 0.00 0.00 0.00

0 3 3 0.00 3.30 2.90 100.00 0.00 0.00 0.00

Turbostat shows P/C-states on Intel CPUsturbostat begins shipping in RHEL6.4, cpupowerutils package

Page 46: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

CPU Tuning – Effect of Power Savings - OLTP

powersave ondemand performance

Scaling governors testing with OLTP workload using Violin Memory Storage

Ramping up user count

10U / 15%

20U / 23%

40U / 40%

Scaling User sets / Avg CPU utilization during the runs

Tran

s /

Min

- Powersave has lowest performance- Performance difference between ondemand and performance goes down as system gets busier

Page 47: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Gotchas

● Test your changes● If an application is not compatible with transparent huge pages, you will find

out quickly!

Recommendations

● Use Tuned! ● %10-%30

Page 48: Red Hat Linux Performance Marc Skinnerpeople.redhat.com/mskinner/rhug/q4.2013/rhel-performance.pdf · Performance in the past Recompile the kernel Have your masters or PHD in performance

Thank you!