EuroBSDcon 2017 System Performance Analysis Methodologies
-
Upload
brendan-gregg -
Category
Technology
-
view
7.951 -
download
7
Transcript of EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017
SystemPerformanceAnalysisMethodologies
BrendanGreggSeniorPerformanceArchitect
ERASABLEMEMORY
CORESETAREA
VACSETS
FIXEDMEMORY
Apollo Lunar Module Guidance Computer performance analysis
Background
History• SystemPerformanceAnalysisuptothe'90s:
– ClosedsourceUNIXesandapplicaNons– Vendor-createdmetricsandperformancetools– Usersinterpretgivenmetrics
• Problems– Vendorsmaynotprovidethebestmetrics– ORenhadtoinfer,ratherthanmeasure– Givenmetrics,whatdowedowiththem?
$ ps -auxwUSER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMANDroot 11 99.9 0.0 0 16 - RL 22:10 22:27.05 [idle]root 0 0.0 0.0 0 176 - DLs 22:10 0:00.47 [kernel]root 1 0.0 0.2 5408 1040 - ILs 22:10 0:00.01 /sbin/init --[…]
Today1. Opensource
– OperaNngsystems:Linux,BSD,etc.– ApplicaNons:sourceonline(Github)
2. Custommetrics– Canpatchtheopensource,or,– Usedynamictracing(opensourcehelps)
3. Methodologies– StartwiththequesNons,thenmakemetricstoanswerthem– MethodologiescanposethequesNons
Biggestproblemwithdynamictracinghasbeenwhattodowithit.Methodologiesguideyourusage.
CrystalBallThinking
An2-Methodologies
StreetLightAn2-Method1. Pickobservabilitytoolsthatare
– Familiar– FoundontheInternet– Foundatrandom
2. Runtools3. Lookforobviousissues
DrunkManAn2-Method• DrinkTunethingsatrandomunNltheproblemgoesaway
BlameSomeoneElseAn2-Method1. Findasystemorenvironmentcomponentyouarenot
responsiblefor2. Hypothesizethattheissueiswiththatcomponent3. Redirecttheissuetotheresponsibleteam4. Whenprovenwrong,goto1
TrafficLightAn2-Method1. Turnallmetricsintotrafficlights2. Opendashboard3. Everythinggreen?Noworries,mate.
• TypeIerrors:redinsteadofgreen– teamwastesNme
• TypeIIerrors:greeninsteadofred– performanceissuesundiagnosed– teamwastesmoreNmelookingelsewhere
Trafficlightsaresuitableforobjec2vemetrics(eg,errors),notsubjec2vemetrics(eg,IOPS,latency).
Methodologies
PerformanceMethodologiesSystemMethodologies:
– Problemstatementmethod– FuncNonaldiagrammethod– Workloadanalysis– WorkloadcharacterizaNon– Resourceanalysis– USEmethod– ThreadStateAnalysis– On-CPUanalysis– CPUflamegraphanalysis– Off-CPUanalysis– LatencycorrelaNons– Checklists– StaNcperformancetuning– Tools-basedmethods
…
• Forsystemengineers:– waystoanalyzeunfamiliarsystemsand
applicaNons
• Forappdevelopers:– guidanceformetricanddashboarddesign
Collectyourowntoolboxofmethodologies
ProblemStatementMethod1. Whatmakesyouthinkthereisaperformanceproblem?2. Hasthissystemeverperformedwell?3. Whathaschangedrecently?
– soRware?hardware?load?
4. Cantheproblembedescribedintermsoflatency?– orrunNme.notIOPSorthroughput.
5. Doestheproblemaffectotherpeopleorapps?6. Whatistheenvironment?
– soRware,hardware,instancetypes?versions?config?
FuncNonalDiagramMethod1. DrawthefuncNonaldiagram2. Traceallcomponentsinthedatapath3. Foreachcomponent,checkperformance
Breaksupabiggerproblemintosmaller,relevantpartsEg,imaginethroughputbetweentheUCSB360andtheUTAHPDP10wasslow… ARPANetwork1969
WorkloadAnalysis• BeginwithapplicaNonmetrics&context• Adrill-downmethodology• Pros:
– ProporNonal,accuratemetrics– Appcontext
• Cons:– Difficulttodigfromapptoresource– Appspecific
ApplicaNonSystemLibraries
SystemCalls
Kernel
Hardware
Workload
Analysis
WorkloadCharacterizaNon• Checktheworkload,notresulNngperformance
• Eg,forCPUs:1. Who:whichPIDs,programs,users2. Why:codepaths,context3. What:CPUinstrucNons,cycles4. How:changingoverNme
TargetWorkload
WorkloadCharacterizaNon:CPUs
Who
How What
Why
top CPUprofileCPUflamegraphs
monitoring PMCsCPIflamegraph
CPUprofileCPUflamegraphs
PMCsCPIflamegraph
Mostcompaniesandmonitoringproductstoday
Who
How What
Why
top
monitoring
Wecandobejer
ResourceAnalysis• Typicalapproachforsystemperformanceanalysis:
beginwithsystemtools&metrics• Pros:
– Generic– Aidsresourceperftuning
• Cons:– Unevencoverage– FalseposiNves
ApplicaNonSystemLibraries
SystemCalls
Kernel
Hardware
Workload
Analysis
TheUSEMethod• For every resource, check:
1. Utilization: busy time 2. Saturation: queue length or time 3. Errors: easy to interpret (objective)
Starts with the questions, then finds the tools Eg, for hardware, check every resource incl. busses:
http://www.brendangregg.com/USEmethod/use-rosetta.html
http://www.brendangregg.com/USEmethod/use-freebsd.html
ERASABLEMEMORY
CORESETAREA
VACSETS
FIXEDMEMORY
Apollo Lunar Module Guidance Computer performance analysis
USEMethod:SoRware• USEmethodcanalsoworkforsoRwareresources
– kernelorappinternals,cloudenvironments– smallscale(eg,locks)tolargescale(apps).Eg:
• Mutexlocks:– uNlizaNonàlockholdNme– saturaNonàlockcontenNon– errorsàanyerrors
• EnNreapplicaNon:– uNlizaNonàpercentageofworkerthreadsbusy– saturaNonàlengthofqueuedwork– errorsàrequesterrors
ResourceUNlizaNon
(%)X
REDMethod
• Foreveryservice,checkthesearewithinSLO/A:1. Requestrate2. Errorrate3. Dura=on(distribuNon)
AnotherexerciseinposingquesNonsfromfuncNonaldiagrams
ByTomWilkie:hjp://www.slideshare.net/weaveworks/monitoring-microservices
LoadBalancer
WebProxy
WebServer
UserDatabase
PaymentsServer
AssetServer
MetricsDatabase
ThreadStateAnalysis
IdenNfy&quanNfyNmeinstates
Narrowsfurtheranalysistostate
Threadstatesareapplicabletoallapps
StatetransiNondiagram
TSA:eg,OSXInstruments:ThreadStates
TSA:eg,RSTS/E
RSTS:DECOSfromthe1970'sTENEX(1969-72)alsohadControl-Tforjobstates
TSA:FindingFreeBSDThreadStates# dtrace -ln sched::: ID PROVIDER MODULE FUNCTION NAME56622 sched kernel none preempt56627 sched kernel none dequeue56628 sched kernel none enqueue56631 sched kernel none off-cpu56632 sched kernel none on-cpu56633 sched kernel none remain-cpu56634 sched kernel none surrender56640 sched kernel none sleep56641 sched kernel none wakeup[…]
struct thread {[…] enum { TDS_INACTIVE = 0x0, TDS_INHIBITED, TDS_CAN_RUN, TDS_RUNQ, TDS_RUNNING } td_state; […]#define KTDSTATE(td) \ (((td)->td_inhibitors & TDI_SLEEPING) != 0 ? "sleep" : \ ((td)->td_inhibitors & TDI_SUSPENDED) != 0 ? "suspended" : \ ((td)->td_inhibitors & TDI_SWAPPED) != 0 ? "swapped" : \ ((td)->td_inhibitors & TDI_LOCK) != 0 ? "blocked" : \ ((td)->td_inhibitors & TDI_IWAIT) != 0 ? "iwait" : "yielding")
probes
threadflags
TSA:FreeBSD# ./tstates.dTracing scheduler events... Ctrl-C to end.^CTime (ms) per state:COMM PID CPU RUNQ SLP SUS SWP LCK IWT YLDirq14: ata0 12 0 0 0 0 0 0 0 0irq15: ata1 12 0 0 0 0 0 0 9009 0swi4: clock (0) 12 0 0 0 0 0 0 9761 0usbus0 14 0 0 8005 0 0 0 0 0[...]sshd 807 0 0 10011 0 0 0 0 0devd 474 0 0 9009 0 0 0 0 0dtrace 1166 1 4 10006 0 0 0 0 0sh 936 2 22 5648 0 0 0 0 0rand_harvestq 6 5 38 9889 0 0 0 0 0sh 1170 9 0 0 0 0 0 0 0kernel 0 10 13 0 0 0 0 0 0sshd 935 14 22 5644 0 0 0 0 0intr 12 46 276 0 0 0 0 0 0cksum 1076 929 28 0 480 0 0 0 0cksum 1170 1499 1029 0 0 0 0 0 0cksum 1169 1590 1144 0 0 0 0 0 0idle 11 5856 999 0 0 0 0 0 0
DTraceproofofconcept
hjps://github.com/brendangregg/DTrace-tools/blob/master/sched/tstates.d
On-CPUAnalysis
1. Splitintouser/kernelstates
– /proc,vmstat(1)2. CheckCPUbalance
– mpstat(1),CPUuNlizaNonheatmap3. ProfilesoRware
– User&kernelstacksampling(asaCPUflamegraph)4. Profilecycles,caches,busses
– PMCs,CPIflamegraph
CPUUNlizaNonHeatMap
CPUFlameGraphAnalysis
1. TakeaCPUprofile2. Renderitasaflamegraph3. Studylargest"towers"first
DiscoversissuesbytheirCPUusage- Directly:CPUconsumers- Indirectly:iniNalizaNonofI/O,locks,Nmes,...
Narrowstargetofstudy
FlameGraph
CPUFlameGraphs:FreeBSD• UseeitherDTraceorpmcstat.Eg,kernelCPUwithDTrace:
• Bothuser&kernelCPU:
git clone https://github.com/brendangregg/FlameGraph; cd FlameGraphdtrace -n 'profile-99 /arg0/ { @[stack()] = count(); } tick-30s { exit(0); }' > stacks01stackcollapse.pl < stacks01 | sed 's/kernel`//g' | ./flamegraph.pl > stacks01.svg
hjp://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#DTrace
dtrace -x ustackframes=100 -x stackframes=100 -n ' profile-99 { @[stack(), ustack(), execname] = sum(1); } tick-30s,END { printa("%k-%k%s\n%@d\n", @); trunc(@); exit(0); }' > stacks02
JavaMixed-ModeCPUFlameGraph
Java
Kernel(C)
JVM(C++)
User(C)
By sampling stack traces with: • -XX:+PreserveFramePointer • Java perf-map-agent
CPIFlameGraph:BSDA CPU flame graph (cycles) colored using instructions/stall profile data eg, using FreeBSD pmcstat:
red==instrucNonsblue==stalls
hjp://www.brendangregg.com/blog/2014-10-31/cpi-flame-graphs.html
Off-CPUAnalysis
Analyzeoff-CPUNmeviablockingcodepath:Off-CPUflamegraph
ORenneedwakeupcodepathsaswell…
Off-CPUTimeFlameGraph:FreeBSDfileread
directoryread
missingsymbols(stripped)
StackdepthOff-CPUNme
seek readahead fileread
tar…>/dev/null
readahead
Off-CPUProfiling:FreeBSD#!/usr/sbin/dtrace -s#pragma D option ustackframes=100#pragma D option dynvarsize=32m
sched:::off-cpu /execname == "bsdtar"/ { self->ts = timestamp; }
sched:::on-cpu/self->ts/{
@[stack(), ustack(), execname] = sum(timestamp - self->ts);self->ts = 0;
}
dtrace:::END{
normalize(@, 1000000);printa("%k-%k%s\n%@d\n", @);
}
offcpu.dUsesDTrace
Warning:canhavesignificantoverhead(schedulereventscanbefrequent)
Change/removeasdesiredeg,add/curthread->td_state<=1/toexcludepreempt,otherwiseseesiCsw
# ./offcpu.d > out.stacks# git clone https://github.com/brendangregg/FlameGraph; cd FlameGraph# stackcollapse.pl < ../out.stacks | sed 's/kernel`//g' | \ ./flamegraph.pl --color=io --title="Off-CPU Flame Graph" --countname=ms > out.svg
Off-CPUTimeFlameGraph:FreeBSD tar…|gzip
pipewritefileread readahead
WakeupTimeFlameGraph:FreeBSD
Who did the wakeup:
waker
wakee
user-stack
kernel-stack
WakeupProfiling:FreeBSD#!/usr/sbin/dtrace -s
#pragma D option quiet#pragma D option ustackframes=100#pragma D option dynvarsize=32m
sched:::sleep /execname == "bsdtar"/ { ts[curlwpsinfo->pr_addr] = timestamp; }
sched:::wakeup/ts[arg0]/{
this->delta = timestamp - ts[arg0];@[args[1]->p_comm, stack(), ustack(), execname] = sum(this->delta);ts[arg0] = 0;
}
dtrace:::END{
normalize(@, 1000000);printa("\n%s%k-%k%s\n%@d\n", @);
}
wakeup.dUsesDTrace
Warning:canhavesignificantoverhead(schedulereventscanbefrequent)
Change/removeasdesired
MergingStackswitheBPF:LinuxWakertask
Wakerstack
Blockedstack
Blockedtask
StackDirecNon
Wokeup
• UsingenhancedBerkeleyPacketFilter(eBPF)tomergestacksinkernelcontext
• NotavailableonBSD(yet)
YeOldeBPFBerkeleyPacketFilter
# tcpdump host 127.0.0.1 and port 22 -d(000) ldh [12](001) jeq #0x800 jt 2 jf 18(002) ld [26](003) jeq #0x7f000001 jt 6 jf 4(004) ld [30](005) jeq #0x7f000001 jt 6 jf 18(006) ldb [23](007) jeq #0x84 jt 10 jf 8(008) jeq #0x6 jt 10 jf 9(009) jeq #0x11 jt 10 jf 18(010) ldh [20](011) jset #0x1fff jt 18 jf 12(012) ldxb 4*([14]&0xf)(013) ldh [x + 14][...]
User-definedbytecodeexecutedbyanin-kernel
sandboxedvirtualmachine
StevenMcCanneandVanJacobson,1993
2x32-bitregisters&scratchmemory
OpNmizespacketfilterperformance
EnhancedBPFakaeBPForjust"BPF"
AlexeiStarovoitov,2014+
10x64-bitregistersmaps(hashes)
stacktracesac=ons
bcc/BPFfront-end(C&Python)
bccexamples/tracing/bitehist.py
LatencyCorrelaNons
1. Measurelatencyhistogramsatdifferentstacklayers
2. Comparehistogramstofindlatencyorigin
Evenbejer,uselatencyheatmaps• MatchoutliersbasedonbothlatencyandNme
Checklists:eg,BSDPerfAnalysisin60s1. uptime2. dmesg -a | tail3. vmstat 14. vmstat -P5. ps -auxw6. iostat -xz 17. systat -ifstat8. systat -netstat9. top10. systat -vmstat
loadaverageskernelerrorsoverallstatsbyNme
CPUbalanceprocessusage
diskI/OnetworkI/O
TCPstatsprocessoverviewsystemoverview
adaptedfromhjp://techblog.neylix.com/2015/11/linux-performance-analysis-in-60s.html
1.RPS,CPU 2.Volume
6.LoadAvg
3.Instances 4.Scaling
5.CPU/RPS
7.JavaHeap 8.ParNew
9.Latency 10.99thNle
Checklists:eg,NeylixperfvitalsDashboard
StaNcPerformanceTuning:FreeBSD
Tools-BasedMethod:FreeBSD
Tryallthetools!MaybeananN-pajern
Tools-BasedMethod:DTraceFreeBSD
JustmynewBSDtools
OtherMethodologies• ScienNficmethod• 5Why's• ProcessofeliminaNon• Intel'sTop-DownMethodology• MethodR
WhatYouCanDo
Whatyoucando1. Knowwhat'snowpossibleonmodernsystems
– Dynamictracing:efficientlyinstrumentanysoRware– CPUfaciliNes:PMCs,MSRs(modelspecificregisters)– VisualizaNons:flamegraphs,latencyheatmaps,…
2. AskquesNonsfirst:usemethodologiestoaskthem3. Thenfind/buildthemetrics4. Buildorbuydashboardstosupportmethodologies
DynamicTracing:EfficientMetrics
send
receive
tcpdump
Kernel
buffer
filesystem
1.read2.dump
Analyzer 1.read2.process3.print
disks
Old way: packet capture
New way: dynamic tracing
Tracer 1.configure2.read
tcp_retransmit_skb()
Eg, tracing TCP retransmits
DynamicTracing:InstrumentMostSoRwareMySolaris/DTracetools(manyalreadyworkonBSD/DTrace):
PerformanceMonitoringCountersEg,BSDPMCgroupsforIntelSandyBridge:
VisualizaNonsEg,DiskI/Olatencyasaheatmap,quanNzedinkernel:
Postprocessingtheoutputofmyiosnooptool:www.brendangregg.com/HeatMaps/latency.html
Summary• Itisthecrystalballageofperformanceobservability• WhatmajersisthequesNonsyouwantanswered• MethodologiesareagreatwaytoposequesNons
Who
How What
Why
References&Resources• FreeBSD@Neylix:
– hjps://openconnect.itp.neylix.com/– hjp://people.freebsd.org/~scojl/Neylix-BSDCan-20130515.pdf– hjp://www.youtube.com/watch?v=FL5U4wr86L4
• USEMethod– hjp://queue.acm.org/detail.cfm?id=2413037– hjp://www.brendangregg.com/usemethod.html
• TSAMethod– hjp://www.brendangregg.com/tsamethod.html
• Off-CPUAnalysis– hjp://www.brendangregg.com/offcpuanalysis.html– hjp://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html– hjp://www.brendangregg.com/blog/2016-02-05/ebpf-chaingraph-prototype.html
• StaNcPerformanceTuning,RichardElling,Sunblueprint,May2000• REDMethod:hjp://www.slideshare.net/weaveworks/monitoring-microservices• Othersystemmethodologies
– SystemsPerformance:EnterpriseandtheCloud,PrenNceHall2013– hjp://www.brendangregg.com/methodology.html– TheArtofComputerSystemsPerformanceAnalysis,Jain,R.,1991
• FlameGraphs– hjp://queue.acm.org/detail.cfm?id=2927301– hjp://www.brendangregg.com/flamegraphs.html– hjp://techblog.neylix.com/2015/07/java-in-flames.html
• LatencyHeatMaps– hjp://queue.acm.org/detail.cfm?id=1809426– hjp://www.brendangregg.com/HeatMaps/latency.html
• ARPANetwork:hjp://www.computerhistory.org/internethistory/1960s• RSTS/ESystemUser'sGuide,1985,page4-5• DTrace:DynamicTracinginOracleSolaris,MacOSX,andFreeBSD,PrenNceHall2011• Apollo:hjp://www.hq.nasa.gov/office/pao/History/alsj/a11hjp://www.hq.nasa.gov/alsj/alsj-LMdocs.html
ThankYou
• hjp://slideshare.net/brendangregg• hjp://www.brendangregg.com• [email protected]• @brendangregg
EuroBSDcon 2017