Real Life RAC Performance Tuning - · PDF fileReal Life RAC Performance Tuning Arup Nanda Lead...
Transcript of Real Life RAC Performance Tuning - · PDF fileReal Life RAC Performance Tuning Arup Nanda Lead...
Real Life RAC Performance Real Life RAC Performance TuningTuning
Arup NandaArup NandaLead DBALead DBA
Starwood HotelsStarwood Hotels
©© Arup NandaArup Nanda
Who am IWho am IOracle DBA for 13 years and countingOracle DBA for 13 years and countingWorking on OPS from 1999Working on OPS from 1999Implemented and supported 10g RAC in Implemented and supported 10g RAC in 83 sites since 200483 sites since 2004Speak at conferences, write papers, booksSpeak at conferences, write papers, books
©© Arup NandaArup Nanda
Why This SessionWhy This SessionI get emails like this:I get emails like this:
We are facing performance issues in RAC. We are facing performance issues in RAC. What should I do next?What should I do next?
Real Life AdviceReal Life AdviceCommon Issues (with Wait Events)Common Issues (with Wait Events)Dispelling MythsDispelling MythsFormulate a Plan of AttackFormulate a Plan of AttackReal Life Case StudyReal Life Case Study
proligence.com/downloads.htmlproligence.com/downloads.html
©© Arup NandaArup Nanda
Our RAC ImplementationOur RAC ImplementationOracle 10g RAC in March 2004Oracle 10g RAC in March 2004
Itanium Platform running HP/UXItanium Platform running HP/UXOracle 10.1.0.2Oracle 10.1.0.2
Result: Failed Result: Failed Second Attempt: Dec 2004 Second Attempt: Dec 2004 –– 10.1.0.310.1.0.3Result: Failed Again Result: Failed Again Third Attempt: March 2005 Third Attempt: March 2005 –– 10.1.0.410.1.0.4Result: Success! Result: Success! ☺☺
©© Arup NandaArup Nanda
ChallengesChallengesTechnologyTechnology
Lone rangerLone rangerA lot of A lot of ““mysterymystery”” and disconnected and disconnected ““factsfacts””!!
PeoplePeopleBuilding a team that could not only deliver; but Building a team that could not only deliver; but also sustain the delivered partsalso sustain the delivered partsEach day we learned something newEach day we learned something new
In todayIn today’’s session: real performance s session: real performance issues we faced and how we resolved issues we faced and how we resolved them, along with wait events.them, along with wait events.
©© Arup NandaArup Nanda
Why Why ““RACRAC”” Performance?Performance?All tuning concepts in single instance All tuning concepts in single instance applied to RAC as wellapplied to RAC as wellRAC has other complexitiesRAC has other complexities
More than 1 buffer cacheMore than 1 buffer cacheMultiple caches Multiple caches –– library cache, row cachelibrary cache, row cacheInterconnectInterconnectPingingPingingGlobal LockingGlobal Locking
©© Arup NandaArup Nanda
Why RAC Why RAC PerfPerf TuningTuningWe want to make sure we identify the right We want to make sure we identify the right problem and go after it problem and go after it ……. not just . not just aa problemproblem
©© Arup NandaArup Nanda
Switch
VIPServiceListenerInstance
ASMClusterware
Op Sys
VIPServiceListenerInstance
ASMClusterware
Op Sys
Public Interface
Cache
Cache Fusion
OCR Voting
SwitchInterconnect
Storage
Node1 Node2
Lock Manager
©© Arup NandaArup Nanda
Areas of Concern in RACAreas of Concern in RACMore than 1 buffer cacheMore than 1 buffer cacheMultiple caches Multiple caches –– library cache, row cachelibrary cache, row cacheInterconnectInterconnectGlobal LockingGlobal Locking
©© Arup NandaArup Nanda
Cache IssuesCache IssuesTwo Caches, requires synchronizationTwo Caches, requires synchronizationWhat that means:What that means:
A changed block in one instance, when A changed block in one instance, when requested by another, should be sent across requested by another, should be sent across via a via a ““bridgebridge””This bridge is the InterconnectThis bridge is the Interconnect
©© Arup NandaArup Nanda
Interconnect PerformanceInterconnect PerformanceInterconnect must be on a private LANInterconnect must be on a private LANPort aggregation to increase throughputPort aggregation to increase throughput
APA on HPUXAPA on HPUXIf using Gigabit over Ethernet, use Jumbo If using Gigabit over Ethernet, use Jumbo FramesFrames
©© Arup NandaArup Nanda
Checking Interconnect UsedChecking Interconnect UsedIdentify the interconnect usedIdentify the interconnect used
$ $ oifcfgoifcfg getifgetiflan902 172.17.1.0 global lan902 172.17.1.0 global cluster_interconnectcluster_interconnect
lan901 10.28.188.0 global publiclan901 10.28.188.0 global public
Is lan902 the bonded interface? If not, Is lan902 the bonded interface? If not, then set itthen set it
$ $ oifcfgoifcfg setifsetif ……
©© Arup NandaArup Nanda
Pop QuizPop QuizIf I have a very fast interconnect, I can If I have a very fast interconnect, I can perform the same work in multiple node perform the same work in multiple node RAC as a single server with faster CPUs. RAC as a single server with faster CPUs. True/False?True/False?Since cache fusion is now writeSince cache fusion is now write--write, a write, a fast interconnect will compensate for a fast interconnect will compensate for a slower IO subsystem. True/False?slower IO subsystem. True/False?
©© Arup NandaArup Nanda
Cache Coherence TimesCache Coherence TimesThe time is a sum of time for:The time is a sum of time for:
Finding the block in the cacheFinding the block in the cacheIdentifying the masterIdentifying the masterGet the block in the interconnectGet the block in the interconnectTransfer speed of the interconnectTransfer speed of the interconnectLatency of the interconnectLatency of the interconnectReceive the block by the remote instanceReceive the block by the remote instanceCreate the consistent image for the userCreate the consistent image for the user
CPU
CPU
Interconnect
©© Arup NandaArup Nanda
So it all boils down to:So it all boils down to:Block Access CostBlock Access Cost
more blocks more blocks --> more the time> more the timeParallel QueryParallel Query
Lock Management CostLock Management CostMore coordination More coordination --> more time> more timeImplicit Cache Checks Implicit Cache Checks –– Sequence NumbersSequence Numbers
Interconnect CostInterconnect CostLatencyLatencySpeedSpeedmore data to transfer more data to transfer --> more the time> more the time
©© Arup NandaArup Nanda
Hard LessonsHard LessonsIn RAC, problem symptoms may not In RAC, problem symptoms may not indicate the correct problem!indicate the correct problem!Example:Example:
When the CPU is too busy to receive or send When the CPU is too busy to receive or send packets via UDP, the packets fails and the packets via UDP, the packets fails and the ClusterwareClusterware thinks the node is down and thinks the node is down and evicts it.evicts it.
©© Arup NandaArup Nanda
OS TroubleshootingOS TroubleshootingOS utilities to troubleshoot CPU issuesOS utilities to troubleshoot CPU issues
toptopglanceglance
OS Utilities to troubleshoot process OS Utilities to troubleshoot process issues:issues:
trusstrussstracestracedbxdbxpstackpstack
©© Arup NandaArup Nanda
Reducing LatencyReducing LatencyA factor of technologyA factor of technologyTCP is the most latentTCP is the most latentUDP is better (over Ethernet)UDP is better (over Ethernet)Proprietary protocols are usually betterProprietary protocols are usually better
HyperFabricHyperFabric by HPby HPReliable Datagram (RDP) Reliable Datagram (RDP) Direct Memory ChannelDirect Memory Channel
InfinibandInfinibandUDP over UDP over InfinibandInfinibandRDP over RDP over InfinibandInfiniband
©© Arup NandaArup Nanda
Start with AWRStart with AWR
©© Arup NandaArup Nanda
gcgc current|crcurrent|cr grant 2grant 2--wayway
Instance 1
Instance 2
Session
Database
LMS LGWR
Log BufferLog Buffer
LMS
gc current block requestgc current grant 2-way
©© Arup NandaArup Nanda
gcgc current|crcurrent|cr block 2block 2--wayway
Instance 1
Instance 2
Session
Database
LMS LGWR
Log BufferLog Buffer
LMS
gc current block request log file syncgc current block 2-way
©© Arup NandaArup Nanda
gcgc current|crcurrent|cr block 3block 3--wayway
Instance 1
Instance 2
Instance 3
Session
Requestor
Master
Holder
gc current block 3-way
©© Arup NandaArup Nanda
RAC related StatsRAC related Stats
©© Arup NandaArup Nanda
RAC Stats contd.RAC Stats contd.
©© Arup NandaArup Nanda
©© Arup NandaArup Nanda
Other GC Block WaitsOther GC Block Waitsgcgc current/current/crcr block lostblock lost
Lost blocks due to Interconnect or CPULost blocks due to Interconnect or CPUgcgc curent/crcurent/cr block busyblock busy
The consistent read request was delayed, The consistent read request was delayed, most likely an I/O bottleneckmost likely an I/O bottleneck
gcgc current/current/crcr block congestedblock congestedLong run queues and/or paging due to Long run queues and/or paging due to memory deficiency.memory deficiency.
©© Arup NandaArup Nanda
Hung or Slow?Hung or Slow?Check Check V$SESSIONV$SESSION for for WAIT_TIMEWAIT_TIME
If 0, then itIf 0, then it’’s not waiting; its not waiting; it’’s hungs hungWhen hung:When hung:
Take a Take a systemstatesystemstate dump from all nodesdump from all nodesWait some timeWait some timeTake another Take another systemstatesystemstate dumpdumpCheck change in values. If unchanged, then Check change in values. If unchanged, then system is hungsystem is hung
©© Arup NandaArup Nanda
Chart a PlanChart a PlanRule out the obviousRule out the obviousStart with AWR ReportStart with AWR ReportStart with TopStart with Top--5 Waits5 WaitsSee if they have any significant waitsSee if they have any significant waits…… especially RAC relatedespecially RAC relatedGo on to RAC StatisticsGo on to RAC StatisticsBase your solution based on the wait Base your solution based on the wait eventevent
©© Arup NandaArup Nanda
Rule out the obviousRule out the obviousIs interconnect private?Is interconnect private?Is interconnect on UDP?Is interconnect on UDP?Do you see high CPU?Do you see high CPU?Do you see a lot of IO bottleneck?Do you see a lot of IO bottleneck?How about memory?How about memory?Are the apps spread over evenly?Are the apps spread over evenly?Do you see lost blocks?Do you see lost blocks?
©© Arup NandaArup Nanda
Make Simple FixesMake Simple FixesStrongly consider RAID 0+1Strongly consider RAID 0+1Highest possible number of I/O pathsHighest possible number of I/O pathsUse fastest interconnect possibleUse fastest interconnect possibleUse private collision free domain for I/CUse private collision free domain for I/CCache and NOORDER sequencesCache and NOORDER sequences
©© Arup NandaArup Nanda
Enterprise ManagerEnterprise Manager
©© Arup NandaArup Nanda
Buffer BusyBuffer BusyCauseCause
Instance wants to bring something from disk Instance wants to bring something from disk to the buffer cacheto the buffer cacheDelay, due to space not availableDelay, due to space not availableDelay, Delay, bb’’cozcoz the source buffer is not readythe source buffer is not readyDelay, I/O is slowDelay, I/O is slowDelay, Delay, bb’’cozcoz redo log is being flushedredo log is being flushed
In summaryIn summaryLog buffer flush Log buffer flush --> > gcgc buffer busybuffer busy
©© Arup NandaArup Nanda
Parallel QueryParallel QueryOne major issue in RAC is parallel query One major issue in RAC is parallel query that goes across many nodesthat goes across many nodes
Instance 1 Instance 2QC
Slave Slave Slave Slave
ViaInterconnect
©© Arup NandaArup Nanda
Restricting PQRestricting PQDefine Instance GroupsDefine Instance GroupsSpecify in Specify in init.orainit.oraprodb1.instance_groups='pqgroup1'prodb1.instance_groups='pqgroup1'
prodb2.instance_groups='pqgroup2' prodb2.instance_groups='pqgroup2'
Specify Instance Groups in SessionSpecify Instance Groups in SessionSQL> alter session set SQL> alter session set parallel_instance_groupparallel_instance_group = = 'pqgroup1'; 'pqgroup1';
©© Arup NandaArup Nanda
Forcing PQ on both NodesForcing PQ on both NodesDefine a common Instance GroupDefine a common Instance Groupprodb1.instance_groups='pqgroup1prodb1.instance_groups='pqgroup1‘‘
prodb1.instance_groups=prodb1.instance_groups=‘‘pq2nodes'pq2nodes'
prodb2.instance_groups='pqgroup2'prodb2.instance_groups='pqgroup2'
prodb2.instance_groups='pq2nodes' prodb2.instance_groups='pq2nodes'
Specify Instance Groups in SessionSpecify Instance Groups in SessionSQL> alter session set SQL> alter session set parallel_instance_groupparallel_instance_group = = 'pq2nodes'; 'pq2nodes';
©© Arup NandaArup Nanda
Vital Cache Fusion ViewsVital Cache Fusion Viewsgv$cache_transfergv$cache_transfer:: Monitor blocks Monitor blocks transferred by objecttransferred by objectgv$class_cache_transfergv$class_cache_transfer:: Monitor Monitor block transfer by classblock transfer by classgv$file_cache_transfergv$file_cache_transfer:: Monitor Monitor the blocks transferred per file the blocks transferred per file gv$temp_cache_transfergv$temp_cache_transfer:: Monitor Monitor the transfer of temporary tablespace the transfer of temporary tablespace blocksblocks
©© Arup NandaArup Nanda
““HotHot”” TablesTablesTables, e.g. Rate PlansTables, e.g. Rate Plans
SmallSmallCompact blocksCompact blocksHigh updatesHigh updatesHigh readsHigh reads
SymptomsSymptomsgcgc buffer busy waitsbuffer busy waits
SolutionSolutionLess rows per blockLess rows per blockHigh PCTFREE, INITRANS, High PCTFREE, INITRANS, ALTER TABLE ALTER TABLE …… MINIMIZE MINIMIZE RECORDS_PER_BLOCKRECORDS_PER_BLOCK
©© Arup NandaArup Nanda
Hot SequencesHot SequencesSymptoms:Symptoms:
High waits on Sequence Number latchHigh waits on Sequence Number latchHigh waits on SEQ$ tableHigh waits on SEQ$ table
Solution:Solution:Increase the cacheIncrease the cacheMake it NOORDERMake it NOORDER
Especially AUDSESS$ sequence in SYS, Especially AUDSESS$ sequence in SYS, used in Auditingused in Auditing
©© Arup NandaArup Nanda
Read Only? Say So.Read Only? Say So.Reading table data from other instances Reading table data from other instances create create ““gcgc **”” contentionscontentionsSuggestion:Suggestion:
Move Read Only tables to a single tablespaceMove Read Only tables to a single tablespaceMake this tablespace Read OnlyMake this tablespace Read OnlySQL> alter tablespace ROD read SQL> alter tablespace ROD read only;only;
©© Arup NandaArup Nanda
PartitioningPartitioningPartitioning creates several segments for Partitioning creates several segments for the same table (or index) the same table (or index) => more resources => more resources => less contention=> less contention
©© Arup NandaArup Nanda
Monotonically Increasing IndexMonotonically Increasing IndexProblem:Problem:
““Reservation IDReservation ID””, a sequence generated key, a sequence generated keyIndex is heavy on one sideIndex is heavy on one side
SymptomsSymptomsBuffer busy waitsBuffer busy waitsIndex block Index block spiltsspilts
Solutions:Solutions:Reverse key indexesReverse key indexesHash partitioned index (even if the table is not Hash partitioned index (even if the table is not partitioned) 10gR2partitioned) 10gR2
©© Arup NandaArup Nanda
Library CacheLibrary CacheIn RAC, Library Cache is globalIn RAC, Library Cache is globalSo, parsing cost is worse than nonSo, parsing cost is worse than non--RACRACSolutions:Solutions:
Minimize table alters, drops, creates, Minimize table alters, drops, creates, truncatestruncatesUse PL/SQL stored programs, not unnamed Use PL/SQL stored programs, not unnamed blocksblocks
©© Arup NandaArup Nanda
Log FilesLog FilesIn 10g R2, the log files are in a single location:In 10g R2, the log files are in a single location:$CRS_HOME/log/<Host>/$CRS_HOME/log/<Host>/……
racgracg
crsdcrsd
cssdcssd
evmdevmd
clientclient
cssd/oclsmoncssd/oclsmon
$ORACLE_HOME/$ORACLE_HOME/racgracg/dump/dump
©© Arup NandaArup Nanda
Case StudyCase Study
©© Arup NandaArup Nanda
DiagnosisDiagnosisifconfigifconfig --aa shows no congestion or shows no congestion or dropped packetsdropped packetsTop shows 1% idle time on node 2Top shows 1% idle time on node 2Top processesTop processes
LMS and LMDLMS and LMDAnd, several And, several NetbackupNetbackup processesprocesses
©© Arup NandaArup Nanda
Further DiagnosisFurther DiagnosisSQL:SQL:select * from select * from v$instance_cache_transferv$instance_cache_transferwhere class = 'data block'where class = 'data block'and instance = 1;and instance = 1;
Output:Output:INSTANCE CLASS CR_BLOCK CR_BUSYINSTANCE CLASS CR_BLOCK CR_BUSY
-------------------- ------------------------------------ -------------------- --------------------CR_CONGESTED CURRENT_BLOCK CURRENT_BUSY CURRENT_CONGESTEDCR_CONGESTED CURRENT_BLOCK CURRENT_BUSY CURRENT_CONGESTED------------------------ -------------------------- ------------------------ ----------------------------------
1 data block 162478682 50971491 data block 162478682 5097149477721 347917908 2950144 16320267477721 347917908 2950144 16320267
After sometime:After sometime:INSTANCE CLASS CR_BLOCK CR_BUSYINSTANCE CLASS CR_BLOCK CR_BUSY-------------------- ------------------------------------ -------------------- --------------------CR_CONGESTED CURRENT_BLOCK CURRENT_BUSY CURRENT_CONGESTEDCR_CONGESTED CURRENT_BLOCK CURRENT_BUSY CURRENT_CONGESTED------------------------ -------------------------- ------------------------ ----------------------------------
1 data block 162480580 50971851 data block 162480580 5097185477722 347923719 2950376 16320269477722 347923719 2950376 16320269
See increases
©© Arup NandaArup Nanda
Diagnosis:Diagnosis:CPU starvation by LMS/D processes caused CPU starvation by LMS/D processes caused GC waits.GC waits.
Solution:Solution:Killed the Killed the NetbackupNetbackup processesprocessesLMD and LMS got the CPULMD and LMS got the CPU
©© Arup NandaArup Nanda
Increasing Interconnect SpeedIncreasing Interconnect SpeedFaster HardwareFaster Hardware
Gigabit Ethernet; not FastGigabit Ethernet; not FastInfinibandInfiniband, even if IP over IB, even if IP over IB
NIC settingsNIC settingsDuplex ModeDuplex ModeHighest Top Bit RateHighest Top Bit Rate
TCP SettingsTCP SettingsFlow Control SettingsFlow Control SettingsNetwork Interrupts for CPUNetwork Interrupts for CPUSocket Receive BufferSocket Receive Buffer
LAN PlanningLAN PlanningPrivate LANsPrivate LANsCollision DomainsCollision Domains
©© Arup NandaArup Nanda
High Speed InterconnectsHigh Speed InterconnectsOracle will support RDS over Oracle will support RDS over InfinibandInfinibandhttp://oss.oracle.com/projects/rds/http://oss.oracle.com/projects/rds/On 10 Gig Ethernet as wellOn 10 Gig Ethernet as well
©© Arup NandaArup Nanda
In summary: PlanningIn summary: PlanningAdequate CPU, Network, MemoryAdequate CPU, Network, MemorySequences Sequences –– cache, cache, noordernoorderTablespaces read onlyTablespaces read onlyUnUn--compact small hot tablescompact small hot tablesKeep undo and redo on fastest disksKeep undo and redo on fastest disksAvoid full table scans of large tablesAvoid full table scans of large tablesAvoid Avoid DDLsDDLs and unnamed PL/SQL blocksand unnamed PL/SQL blocks
©© Arup NandaArup Nanda
In summary: DiagnosisIn summary: DiagnosisStart with AWRStart with AWRIdentify symptoms and assign causesIdentify symptoms and assign causesDonDon’’t get fooled by t get fooled by ““gcgc”” waits as waits as interconnect issuesinterconnect issuesFind the correlation between Find the correlation between ““droppeddropped””packets in network, CPU issues from packets in network, CPU issues from sarsarand and ““gcgc buffer lostbuffer lost”” in in sysstatsysstat reports.reports.
©© Arup NandaArup Nanda
Thank You!Thank You!Download from:Download from:
proligence.com/downloads.htmlproligence.com/downloads.html