EE422C Software Design and Implementation II Vallath Nandakumar Fall 2015.
Looking Under the Hood at the Oracle ClusterWare OOW -2009 Murali Vallath...
-
Upload
brianne-stevenson -
Category
Documents
-
view
217 -
download
2
Transcript of Looking Under the Hood at the Oracle ClusterWare OOW -2009 Murali Vallath...
Looking Under the Hood at the Looking Under the Hood at the Oracle ClusterWareOracle ClusterWare
OOW -2009OOW -2009Murali Vallath
About me…Independent Oracle Consultant - Summersky Enterprises
e-mail: [email protected]
Agenda
• Architecture
• ClusterWare Components
• CSS Startup process..
• Oracle ClusterWare
• Debug../Troubleshooting
• OCR
• Q&A
Architecture
© Summersky Enterprises LLC | Murali Vallath | Slide: 4
ORADB2ORADB1
Cluster Interconnect
SSKY1SSKY1 SSKY2SSKY2
ORADB4
SSKY4SSKY4
ORADB3
SSKY3SSKY3
Public Network
Shared Storage
Listeners | Monitors-----------------------Clusterware
IPC
Comm. Layer
Network Switch
Network Switch
SAN switch
VIPVIPVIPVIP VIPVIPVIPVIP
Operating System
Listeners | Monitors-----------------------Clusterware
IPC
Comm. Layer
Operating System
Listeners | Monitors-----------------------Clusterware
IPC
Comm. Layer
Operating System
Listeners | Monitors-----------------------Clusterware
IPC
Comm. Layer
Operating System
SSKYDBSSKYDB
Cluster Manager
• Is a distributed kernel component that monitors whether cluster members can communicate with each other
• Enforces rules of cluster membership
• Forms a cluster, adds members to a cluster and removes members from a cluster
• Tracks which members in a cluster are active
• Maintains a cluster membership list that is consistent on all cluster members
• Provides timely notification of membership changes
• Detects and handles possible cluster partitions
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 5
Oracle Clusterware Components
• Cluster Synchronization Services (CSS)
• Cluster Ready Services (CRS)
• Event Manager (EVM)
• Oracle Cluster Registry (OCR)
• Voting Disk
• Virtual (IP)
• Cluster Interconnect
ORADB2ORADB1
Cluster Interconnect
SSKY1SSKY1 SSKY2SSKY2
ORADB4
SSKY4SSKY4
ORADB3
SSKY3SSKY3
Public Network
RACGIMON
CRS
CSS EVM
CRS
CSS
EVM
CRS
EVM
CSS
CRS
EVM
CSS
VIPVIP VIPVIP VIPVIP VIPVIP
OCR (registry)CSS Voting Disk
NM
GM
OCR
SGA
Network Switch
Interconnect Switch
SAN switch
Oracle Clusterware
CSSD
• Node Membership (NM)– Checks the heartbeat across the various nodes
in the cluster every second – Checks the voting disk to determine if there is a
failure on any other nodes in the cluster
• Group Membership (GM)– Provides group membership services – All clients that perform I/O operations register
with the GM; for example, the LMON, DBWR etc © Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 8
EVMD
• Event forwarding daemon process
• Propagates using Oracle notification service (ONS)
• Scans node callout directory and invokes callouts
• Started after CSSD is started.
• Communication bridge between CSS and CRS
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 9
CRSD
• Defines and manages resources• Resource profile is stored in OCR• CRS reads OCR to manage resources• Manages application resources
– START– STOP– Manages Failover – Generates events during cluster state change
• Information from OCR is cached by CRS• Communicates with RAGIMON
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 10
Logging
• $ORA_CRS_HOME/log/<node name> directory contains– Clusterware alert log e.g.: <nodename.log>– crsd – log files for CRS daemons – cssd - log files for CSS daemons– evmd – log files for EVM daemons– racg – log files for node applications including
VIP and ONS
Clusterware log directory structure
crs
log
node
admin evmd client cssd racg crsd
DEBUG
• crsctl debug statedump crs– Output gets appended to ORA_CRS_HOME/log/oradb4/crsd/crsd.log
• crsctl debug statedump evm– Output gets appended to ORA_CRS_HOME/log/oradb4/evmd/evmd.log
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 13
DEBUG CRS Modules
Modules Functions /description
CRSUI User interface module
CRSCOMM Communication module
CRSRTI Resource management module
CRSMAIN Main module/driver
CRSPLACE CRS placement module
CRSAPP CRS application
CRSRES CRS Resources
CRSOCR OCR interface/ engine
CRSTIMER Various CRS related timers
CRSEVT CRS - EVM/event interface module
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 14
DEBUG CRS Modules
crsctl debug log crs “CRSTIMER:2”
crsctl debug log crs “CRSEVT:1”
crsctl debug log crs “CRSAPP:2”
DEBUG EVM Modules
Module Name Function
EVMD EVM deamon
EVMDMAIN EVM main module
EVMCOMM EVM communication module
EVMEVT EVM event module
EVMAPP EVM application module
EVMAGENT EVM agent module
CRSOCR OCR interface /engine
CLUCLS EVM cluster /CSS information
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 16
EVMD Check
D:\oracle\product\10.2.0\crs\BIN>evmwatch -A -t "@timestamp @priority @name"05-Dec-2007 22:06:48 200 sys.ora.evm.msg.user05-Dec-2007 22:06:50 200 ora.ha.oradb5.ASM2.asm.imcheck05-Dec-2007 22:06:50 200 ora.ha.oradb5.ASM2.asm.imup05-Dec-2007 22:07:14 200 sys.ora.evm.msg.user05-Dec-2007 22:07:21 200 sys.ora.evm.msg.user05-Dec-2007 22:07:21 200 sys.ora.evm.msg.user05-Dec-2007 22:08:15 200 ora.ha.SSKY2.SSKY2.inst.imcheck05-Dec-2007 22:08:15 200 ora.ha.SSKY2.SSKY2.inst.imup05-Dec-2007 22:09:26 200 ora.ha.oradb4.ASM1.asm.imcheck05-Dec-2007 22:09:26 200 ora.ha.oradb4.ASM1.asm.imup05-Dec-2007 22:10:17 200 ora.ha.SSKY.SSKY1.inst.imcheck05-Dec-2007 22:10:17 200 ora.ha.SSKY.SSKY1.inst.imup
© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 17
EVMD ActionsAction Priority Function
Error 500 No response is received for the action sent
transition 300 The event is in a state change process. Normally the action is received when a resource or service is initially started, stopped or failing over.
Down 200 Indicates that the resource or service is currently down
running 300 Indicates that the service or resource is currently in execution state. This state is normally seen in cluster services or applications managed by the Oracle Clusterware for example ‘crs’
Up 200 Indicates that the service or resource specified is up.
Imstop 200 Indicates an HA service stop action
relocatefailed 300 Indicates an attempt to relocate a service or resource from one node to another, however such relocation attempt failed. This action normally follows other actions such as ‘imstop’ or ‘stopped’
stopped 300 Indicates that the application has completely stopped execution.
Cluster VerificationD:\oracle\product\10.2.0\crs\BIN>olsnodes -n -p -v -g -iprlslms: Initializing LXL globalprlsndmain: Initializing CLSS contextprlsmemberlist: No of cluster members configured = 256prlsmemberlist: Getting information for nodenum = 1prlsmemberlist: node_name = oradb4prlsmemberlist: ctx->lsdata->node_num = 1prls_getnodeprivname: Retrieving the node private name for node = oradb4prls_getnodeprivname: Private node name = oradb4-privprls_getnodevip: Retrieving the virtual IP for node = oradb4prls_getnodevip: prsr_vpip_key_len = 281prls_getnodevip: Opening the OCR key DATABASE.NODEAPPS.oradb4.VIPprls_getnodevip: OCR key value length = 29prls_getnodevip: Virtual IP = oradb4-vip.sumsky.netprls_printdata: Printing the node dataoradb4 1 oradb4-priv oradb4-vip.sumsky.netprlsmemberlist: Getting information for nodenum = 2prlsmemberlist: node_name = oradb5prlsmemberlist: ctx->lsdata->node_num = 2prls_getnodeprivname: Retrieving the node private name for node = oradb5prls_getnodeprivname: Private node name = oradb5-privprls_getnodevip: Retrieving the virtual IP for node = oradb5prls_getnodevip: prsr_vpip_key_len = 281prls_getnodevip: Opening the OCR key DATABASE.NODEAPPS.oradb5.VIPprls_getnodevip: OCR key value length = 29prls_getnodevip: Virtual IP = oradb5-vip.sumsky.netprls_printdata: Printing the node dataoradb5 2 oradb5-priv oradb5-vip.sumsky.netprlsndmain: olsnodes executed successfully © Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 19
OCSSD
• Spawned in init.cssd
• Exists in both vendor ClusterWare and non-vendor ClusterWare environments
• Performs inter node health monitoring
• Performs RDBMS instance endpoint discovery
OCSSD – Reboot Causes
• Network failure or latency between nodes• Problems writing or reading from the CSS voting disk• Lack of CPU resources• Problem with the executables• Mis-configuration of CRS
– Wrong network selected as private for CRS– Placing the CSS vote file on a Netapp that’s shared over unreliable
or excessively loaded network• Killing the ‘init.cssd fatal’ process or “ocssd”
process• Unexpected failure of the OCSSD process• Oracle bug
OPROCD
• Is a process monitor deamon that provides cluster level I/O fencing
• This process is spawned in any non-vendor ClusterWare environment (Exception: Windows)
• Replaces hangcheck timer module for Linux (post 10.2.0.4)
• Runs as root• Locked in memory• Failure causes reboot of system
OPROCD
• Accepts two parameters – -t - timeout value
•OPROCD_DEFAULT_TIMEOUT• Specifies time between executions (milliseconds)• Defaults to 10000
– -m – margin•OPROCD_DEFAULT_MARGIN• Acceptable margin before reboot• Defaults to 500
/etc/init.d/init.cssd
OPROCD
• Current values can be obtained using crsctl– crsctl get css reboottime– crsctl get css diagwait
OPROCD – Reboot Causes
• OS scheduler issues
• OS locked by another process
• Excessive loads
• Oracle bug
OCLSOMON
• Used in environments with CRS and vendor clusterware• Helps in providing more diagnostics information to
vendors during node evections by flushing more information to the log files.
• This process monitors the CSS daemon for hangs or scheduling issues and can reboot a node if there is a hang.
• Registers with the SKGXN (ClusterWare layer) and CSS.• Lightweight process runs every second and ensures CSS is
healthy• During CSS hang, it calls local fence in init.cssd
OCLSOMON – Reboot Causes
• Reboots because CSS is hung
• When CSS is hung, or fails, clsomon will fail and call LocalFence in init.cssd
• OS scheduler issues
• Excessive amounts of load
• Oracle bug
Level Resource Name
1 SYSTEM CSS
EVM
CRS
LANGUAGE
VERSION
ORA_CRS_HOME
OCR
2 DATABASE DATABASES
ASM
NODEAPPS
VIP_RANGE
LOG
ONS_HOSTS
3 CRS CUR (current)
HIS (history)
SEC (security)© Summersky Enterprises LLC | OOW 2008 | Murali Vallath | Slide: 28
OCR
ORADB2ORADB1
Cluster Interconnect
ORADB4ORADB3
Public Interface
Oracle Database
OCR Process OCR Process OCR Process OCR Process
OCR Cache OCR Cache OCR Cache OCR Cache
OCR(repository)
OEM Agent OEM AgentsrvctlOUI
OCR
Clusterware Not Starting
• Bad voting disk
• Corrupted OCR file
• Log directories full
• Oracle Bug
OCR Corruption
• Check CSSD log files– Repeated attempts to CSS– Not able to read OCR file– OCR file locked on by other nodes
$ORA_CRS_HOME/log/oradb3/cssd/cssd.log
[CSSD]2009-06-04 19:30:36.042 [1274124608] >TRACE: clssnmRcfgMgrThread: Local Join[CSSD]2009-06-04 19:30:36.042 [1274124608] >WARNING:clssnmLocalJoinEvent:takeover aborted due to ALIVE node on Disk
• Stop CRS• Repair OCR file
• ocrconfig -repair ocr /dev/raw/ocr1• Repair Mirrored copy of OCR
• ocrconfig -repair ocrmirror /dev/raw/ocr2
• Stop CRS• Restore from OCR backup• Repair Mirrored copy of OCR
[root@oradb3 bin]# ocrcheckStatus of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 306968 Used space (kbytes) : 12852 Available space (kbytes) : 294116 ID : 658275539 Device/File Name : /dev/raw/ocr1 Device/File integrity check succeeded Device/File Name : /dev/raw/ocr2 Device/File integrity check succeeded
Cluster registry integrity check succeeded
References
• Oracle 10g RAC - Grid Services and Clustering – Murali Vallath
• Metalink Note #’s 26579.1
AQ&Q U E S T I O N SQ U E S T I O N S
A N S W E R SA N S W E R S
Join the RAC-SIG@
www.oracleracsig.org
My Other Presentations
• Session S307890– 12-OCT-2009 17:30 Room: 236
– Looking Under the Hood of Oracle ClusterWare
• Session S309238– 13-OCT-2009 14:30 @ Hilton /Franciscan A/B
– Understanding Oracle 11g RAC for Developers
• Session S299961 (Power Session)– 14-OCT-2009 13:45 Room: 308
– Exploiting Oracle Tools and Utilities to Monitor and Test Oracle RAC