Oracle 11G SCAN: Concepts and Implementation Experience Sharing
-
Upload
yury-velikanov -
Category
Technology
-
view
8.813 -
download
1
description
Transcript of Oracle 11G SCAN: Concepts and Implementation Experience Sharing
Oracle 11G SCAN: Concepts and implementation
experience sharing
Yury Velikanov
Senior Oracle DBA
Todd Carlson
Sr Manager, DBA Team
© 2009/2010 Pythian
SCAN Agenda
• Background • Introduction
• SCAN Infrastructure Main Components • SCAN troubleshooting
• Advanced points
• Q & A
2
© 2009/2010 Pythian
• Oracle ACE and RAC SIG regional leader - @yvelikanov - http://www.pythian.com/news/author/velikanov/
• Started as Oracle DBA - with 7.2 (in 1997, 14+)
• First international appearance - 2005 - Hotsos Symposium 2005
• First RAC experience - 2000 FIFA - Oracle Parallel Server
• Education (Master Degree in Computer science)
- OCP 7/8/8i/9/10 + OCM 9i/10g/11g
• Several 11GR2 RAC projects in production - Including GNS implementation
3
Few words about Yury
Google: Oracle Yury
Blog, Twitter, Linkedin, ACE … email, phone number
© 2009/2010 Pythian
Few words about Todd
• Have been a DBA for 12 years - 7.3.4
• Currently manage a team of - 7 DBAs covering Oracle, SQL Server and eBusiness Suite
(11.5.10.2) - 3 Data Warehouse Developers
• First RAC experience - 2006 – 10g R2
• Education (Master Degree in Business)
- OCP 8/8i/9/10g/11g
• 11GR2 RAC in production - 2/18/2011 – Migrated 1.3 TB EBS Database from 10g R2 Solaris
9 Single Instance to 11g R2 3-node cluster on RHEL 5.5 via Cross Platform Transportable Tablespaces
4
© 2009/2010 Pythian
World Wide Technology
Industry Leading Systems Integrator Providing Technology
& Supply Chain Services to Customers Around the World
➭ Privately Held with Revenue Over $4.5 Billion
➭ Over 1,600 Employees Across the Country and Around The Globe
➭ Global Strategy in Three Key Markets:
Value Added Reseller
Data Center
Supply Chain
© 2009/2010 Pythian
• Cluster managed server-side load balancing • Simple connection strings with failover • Prepare for the future • Adding Nodes with SCAN is masked from the clients • SCAN with Services completely abstracts the
underlying complexities from the applications
WWT’s SCAN Business Drivers
PRODRACDB2
Shared Storage via Automatic Storage Management
PRODRACDB1
Datafiles
PRODRACDB3
Redo Logs
Archive Logs
Voting Disk
Control Files
All System Log Files for Operations (alert, trace, core)
Cluster Ready Services
S
C
A
N
© 2009/2010 Pythian
WWT’s Services Business Drivers
• Need application activity visibility at the database level beyond what Application & Module offered
• Use Resource Manager to control workloads by Service
• Pin Services to specific nodes • Provide specific failover options • Tuning via Services
© 2009/2010 Pythian
OracleFi
re X4600
O
r
a
c
l
e
OracleFi
re
X4200
O
r
a
c
l
e
OracleFi
re X4600
O
r
a
c
l
e
OracleFi
re
X4200
O
r
a
c
l
e
OracleFi
re
X4200
O
r
a
c
l
e
OracleFi
re X4600
O
r
a
c
l
e
Database
F5
Storage Tier: EMC DMX 3500
RAID 1+0
73GB fiber drives
DB Tier: Oracle X4270 - 96 gig RAM
2 x Intel Xeon X5570 Quad-Core
2.93GHz
EBS Tier: Oracle x4170's - 24 gig RAM
2 x Intel Xeon X5570 Quad-Core
2.93GHz
Interconnect: Oracle Datacenter Infiniband
Switch
WWT Environment - Physical
© 2009/2010 Pythian
WWT Environment - Database
• 4 Production Databases run on the same cluster • Each runs as a separate OS user with non-shared homes • EBS database supports 11.5.10.2 & over 25 custom apps
• Ave ~2000 concurrent users • 1.4 TB • Ave 220+ DML transactions per second (tps) and peak at 2500 tps
• WWT applications running from • WebLogic • WebMethods • APEX • Shell • Misc. platforms
• Each database has up to 7 services
Service Application
PRODERP_IBI Reporting via IBI
PRODERP_BI BI Processing
PRODERP_WEBM Middleware
PRODERP_WWT_B2B OS File Processing
PRODERP_10g Reporting via 10g
PRODERP_GENERAL Developers & DBAs
PRODERP_APEX APEX
© 2009/2010 Pythian
WWT SCAN Results
• With SCAN, the Users don’t even know what database, much less what server, their Service is running on. • 7/23/11 – Node 3 crashed due to a PCI Driver issue
• All applications reconnected to Nodes 1 & 2 automatically • Load was evenly distributed • Only got 1 ticket because the offending system was hardcoded • Priceless
• Mission critical batch load failed from Node 2 to Node 1 • Failed the batch load between nodes seamlessly for troubleshooting • The batch load completed successfully
© 2009/2010 Pythian
Single [Client Access] Name
11
scan.clustgrid-prod.yourdomain.com
+ service
© 2009/2010 Pythian
There are Two SCAN related news
• Good • SCAN is based on known components you worked for
years now
• Other news • SCAN uses those components in different way
12
RAC: Frequently Asked Questions [ID 220970.1]
How to Troubleshoot Connectivity Issue with 11gR2 SCAN Name [ID 975457.1]
11gR2 Grid Infrastructure Single Client Access Name (SCAN) Explained [ID 887522.1]
SCAN & EBS 11i [ID 823581.1 ] R12 [823587.1]
© 2009/2010 Pythian
SCAN Introduction
• Single Client Access Name • Addresses the TNSNAMES multi address issue
• Old - 10G FAILOVER
• Complex TNS entries • Complex to manage (add a node) • Previous Oracle Clients support
• New - 11GR2
• One Simple TNS entry on client side • Easy to add nodes • Transparent to Oracle Client versions (!DNS issue!) • No static listener.ora file
13
© 2009/2010 Pythian
SCAN and PREV tnsnames.ora
14
PROD _HR.yourdomain.com =
(DESCRIPTION =
(ADDRESS_LIST =
(FAILOVER=on)
(LOAD_BALANCE=TRUE)
(ADDRESS = (PROTOCOL = TCP)(HOST = vip.node1)(PORT = 1523))
(ADDRESS = (PROTOCOL = TCP)(HOST = vip.node2)(PORT = 1523))
(ADDRESS= (PROTOCOL = TCP)(HOST = vip.node3)(PORT = 1523))
)
(CONNECT_DATA = (SERVICE_NAME = HR) )
)
scan.clustgrid-prod.yourdomain.com:1523/HR
© 2009/2010 Pythian
Ora*Net: Easy Connect
15
PROD _HR.yourdomain.com =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = scan.clustgrid-prod)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = HR)
)
)
scan.clustgrid-prod.yourdomain.com
scan.clustgrid-prod.yourdomain.com:1521
scan.clustgrid-prod.yourdomain.com:1521/HR
scan.clustgrid-prod.yourdomain.com:1521/HR:dedicated/ERP1
Oracle® Database Net Services Administrator's Guide
11g Release 2 (11.2)
Part Number E10836-06
http://download.oracle.com/docs/cd/E11882_01/network.112/e10836/naming.htm#BABJBFHJ
© 2009/2010 Pythian 16
© 2009/2010 Pythian
SCAN Infrastructure Main Components
• Single Client Access Name + Oracle Services (Definitions) • DNS – resolving SCAN to 3 IP addresses (Round Robin)
• Primary / Secondary • NameServer configuration (client side)
• SCAN Listeners • Keeps records on available Local Listeners and Services those serve • Forwards connections to less loaded Local Listener
• Local (VIP) Listeners • Creates foreground processes • Manages sockets
• RAC (SCAN / VIP / Interconnect) ip addresses
• Grid Name Service • registers and resolves RAC ip addresses
• DHCP • Assign dynamically IP addresses
17
© 2009/2010 Pythian 18
© 2009/2010 Pythian
SCAN troubleshooting
• Service Names • DO NOT MODIFY init.ora:service_name • USE srvctl to configure and manage services
srvctl config service -d <DB Name>
… Service name: DEVERP_APEX.GGT.COM
Service is enabled
Failover type: NONE
Preferred instances: DEVERP1
Available instances: DEVERP1,DEVERP2,DEVERP3,DEVERP4,DEVERP5,DEVERP6
…
show parameter service_name SQL> show parameter service_name
NAME TYPE VALUE
-------------------- ----------- --------------------------------------------------
service_names string DEVERP_CDC.GGT.COM, SYS$APPLSYS.WF_CONTROL.DEVERP.
WORLD, SYS$STREAMS_ADMIN.CDC$Q_ERP.DEVERP.WORLD, D
EVERP_WEBM.GGT.COM, DEVERP_WWT_B2B.GGT.COM, DEVERP
_RFUI.GGT.COM, DEVERP_IBI.GGT.COM, DEVERP_GENERAL.
WWT.COM, DEVERP_BI.GGT.COM, DEVERP_APEX.GGT.COM, D
EVERP_10g, DEVERP1, DEVERP
SQL>
19
© 2009/2010 Pythian
SCAN troubleshooting
• Oracle Listeners • Running under grid OS user
• Don’t start it under ORACLE user (DB OH) • If you do you end up with a mess
• Manage (start/stop) by srvctl • Be careful with manual start/stop (TNS_ADMIN)
• listener.ora is dynamic configuration file by default • [All] parameters managed by Cluster • Use srvctl to configure
• Make sure listeners listen on corresponding IPs >lsnrctl status LISTENER_SCAN2
…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.2.9.122)(PORT=1523)))
…
• LISTENER_SCAN1/2/3 on SCAN IPs • LISTENER on VIP and Public IPs
20
© 2009/2010 Pythian
SCAN troubleshooting
• init.ora:local_listener • It is an OLD good parameter • The same rules applies
• Specify LOCAL listener only! • Cant stress enough !!! NO SCAN !!!
• You can use TNS address directly or TNS alias • !!! If can’t resolve an instance won't start !!!
SQL> show parameter local_listener
NAME TYPE VALUE
-------------------- ----------- --------------------------------------------------
local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)
(HOST=devracdb1-vip)(PORT=1534))(ADDRESS=(PROTOCOL=TCP)
(HOST=devracdb1-vip)(PORT=1521))))
SQL> show parameter local_listener
NAME TYPE VALUE
-------------------- ----------- --------------------------------------------------
local_listener string devracdb1-vip
tnsping devracdb1-vip
21
© 2009/2010 Pythian
SCAN troubleshooting
• init.ora:remote_listener
SQL> show parameter remote_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string scan.clustgrid-prod.yourdomain.com
SQL>
• The same management principals apply
• Make SURE it points to SCAN IP addresses only • Cant stress enough !!! NO VIP !!!
• Any valid TNS config is acceptable • tnsnames alias • sqlnet.ora
• NAMES.DIRECTORY_PATH=(TNSNAMES, EZCONNECT)
• Use SCAN or IPs (for static SCAN conf only)
!!! If can’t resolve an instance won't start !!!
22
© 2009/2010 Pythian
SCAN troubleshooting
• DNS • dig (Linux os command) • nslookup <scan> (run several times)
• check primary and secondary name servers
• Make 200% sure
• SCAN doesn’t contain VIPs • VIPs don’t contain SCAN IPs
23
[oracle@host01 admin]$ dig scan.clustgrid-prod.yourdomain.com
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5 <<>> scan.clustgrid-prod.yourdomain.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15137
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;scan.clustgrid-prod.yourdomain.com. IN A
;; ANSWER SECTION:
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.218
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.216
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.217
;; AUTHORITY SECTION:
clustgrid-prod.yourdomain.com. 86400 IN NS gns.clustgrid-prod.yourdomain.com.
;; Query time: 0 msec
;; SERVER: 172.30.192.82#53(172.30.192.82)
;; WHEN: Wed Jun 15 18:21:03 2011
;; MSG SIZE rcvd: 114
[oracle@host01 admin]$
[oracle@host01 admin]$ nslookup scan.clustgrid-prod.yourdomain.com | grep Address | tail -1
Address: 172.30.193.218
[oracle@host01 admin]$ nslookup scan.clustgrid-prod.yourdomain.com | grep Address | tail -1
Address: 172.30.193.216
[oracle@host01 admin]$ nslookup scan.clustgrid-prod.yourdomain.com | grep Address | tail -1
Address: 172.30.193.218
[oracle@host01 admin]$ dig scan.clustgrid-prod.yourdomain.com
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5 <<>> scan.clustgrid-prod.yourdomain.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15137
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;scan.clustgrid-prod.yourdomain.com. IN A
;; ANSWER SECTION:
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.218
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.216
scan.clustgrid-prod.yourdomain.com. 58 IN A 172.30.193.217
© 2009/2010 Pythian
GNS Advanced points
• SCAN + GNS implementation
• Most probably you do not need it
• Makes the configuration 100% dynamic • Unlimited number of nodes with simple Oracle
Client Configuration
• Oracle retrieves new IPs from DHCP for SCAN / VIP / [ Interconnect ] components at startup time
• The only static RAC IP is GNS IP
24
© 2009/2010 Pythian
GNS Advanced points
• Additional components • Grid Name Service • DNS and GNS integration (SCAN/VIP) • Dedicated DHCP service
• Separate Network Segment • DHCP redundancy could be an issue
• RAC and DHCP integration • Make DHCP assigning the same IPs (or range) each
time per RAC process (Joseph Griffiths) • http://blog.jgriffiths.org/?p=24 • DHCPDISCOVER from 00:00:00:00:00:00 via eth0
• Many things could go wrong !!! • GNS Troublesooting – see my blog
25
© 2009/2010 Pythian
There are Two SCAN related news
• Good • SCAN is based on known components you worked for
years now
• Other news • SCAN uses those components in different way
26
27 © 2011 Pythian - Confidential
Why Companies Trust Pythian • Recognized Leader:
• Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and SQL Server
• Work with over 150 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments
• Expertise:
• One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 6 Oracle ACEs/ACE Directors.
• Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC.
• Global Reach & Scalability:
• 24/7/365 global remote support for DBA and consulting, systems administration, special projects or emergency response
© 2009/2010 Pythian
Additional Resources
• www.oracle.com/scan • www.pythian.com/exadata • www.pythian.com/news/tag/exadata - Exadata
Blog • www.pythian.com/news_and_events/in_the_news
Article: “Making the Most of Oracle Exadata” My Oracle Support notes 888828.1 and 757552.1
Thank you!
28
Google: Oracle Yury
Blog, Twitter, Linkedin, ACE … email, phone number
RAC: Frequently Asked Questions [ID 220970.1]
How to Troubleshoot Connectivity Issue with 11gR2 SCAN Name [ID 975457.1]
11gR2 Grid Infrastructure Single Client Access Name (SCAN) Explained [ID 887522.1]
SCAN & EBS 11i [ID 823581.1 ] R12 [823587.1]