KSCOPE 2013: Exadata Consolidation Success Story

Post on 24-Apr-2015

360 views 2 download

description

 

Transcript of KSCOPE 2013: Exadata Consolidation Success Story

Exadata Consolidation Success Story

Getting the kids to play nice with each other…

1

Presented by: Karl Arao

whoami

Karl Arao• Senior Technical Consultant @ Enkitec• Performance and Capacity Planning Enthusiast

7 years DBA experienceOracle ACE, OCP-DBA, RHCE, OakTableBlog: karlarao.wordpress.comWiki: karlarao.tiddlyspot.comTwitter: @karlarao

www.enkitec.com 2

www.enkitec.com 3

100+

3

Agenda

• Architecture • Tools and Methodology

– Simple consolidation scenario– Provisioning workflow and the worksheet

• War Stories

www.enkitec.com 4

General Architecture

www.enkitec.com 5

Primary Site Standby Site

Production

Test & Dev Disaster Recovery

Future Growth

General Architecture

www.enkitec.com 6

The Stats

Three Half Rack Exadata clusters with High Cap. drivesCluster #1

36 Dev/Test Databases

Cluster #211 Production Databases

Cluster #313 Dev/Test Databases6 Standby Databases

Still more databases to come…

www.enkitec.com 7

Why Consolidate?

Primary drivers for consolidation center around cost savings•Reduces Oracle software licensing•3rd party products such as backup agents, ETL tools, etc…•More efficient use of system resources•Soft Costs

– Floor space– Power & Cooling– Administration, Staffing Costs

(training, etc.)

www.enkitec.com 8

www.enkitec.com 9

7 Databases

A Simple Consolidation Example

www.enkitec.com 10

For example, the first row should read… Database ‘A’ requires 4 CPU’s and will run on nodes 1 and 2 (2 CPU’s each)

Let’s say we have the following databases to migrate on Exadata:

Cluster Level Utilization

A Simple Consolidation Example

www.enkitec.com 11

Let’s say we have the following databases to migrate on Exadata:

Per compute node Utilization

For example, the first row should read… Database ‘A’ requires 4 CPU’s and will run on nodes 1 and 2 (2 CPU’s each)

A Simple Consolidation Example

www.enkitec.com 12

Cluster Level

Utilization = 29.2%Per compute node Utilization

25% 42% 33% 17%

A Simple Consolidation Example

www.enkitec.com 13

Cluster Level

Utilization = 29.2%Per compute node Utilization

8% 83%83% 17% 8%

A Simple Consolidation Example

www.enkitec.com 14

• Gather Utilization Metrics (usage history)• Create Provisioning Plan• Implement Plan• Audit Your Implementation

Tools And Methodology

www.enkitec.com 15

Provisioning Worksheet

• Capacity Planning

• Communication Tool

• Hand off

www.enkitec.com 16

**Supplement to existing Exadata installation tools:• Site planning checklist• Configuration Worksheet• Exadata Configurator sheet• CheckIP• OneCommand

Utilization = Requirements / Capacity

Capacity

www.enkitec.com 17

2 = quarter rack4 = half rack8 = full rack

SPECint_rate2006http://goo.gl/doBI5

CPU_COUNT, threads, & cores

http://goo.gl/CunHN

96 to 144GB(frequency of the memory DIMMs

drops to 800 MHz from 1333 MHz)

Space will also depend on:•ASM redundancy•DATA/RECO allocation

http://goo.gl/I3fjn

Query Low (4x)Query High (6x)Archive Low (7x)

Archive High (12x)

Smart

Scans!

CPU Core Comparison

www.enkitec.com 18

Source

chip efficiency factor = source SPEC rating / Exadata SPEC rating = 16/26 = .6154 EXA cores requirement = source host cores * utilization * chip efficiency factor = 32 * .7 * .6154 = 13.78

* offload factor * .5 --------- 6.89

Sun Fire X4170 M2 X5670@2.93GHz

Destination

how much of the source CPU cores

are being used

multiplier for equivalent database

machine cores

amount of CPU resources that will be offloaded to the

storage cells

The Perfect Storm(Peoplesoft HR)

www.enkitec.com 19

Month-end Processing+ Weekly Time Entry

+ SQL Plan Change------------------------------------

Uh-oh!

CPU Allocation

www.enkitec.com 20

DB Uniq Name DB Namenode1

node2

node3

node4

    4 instance 5 instance 4 instance 3 instance

    47% cpu used 75% cpu used 47% cpu used 18% cpu used

  49% mem used 66% mem used 71% mem used 54% mem used

BIPRDDAL biprd   P P  

DBFSPRD DBFSPRD P P P P

HCMPRDDAL hcmprd P P    

MTAPRD11DAL mtaprd11     P P

PAPRDDAL paprd P P    

RMPRDDAL rmprd P P    

dbm dbm F F F F

Fsprddal fsprd     P P

= Preferred = Failover

www.enkitec.com 21

Load Map(our first stop…)

Users Complaint: HR time entry and OBIEE reports painfully slow…

www.enkitec.com 22

Top Activity - HCMPRD

www.enkitec.com 23

Instance Activity – HCMPRD2

HCMPRD Caged at 12 CPU’s

SQL Profile Installed to lock in good plan.

Problem: A single SQL stmt. overwhelming CPU resources.

Node 2

Memory Exhaustion(OBIEE)

“1 Report = 1 SQL query, right?”

WRONG!

www.enkitec.com 24

www.enkitec.com 25

Overlapping workloads of three databases across 3 nodes.

BIPRD, HCMPRD, and MTAPRD

Overlapping workloads of three databases across 3 nodes.

BIPRD, HCMPRD, and MTAPRD

Node 1

Node 2

Node 3

Node 4

www.enkitec.com 26

Node Layout Revisited…

www.enkitec.com 27

Notice what happens to CPU waits and the system load average when this report is run.

Notice what happens to CPU waits and the system load average when this report is run.

www.enkitec.com 28

PGA Memory SpikesPGA Memory Spikes

www.enkitec.com 29

www.enkitec.com 30

Storage Cell Saturation(OBIEE)

www.enkitec.com 31

www.enkitec.com 32

www.enkitec.com 33

I/O Intensive Workload

www.enkitec.com 34

Smart Scans as seen in Grid Control

www.enkitec.com 35

25 Sessions Doing Smart Scans

…as seen in gv$sql

www.enkitec.com 36

www.enkitec.com 37

Smart Scan in Action. The cells are scanning 1T but only returning 144G…***That’s on each of the highlighted row source below…

www.enkitec.com 38

The databases on other nodes see the contention as “System I/O”Without I/O resource management even critical processes are affected (CKPT, LGWR, …)

www.enkitec.com 39

Inter-database IORM Plan(only kicks in when needed)

I/O requests from critical processes like CKPT, LGWR, LMON get priority automatically.Without IORM I/O requests from these important processes receive the same priority as any other process.

*Side Benefit (automatic when IORM is enabled)

www.enkitec.com 40

IORM Plan Definition(on each storage cell)

Wrap up!

Provisioning Methodology & Tools– Workflow– Provisioning Spreadsheet

Success Stories– CPU resource management– Tuning and provisioning adjustments– I/O resource management

www.enkitec.com 41

www.enkitec.com 42

43

Fastest Growing Companies in Dallas

Contact Info…