All Change

35
All Change: The Story of a Migration Dr. Jason Arneil

description

Talk about an Oracle 10.1 Linux to Solaris 10.2 migration

Transcript of All Change

Page 1: All Change

All Change:The Story of a MigrationDr. Jason Arneil

Page 2: All Change

• Introduction

• The Installation

• The Migration

• Conclusion

AGENDA

Page 3: All Change

Introduction

• Jason Arneil

• Oracle DBA/System & Storage Administrator

• Using Oracle since 1998

• At Nominet since 2001

About Me

Page 4: All Change

Introduction

• Nominet is the internet registry for .uk domain names

• Nominet has just celebrated its 11th anniversary

• Nominet is run as a not-for-profit company

• Nominet is owned by its members

• There are almost 6 Million .uk domain names

About Nominet

Page 5: All Change

Introduction

Hardware/Software Stack

Oracle10.1

Veritas

Powerpath

Solaris 8

EMC

Page 6: All Change

Introduction

Motivation

• Replace ageing SPARC hardware

• New 4-way Sun SPARC servers ~ £50K EACH

• Equivalent 4-way AMD Opteron cost £20K in TOTAL

• Reduce complexity in the software stack

• Ability to store accented characters

Page 7: All Change

Introduction

Changes

10.1.0.3 10.2.0.2

Solaris 8 RedHat AS 4

VxFS ASM

US7ASCII UTF8

SPARC AMD

Page 8: All Change

Introduction

Issues

• This was not just a case of running an upgrade script

• Database was some 100GB in size

• Change in endianess an additional complication

• Downtime allowable - but needed to be the minimum possible

Page 9: All Change

The Installation

Initial Server Install

• Server installation kept to as minimal rpm’s as possible

• Recommended install rpms have many package dependencies

• Both 32-bit and 64-bit versions of certain packages are required– rpm --query --queryformat “%(NAME)-%(VERSION).%(RELEASE)

(%(ARCH))\n” rpm name

• Binutils rpm incorrect with RedHat Update 2 – rpm -Uvh --force binutils-2.15.92.0.213.0.0.0.2.x86_64.rpm

Page 10: All Change

The Installation

ASMLib

• Eliminates need to bind raw devices

• Need to install 3 rpms:

– oracleasm-support

– oracleasm-your-kernel

– oracleasmlib

• Must get ASMLib from

http://www.oracle.com/technology/tech/linux/asmlib/index.html

Page 11: All Change

The Installation

Other tips

• Follow recommendations in the Oracle Validated Linux configurations:

http://www.oracle.com/technology/tech/linux/validated-configurations/index.html

• Ensure DNS resolving works for hostnames, VIPs

• Use Cluster Verification Utility

• Bond the Interconnect

Page 12: All Change

The Installation

Hugepages

• Translation Lookaside Buffer maps virtual -> physical memory

• Normal memory page size is 4K

• Hugepage size is 2MB resulting in more efficient memory access

Page 13: All Change

• Method of moving tablespaces between databases

• Export metadata then copy the actual datafiles

• Plug in the metadata into the target database

• Transporting across platforms new in 10g

• A change of endianess means an rman convert is needed

The Migration

Transportable Tablespaces

Page 14: All Change

• Check your platform is supported for transporting:

– sql> select * from V$TRANSPORTABLE_PLATFORM

PLATFORM_ID PLATFORM_NAME ENDIAN_FORMAT

----------- -------------------------------- --------------

1 Solaris[tm] OE (32-bit) Big

2 Solaris[tm] OE (64-bit) Big

3 HP-UX (64-bit) Big

4 HP-UX IA (64-bit) Big

5 HP Tru64 UNIX Little

6 AIX-Based Systems (64-bit) Big

7 Microsoft Windows IA (32-bit) Little

.

.

.

The Migration

Transportable Tablespaces

Page 15: All Change

The Migration

Transport

• Must have a self-contained set of tablespaces to move

– sql> EXECUTE DBMS_TTS.TRANSPORT_SET_CHECK('TBNAME', TRUE);

– sql> select * from TRANSPORT_SET_VIOLATIONS;

• You must make the tablespaces to be moved read only

• Convert the endianess of the datafiles

– rman> convert tablespace TBNAME to platform 'Linux 64-bit for AMD’ format='/home/oracle/%U' ;

Page 16: All Change

The Migration

Plugging In• Now you need the metadata:

– exp ‘as sys’ transport_tablespace=y file=tbname.dmp tablespaces=tbname

• Plug tablespace in as a “normal” filesystem tablespace:

– imp ‘as sys’ transport_tablespace=y file=‘path to export’ datafiles=‘path to filesystem datafile’

• Convert from filesystem datafile to ASM datafile:

– rman> convert datafile ‘path to filesystem datafile’ format ‘ASM PATH’

Page 17: All Change

The Migration

The Plan

Solaris Linux

Rm

an

con

vert

Rm

an

con

vert

Copy datafiles

metadata

Page 18: All Change

The Migration

A problem

• Started to notice space calculations incorrect

• Particular problem with 1 critical tablespace

• Assumed we had problems with our created types & custom indexing

• Clearing database down had catastrophic problem:

– sql> drop user TRANSPORTED cascade;

ERROR at line 1:

ORA-00600: internal error code, arguments: [ktecgetsh-inc], [2], [], [], [], [], [], []

Page 19: All Change

The Migration

Oracle Response

Page 20: All Change

The Migration

The cause of the problem

• The cause of the instance crashing was the following:

– ORA-00600: internal error code, arguments: [ktecgetsh-inc], [2], [], [], [], [], [], []

– Non-fatal internal error happened while SMON was doing temporary segment drop.

– Instance terminated by PMON

• Internal table SEG$ info wrong after TTS

• Fixed by patch 5186784

Page 21: All Change

The Migration

Data Pump to the Rescue!• Could not wait for a fix to the Transportable Tablespace problems

- did not know how long it would take.

• Data Pump new in 10g - export/import on steroids

• Oracle claim Data Pump is 2x faster than export and 15-45 times faster than import

• Restartable with detailed job monitoring

• Automatic tuning and parallel server side execution

• direct import across the network

Page 22: All Change

The Migration

Data Pump Architecture

Client Process

Client Process

Status Queue

Master table LOGMaster Control Process

Command and Control Queue

Worker Worker Dump file set

Data, Metadata & Master Table

Page 23: All Change

The Migration

Data Pump Basics

• Create dumpdir:

– sql> create directory dumpdir as '/home/oracle/dumpdir’

– sql> grant read, write on directory dumpdir to USER;

• Actually do a Data Pump export:

– expdp user/password schemas=user directory=dumpdir dumpfile=output.dmp

• Do the Data Pump import:

– impdp user/password directory=dumpdir dumpfile=output.dmp

Page 24: All Change

The Migration

Data Pump Issues

• Data Pump expdp did not run trouble free:

– ORA-31693: Table data object ”USER"."DOMAINS" failed to

load/unload and is being skipped due to error: ORA-06502: PL/SQL: numeric or value error

LPX-00210: expected '<' instead of '?'

• Apparently the xml/xsl stylesheets were not loaded into our 10.1 database

• The fix was running $Oracle_Home/rdbms/admin/initmeta.sql

Page 25: All Change

The Migration

Data Pump Issues

• Data Pump impdp also was not trouble free:

– ORA-31693: Table data object ”USER"."DOMAINS" failed to load/unload and is being skipped due to error:ORA-44002: invalid object nameORA-06512: at "SYS.DBMS_ASSERT", line 283ORA-06512: at "SYS.KUPD$DATA", line 1336ORA-12801: error signaled in parallel query server P004,instance lr2:NOM2 (2)ORA-29913: error in executing ODCIEXTTABLEOPEN calloutORA-31626: job does not existORA-06512: at "SYS.ORACLE_DATAPUMP", line 19

• Seems like Data Pump import running in parallel across RAC nodes does not work. Shutdown of 2nd node cleared this problem

Page 26: All Change

The Migration

Data Pump Issues

• We encountered even more problems during an import:

– ORA-02374: conversion error loading table ”USER"."INVOICES” ORA-12899: value too large for column CONTACT (actual: 162, maximum: 100)ORA-02372: data for row: CONTACT0X'C3A1C2B3CA98C3C5C28DC2A3C283C2A1C3A3C283C2B3C3'

• This time it was due to changing charactersetUS7ASCII -> UTF8

• Fortunately not too many rows affected so data munging was possible

Page 27: All Change

The Migration

Data Pump Issues

• Tried playing with importing directly across network:

– ORA-31693: Table data object "SNOWDROP"."EMPLOYEE_OLE" failed to load/unload and is being skipped due to error:ORA-06502: PL/SQL: numeric or value error

• Seems like this is either a Bug:

– 3369197 - Dpimp : Network Mode Hr.Per_Addresses Ora-00997: Illegal Use Of Long Datatype Errors

– Metalink Note 313496.1

• Or maybe this is a feature - the above bug & Metalink Note are NOT publicly visible

Page 28: All Change

The Migration

Final Data Pump Issues

• When we came to the day of the migration we hit this problem when exporting:

– expdp system parfile=param.exp

Export: Release 10.1.0.3.0 - 64bit Production on Sunday, 16July, 2006 10:10 Copyright (c) 2003, Oracle.

ORA-31626: job does not existORA-31637: cannot create job SYS_EXPORT_SCHEMA_01 for

user SYSTEM ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95 ORA-06512: at "SYS.KUPV$FT_INT", line 601

ORA-12850: Could not allocate slaves on all specified instances: 2 needed, 1 allocated

• We managed to export 1 schema without issue but second schema was starting Parallel slaves on both nodes

Page 29: All Change

The Migration

Data Pump Timings

• Use a parameter file:

– dumpfile=user.dmpdirectory=dumpdirparallel=4logfile=user.logschemas=user

• Some timings:

size

(GB)

Expdp

(time mins)

Impdp

(time mins)

network

(time mins)

35 11 55 75

49 13 95 240

Page 30: All Change

Tips & Conclusions

ODBC Issue• Rule based Optimizer meant to be de-supported in 10g - see

Metalink Note: 189702.1

• ODBC driver was adding /*+ RULE */ hint to certain queries - still performing ok in 10.1.0.3

• /*+ RULE */ hinted queries in 10.2.0.2 were performing x 100 slower

• This is covered in Metalink Note:373129.1

• Added logon trigger for application to ignore optimizer hints _optimizer_ignore_hints

Page 31: All Change

Tips & Conclusions

Post-Migration Issues

• There were quite a few niggling issues post migration

• perl-dbi applications silently dying

– update DBD driver to 1.17

• Pro*C applications silently dying

– update Oracle Client to 10.2

Page 32: All Change

Tips & Conclusions

Post-Migration Issues

• Java JDBC problem:

– java.sql.SQLException: OALL8 is in an inconsistent state.

• Character set problem with domain registration system

– ORA-01756: quoted string not properly terminated

• Developers finding counting characters “harder”

– Can no longer guarantee 1 character == 1 byte

Page 33: All Change

Tips & Conclusions

Migration Benefits

• Load on servers is greatly reduced

• Most ran query has gone from 1.6ms to 0.8ms

• RAC related waits decreased

• Stable for 6 months +

Page 34: All Change

Tips & Conclusions

Conclusion

• TEST, TEST, TEST

• Transportable tablespaces is quickest way of moving data

– changing platforms really complicates & slows it down

• Data Pump is much, much faster than old export/import

– Roughly 25GB per hour

Page 35: All Change

Questions?

All Change:The Story of a Migration

Contact:

[email protected]

• http://blog.nominet.org.uk