Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

60
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN WELCOME Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else Markus Flechtner 24.12.16 Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else 1

Transcript of Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

Page 1: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

WELCOME Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything ElseMarkus Flechtner

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

1

Page 2: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Trivadis is a market leader in IT consulting, system integrationand the provision of IT services focusing on and

technologies in Switzerland, Germany and Austria.

We offer our services in the following strategic business fields:

Trivadis Services takes over the interacting operation of your IT systems.

Our company.

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

Page 3: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

With over 600 specialists and IT experts in your region.

3

11 Trivadis branches and more than 600 employees

200 Service Level Agreements

Over 4,000 training participants

Research and development budget: CHF 5.0 / EUR 4 million

Financially self-supporting and sustainably profitable

Experience from more than 1,900 projects per year at over 800 customers

Hamburg

Düsseldorf

Frankfurt

FreiburgMunich

Vienna

Basle

ZurichBernLausanne

3

Stuttgart

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

Page 4: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

About me ..

§ Senior Consultant, Trivadis, Duesseldorf

§ Working with Oracle since 1993§ Development (Forms, Reports, PL/SQL)§ Support§ Database Administration

§ Started at Trivadis in 2008

§ Focus§ Oracle Real Application Clusters§ Database Migration

§ Teacher§ O-AI – Oracle Architecture and Internals§ O-RAC – Oracle Real Application Clusters§ O-NF12c DBA – Oracle 12c New Features for DBAs

4

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

Page 5: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

5

Page 6: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character sets (1)

§ “A character set is just an agreement on what numeric value a symbolhas“, i.e. a mapping table (code point)§ How to display a numeric value (in the database) as a corresponding character

in the output§ How to store an input value (e.g. numeric value) in the database (or file etc.)

§ Some well known character sets§ WE8ISO8859P15 („= WE8ISO8859P1 +€-character + some more“)§ WE8MSWIN1252§ AL32UTF8 § UTF8§ US7ASCII – not recommended§ WE8ISO8859P1 (ISO 8859, Page 1, for Western Europe) – not recommended

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

6

Page 7: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character sets (2)

§ Oracle 11.2 supports 247 different character sets

§ Oracle database supports different types of character sets§ SB: single-byte encoding§ MB: multi-byte encoding§ FIXED: fixed-width multi-byte encoding§ ASCII: strict superset of ASCII§ EURO: Euro symbol supported§ UDC: User-Defined-Character set (using „Oracle Locale Builder“)

§ See Oracle 11.2 Globalization Support Guide, Appendix A („Locale data“) for more information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

7

Page 8: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character sets (3) – well known character sets

§ The WE8ISO8859-family§ WE8ISO8859P1§ WE8ISO8859P15§ ..

§ ISO/IEC 8859

§ Differences between ISO8859-1 and -15

§ See http://en.wikipedia.org/wiki/ISO8859 for more information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

8

Page 9: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character Conversion: from client to server (and vice versa)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

9

WE8MSWIN1252WE8ISO8859P1

Database Character set = WE8ISO8859P15

INSERT (‚€‘)

è Converted toè WE8ISO8859P15

Stored: „€“

SELECT

But the € cannot be displayedIn WE8ISO8859P1!

It‘s all about the correct and compatible setting of database character set, client OS character set and client Oracle character set!

Page 10: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character set- and NLS-information on the database

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

10

SQL> SELECT PROPERTY_VALUE FROM DATABASE_PROPERTIES WHERE PROPERTY_NAME like ’NLS%’;

PROPERTY_NAME PROPERTY_VALUE ------------------------------ -----------------------------------NLS_LANGUAGE AMERICAN NLS_TERRITORY AMERICA NLS_CURRENCY $ NLS_ISO_CURRENCY AMERICA NLS_NUMERIC_CHARACTERS ., NLS_CHARACTERSET WE8MSWIN1252 NLS_CALENDAR GREGORIAN NLS_DATE_FORMAT DD-MON-RR NLS_DATE_LANGUAGE AMERICAN NLS_SORT BINARY NLS_TIME_FORMAT HH.MI.SSXFF AM NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR NLS_DUAL_CURRENCY $ NLS_COMP BINARY NLS_LENGTH_SEMANTICS BYTE NLS_NCHAR_CONV_EXCP FALSE NLS_NCHAR_CHARACTERSET AL16UTF16 NLS_RDBMS_VERSION 11.2.0.3.0

Page 11: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

11

Page 12: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

What's Unicode? - Introduction

§ “Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems.” (http://en.wikipedia.org/wiki/Unicode)

§ Development started in 1987, current version is Unicode 6.3 (released in September 2013), representing more than 110.000 characters

§ Universal encoded character set

§ Store information in any language è supports globally used databases

§ AL32UTF8 = variable width (1 – 4 bytes per character)

§ UTF-8 = variable width ( 1 – 3 bytes/character)

§ UTF-16 = fixed width (2 or 4 bytes/character)

§ Oracle supports Unicode since Oracle 8.1

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

12

Page 13: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character set AL32UTF8 - Introduction

§ Reminder: “A character set is just an agreement on what numericvalue a symbol has“

§ Supported as database character set since Oracle 9i

§ Recommended character set

§ Check your clients: Old Oracle 8i (or earlier) clients have problems with AL32UTF8 è use UTF8

§ Characters have variable length (1 – 4 bytes)è any character other than US7ASCII (A-Z,a-z,0-1 and ./?,*# etc..) needs more than one byteè the database may need some more space (csscan will report an estimation)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

13

Page 14: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character set AL32UTF8 – bytes of storage for different chars

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

14

(from Oracle 11.2 Globalization Support Guide, © Oracle)

Page 15: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Unicode vs. Classical 8-bit-character set

§ Database 1 – “old fashioned” 8-bit-Character set

§ Database 2 – “modern” Unicode Character set SQL> SELECT PROPERTY_VALUE FROM DATABASE_PROPERTIES

WHERE PROPERTY_NAME=’NLS_CHARACTERSET';PROPERTY_VALUE----------------------------------------------------AL32UTF8SQL> INSERT INTO CSDEMO.UMLAUTE VALUES ('äöüÄÖÜß');SQL> SELECT ZEICHEN,DUMP(ZEICHEN) FROM CSMDEMO.UMLAUTE;ZEICHEN DUMP(ZEICHEN) ---------- ----------------------------------------------------------------------äöüÄÖÜß Typ=1 Len=14: 195,164,195,182,195,188,195,132,195,150,195,156,195,159

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

15

SQL> SELECT PROPERTY_VALUE FROM DATABASE_PROPERTIESWHERE PROPERTY_NAME=‘NLS_CHARACTERSET';PROPERTY_VALUE----------------------------------------------------WE8ISO8859P15SQL> INSERT INTO CSDEMO.UMLAUTE VALUES ('äöüÄÖÜß');

SQL> SELECT ZEICHEN,DUMP(ZEICHEN) FROM CSMDEMO.UMLAUTE;ZEICHEN DUMP(ZEICHEN) ---------- ----------------------------------------äöüÄÖÜß Typ=1 Len=7: 228,246,252,196,214,220,223

Page 16: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Character set AL32UTF8 – Technical Limits within Oracle

UTF8 | AL32UTF8 | AL16UTF16 (1 to 3 bytes)| (1 to 4 bytes)| (2 bytes)

MIN MAX | MIN MAX | MIN MAX------------------------------------------------CHAR 2000 666 | 2000 500 | N/A N/AVARCHAR2 4000 1333 | 4000 1000 | N/A N/A------------------------------------------------NCHAR 2000 666 | N/A N/A | 1000 1000NVARCHAR2 4000 1333 | N/A N/A | 2000 2000

§ For the national character set, only UT8 and AL16UTF16 are allowed

§ There is still an underlying limit of 4000 Bytes per VARCHAR2 column è you can store 4000 “a” (single-byte-character) in such a column

§ Every multi-byte character reduces the number of available characters (e.g. you can store 3997 “a” + 1 “€”, because the EUR-character is a 3 bytes character in the AL32UTF8 Character set, and 3994 “a” + 2 “€” and so on)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

16

Page 17: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

17

Page 18: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle Parameters: NLS_LANG (1)

§ „NLS_LANG is used to let Oracle know what character set your client's OS is USING so that Oracle can do (if needed) conversion from the client's character set to the database character set.“

§ Character set in NLS_LANG not necessarily equal to DB character set!§ If both are equal this may be beneficial in terms of performance (OCI)

§ è On typical Western European Windows Clients§ NLS_LANG=<LANGUAGE>_<TERRITORY>.WE8MSWIN1252

§ è On Unix systems the character set depends on the locale- Check LC_ALL- See http://docs.oracle.com/cd/E11882_01/server.112/e10729/applocaledata.htm

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

18

-- Format <Language>_<Territory>.<ClientCharacterSet>NLS_LANG = GERMAN_SWITZERLAND.AL32UTF8

Page 19: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle Parameters: NLS_LANG (2)

§ Recommendation on NLS_LANG § Set the character set part of NLS_LANG to the character set of the OS, e.g

WE8MSWIN1252 on a Windows client

§ Recommendation on the database character set§ Choose a character set which is a superset of the character sets of all clients

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

19

Page 20: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle Parameters: NLS_LENGTH_SEMANTICS (1)

§ Specifies the default length semantics to use for VARCHAR2 and CHAR table columns

§ Possible values are§ BYTE (Default)§ CHAR

§ New hint in the 11.2 documentation:

§ è change the parameter to CHAR on session level only!

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

20

Page 21: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle Parameters: NLS_LENGTH_SEMANTICS (2)

§ Oracle has changed its’ mind:

§ See the 11.1-documentation http://docs.oracle.com/cd/B28359_01/server.111/b28298/ch2charset.htm

§ (it’s a “documentation bug”) J

§ But there are MOS-Notes which state “do not set the parameter to CHAR”, even for 11.1 databases

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

21

Page 22: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle Parameters: NLS_LENGTH_SEMANTICS (3)

§ Oracle recommends to specify the NLS_LENGTH_SEMANTICS on column level:

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

22

SQL> CREATE TABLE unicode_demo(byte_string VARCHAR2(10 BYTE),char_string VARCHAR2(10 CHAR))Table created.SQL> DESCRIBE unicode_demoName Null Typ----------- ---- -----------------BYTE_STRING VARCHAR2(10 BYTE) CHAR_STRING VARCHAR2(10)

SQL> INSERT INTO unicode_demo (byte_string) VALUES ('äöüÄÖÜ');ORA-12899: Value too large for column [..]

Page 23: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

23

Page 24: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (1): Introduction

§ CSSCAN§ Character set Scanner

§ CSALTER§ Script to change the character set of a database

($ORACLE_HOME/rdbms/admin/csalter.plb)§ Changes data dictionary only, additional actions for user data may be required

§ Csscan/csalter to be desupported after 11.2§ Replaced by DMU (Database Migration Assistant to Unicode)

§ Installation of csscan§ User & Tables created using $ORACLE_HOME/rdbms/admin/csminst.sql§ Default-User CSMIG (after installation user is locked & password expired)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

24

Page 25: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (2): csscan: Parameter

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

25

Character set Scanner v2.2 : Release 11.2.0.3.0 - ..[..]

Keyword Default Prompt Description---------- ------- ------ -------------------------------------------------USERID yes username/passwordFULL N yes scan entire databaseUSER yes owner of tables to be scannedTABLE yes list of tables to scanCOLUMN yes list of columns to scanEXCLUDE list of tables to exclude from scanTOCHAR yes new database Character set nameFROMCHAR current database Character set nameTONCHAR new national Character set nameFROMNCHAR current national Character set namePROCESS 1 yes number of concurrent scan processLOG scan base file name of report files[..]---------- ------- ------ -------------------------------------------------Scanner terminated successfully.

Page 26: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (3)

§ How to find out if the database contains characters which do not belong to the database character set (and therefore can not be converted to AL32UTF8 or any other character set)?

§ èset FROMCHAR=TOCHAR=current character set (sic!)

§ (in theory) running csscan with TOCHAR=AL32UTF8 makes no sense, because AL32UTF8 contains all characters J

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

26

oracle> csscan userid='"/ as sysdba"' full=y fromchar=WE8ISO8859P15 tochar=WE8ISO8859P15 ARRAY=1024000 LOG=/tmp/csscan_source.log PROCESS=8

Page 27: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (4): Summary Report

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

27

[Data Dictionary Conversion Summary]Datatype Changeless Convertible Truncation Lossy--------------------- ---------------- ---------------- ---------------- ----------------VARCHAR2 1.778.335 0 0 0CHAR 94 0 0 0LONG 128.565 0 0 0CLOB 22.072 0 0 0VARRAY 19.812 0 0 0--------------------- ---------------- ---------------- ---------------- ----------------Total 1.948.878 0 0 0Total in percentage 100,000% 0,000% 0,000% 0,000%The data dictionary can be safely migrated using the CSALTER script

[Application Data Conversion Summary]Datatype Changeless Convertible Truncation Lossy--------------------- ---------------- ---------------- ---------------- ----------------VARCHAR2 682.846.037 0 0 872.321CHAR 134.119.259 0 0 0LONG 36.528 0 0 0CLOB 0 0 0 0VARRAY 0 0 0 0--------------------- ---------------- ---------------- ---------------- ----------------Total 817.001.824 0 0 872.321Total in percentage 99,893% 0,000% 0,000% 0,107%

Page 28: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (5): csscan - Results

§ Changeless§ OK, no more steps required§ Simply run „csalter“ to change the data dictionary

§ Convertible§ Data has to be converted between the character sets (due to a different character

mapping)è export/import

§ Lossy data§ Existing database contains data which cannot be converted to new character set§ è correct the data so that it can be converted, otherwise csalter will fail!

§ Truncation§ Modify column length (if possible)§ Shorten data if necessary or switch to CLOB

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

28

Page 29: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (6): csscan - Truncation

§ Csscan reports the „Max Post Conversion Data Size” (in Byte!)è there is no problem with a size less than 4000 Byte (for VARCHAR2) resp. 2000 (CHAR).

§ Otherwise csscan will report these columns

§ è you have to take care for these columns (shorten data in a sensefulway or migrate to CLOB)

§ è otherwise you will loose some information L

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

29

Page 30: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (7): csscan – Lossy Data

§ “Lossy” data is data which cannot be converted to the target character set è in most cases result of an invalid NLS_LANG setting on the clientè Data Cleanup is required

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

30

User : DEMOTable : COST_IDColumn: DESCRIPTIONType : VARCHAR2(80)Number of Exceptions : 13 Max Post Conversion Data Size: 81 ROWID Exception Type Data(first 30 bytes) ------------------ ------------------ -----------------------------AAAPYvAANAAADOfAAr lossy conversion GEBŽUDE W1 (BšROGEBŽUDE) AAAPYvAANAAADOPAAs lossy conversion FIRMENANGEH(tm)RIGEN-NEUFAHRZEAAAPYvAANAAADOPABB lossy conversion GEBŽUDE FMI AAAPYvAANAAADOvAAP lossy conversion EINFšHRUNG EURO AAAPYvAANAAADOvABN lossy conversion OBERFLŽCHENVEREDELUNG (KOPF-KSAAAPYvAANAAADQGABJ lossy conversion BENUTZERSERVICE, CAD-SERVICE,

Page 31: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (8): Data Cleanup Script

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

31

SQL> set serveroutput onSQL> BEGIN

FOR c_rec IN(SELECT owner_name,table_name,column_name,data_rowid

FROM csmig.csmv$errors WHERE error_type='DATA_LOSS')LOOPsql_statement := 'SELECT '||c_rec.column_name||' from'||c_rec.owner_name||'.'||c_rec.table_name||' where

rowid='||chr(39)||c_rec.data_rowid||chr(39);execute immediate sql_statement into old_value;dbms_output.put_line('UPDATE '||c_rec.owner_name||'.'||c_rec.table_name||' SET '||c_rec.column_name||'='||chr(39)||old_value||chr(39)||' where ROWID='||chr(39)||c_rec.data_rowid||chr(39)||';');END LOOP;

END;/

è Results look like[..]UPDATE OWNER.TABELLE SET SPALTE='Durchf?hrung' WHERE ROWID=' … ' ;

Page 32: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (9): csalter

§ Changes the data dictionary only, no user data

§ Use it only if you have “changeless” data only in the database (otherwise csalter will fail)

§ Csalter runs in restricted mode only

§ If you have “convertible” columns in the database, you can run csalter + export/import (on table or user level) to convert the data§ è if you have “convertible” data in your database, switching to a new

database may be the better option§ è old database remains as a fallback solution§ è reorganization etc. possible

§ In most cases csalter will fail, so the recommendation is: do not use it!

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

32

Page 33: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: Csscan & csalter (10): csalter

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

33

SQL> @@csalter.plb0 rows created.Function created.Function created.Procedure created.

This script will update the content of the Oracle Data Dictionary.Please ensure you have a full backup before initiating this procedure.Would you like to proceed (Y/N)?Yold 6: if (UPPER('&conf') <> 'Y') thennew 6: if (UPPER('Y') <> 'Y') thenChecking data validility...begin converting system objects

PL/SQL procedure successfully completed.

Alter the database character set...CSALTER operation completed, please restart database

PL/SQL procedure successfully completed.

0 rows deleted.Function dropped.Function dropped.Procedure dropped.

Page 34: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

34

Page 35: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (1) - Overview

§ DMU = Database Migration Assistant for Unicode, current version: 1.2

§ JAVA-based GUI-tool, requires J2SE SDK

§ DMU can convert/migrate/change the NLS_CHARACTERSET to AL32UTF8/UTF8 for a database

§ Successor of csscan/csalter

§ New features (compared to csscan/csalter)§ Aggregates all tools in UI§ Extended analysis features§ Automation of modification and conversion tasks

§ Download via OTN or MOS:§ http://www.oracle.com/technetwork/products/globalization/dmu/overview/index-

330958.html§ MOS: Patch 13262762

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

35

Page 36: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (2) - Installation

§ Database Server§ A server side patch (9825461) is required§ The patch is included in 11.2.0.3 (and up)§ Patch is not available for all versions (e.g. use of DMU with 11.2.0.2.4 or

10.2.0.5.4 is not supported) è csscan/csalter as “Fallback”

§ Database § Run $ORACLE_HOME/rdbms/admin/prvtdumi.plb to install the package

SYS.DBMS_DUMA_INTERNAL

§ Client§ J2SE SDK 6 is required

§ See MOS-Note 1272374.1 for more information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

36

Page 37: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (3) - Workflow

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

37

Page 38: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (4) – Create Repository

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

38

Page 39: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (5) – Scan the database

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

39

Page 40: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (6) – Solve issues

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

40

Page 41: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (7) – “Bulk Cleansing”

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

41

Page 42: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (8) – Convert Database

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

42

Page 43: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Oracle-Tools: DMU (9) – How does DMU do the conversion?

1. Put the database into restricted mode.

2. Disable various job queue processes.

3. Drop or disable selected indexes.

4. Disable selected triggers and constraints.

5. Convert the data in user tables and in selected data dictionary tables to Unicode.

6. Convert CLOB columns in the data dictionary.

7. Issue the ALTER DATABASE Character set statement.

8. Enable triggers and constraints; and re-create indexes and constraints.

9. Restore the database instance parameters.

The conversion of a table is performed either by updating its columns with an UPDATE statement or by converting the columns while re-creating the table using the CREATE TABLE AS SELECT statement. The re-creation of a table is faster than an update if most of the tablerows must be converted.

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

43

SQL> UPDATE <col> SET <col>=SYS_OP_CSCONV(<col>, 'AL32UTF8');

Page 44: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database (csscan/csalter)

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

44

Page 45: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Changing the Character set of an existing Database (1)

§ Don’t forget to backup your database before changing the Character set!

§ Prepare for csscan

§ Run csscan

§ Evaluate output

§ Data Cleanup

§ Run csalter (remember the backup beforehand)

§ If you have convertible columns or CLOBS: run (partial) export/import to migrate user data

§ Update Statistics

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

45

Page 46: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

46

Page 47: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Change Database & Character set (1) - Preparation

§ Prepare for csscan

§ Run csscan

§ Evaluate output

§ Data Cleanup

§ Create empty tables in new database§ With correct CREATE scripts (<column> VARCHAR2 (xxx CHAR))§ (or) Adjust column length / semantics after creation

§ You can modify the length semantics in the source database, too. In this case you don’t have to create the tables before importing them

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

47

SQL> ALTER TABLE demo_table2 MODIFY (demo_column VARCHAR2(20 CHAR));

Page 48: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Change Database & Character set (2) – Data Transfer

§ Option 1:1. Datapump Export 2. Datapump Import

(with TABLE_EXISTS_ACTION=TRUNCATE|APPEND!)

§ Option 2: § Streams (è “near-zero-downtime” possible)

§ Option 3: § Own methods (e.g. INSERT INTO tab@newdb SELECT * from

tab@olddb;

§ Or § Golden Gate, Quest Shareplex etc.

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

48

Page 49: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

49

Page 50: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

NLS_LENGTH_SEMANTICS & SQL + PL/SQL

§ Take care for functions

§ Don’t forget the NLS_LENGTH_SEMANTICS in PL/SQL-code

§ Additional changes to application code may be required (see Globalization Support Guide for more information)

§ Don’t forget to test the application with the Unicode database!

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

50

Refers to characters ofthe input character set

Refers to bytes Refers to Unicode characters

LENGTH LENGTHB LENGTHCINSTR INSTRB INSTRCSUBSTR SUBSTRB SUBSTRC...

Page 51: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

51

Page 52: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Recommendations – summarized (1) – Parameter

§ Parameter NLS_LANG § Set the character set part of NLS_LANG to the character set of the OS

- On Windows: check your environment for the ASCII codepage, choose the appropriate character set, e.g. WE8MSWIN1252 (see MOS-Note 179133.1 for details)

- On Unix: check your locale setting (“locale –a”), see MOS-Note 264157.1 for details

§ Parameter NLS_LENGTH_SEMANTICS§ On instance level: BYTE § On column level (in DDL statements): CHAR

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

52

Page 53: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Recommendations – summarized (2) – DB Character set

§ Database character set§ Choose a character set which is a superset of the character sets of all clients

(see Oracle Globalization Support Guide 11.2, Apendix A) for details)§ think about future requirements § Oracle recommends using Unicode for all new system deployments (Oracle

11.2 Globalization Support Guide, Chapter 2 “Choosing a character set”)§ If you have to store Unicode data, choose a Unicode database character set

§ National character set (for NCHAR, NVARCHAR2, NCLOB)§ As Oracle supports Unicode as a database character set, the “N-datatypes”

should not be used (this simplifies application development)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

53

Page 54: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Recommendations – summarized (3) – Character set Migration

§ Migrating the character set of a database to Unicode§ Check your existing data carefully and change data which cannot be

converted§ Recommended: create a new database with character set AL32UTF8 and

transfer the data to the new database using e.g. Datapump- è old database remains as a fallback solution- è reorganization possible- è Near-Zero-Downtime migration possible with Oracle Streams

§ Change the character set of an existing database if you have„changeless“ columns only or if diskspace is an issue

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

54

Page 55: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Recommendations – summarized (4) - Tools

§ Csscan/csalter§ To be desupported after 11.2 (MOS-Note 1418321.1)§ Fallback solution if DMU cannot be used§ In most cases only csscan is used (in order to analyze the existing data)

§ Database Migration Assistant for Unicode (DMU)§ Successor of csscan/csalter§ Default tool for 12c§ Patch 9825461 required (included in 11.2.0.3 and higher, but not available for

all database versions below 11.2.0.3)

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

55

Page 56: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

AGENDA

1. Character sets & Character Conversion

2. Unicode

3. Oracle Parameters

4. Oracle-Tools: Csscan & csalter

5. Oracle-Tools: Database Migration Assistant for Unicode (DMU)

6. Changing the Character set of an existing Database

7. Change Database and Character set

8. NLS_LENGTH_SEMANTICS & SQL + PL/SQL

9. Recommendations

10.Links & further information

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

56

Page 57: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Links & Further Information (1)

§ Oracle Documentation§ Oracle® Database Globalization Support Guide 11.2

§ Unicode in general§ http://www.unicode.org/versions/Unicode6.1.0/§ http://www.fileformat.info/info/unicode/utf8.htm

§ Database Migration Assistant for Unicode § http://www.oracle.com/technetwork/products/globalization/dmu/overview/index.html§ Near-Zero Downtime Migration Model with DMU & Oracle Streams

- http://www.oracle.com/technetwork/products/globalization/dmu/learnmore/nzd-migration-524223.html

§ Misc§ Oracle Whitepaper „Character Set Migration – Best Practices“§ http://www.oracle-base.com/articles/9i/character-semantics-and-globalization-9i.php§ http://www.oracle-base.com/articles/10g/character-set-migration.php

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

57

Page 58: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

Links & Further Information (2)

§ My Oracle Support Notes§ 144808.1 - Examples and limits of BYTE and CHAR semantics usage

(NLS_LENGTH_SEMANTICS)§ 1272374.1 - The Database Migration Assistant for Unicode (DMU) Tool§ 276914.1 - The National Character set ( NLS_NCHAR_Character set ) in Oracle 9i, 10g

and 11g§ 225912.1 - Changing the Database Character Set ( NLS_Character set ) § 69518.1 - Storing and Checking Character Codepoints in an UTF8/AL32UTF8 (Unicode)

database§ 788156.1 - AL32UTF8 / UTF8 (Unicode) Database Character set Implications§ 745809.1 - Installing and configuring Csscan in 10g and 11g (Database Character set

Scanner)§ 444701.1 - Csscan output explained § 158577.1 - NLS_LANG Explained (How does Client-Server Character Conversion Work?)§ 264157.1 - The correct NLS_LANG setting in Unix Environments § 179133.1 - The correct NLS_LANG in a Windows Environment § 1522722.1 - Known Issues With The DMU Tool version 1.2

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

58

Page 59: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

ANY QUESTIONS?

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

59

Page 60: Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

Thank youfor your attention

Trivadis GmbH

Markus Flechtner

Werdener Straße 440227 Düsseldorf

Tel. +49-211-5866 6470Fax +49-211-5866 6471

[email protected]

24.12.16Oracle Globalization Support, NLS_LENGTH_SEMANTICS, Unicode and Everything Else

60