1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.
-
Upload
octavio-burden -
Category
Documents
-
view
230 -
download
0
Transcript of 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.
![Page 1: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/1.jpg)
1
NAAM
Oracle Character setsAino Andriessen
![Page 2: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/2.jpg)
2
Demo1
![Page 3: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/3.jpg)
4
nls_length_semantics
Intializatie parameterCHAR of BYTE (default)Van toepassing op multi byte character setsDefinieert het type voor de lengte van character
kolommen en variabelenalter session set nls_length_semantics=CHAR;
niet met terugwerkende kracht ev pl/sql recompile alter system
![Page 4: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/4.jpg)
5
nls_length_semantics 2
lengte van karakter kolommen en variabelen expliciet opgeven create table demo (naam varchar2(4 char)) create table demo (naam varchar2(4 byte))
t_naam varchar2(4 char); t_naam demo2.naam%TYPE
![Page 5: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/5.jpg)
6
Demo2
![Page 6: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/6.jpg)
8
Character encoding
![Page 7: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/7.jpg)
9
Character set
Character set definieert de 'mapping' tussen binary/headecimale code en het character UTF8 WE8MSWIN1252 WE8ISO8859P1 JA16EUC US7ASCII WE8DEC ...
Code pages IBM / windows terminologie ~ analoog met character set code page per language
![Page 8: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/8.jpg)
10
Character sets 2
ASCII 1 byte 128 karakters standaard letters uit het engels zonder accenten
ISO 8859 en latin-1 1 byte (8 bit) 256 karakters
CP-1252 Windows variant op latin 1
UTF8 variabel, multibyte max 4 bytes ~100000 karakters
• ~1 miljoen beschikbaar meertalig ascii codes zijn gelijk
![Page 9: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/9.jpg)
11
Voorbeelden
Character Set Hexadecimale code - Euro
AL32UTF8 E282AC
WE8MSWIN1252 80
ASCII -
WE8ISO8859P1 -
WE8ISO8859P15 164
Character Set Hexadecimale code - é
AL32UTF8 C3A9 (50089)
WE8MSWIN1252 E9 (233)
ASCII -
WE8ISO8859P1 E9
WE8ISO8859P15 E9
![Page 10: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/10.jpg)
12
Unicode / UTF 8 example
The image shows the number of bytes needed to store different kinds of characters in the UTF-8 character set. The ASCII characters (C, t, and d) require one byte. The Latin and Greek characters (á, ö, and Ø) require 2 bytes. The Asian character requires 3 bytes. The supplementary character (treble clef sign) requires 4 bytes of storage.
![Page 11: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/11.jpg)
13
Diakrieten en speciale tekens
Diakrieten zijn accenten die bij (boven, onder of zelfs door) een letter gezet worden om de uitspraak van een letter te veranderen en daarmee taaleigen klanken van een (gewijzigde) letter te voorzien. àÿęňĜş etc.
Speciale tekens ßæ¿
![Page 12: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/12.jpg)
14
Diakrieten en speciale tekens
Single byte character sets 1 byte voor samengesteld karakter Niet alle combinaties mogelijk code pages
UTF-8 diakriet heeft eigen codering samengesteld karakter heeft eigen codering
• meestal (altijd) samenstelling van oorspronkelijke karakter + diakriet
![Page 13: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/13.jpg)
15
Database functies
Character functies substr - substrb - substrc - substr2 instr - ... length - lengthb
chr (n) Returns a character corresponding to the number passed in as the argument in the
database character set select chr (50089) from dual;
dump Returns a VARCHAR2 value containing the datatype code, length in bytes, and internal
representation of expr. The returned result is always in the database character set. select dump (naam, 1017) from demo2;
convert Converts a character string from one character set to another
utl_raw select utl_raw.cast_to_raw(naam) from demo2;
unistr() Converts the characters in x to the national language character set select (unistr('Ren\00e9')) from dual;
![Page 14: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/14.jpg)
16
Demo3
![Page 15: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/15.jpg)
18
nls_lang
Client character setWhen the client NLS_LANG character set is set to
the same value as the database character set, Oracle assumes that the data being sent or received are of the same (correct) encoding, so no conversions or validations may occur for performance reasons. The data is just stored as delivered by the client, bit by bit.
![Page 16: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/16.jpg)
19
nls lang 2
language_country.character set american_america.UTF8 dutch_the netherlands.WE8MSWIN1252 american_THE NETHERLANDS.WE8MSWIN1252
Environment variable, nls_lang
Verschil in Windows GUI (WE8MSWIN1252) en command line (WE8PC850)
Wordt niet door Java clients gebruikt
![Page 17: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/17.jpg)
20
Demo4
![Page 18: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/18.jpg)
22
National character set
Support for another character set next to the database character set
e.g to allow japanese in a MSWIN1252 or ISO8859 character set
Less necessary in a UTF8 database
Multibytenvarchar, nclob etc.
![Page 19: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/19.jpg)
23
Case
TELETEX karakterset bestaat niet meer in Oracle
select convert(naam,’TELETEX’,’UTF8’) from tabel;
Locale builder
![Page 20: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/20.jpg)
25
sql> select name from emp
sql> select utl_raw.cast_to_varchar (utl_raw.cast_to_raw (name)) from emp@db
sql> select utl_raw.cast_to_varchar (utl_raw.cast_to_raw@db (name)) from emp@db
sql> select name from emp@db
![Page 21: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/21.jpg)
26
Vraag
Diacrietloos zoeken
Case insensitive zoeken
![Page 22: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/22.jpg)
27
Summary
nls_lenght_semanticsAlways explicitly define a character column with its
type (CHAR or BYTE)Oracle performs automatic character set
conversion wysinawyg
Use a Java clientWorking with character sets can be confusing
UTF8 is often the preferred character set
![Page 23: 1 NAAM Oracle Character sets Aino Andriessen. 2 Demo1.](https://reader036.fdocuments.net/reader036/viewer/2022062307/551a6f99550346b52d8b4ebb/html5/thumbnails/23.jpg)
28
Referenties
Unicode en Ultraedit http://www.ultraedit.com/support/tutorials_power_tips/
ultraedit/unicode.html
nls_lang http://www.oracle.com/technology/tech/globalization/
htdocs/nls_lang%20faq.htm
Oracle globalization support http://download.oracle.com/docs/cd/B28359_01/
server.111/b28298/toc.htm
Wikipedia