Location Terminologies

26
Strategies LLC Taxonomy November 7, 2006 Copyright 2006 Taxonomy Strategies LLC. All rights reserved. Location Terminologies ASIS&T Annual Meeting Austin, TX November 7, 2006

description

Location Terminologies. ASIS&T Annual Meeting Austin, TX November 7, 2006. Agenda. Who we are Overview Using ISO 3166 Accommodating special needs. Who we are: Ron Daniel, Jr. Over 15 years in the business of metadata & automatic classification Principal, Taxonomy Strategies - PowerPoint PPT Presentation

Transcript of Location Terminologies

Page 1: Location Terminologies

Strategies LLCTaxonomy

November 7, 2006 Copyright 2006 Taxonomy Strategies LLC. All rights reserved.

Location Terminologies

ASIS&T Annual Meeting

Austin, TX

November 7, 2006

Page 2: Location Terminologies

2Taxonomy Strategies LLC The business of organized information

Agenda

Who we are

Overview

Using ISO 3166

Accommodating special needs

Page 3: Location Terminologies

3Taxonomy Strategies LLC The business of organized information

Who we are: Ron Daniel, Jr.

Over 15 years in the business of metadata & automatic classification Principal, Taxonomy Strategies Standards Architect, Interwoven Senior Information Scientist, Metacode Technologies (acquired by

Interwoven, November 2000) Technical Staff Member, Los Alamos National Laboratory Doctoral and post-doctoral research in pattern recognition

Metadata and taxonomies community leadership Chair, PRISM (Publishers Requirements for Industry Standard

Metadata) working group Acting chair, XML Linking working group Member, RDF working groups Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2

reports.

Page 4: Location Terminologies

4Taxonomy Strategies LLC The business of organized information

Recent & current projects

Page 5: Location Terminologies

5Taxonomy Strategies LLC The business of organized information

Agenda

Who we are

Overview

Using ISO 3166

Accommodating special needs

Page 6: Location Terminologies

6Taxonomy Strategies LLC The business of organized information

8 Common Taxonomy Facets

Facet Definition Potential Sources

Organization Organizational structure. FIPS 95-2, U.S. Government Manual, Your organizational structure, etc.

Content Type Structured list of the various types of content being managed or used.

DC Types, AGLS Document Type, AAT Information Forms , Your records management policy, etc.

Industry Broad market categories such as lines of business, life events, or industry codes.

FIPS 66, SIC, NAICS, Your market segments, etc.

Location Place of operations or constituencies. FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc.

Function Functions and processes performed to accomplish mission and goals.

FEA Business Reference Model, Enterprise Ontology, AAT Functions, Your business functions, etc.

Topic Business topics relevant to your mission & goals.

Federal Register Thesaurus, NAL Agricultural Thesaurus, LCSH, Your research areas, etc.

Audience Subset of constituents to whom a piece of content is directed or intended to be used.

GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc.

Products & Services

Names of products/programs & services.

ERP system, UNSPSC, Your products and services, etc.

Page 7: Location Terminologies

7Taxonomy Strategies LLC The business of organized information

Potential facets in the petroleum industry

E&P Lifecycle

Hydro carbon System

Geologic Age

Process Mgmt

Lease Mgmt Orgs.

Basins, Reservoirs

& Fields

FacilitiesWells Disciplines

Maint.

ReservesHuman

Resources

Content Types

Production

Should be part of community standard

Community Standard

Company Facets

LocationsCompany

Org

Strongly related to location

Moderately related to location

Page 8: Location Terminologies

8Taxonomy Strategies LLC The business of organized information

Location names serve as surrogates for other things

Company divisions

Company facilities

Regulatory regimes

Currency regions

Product marketing areas

Sales territories

Customer locations

Page 9: Location Terminologies

9Taxonomy Strategies LLC The business of organized information

What is a good taxonomy?

A means to an end, and not the end in itself.

Not perfect, but it does the job it is supposed to do—such as improving search and navigation.

Improved over time, and maintained.

Incremental, extensible process that identifies and enables owners, and engages stakeholders.

Quick implementation that provides measurable results as quickly as possible.

Not monolithic—has separately maintainable facets.

Re-uses existing IP as much as possible.

Page 10: Location Terminologies

10Taxonomy Strategies LLC The business of organized information

Location names are used as part of different purposes

Typical correspondence and shipping “Libya” “South Korea”

Official correspondence with government ministers “Great Socialist People's Libyan Arab Jamahiriya” “Republic of Korea”

Corporate division of responsibility “Western Region” – does that include Montana?

Page 11: Location Terminologies

11Taxonomy Strategies LLC The business of organized information

Location terminologies may be used to organize different collections of information

ABC Computers.com

AllBusinessEmployeeEducationGaming Enthusiast

HomeInvestorJob SeekerMediaPartnerShopper

First TimeExperiencedAdvanced

Supplier

Audience

AllHome & Home Office

GamingGovernment, Education & Healthcare

Medium & Large Business

Small Business

Line of Business

AllAsia-PacificCanadaEMEAJapanLatin America & Caribbean

United States

Region-Country

DesktopsMP3 PlayersMonitorsNetworkingNotebooksPrintersProjectorsServersServicesStorageTelevisionsOther Brands

Product Family

AwardCase StudyContract & Warranty

DemoMagazineNews & EventProduct Information

ServicesSolutionSpecificationTechnical NoteToolTrainingWhite PaperOther Content Type

Content Type

Business & Finance

Interpersonal Development

IT Professionals Technical Training

IT Professionals Training & Certification

PC ProductivityPersonal Computing Proficiency

Competency Industry

Banking & Finance

Communica-tions

E-BusinessEducationGovernmentHealthcareHospitalityManufacturingPetro-chemocals

Retail / Wholesale

TechnologyTransportationOther Industries

Service

Assessment, Design & Implementation

DeploymentEnterprise Support

Client Support

Managed Lifecycle

Asset Recovery & Recycling

Training

Page 12: Location Terminologies

12Taxonomy Strategies LLC The business of organized information

Location terminologies may be used to limit search results

Category

Company

City

State

Salary

Page 13: Location Terminologies

13Taxonomy Strategies LLC The business of organized information

Problems with location vocabularies

Placenames change over time

Codes may be reused over time

Familiarity leads to proliferation Many versions of pseudo-

standard lists Guessing what the standard will

become (e.g. KOS as a code for Kosovo)

Approximate alignment between placenames and business functions leads to errors when mapping data from one purpose to another Geopolitical names get applied

to sales territories with different company history and importance (e.g. Japan vs. Asia-Pac)

Natural messiness of human affairs States vs. Provinces vs.

Protectorates, Territories, Possessions, Tribal territories,…

Disputed territories (Palestine, Kashmir, Taiwan, Kurdistan)

Proto-states (Kosovo, Somaliland)

Complexity tradeoff in software Very few invariant properties of

countries and their groupings

Passions Boycotts and death threats have

been received by people who do or do not list particular places in their lists of ‘countries’

Page 14: Location Terminologies

14Taxonomy Strategies LLC The business of organized information

Agenda

Who we are

Overview

Using ISO 3166

Accommodating special needs

Page 15: Location Terminologies

15Taxonomy Strategies LLC The business of organized information

ISO 3166 is a fundamental vocabulary for dealing with locations

UPS maintains a central World Wide Code Repository (WWCR) to store the metadata used throughout the corporation Based on the data identified in the enterprise data models

They also have a Corporate Code Table Database, populated via extract files from the WWCR. These tables contain the complete list of standardized corporate

code values for each code type. Country codes are ISO 3166-1, with local extensions obeying ISO

restrictions. The data modeler for the Corporate Code Table Database is the

primary contact from UPS to ISO and the UN with respect to codes for countries.

Source: Barbara LaRobardier, “Taxonomy and Metadata at United Parcel Service (UPS): World Wide Code Repository and Corporate Code Tables”; Semantic Technologies

Conference, San Francisco, 2005.

Page 16: Location Terminologies

16Taxonomy Strategies LLC The business of organized information

ISO 3166 is the world’s most widely-used list of country names

Country or area name

numeric -3

alpha -2

alpha -3

Afghanistan 004 AF AFG

Åland Islands 248 AX ALA

Albania 008 AL ALB

Algeria 012 DZ DZA

American Samoa

016 AS ASM

Andorra 020 AD AND

Zimbabwe 716 ZW ZWE

3166 is divided into 3 lists: 3166-1: Countries 3166-2: Sub-regions 3166-3: Changes

The lists contain three different codes for the same places: alpha-2 alpha-3 numeric-3

The source for the list is the UN Statistics Division

Page 17: Location Terminologies

17Taxonomy Strategies LLC The business of organized information

ISO 3166 codes change, and are even re-assigned!

Country alpha-2

Assigned Removed

CZECHOSLOVAKIA CS 1974* 1993

SERBIA AND MONTENEGRO CS 2003-07-23 2006

SERBIA RS 2006-09-26 current

MONTENEGRO ME 2006-09-26 current

* ISO 3166 first published in 1974. Czechoslovakia dates from 1918.

Page 18: Location Terminologies

18Taxonomy Strategies LLC The business of organized information

What is the code for Kosovo?

No code currently exists for Kosovo, but “KS” is unassigned. Should we use it in the expectation that eventually it will be assigned?

No.

To quote from ISO 3166-1:1997, clause 8.1.3 User-assigned code elements:

"If users need code elements to represent country names not included in this part of ISO 3166, the series of letters AA, QM to QZ, XA to XZ, and ZZ, and the series AAA to AAZ, QMA to QZZ, XAA to XZZ, and ZZA to ZZZ respectively and the series of numbers 900 to 999 are available."

Page 19: Location Terminologies

19Taxonomy Strategies LLC The business of organized information

There are many categories of ISO 3166-1 alpha-2 codes

AAAB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ

BA BB BC BD BE BF BG BH BI BJ BK BL BM BN BO BP BQ BR BS BT BU BV BW BX BY BZ

CA CB CC CD CE CF CG CH CI CJ CK CL CM CN CO CP CQ CR CS CT CU CV CW CX CY CZ

DA DB DC DD DE DF DG DH DI DJ DK DL DM DN DO DP DQ DR DS DT DU DV DW DX DY DZ

EA EB EC ED EE EF EG EH EI EJ EK EL EM EN EO EP EQ ER ES ET EU EV EW EX EY EZ

FA FB FC FD FE FF FG FH FI FJ FK FL FM FN FO FP FQ FR FS FT FU FV FW FX FY FZ

GA GB GC GD GE GF GG GH GI GJ GK GL GM GN GO GP GQ GR GS GT GU GV GW GX GY GZ

HA HB HC HD HE HF HG HH HI HJ HK HL HM HN HO HP HQ HR HS HT HU HV HW HX HY HZ

IA IB IC ID IE IF IG IH II IJ IK IL IM IN IO IP IQ IR IS IT IU IV IW IX IY IZ

JA JB JC JD JE JF JG JH JI JJ JK JL JM JN JO JP JQ JR JS JT JU JV JW JX JY JZ

KA KB KC KD KE KF KG KH KI KJ KK KL KM KN KO KP KQ KR KS KT KU KV KW KX KY KZ

LA LB LC LD LE LF LG LH LI LJ LK LL LM LN LO LP LQ LR LS LT LU LV LW LX LY LZ

MA MB MC MD ME MF MG MH MI MJ MK ML MM MN MO MP MQ MR MS MT MU MV MW MX MY MZ

NA NB NC ND NE NF NG NH NI NJ NK NL NM NN NO NP NQ NR NS NT NU NV NW NX NY NZ

OA OB OC OD OE OF OG OH OI OJ OK OL OM ON OO OP OQ OR OS OT OU OV OW OX OY OZ

PA PB PC PD PE PF PG PH PI PJ PK PL PM PN PO PP PQ PR PS PT PU PV PW PX PY PZ

QA QB QC QD QE QF QG QH QI QJ QK QL QM QN QO QP QQ QR QS QT QU QV QW QX QY QZ

RA RB RC RD RE RF RG RH RI RJ RK RL RM RN RO RP RQ RR RS RT RU RV RW RX RY RZ

SA SB SC SD SE SF SG SH SI SJ SK SL SM SN SO SP SQ SR SS ST SU SV SW SX SY SZ

TA TB TC TD TE TF TG TH TI TJ TK TL TM TN TO TP TQ TR TS TT TU TV TW TX TY TZ

UA UB UC UD UE UF UG UH UI UJ UK UL UM UN UO UP UQ UR US UT UU UV UW UX UY UZ

VA VB VC VD VE VF VG VH VI VJ VK VL VM VN VO VP VQ VR VS VT VU VV VW VX VY VZ

WA WB WC WD WE WF WG WH WI WJ WK WL WM WN WO WP WQ WR WS WT WU WV WW WX WY WZ

XA XB XC XD XE XF XG XH XI XJ XK XL XM XN XO XP XQ XR XS XT XU XV XW XX XY XZ

YA YB YC YD YE YF YG YH YI YJ YK YL YM YN YO YP YQ YR YS YT YU YV YW YX YY YZ

ZA ZB ZC ZD ZE ZF ZG ZH ZI ZJ ZK ZL ZM ZN ZO ZP ZQ ZR ZS ZT ZU ZV ZW ZX ZY ZZ

 

http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/iso_3166-1_decoding_table.html#AW 

Officially assigned code element   Code element may be used without restriction

User-assigned code element   Code element may be used without restriction

Exceptionally reserved code element   Code element may be used but restrictions may apply

Transitionally reserved code element   Code element deleted from ISO 3166-1; stop using ASAP

Indeterminately reserved code element   Code element must not be used in ISO 3166-1

Code elements not used at present stage   Code element must not be used in ISO 3166-1

Un-assigned code elements   Code element free for assignment (by ISO 3166/MA only!)

 

These are reserved for local extensions. Use them when you

need a new code!

Page 20: Location Terminologies

20Taxonomy Strategies LLC The business of organized information

Agenda

Who we are

Overview

Using ISO 3166

Accommodating special needs

Page 21: Location Terminologies

21Taxonomy Strategies LLC The business of organized information

Usual and unusual requirements for handling country names

One client needed to maintain multiple country lists:

ISO 3166 used in most systems

Maintained a separate editorial style list for correspondence and reports

Still other lists were used for statistical information on country subdivisions and multi-country regions

Organization maintained a variety of historical information on countries and regions:

Effective dates for codes were needed (note – dates were for codes within a system, not for the countries)

Mappings from old countries to successors were also needed

Country Alpha-2 Start Date End Date

Bosnia and Herzegovina

1992

Czech Republic CZ 1993-06-15

Czechoslovakia CS 1974 1993-06-15

Yugoslavia YU 1974 2003

USSR 1974 1992-08-30

Zaire ZA 1974 1997-07-14

Congo, Dem. Rep. of CD 1997-07-14

3166 Short Name

Redbook Country Name

Redbook Full Form

Redbook Short Form

STA Code

Afghanistan Afghanistan, Islamic State of

Afghanistan, I.S. of

512

Åland Islands

not in Redbook

Albania Albania 914

Aruba Aruba Kingdom of the Netherlands-Aruba

314

… … … … …

Page 22: Location Terminologies

22Taxonomy Strategies LLC The business of organized information

Problems when mapping between location terminologies

ISO Code

ISO OfficialShort Name ISO Full Names

Redbook Country Name

Redbook Full Form

STA Name (60 chars) Issues

Missing entities not listed in any of the recommended country lists. (e.g. The Azores, Kosovo)

CIV CÔTE D'IVOIRE Republic of Côte d'Ivoire

Côte d’Ivoire - Côte d'Ivoire Use of accents in Country names.

BIH BOSNIA AND HERZEGOVINA

- Bosnia and Herzegovina

- Bosnia & Herzegovina

Inconsistent use of conjunctions special characters ('and' or ampersand ‘&’)

TLS TIMOR-LESTE Democratic Republic of Timor-Leste

Timor-Leste Democratic Republic of Timor-Leste

Timor-Leste Direct order of official country name does not alphabetize where users expect to find it.

HKG HONG KONG Hong Kong Special Administrative Region of China

China,P.R.: Hong Kong

China,P.R.: Hong Kong SRA

China,P.R.: Hong Kong

Variation between ISO and company practices.

MKD MACEDONIA, THE FORMER YUGOSLAV REPUBLIC OF

The former Yugoslav Republic of Macedonia

Macedonia, former Yugoslav Republic of

- Macedonia, FYR Long names are more frequently abbreviated.

PSE PALESTINIAN TERRITORY, OCCUPIED

Occupied Palestinian Territory

West Bank and Gaza

- West Bank and Gaza

Unclear what the correct form of name is. Note: Redbook name is from front matter, not table.

KNA SAINT KITTS AND NEVIS

- St. Kitts and Nevis

- St. Kitts and Nevis

ISO spells out “Saint” but company uses abbreviation.

VNM VIET NAM Socialist Republic of Viet Nam

Vietnam - Vietnam Spelling and name order variations between ISO and company

Page 23: Location Terminologies

23Taxonomy Strategies LLC The business of organized information

Published Facets

Consuming Applications

IntranetSearch

’’

Web CMS

Archives

ERMS

Custodians

Notifications

Change Requests & Responses

ISO3166-1

Other External

ERP

Other Internal

Vocabulary Management

System

Other Controlled

Items

’’

Intranet Nav.

DAM

Enterprise taxonomy governance environment

Taxonomy Governance Environment

CVs

2: Team decides when to update facets within Taxonomy

3: Team adds value via mappings, translations, synonyms, training materials, etc.

1: External vocabularies change on their own schedule, with some advance notice.

4: Updated versions of facets published to consuming applications

CV (Controlled Vocabulary) – The list of values for one facet in the Taxonomy.

Page 24: Location Terminologies

24Taxonomy Strategies LLC The business of organized information

The client defined a process for country vocabulary changes

The different vocabularies had different processes.

Custodians of the different vocabularies communicate so that if one changes, the others know about it.

Submit Change Request

Delegate to Other

Custodian

Inform Requester

Fast-track from ED?

Mark as IN-PROCESS

SEC drafts circular, sends

to ED

ED approval?

Y

YC

R C C

C

Review Request byRejection Criteria

Violates Criteria?

Wrong CV?

Y

Y

Marked as REQUESTED

Update CV and Mapping

C

F,O

EV

Mark as PROVISIONAL

C

V

Mark as APPROVED

Exit

C

V

Updates PublishedV

V

V

Marked as DENIED

C

Send to BoardInform

Requester

E

CMarked as DENIEDV

EDV

R – Requester V –VM System C – Custodian E – Email from VM SystemED – Exec. Dir. F – Forms Interface O – Other (Phone, Fax, etc.)

Submit Change Request

Delegate to Other

Custodian

Inform Requester

Fast-track from ED?

Mark as IN-PROCESS

SEC drafts circular, sends

to ED

ED approval?

Y

YC

R C C

C

Review Request byRejection Criteria

Violates Criteria?

Wrong CV?

Y

Y

Review Request byRejection Criteria

Violates Criteria?

Wrong CV?

Y

Y

Marked as REQUESTED

Update CV and Mapping

C

F,O

EV

Mark as PROVISIONAL

C

V

Mark as PROVISIONAL

C

V

Mark as APPROVED

Exit

C

V

Updates PublishedV

Mark as APPROVED

Exit

C

V

Updates PublishedV

V

V

Marked as DENIED

C

Send to BoardInform

Requester

E

CMarked as DENIEDV

Inform Requester

E

CMarked as DENIEDV

CMarked as DENIEDV

EDV

R – Requester V –VM System C – Custodian E – Email from VM SystemED – Exec. Dir. F – Forms Interface O – Other (Phone, Fax, etc.)

Notify Board

Notify Board

– Indicates Role(s) – Indicates Tool(s)

Page 25: Location Terminologies

25Taxonomy Strategies LLC The business of organized information

Conclusion

Location terminologies are commonly used They fulfill many different purposes

Keeping up-to-date is an ongoing effort The rate of change is low, but ongoing Codes will be reassigned at times, get ready for it

The issues can be complex Anything out of the ordinary will not be well-served by

off-the-shelf software

Most organizations have a proliferation of pseudo-3166 vocabularies. Start there to get things under control.

Page 26: Location Terminologies

Strategies LLCTaxonomy

November 7, 2006 Copyright 2006 Taxonomy Strategies LLC. All rights reserved.

Questions?

Ron Daniel

925-368-8371

[email protected]