The 2010 Secretary’s The 2010 Secretary’s Annual Report Annual Report
on ISO/TC37/SC4 on ISO/TC37/SC4 “Language resource “Language resource
management”management”
2010-08-15
http://www.tc37sc4.org/
ContentsContents
MembershipWorking Groups
Thematic Domain GroupsTask Forces
Project ItemsOn-going Ballots
MeetingsIssues and Proposals
2
MembershipMembership
OrganizationP-members (23 ->24)O-members (9 -> 8)Liaisons (12 ->11)
Key contact persons
4
Organization Organization
Chairperson Body: AFNOR (France) Name: Romary, Laurent
Secretary Body: KATS (Korea) Name: Choi, Key-Sun Choi
5
P-members P-members (24)(24)
1. AENOR (Spain) 2. AFNOR (France) 3. ANSI (USA) 4. ASI (Austria)5. BSI (United
Kingdom)6. CSK (Korea, DPR)7. DIN (Germany)8. DS (Denmark)9. GOST R (Russian
FED.)10. JISC (Japan)11. KATS (Korea, Rep.)12. MSA (Malta)
1. NEN (Netherlands)2. NSAI (Ireland)3. PKN (Poland)4. SABS (South Africa)5. SAC (China)6. SCC (Canada)7. SIS (Sweden)8. SN (Norway)9. TISI (Thailand)10. TSE (Turkey)11. UNI (Italy)12. UNMZ (Czech Rep.)
6
O-members O-members (8)(8) and liaisons and liaisons (11)(11)
1. ASRO (Romania)2. BDS (Bulgaria)3. DSSU (Ukraine)4. ISS (Serbia)5. NBN (Belgium)6. SFS (Finland)7. STAMEQ (Vietnam)8. SUTN (Slovakia)
1. ISO/IEC JTC 001/SC 29
2. ISO/TC 46/SC 093. ISO/TC 184/SC 044. ELRA5. Infoterm6. LISA7. OMG8. TEI9. TERMNET10. UIC11. UNESCO
7
Key personsKey persons
(Miss) Hyojin Won International Standards Support, KSA [email protected]
Jenny Pellaux ISOCS-TPM [email protected]
8
Working GroupsWorking Groups
WG 01 “Basic descriptors and mechanisms for language resources” Convenor: Nancy Ide
WG 02 “Representation schemes” Convenor: Kiyong Lee
WG 03 “Multilingual information representation” Convenor: Nasredine Semmar
WG 04 “Lexical resources” Convenor: Nicoletta Calzolari
WG 05 “Workflow of Language resource management” Convenor:
10
Thematic Domain GroupsThematic Domain Groups
Status: ad hoc Established in May 2004, Lisbon Triple Function:
(1) Liaison to ISOCat (2) Incubator for new work item proposals (3) Working with international groups: e.g. ISA with IWCS, LREC, FLaReNet
12
TDG 01 Metadata: Peter Wittenburg TDG 02 Morphosyntactic data categories: Gil Francopoulo TDG 03 Semantic content representation: Harry Bunt
Activity 01 Discourse relations: Koiti HasidaActivity 02 Dialogue acts: Harry BuntActivity 03 Referential structures and Links: Laurent RomaryActivity 04 Logico-semantic relations: Scott FarrarActivity 05 Temporal entities and relations: Kiyong Lee Activity 06 Semantic roles and argument structures: Thierry Declerk
TDG 04 Syntactic data categories: Thierry Declerk
TDG 05 Machine readable dictionary: Monte George TDG 06 Multilingual Ontology: Koiti Hasida TDG 07 Lexical semantics: Monica Monachini
13
Task ForcesTask Forces
Task Force for the Harmonization of Principles (TFH)
Convenor: Nancy Ide
Task Force for Terminology Coordination (TFTC)
Convenor and liaison to TC 37/TCG: Alex C. Fang
15
Project ItemsProject Items
14 Active project items: WG 01 (4), WG 02(9), WG 03 (1)
3 Unregistered project items
2 ISO Published Standards◦ ISO 24610-1: 2006 “Language resource
management - Feature Structures - Part 1: Feature Structure Representation (FSR)”
◦ ISO 24613: 2008 “Language resource management - Lexical Markup Framework (LMF)”
17
WG 01-01: WD 24610-1 “Language resource management - Feature structures – Part 1: Feature structure representation (FSR)” Project leaders: Kiyong Lee, Gerald Penn• revision of ISO 24610-1:2006 Feature Structures Part 1:
Feature structure representation (FSR:2006) Joint work with TEI: Lou Burnard
WG 01-02: FDIS 24610-2 “Language resource management - Feature Structures - Part 2: Feature Systems Declaration (FSD)” Project leaders: Kiyong Lee, Gerald Penn
WG 01-03: DIS 24612 “Language resource management - Linguistic Annotation Framework (LAF)” Project leader: Nancy Ide
• WG 01-04: DIS 24619 “Language resource management - Persistent identification and access in language technology applications (PID)” Project leader: Daan Broeder
19
21
WG 02-01: DIS 24611 “Language resource management - Morphosyntactic annotation framework (MAF)”• Project leader: Eric de la Clergerie
WG 02-02: DIS 24614-1 “Language resource management - Word segmentation of Text – Part 1: Basic concept s and general principles (WordSeg-1)”• Project leader: SUN Maosong
WG 02-03: WD 24614-2 “Language resource management - Word Segmentation of Text – Part 2: Word Segmentation for Chinese, Japanese and Korean (WordSeg-2)”• Project leaders: SUN Maosong, Key-Sun Choi, Hitoshi
Isahara
WG 02-04: FDIS 24615 “Language resource management -Syntactic annotation framework (SynAF)”• Project leader: Thierry Declerck
WG 02-05: DIS 24617-1 “Language resource management - Semantic Annotation Framework – Part 1: Time and events (SemAF/Time, ISO-TimeML)”• Project leader: Kiyong Lee• Editors: James Pustejovsky (chair), Branimir Boguraev, Harry
Bunt, Nancy Ide, Kiyong Lee• (Cancellation date: 2010-10-13)
WG 02-06: DIS 24617-2 “Language resource management -Semantic Annotation Framework – Part 2: Dialogue acts (SemAF/ Dacts ) ”• Project leader: Harry Bunt• Editors: Harry Bunt (chair), Jan Alexadersson, Jean Carletta,
Jae-woong Choe, Volha Petukhova, Alex C. Fang, Koiti Hasida, Andrei Popescu-Belis, Claudia Soria, David Traum,
22
WG 02-06: NP 24617-3 “Language resource management - Semantic Annotation Framework – Part 3: Named entities (SemAF/NE) ”
Project leader: Gil Francopoulo
WG 02-07: NP 24617-4 “Language resource management - Semantic Annotation Framework – Part 4: Semantic roles (SemAF/SRL) ”
Project leader: Martha Palmer
WG 02-08: NP 24617-5 “Language resource management - Semantic Annotation Framework – Part 5: Discourse Structures entities (SemAF/DS) ”
Project leader: Gil Francopoulo
WG 02-09: PWI 24617-6 “Language resource management - Semantic Annotation Framework – Part 6: Space (SemAF/ISO-Space) ”
Project leader: James Pustejovsky
23
WG 03-01: DIS 24616 “Language resource management - Multilingual information framework (MLIF)”◦Project leader: Samuel Cruz-Lara◦Limit date: 2010-10-15
25
WG 4-1: ISO 24613 Lexical Markup Framework (LMF)◦ Project leaders: Monte George, Gil
Francopoulo◦ Status: ISO International Standard 2008
27
Unregistered PWIUnregistered PWI
ISO NP 24620 (OMG) “Language resource management – Simplified natural languages – Part 1: Basic concepts and general principles (simpL-1)”
Project leaders: Thierry Declerck, Sung-Kwon Choi Editor: Doug Lawrence ISO NP 2462x “Language resource management –
Segmentation rules eXchange (SRX)” Proposed project leader: Arle Lommel
ISO PWI 2462x (OMG) “Language resource management – Temporal Vocabulary ”
Proposed project leader: Mark Linehan
28
end dateNP 2462x SRX 2010-08-08FDIS 24614-1 WordSeg-1 2010-09-05NP 24617-4 SemAF-SRL 2010-09-14FDIS 24615 SynAF 2010-10-02NP 24617-5 SemAF-DS 2010-10-17DIS 24614-2 WordSeg-2 2010-10-26DIS 24617-2 SemAF-Dacts 2010-12-30
30
Meetings 2009 Meetings 2009
2009-09-14/16: Tilburg, The Netherlands WG 2: MAF, SynAF, SemAF-Dacts
2009-09-24/26: Fragrant Hill Hotel, Beijing, ChinaWG 2 WordSeg-1/2 editorial meeting
2009-11-01/05: Brandeis, Waltham, MA, USA WG 1-2, FLaReNet, SILT
32
Meetings 2010 Meetings 2010
2010-01-15/20: City University of Hong Kong WG 1, WG 2, WG 3, WG 4, ISOCat ISA-5, ICGL 2010
2010-03-20/22: Beijing Xijiao Hotel, Beijing, China WG2 WordSeg-2 Editorial Meeting
2010-05-17/21: Valletta, Malta Tutorial + LRT workshop + WG2 + TDG 3, LREC 2010
2010-08-15/20: Dublin, Ireland TC 37 and SCs Annual Meetings
2010-10-13/15: DIN, Berlin, Germany TDG 1 + WG 2 + WG 4
33
Meetings 2011Meetings 2011
2011-01-10/11 Oxford, United Kingdom WG 2 + ISA-6, IWCS 2011 (2011-01-12/14)
2011-05: to be discussed
2011-08-14/19: TC 37 + SCs meetings, Seoul Palace Hotel, Seoul, South Korea
2010-10: to be discussed
34
Cross-institutional collaborationsCross-institutional collaborations
ISO/TC 37/SC 4• generic models for LR management• target expert groups with wide international coverage• stability - consensus
ISO/TC 37/SC 4• generic models for LR management• target expert groups with wide international coverage• stability - consensus
TEI – Text Encoding Initiative• reference XML vocabularies• specification infrastructure (ODD)• back office format for ISO documents• reactivity larger community
TEI – Text Encoding Initiative• reference XML vocabularies• specification infrastructure (ODD)• back office format for ISO documents• reactivity larger community
W3C• dedicated application profile for web-based applications• articulation with other web-based standards (e.g. web service)• industry based requirements• bridge to various industries, e.g. localization
W3C• dedicated application profile for web-based applications• articulation with other web-based standards (e.g. web service)• industry based requirements• bridge to various industries, e.g. localization
Consequences for SC 4Consequences for SC 4
Work on a wide coverage of language resource levels◦Ex.: Systemacity of SemAF components (Time,
Space, Dialogue Acts, Named entities, discourse structures, semantic roles)
Articulate SC 4 standards with industry standards◦Ex.: MLIF XLIFF, TMX, SMIL
Avoid maintaining XML formats as ISO standard◦Ex.: SynAF. Tiger or TEI can be good serialisations
Proposal for wikiProposal for wiki
• Website of TC37/SC4–Purpose: • To give information to the experts• To communicate with standard users• To show the feasible solution based on standards
–Maintenance• Convenor and project leaders will put the information
– Idea collection stage• Organization of wiki
–Please access to: http://swrc.kaist.ac.kr/isotc37wiki/ • id: WikiSysop (case-sensitive) pw: isowiki$&14
Practical problemsPractical problems
PWI -> NP stage (1) Working draft (2) Editorial or consulting groupManagement: co-PLsEditorial: DIS -> FDIS stage (1) Producing documents in MS Word
format (ODD) (2) Figures Volume control on each document
39
Top Related