Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria...
Transcript of Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria...
![Page 1: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/1.jpg)
Co-funded by the 7th Framework Programme of the European Commission through the contract T4ME, grant agreement no.: 249119.
Co-funded by the ICT PSP Programme of the European Commission through the contract CESAR, grant agreement no.: 271022.
Furthering Natural Language Processing in Bulgaria
Svetla KoevaInstitute for Bulgarian, Bulgaria
META-FORUMBudapest, Hungary, 2011-06-27-28
Tuesday, June 28, 2011
![Page 2: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/2.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 3: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/3.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 4: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/4.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 5: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/5.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 6: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/6.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 7: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/7.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 8: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/8.jpg)
Furthering NLP in Bulgaria
General facts
Republic of BulgariaArea - 110, 993. 6 km2Population - 7 351 633Bulgarian -
9 million native speakers
2
Tuesday, June 28, 2011
![Page 9: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/9.jpg)
Furthering NLP in Bulgaria
General facts
The official alphabet is Cyrillic.
Официалната азбука е кирилица.
Cyrillic became the third official alphabet of the European Union, following the Latin and Greek alphabets.3
Tuesday, June 28, 2011
![Page 10: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/10.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 11: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/11.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 12: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/12.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 13: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/13.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 14: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/14.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 15: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/15.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 16: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/16.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 17: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/17.jpg)
Furthering NLP in Bulgaria
Research
4
Tuesday, June 28, 2011
![Page 18: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/18.jpg)
Furthering NLP in Bulgaria
BLARK Over the past decade a number of important
language resources and tools have been developed.
5
Tuesday, June 28, 2011
![Page 19: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/19.jpg)
Furthering NLP in Bulgaria
BLARK Over the past decade a number of important
language resources and tools have been developed.
5
Tuesday, June 28, 2011
![Page 20: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/20.jpg)
Furthering NLP in Bulgaria
BLARK Over the past decade a number of important
language resources and tools have been developed.
5
Tuesday, June 28, 2011
![Page 21: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/21.jpg)
Furthering NLP in Bulgaria
BLARK Over the past decade a number of important
language resources and tools have been developed.
5
Tuesday, June 28, 2011
![Page 22: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/22.jpg)
Furthering NLP in Bulgaria
BLARK Over the past decade a number of important
language resources and tools have been developed.
5
Tuesday, June 28, 2011
![Page 23: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/23.jpg)
Furthering NLP in Bulgaria
BLARK
6
Bulgarian National Corpus - app. 500M words
Bulgarian POS-annotated Corpus
Bulgarian Sense-annotated Corpus
Dependency part of BulTreeBank
Tuesday, June 28, 2011
![Page 24: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/24.jpg)
Furthering NLP in Bulgaria
BLARK
SEE-ERA.net Administrative and Literally Corpus
Bilingual collection of cultural texts in Greek and Bulgarian
Bulgarian-Polish-Lithuanian Corpus Bulgarian-English-X language parallel
corpus - app. 100M words for Bulgarian ...
Tuesday, June 28, 2011
![Page 25: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/25.jpg)
Furthering NLP in Bulgaria
BLARK
8
Several large inflectional dictionaries
Bulgarian WordNet
Bulgarian FrameNet
Tuesday, June 28, 2011
![Page 26: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/26.jpg)
Furthering NLP in Bulgaria
Companies
Development of tools and solutions based on semantic technologies
Ontology design Data integration, management and
publishing9
Tuesday, June 28, 2011
![Page 27: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/27.jpg)
Furthering NLP in Bulgaria
Companies
10
Tuesday, June 28, 2011
![Page 28: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/28.jpg)
Furthering NLP in Bulgaria
Companies
Web applications (dynamic web content) Content Management Systems (CMS) Tools for web site content management Multilingual tools and services for natural
language processing
11
Tuesday, June 28, 2011
![Page 29: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/29.jpg)
Furthering NLP in Bulgaria
Companies
12
Tuesday, June 28, 2011
![Page 30: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/30.jpg)
Furthering NLP in Bulgaria
Companies
WebTrance - a translation software package from English, French, German, Spanish, Italian and Turkish to Bulgarian and vice versa.
SkyCode is one of the partners of iTranslate4.
Tuesday, June 28, 2011
![Page 31: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/31.jpg)
http://www.meta-net.eu
Furthering NLP in Bulgaria
Approximate status
14
Technology MedianTokenization, Morphology 4
Parsing 3
Information Retrieval 2
Speech Synthesis 2
Text semantics 2
Information extraction 2
Summarization, QA 2
Machine translation 2
Language generation 1
Resources Median
Reference Corpora 4
Thesauri, WordNets 4
Lexicons, Terminologies 3
Semantic corpora 3
Parallel Corpora, TM 2
Syntax-Corpora 2
Discourse-Corpora 1
Multimedia/multimodal data 1
Tuesday, June 28, 2011
![Page 32: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/32.jpg)
http://www.meta-net.eu
Furthering NLP in Bulgaria
Approximate status
14
Technology MedianTokenization, Morphology 4
Parsing 3
Information Retrieval 2
Speech Synthesis 2
Text semantics 2
Information extraction 2
Summarization, QA 2
Machine translation 2
Language generation 1
Resources Median
Reference Corpora 4
Thesauri, WordNets 4
Lexicons, Terminologies 3
Semantic corpora 3
Parallel Corpora, TM 2
Syntax-Corpora 2
Discourse-Corpora 1
Multimedia/multimodal data 1
Tuesday, June 28, 2011
![Page 33: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/33.jpg)
http://www.meta-net.eu
Furthering NLP in Bulgaria
Approximate status
14
Technology MedianTokenization, Morphology 4
Parsing 3
Information Retrieval 2
Speech Synthesis 2
Text semantics 2
Information extraction 2
Summarization, QA 2
Machine translation 2
Language generation 1
Resources Median
Reference Corpora 4
Thesauri, WordNets 4
Lexicons, Terminologies 3
Semantic corpora 3
Parallel Corpora, TM 2
Syntax-Corpora 2
Discourse-Corpora 1
Multimedia/multimodal data 1
Tuesday, June 28, 2011
![Page 34: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/34.jpg)
http://www.meta-net.eu
Furthering NLP in Bulgaria
Approximate status
14
Technology MedianTokenization, Morphology 4
Parsing 3
Information Retrieval 2
Speech Synthesis 2
Text semantics 2
Information extraction 2
Summarization, QA 2
Machine translation 2
Language generation 1
Resources Median
Reference Corpora 4
Thesauri, WordNets 4
Lexicons, Terminologies 3
Semantic corpora 3
Parallel Corpora, TM 2
Syntax-Corpora 2
Discourse-Corpora 1
Multimedia/multimodal data 1
Tuesday, June 28, 2011
![Page 35: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/35.jpg)
http://www.meta-net.eu
Furthering NLP in Bulgaria
Approximate status
15
QuantityAvailability Quality Coverage Maturity Sustaina
bilityAdaptab
ility
Technology 2 2 2.5 2.5 2 2 2.5
Resources 2 2.5 3 3.5 2.5 2.5 2.5
Total 2 2 2.5 3 2 2 2.5
Tuesday, June 28, 2011
![Page 36: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/36.jpg)
Furthering NLP in BulgariaState contribution to R&D
Japan Korea USA Singapore China EC 27 Bulgaria 2000 2008 per year in %
16
Tuesday, June 28, 2011
![Page 37: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/37.jpg)
Furthering NLP in BulgariaState contribution to R&D
Strategic research agenda: Cultural-historical heritage - language being
a central part of it ICT as a horizontal instrument
17
Tuesday, June 28, 2011
![Page 38: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/38.jpg)
Furthering NLP in Bulgaria
European dimensions META, Multilingual Europe Technology Alliance
Institute for Bulgarian, BAS, a member of Institute for Literature, BAS Institute of Information and Communication Technologies, BAS
Sofia University St. Kliment Ohridski University of Plovdiv
Ontotext, Bulgaria Musala Soft, Bulgaria Tetracom Interactive Solutions, Bulgaria TransGlobe International Ltd., Bulgaria
18
Tuesday, June 28, 2011
![Page 39: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/39.jpg)
Furthering NLP in Bulgaria
Conclusions
Several factors are mutually related for the success: clear formulation of target goals and strategies for their
accomplishment stable financing effective management of the resources beneficial relations between education - research -
business - end users networking
META-NET as a concerted, substantial, continent-wide effort in language technology research and engineering is relevant for all of these factors.19
Tuesday, June 28, 2011
![Page 40: Furthering Natural Language Processing in Bulgaria · 2011-06-28 · Furthering NLP in Bulgaria Companies Web applications (dynamic web content) Content Management Systems (CMS)](https://reader033.fdocuments.net/reader033/viewer/2022050518/5fa2128c66076d04ee4712ab/html5/thumbnails/40.jpg)
Thank you very much for your attention.
20
Tuesday, June 28, 2011