CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD...

8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT CyberBrain: Towards the Next Generation Social Intelligence Asanee Kawtrakul 1, 2 , Watchara Sriswasdi 1 , Suparat Wuttilerdcharoenwong 1 , Vasuthep Khunthong 1 , Frederic Andres 3 , Saovakon Laovayanon 4 , Decha Jenkollop 4 , Werachai Narkwiboonwong 4 and Anan Pusittigul 4 1 Kasetsart University, Bangkok, Thailand Email: {asanee.kawtrakul, watchara.sriswasdi, suparat.wuttilerdcharoenwong, vasuthep.khunthong }@uknowcenter.org 2 National Electronics and Computer Technology Center, Pathumthani, Thailand Email: [email protected] 3 National Institute of Informatics, Tokyo, Japan Email: [email protected] 4 Agricultural Land Reform Office, Bangkok, Thailand Email: {saovakon ,urai, ananp}@alro.go.th, [email protected] Abstract With the development of the Internet and the World Wide Web, the enormous amount of knowledge resources becomes the obstacle for knowledge consumers from effectively and efficiently accessing the information needed. To overcome such a problem, knowledge fusion is one of the solutions. This paper introduces the CyberBrain: a framework that combines approaches based on Knowledge Engineering and Language Engineering to provide the effective knowledge service. CyberBrain is a dynamic structure, interconnecting organization and communities. It behaves as a natural ecosystem, self-organizing, emerging and adaptive to acquire, collect, extract, and aggregate the related knowledge. With CyberBrain, appropriate and personalized knowledge services will be provided to support problem solving, decision making and early warning. At the current state, the framework is demonstrated with Rice Knowledge Portal using the PMM (Problem-Methods-Man) map generation. In addition, AGROVOC concept Server 1 has also been used focusing primarily on the process of knowledge integration. Keywords: CyberBrain, Knowledge Fusion, Knowledge Services, Rice Knowledge Portal, Problem-Method-Man map, AGROVOC concept server Introduction In the Cyberspace Era, Internet has become a primary part of our daily life. We, the information and knowledge consumers, have spent on searching process, obtaining not so useful information. To tackle this issue, we designed the CyberBrain as a knowledge aggregation framework. It plays an important role on gathering knowledge from Internet in order to provide solutions for various problems. The CyberBrain aims to be applied in various fields such as agriculture, health, traditions and culture. It can accumulate knowledge from Internet, textbook, expert, and other resources into a knowledge ecosystem. One dynamically adaptive structural 1 http://naist.cpe.ku.ac.th/agrovoc 545

Transcript of CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD...

Page 1: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

CyberBrain: Towards the Next Generation Social Intelligence Asanee Kawtrakul 1, 2, Watchara Sriswasdi 1, Suparat Wuttilerdcharoenwong 1, Vasuthep Khunthong1, Frederic Andres3, Saovakon Laovayanon4, Decha Jenkollop4, Werachai Narkwiboonwong4 and Anan Pusittigul4 1 Kasetsart University, Bangkok, Thailand Email: asanee.kawtrakul, watchara.sriswasdi, suparat.wuttilerdcharoenwong, vasuthep.khunthong @uknowcenter.org 2 National Electronics and Computer Technology Center, Pathumthani, Thailand Email: [email protected] 3 National Institute of Informatics, Tokyo, Japan Email: [email protected] 4 Agricultural Land Reform Office, Bangkok, Thailand Email: saovakon ,urai, [email protected], [email protected] Abstract With the development of the Internet and the World Wide Web, the enormous amount of knowledge resources becomes the obstacle for knowledge consumers from effectively and efficiently accessing the information needed. To overcome such a problem, knowledge fusion is one of the solutions. This paper introduces the CyberBrain: a framework that combines approaches based on Knowledge Engineering and Language Engineering to provide the effective knowledge service. CyberBrain is a dynamic structure, interconnecting organization and communities. It behaves as a natural ecosystem, self-organizing, emerging and adaptive to acquire, collect, extract, and aggregate the related knowledge. With CyberBrain, appropriate and personalized knowledge services will be provided to support problem solving, decision making and early warning. At the current state, the framework is demonstrated with Rice Knowledge Portal using the PMM (Problem-Methods-Man) map generation. In addition, AGROVOC concept Server1 has also been used focusing primarily on the process of knowledge integration. Keywords: CyberBrain, Knowledge Fusion, Knowledge Services, Rice Knowledge Portal, Problem-Method-Man map, AGROVOC concept server Introduction In the Cyberspace Era, Internet has become a primary part of our daily life. We, the information and knowledge consumers, have spent on searching process, obtaining not so useful information. To tackle this issue, we designed the CyberBrain as a knowledge aggregation framework. It plays an important role on gathering knowledge from Internet in order to provide solutions for various problems. The CyberBrain aims to be applied in various fields such as agriculture, health, traditions and culture. It can accumulate knowledge from Internet, textbook, expert, and other resources into a knowledge ecosystem. One dynamically adaptive structural

1 http://naist.cpe.ku.ac.th/agrovoc

545

Page 2: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

framework was designed to handle different kinds of knowledge and to work properly with acquiring, collecting, extraction, and aggregation processes.

The architecture of the CyberBrain is composed of 3 parts which are knowledge acquisition tools, dynamically adaptive structural framework, and knowledge service. Knowledge acquisition tools include knowledge templates tool, folksonomy annotation tool, document classification, knowledge extraction, and a PMM (Problem-Methods-Man) browser. End-users can access knowledge inside the CyberBrain via knowledge service. Knowledge service consists of visualized browser, Question and Answering, web service knowledge portal, and intelligent search engine. With these services and tools, CyberBrain can provide personalized knowledge, less time spending and more appropriate knowledge gain, to support decision in personalized domain problem solving.

Current implementation of CyberBrain has been applied to Thailand agricultural knowledge, especially to rice knowledge in order to support precision rice farming on various services such as Personalized Fertilizer Expert System (Sriswasdi W., et al., 2008), Pest and Disaster Early warning, etc. Problem in Knowledge Management and Services Knowledge Sources are divided into 2 different categories:

Tacit Knowledge2 defined as the knowledge that people carry in their minds and difficult to access such as experience, mental maps, know-how, etc.

Explicit Knowledge3 corresponding to any knowledge that can be articulated, codified, and stored. It can be readily transmitted to others such as document, policy, process, strategy, software, etc.

Knowledge Sources are scattered in the cyberspace as explicit knowledge and unstructured knowledge that lead to be difficult for knowledge accessing and also be the same as the personal knowledge, the tacit knowledge that is hard to be utilized and transferred because there is no approved knowledge template. It is why CyberBrain framework is concerned to solve such problems by providing the various tools through a dynamic structure interconnectivity organization and communities.

Knowledge Service has been given over-needed results and has been spent much time to consume. Some services, for example current search engine, have given a lot of results. However some of them do not meet the user's needs and also miss semantic search keywords so knowledge consumers spend much time to retrieve needed knowledge results. CyberBrain can provide one stop-service to fulfill needed knowledge such as virtualized browser, web service knowledge portal, and advanced search engine with specific-needed information extraction like causality information (Pechsiri C. and Kawtrakul A., 2007). Related Technologies CyberBrain framework aggregates various knowledge domains from several resources. In order to extract them, Knowledge Engineering and Language Engineering have been applied:

• Language engineering is an important process which extracts target knowledge from unstructured texts into knowledge representation. Language Engineering can be classified into four steps:

2 http://en.wikipedia.org/wiki/Tacit_knowledge 3 http://en.wikipedia.org/wiki/Explicit_knowledge

546

Page 3: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

o Morphology Processing means a process on word segmentation (Sudprasert S., et al., 2005) and Name Entity Recognition (Chanlekha H., et al., 2004) which is a subtask of information extraction (Kawtrakul, A. and Yingsaeree, C., 2005) that finds how to locate and classify each element in text into predefined categories, as follows:

o Syntax processing or parsing is a process on analyzing the components of a

sentence in order to identify either extracted units and fill into the slots of IE predefined frames or identify the answer’s units of Q&A system.

o Semantic processing deals with the complexity of understanding and processing information such as word sense disambiguation.

o Discourse processing (Chareonsuk J., et al., 2005) is a process used for discourse relation recognition which can classify a relation among contingencies Elementary Discourse Units (so called EDUs) in order to support Q&A system such as “Know-Why” and “Know-How”.

• Knowledge Engineering is a process for choosing an appropriate representation (e.g. rule-based and frame/script) of the knowledge extracted using language processing and wisely applied to services. Knowledge Engineering has been defined mainly as five areas to deal with knowledge:

o Knowledge Acquisition as the knowledge identification and transformation of tacit knowledge and explicit knowledge into forms of the knowledge base (KB).

o Knowledge Storage consisting of the activities to systematically represent, store and retrieve knowledge.

o Knowledge Processing as the computing and reasoning of the existing knowledge for deriving a new knowledge.

o Knowledge Transfer as the process and platform for supporting knowledge transfer.

o Knowledge Utilization as the necessarily process for supporting the knowledge usage.

A Basic idea behind CyberBrain As mentioned above that the CyberBrain is a framework integrating technologies and tools categorized by interacted person: Knowledge owners, Knowledge brokers and Knowledge consumers.

Knowledge Brokers: will be trained to use appropriate tools and technologies for constructing CyberBrain. These tools are used for knowledge acquiring and managing. It is also possible to classify tools according to the type of knowledge owner:

Example: <sentence> เมอ / when <np> ใบ / leave </np><vp> เรมเหลอง / becomes yellow </vp> </sentence>

Example: The Bank of Thailand locates on the bank of Chao Praya River.

Example: <EDU1>แตงกวาสามารถขนไดในดนแทบทกชนด / cucumber can grow in every type of soils</EDU1> <relation cue> แต / but </relation cue> <EDU2> ดนทชอบคอ ดนรวนทราย / it likes sandy loam</EDU2>

Example: <animal>เพลยกระโดดสน าตาล / brown planthopper</animal> ทาให / make <plant>ตนขาว / rice </plant>มอาการ / have symptom <symptom>ใบเหลองแหงตาย / yellow, dry and die leaves </symptom> เปนหยอมๆ / in some areas

547

Page 4: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

• Tacit knowledge: using tools, Knowledge Template Tools, for extracting specific knowledge from tacit knowledge owners. The template will be filled up by expert or via knowledge broker. The template must be designed and committed to be a standard format of specific knowledge.

• Explicit knowledge: There are many tools for Explicit Knowledge management, such as Folksonomy Annotation Tool, Knowledge Map Browser.

o Document Classification tools: Document clustering and supervised document classifications have been used to enhance information retrieval performance. Those approaches are based on clustering hypothesis, which states that documents having similar contents are also relevant to the same query. A fixed collection of text is clustered into groups or clusters that have similar contents. Text categorization and document clustering consist of two parts: a learning process to provide prototypes for each cluster of documents and a clustering process, which computes the similarity between input document and prototype.

o Knowledge Extraction tools: They are used to extract specific knowledge from domain resources such as rice knowledge from several websites, news, or documents.

o PMM Concept: PMM Map is the process of bringing together information from different sources and structures scattered in the web. Ontology based integration approach is used as the separation agent from the data storage providing higher degree of abstraction, extensibility and reusability. Ontology in PMM is provided by Automatic Ontology construction tool (Imsombut A. and Kawtrakul A., 2007)

Knowledge Consumer: There are many services used for solving problems and making decision support for the end users.

• The Web Service Knowledge Portal for aggregating knowledge that related to each other to access to the knowledge easily, introducing the one stop-service on website.

• The Intelligent Search Engine is an information retrieval system designed to help in finding precise information. With knowledge base (e.g. ontology, soundex), the query could be expanded. With Knowledge mining, specific-information will be extracted such as Know-Why.

• Question Answering System is a service that attempts to deal with a wide range of question types including Know-How, Know-Who (Thamvijit D., et al., 2005), Know-What, and Know-Why.

• Collaborative/Social Annotation Tools such as Folksonomy annotation tools. The annotation tools will enable collaboration not only between experts, but also between creators and consumers of contents, in order to create and manage content categories (see Figure1).

Figure1. Folksonomy Annotation Tool For Rice Blast Knowledge.

548

Page 5: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

A Design of CyberBrain Framework CyberBrain integrates three types of interacted users (Figure 2) – content owner, knowledge broker, and knowledge consumer – so CyberBrain becomes alive. By numerous knowledge brokers, CyberBrain knowledge is acquired and maintained from many sources of content owners via digitized tacit and explicit knowledge. Knowledge Portal and Intelligent Search Engine are the samples of CyberBrain yield from enormous knowledge acquiring.

Figure2. Cyber Brain: A Knowledge Framework for Social Intelligence Construction.

CyberBrain knowledge derives from various sources of knowledge holders either tacit or

explicit knowledge. Via numerous technologies and tools (see Figure3), CyberBrain can be maintained and provides various services – Visualized Browser, Intelligent Search Engine, Q&A Service, and Web Service Knowledge Portal.

Figure3. CyberBrain Technologies and tools for various services

Policy Maker

549

Page 6: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

Case Study The current implementation of CyberBrain focuses on rice knowledge that can be classified into many knowledge sub-domains (see Figure4). Each specific domain is extracted from various sources such as rice experts, rice databases, and unstructured texts from several websites. To manage knowledge, language processing are needed to acquire the precise knowledge from rice knowledge sub-domain. For Explicit Knowledge, we use document classification tools. In Figure 4, surveying knowledge from numerous websites can be classified into different groups. Knowledge Extraction in disease knowledge may seem as name, characteristic, color, size, remedy approaches, protection and prevention. Folksonomy Annotation Tool can be used to tag and add more semantic concepts and subjective information from viewers. Tacit knowledge is managed by using Knowledge Template Tools due to the knowledge which corresponds to almost person insight and not a digital format. It is used to build template for different kind of knowledge. Knowledge broker uses this template for extracting and storing the tacit knowledge from knowledge owner. Rice tacit knowledge can be acquired as experience, skills, folk wisdom, and etc.

Figure4. Explicit Knowledge in Rice sub-domain

Examples of knowledge resources from surveying and classification Available at (A, B, E) 4 (C) 5(D) newspaper web site (F) 6 (I) 7, (J) 8(K) 9(L) 10

After the knowledge acquisition and storage process have been done, knowledge is feed

into PMM concept which aggregates with much related rice domain knowledge – such as disease characteristic, expert, and disease remedy approach. Since knowledge is acquired into CyberBrain Ecosystem, knowledge service is utilized from knowledge ecosystem through some Visualized Browsers.

4 www.doa.go.th/plp/homepage/homepage.htm 5 www.dit.go.th, www.oae.go.th, www.tmd.go.th, www.oae.go.th/mis/predict 6 www.dit.go.th, www.riceexporters.or.th/price.htm 7 www.doa.go.th/apsrdo/index.html, www.soilwafer.com, www.doa.go.th/AedWeb/main.htm 8 www.doa.go.th, www.doae.go.th 9 www.doa.go.th/pprdo/index.htm, www.phtnet.org, www.doa.go.th/pprdo/index.htm 10 www.doa.go.th/rri/, www.doae.go.th 11 www.moac.go.th, www.dit.go.th

550

Page 7: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

For example, the Disease Visualized Browser11 (see Figure5) points out the related map of rice disease knowledge and can help farmers, via knowledge broker, with disease remedy approaches. Additionally, the Visualized Browser can provide search engine for rice disease domain.

Figure5. PMM browser visualization

(A) node refers to a methodology; disease remedy, (B) nodes are instances of methodology; remedy approach, (C) node refers to the person (M) related to linked concept domain, (D) refers

to a problem node; rice disease, and (E) refers to the characteristics of problem nodes. Conclusion CyberBrain is a framework to help (1) the maintenance of specific domains from every knowledge owner and (2) the dissemination to answer the questions of knowledge consumer through various services. CyberBrain has been applied first to the agriculture domain and has been assessed regarding the rice subdomain. Rice Knowledge includes breed, how to plant, the characteristics of a disease, how to prevent a pest, the current market price, tailor-made fertilizer. Knowledge brokers extract knowledge from expert, books, internet contents and cooperate with government organization by using knowledge acquisition tools and technologies to construct. The services about Rice, implemented over CyberBrain called ALRO-CYBERBRAIN, include the Rice Knowledge Web Portal, the Precision Fertilizing Expert System, and etc. Additionally, CyberBrain has been developed in order to support tailor-made decision, tailor-made answering, and precision tracking event system. The current design of CyberBrain has been focused on the final target of the platform, a center of knowledge services including

A B

C

D

E

551

Page 8: CyberBrain: Towards the Next Generation Social Intelligence · IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT o Morphology Processing means a process on

IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

technologies and tools to support efficiently dynamic structure framework function. To achieve these goals, effective knowledge integration has been needed from cooperative organizations and experts. Acknowledgment The CyberBrain has been supported by NECTEC and Kasetsart University. We would like to give a special thank to the members of Agricultural Land Reform Office of Ministry of Agriculture and Cooperatives for their support in user friendly feedback and system evaluation. Reference

Chanlekha H., and Kawtrakul A. (2004) Thai Named Entity Extraction by incorporating Maximum Entropy Model with Simple Heuristic Information, IJCNLP’ 2004, Hainan Island , China, 2004.

Chareonsuk J., Sukvaree T. and Kawtrakul A. (2005) Element Discourse Unit Segmentation for Thai Discourse Cues and Syntactic Information, NCSEC 2005, Bangkok, THAILAND, 2005.

Imsombut A., and Kawtrakul A. (2007) Automatic building of an ontology on the basis of text corpora in Thai, Language Resources and Evaluation Journal special issue on Asian Language technology, December, 2007

Kawtrakul, A., and Yingsaeree, C. (2005) A Unified Framework for Automatic Metadata Extraction from Electronic Document. In Proceedings of The International Advanced Digital Library Conference. Nagoya, Japan.

Pechsiri, C. and Kawtrakul, A., Mining Causality from Texts for Question Answering System, in IEICE on Transactions on Information and Systems, Vol. 90-D, No 10, Oct 2007.

Sriswasdi W., Luengsrisagoon S., Lorsuwansiri N., Sriswasdi W., Wuttilerdcharoenwong S. and Kawtrakul A. (2008) A Smart Mobilized Fertilizing Expert System: 123 Personalized Fertilizer, In Proceedings of AFITA 2008. Tokyo, Japan.

Sudprasert S., and Kawtrakul A. (2003) Thai Word Segmentation based on Global and Local Unsupervised Learning, NCSEC’2003, Chonburi, Thailand, 2003

Thamvijit D., Chanlekha H., Sirigayon C., Permpool P., and Kawtrakul A. (2005) Person Information Extraction from the Web, SNLP 2005, Chiang Rai, Thailand, 2005.

552