APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... ·...

140

Transcript of APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... ·...

Page 1: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

A PEER-TO-PEER (P2P) BASED SEMANTICAGREEMENT APPROACH FOR SPATIALINFORMATION INTEROPERABILITY

DISSERTATION

Submitted to the Doctoral Program of Information TechnologyGunadarma University

as a partial ful�llment of the requirements for the degree ofDoctor of Information Technology

I WAYAN SIMRI WICAKSANA99203105

DOCTORAL PROGRAM OF INFORMATION TECHNOLOGYGUNADARMA UNIVERSITY

January, 2006

Page 2: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

LOGO GUNADARMA ONLYGOLD COLOR

EMBOSS PRINTING

ii

Page 3: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

ABSTRACT

Recently, research on information systems has increasingly focused on how toe�ectively manage and share data in heterogeneous distributed environments.Various data sources can be accessed online in the web, including web pages,semi-structured documents (XML, RDF, etc.) and spatially referenced data.

The need for sharing data stems from (i) the explosive growth of the weband the ability to interconnect a growing number of information sources, (ii)the increasing availability of autonomous data sets, and (iii) rising acquisitioncosts of complex non traditional data. For example, geographic informationsystems are more and more used in location based applications and decisionsupport systems that rely on combining data from multiple data sources.Spatial data are very costly to collect and are often independently createdby di�erent institutions.

This thesis focuses on interoperability of spatial information based onP2P based data management and semantic mediation. The approach basedon a P2P for semantic interoperability of information sources that aims tocombine the advantages of semantic mediation and peer-to-peer systems isproposed. It is based on a hybrid P2P architecture consisting of three typesof peers. The super peer contains a reference ontology which provides ataxonomy description of the common domain. The provider peer containsexport schema which represent local data, The request peer submits queriesto provider peers.

The details of matching peer content and describe the phases of queryprocessing based on the peer agreement approach is presented and the un-derlying architecture at chapter 4. And the development of a prototype toevaluate the proposed method is described at chapter 5. The prototype usesOWL to describe reference and local ontology.

iii

Page 4: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

A PEER-TO-PEER (P2P) BASED SEMANTICAGREEMENT APPROACH FOR SPATIALINFORMATION INTEROPERABILITY

DISSERTATION

I WAYAN SIMRI WICAKSANA

Approved by:

Prof. Suryo GuritnoSupervisor

Prof. Kokou YétongnonCo-Supervisor

Prof Sebti FofouCo-Supervisor

Dr. Djati KeramiCo-Supervisor

Jakarta, January 23rd 2006

iv

Page 5: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Title of dissertation A Peer-to-peer (P2P) Based Semantic Agreement Approachfor Spatial Information Interoperability

Name of student I Wayan Simri Wicaksana, SSi, MEngStudent register number 99203105

Committee of supervisorSupervisor Prof. Suryo GuritnoCo-Supervisor Prof. Kokou YétongnonCo-Supervisor Prof. Sebti FoufouCo-Supervisor Dr. Djati Kerami

Committee of examinationPresident Prof. Dr. Yuhara Sukra, M.ScMembers Prof. Dr. E.S. Margianti, SE, MM

Prof. Suryadi H.S., SSi., MMSIProf. Dr. Belawati H. WidjajaProf. Vincent VajnovzkiProf. Dr. Dali Santun NagaProf. Yannuarsyah HaroenDr. Ing. Farid Wazdi

v

Page 6: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CURRICULUM VITAE

Name I Wayan Simri WicaksanaDate of birth Surabaya, June 11th 1964Bachelor SSi, Physic Department, University of Indonesia, Jakarta, 1988Master M.Eng, CIM, Swinburne University of Technology, Melbourne, 1992

vi

Page 7: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

ACKNOWLEDGMENTS

In preparing this thesis, I am highly indebted to pass my heartfelt thanks toHyang Widhi and the many people who helped me in one way or another.

First, and above all, I would like to acknowledge my sincerest, special,deep appreciation and trough thanks to my advisor and associate advisors:Prof. Suryo Guritno, Prof Kokou Yétongnon, and Prof. Sebti Foufou. With-out their insights and guidance this thesis would not have been possible.During the last two years, my associate advisor, Kokou Yétongnon played acrucial role in my life. His insights, support and patience allowed me to growand become the person that I am today.

I would like to appreciate of support from: President of Gunadarma Uni-versity Prof. Dr. E.S. Margianti, SE, Vice President of Gunadarma Uni-versity Prof. Suryadi H.S., SSi, and Director of Computer Science DoctoralProgram Gunadarma University Dr. Djati Kerami.

I would like to appreciate to Laboratoire Electronique, Informatique etImage (LE2I) Université de Bourgogne for valuable facility and support dur-ing my research since 2 years ago.

I am also blessed with wonderful friends at Database Workgroup LE2IUniversité de Bourgogne and doctoral students of Gunadarma University inmany ways, my successes are theirs, too. Deep thanks to my little brotherMadeWiryana, who has discussed with me, since prepareation of my proposalresearch until �nishing touch of my dissertation.

When I look back to where and how it all started, the answer "who bringme to the world". My parents, Made Sutrisna, Siti Kadarwati, NyomanRana, Ketut Latri and all the people revolving around them provided aunique environment that prepard me for this journey. I was unbelievablelucky to belong to that world, and all I wish to create a similar experiencefor my own kids.

Finally, the three people who give sense to my life and work are my wifeN. Christiany Y. Wicaksana and my two boys Pandu Wicaksana & Mah-esa Wicaksana. Without their love, support, and sacri�ce, this dissertationwould have never happened. I love you.

vii

Page 8: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiApproval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivCurriculum Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . viAcknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Scope and Objective . . . . . . . . . . . . . . . . . . . . . . . 41.3 Research overview: Peer Agreement Approach and Architecture 8

1.3.1 P2P architecture . . . . . . . . . . . . . . . . . . . . . 101.3.2 Prototype . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Literature Review 142.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.1 Semantic Interoperability . . . . . . . . . . . . . . . . . 142.1.2 GIS Interoperability . . . . . . . . . . . . . . . . . . . 152.1.3 Peer-to-Peer (P2P) . . . . . . . . . . . . . . . . . . . . 16

2.2 Semantic Interoperability . . . . . . . . . . . . . . . . . . . . . 182.2.1 The main problems . . . . . . . . . . . . . . . . . . . . 182.2.2 Semantic Interoperability Approaches . . . . . . . . . 23

2.2.2.1 Tightly Coupled Approach . . . . . . . . . . . 232.2.2.2 Loosely Coupled Approach . . . . . . . . . . 31

viii

Page 9: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

2.2.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . 362.2.3.1 Comparison Criteria . . . . . . . . . . . . . . 362.2.3.2 Comparison Table . . . . . . . . . . . . . . . 37

2.3 GIS Interoperability . . . . . . . . . . . . . . . . . . . . . . . 402.3.1 GIS Interoperability . . . . . . . . . . . . . . . . . . . 40

2.3.1.1 GIS and Web Technology . . . . . . . . . . . 422.3.1.2 Standardization . . . . . . . . . . . . . . . . 422.3.1.3 Mediation Approach . . . . . . . . . . . . . . 43

2.3.2 Emerging Solution . . . . . . . . . . . . . . . . . . . . 462.3.2.1 GIS Interoperability and P2P . . . . . . . . . 462.3.2.2 Semantic Web and Peer-to-Peer . . . . . . . . 482.3.2.3 Semantic Mediation, GIS Interoperability, and

Peer-to-Peer . . . . . . . . . . . . . . . . . . . 492.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.4.1 Summary of Semantic Interoperability . . . . . . . . . 512.4.2 Summary of GIS Interoperability . . . . . . . . . . . . 52

3 Methodology 533.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.2 Research Framework . . . . . . . . . . . . . . . . . . . . . . . 54

3.2.1 Semantic Interoperability, characteristics and issues . . 553.2.2 GIS Interoperability, characteristics and issues . . . . 573.2.3 P2P Architecture, characteristics and issues . . . . . . 593.2.4 Query Processing, characteristics and issues . . . . . . 60

3.3 Overview of Approach . . . . . . . . . . . . . . . . . . . . . . 623.3.1 The Local Context . . . . . . . . . . . . . . . . . . . . 633.3.2 The Community Context . . . . . . . . . . . . . . . . 633.3.3 The Agreement Context . . . . . . . . . . . . . . . . . 643.3.4 Peer Component . . . . . . . . . . . . . . . . . . . . . 653.3.5 Agreement Process . . . . . . . . . . . . . . . . . . . . 66

3.4 Contribution and Results . . . . . . . . . . . . . . . . . . . . 69

4 Discussion of Approach 714.1 Representation of Peer's Context . . . . . . . . . . . . . . . . 72

4.1.1 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

ix

Page 10: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

4.1.2 RDF and OWL . . . . . . . . . . . . . . . . . . . . . . 754.1.3 Peer's Representation . . . . . . . . . . . . . . . . . . . 76

4.2 The Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . 764.2.1 Similarity Measurement . . . . . . . . . . . . . . . . . 78

4.2.1.1 Label Matching . . . . . . . . . . . . . . . . . 784.2.1.2 Internal Structure . . . . . . . . . . . . . . . 814.2.1.3 External Structure . . . . . . . . . . . . . . . 82

4.2.2 Agreement Unit . . . . . . . . . . . . . . . . . . . . . . 844.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864.4 Query Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.5 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5 Result of Prototype 1035.1 Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 105

5.2.1 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 1055.2.2 Common Ontology and Export Schema . . . . . . . . . 106

5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075.4 Result and Discussion . . . . . . . . . . . . . . . . . . . . . . 1085.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6 Conclusion and Future Work 1116.1 Research Objective . . . . . . . . . . . . . . . . . . . . . . . . 1116.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.3 Constribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Bibliography 116

x

Page 11: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

List of Figures

1.1 Peer contents (fragments of ontology and export schema) . . . . . 51.2 Components of the peer agreement approach . . . . . . . . . . . 8

2.1 Classi�cation of Semantic Interoperability Approaches . . . . . 242.2 Architecture Global Integrated Schema . . . . . . . . . . . . . 242.3 Garlic Architecture, from [19] . . . . . . . . . . . . . . . . . . 252.4 InfoMaster Architecture, from [43] . . . . . . . . . . . . . . . 262.5 MOMIS Architecture, from [10] . . . . . . . . . . . . . . . . . 262.6 Global Integrated Ontology . . . . . . . . . . . . . . . . . . . 282.7 COG Architecture, from [17] . . . . . . . . . . . . . . . . . . 292.8 SIMS Architecture, from [2] . . . . . . . . . . . . . . . . . . . 302.9 Mediation Semantic Integration Approach . . . . . . . . . . . 322.10 BUSTER Architecture, from [104] . . . . . . . . . . . . . . . 332.11 COIN Architecture, from [46] . . . . . . . . . . . . . . . . . . 342.12 OBSERVER Architecture, from [68] . . . . . . . . . . . . . . 362.13 TSIMMIS Architecture, from [22] . . . . . . . . . . . . . . . . 372.14 ISIS Architecture [61] . . . . . . . . . . . . . . . . . . . . . . 442.15 SemWeb Architecture, from [16] . . . . . . . . . . . . . . . . . 452.16 APPOINT Architecture, from [99] . . . . . . . . . . . . . . . 482.17 PSW InfoQuilt Architecture, from [3] . . . . . . . . . . . . . 50

3.1 Four Pillars Research Famework . . . . . . . . . . . . . . . . 553.2 Peer Agreement Famework . . . . . . . . . . . . . . . . . . . 62

4.1 Type of Peers & Basic Task . . . . . . . . . . . . . . . . . . . 724.2 Example Common Ontology and Export Schema of Provider

Peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.3 Di�erent Point of View on the same Road Network . . . . . . 88

xi

Page 12: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

4.4 Schema and Agreement between S1 and S2 . . . . . . . . . . 884.5 Example of Agreement Unit . . . . . . . . . . . . . . . . . . . 914.6 Example of Peer Agreement . . . . . . . . . . . . . . . . . . . 924.7 Example of Multi Peer Agreement . . . . . . . . . . . . . . . . 944.8 General Architecture Peer Agreement . . . . . . . . . . . . . 994.9 Registration & Agreement Phase at Peer Agreement . . . . . . 1004.10 Query Rewriting Phase at Peer Agreement . . . . . . . . . . . 1014.11 Query Respond Phase at Peer Agreement . . . . . . . . . . . 102

5.1 Virtual Machine Prototype . . . . . . . . . . . . . . . . . . . 1045.2 Virtual Web Prototype . . . . . . . . . . . . . . . . . . . . . 104

xii

Page 13: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

List of Tables

2.1 Heterogeneity in Information System, from [89] . . . . . . . . 212.2 Di�erent Layers of Semantic Interoperability, from [102] . . . . 222.3 Comparison Semantic Integration/Interoperability . . . . . . 382.4 Comparison Semantic Mediation Approach for GIS Interoper-

ability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.1 Label Matching of Concepts/Classes . . . . . . . . . . . . . . 804.2 Agreement Unit for Simple Mapping . . . . . . . . . . . . . . 854.3 Comparison of Web Query Language [30] . . . . . . . . . . . 964.4 Comparing RDF Query Language [58] . . . . . . . . . . . . . 97

5.1 Result of Concept of Agreement Unit . . . . . . . . . . . . . . 109

xiii

Page 14: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 1

Introduction

More and more institutions and government agencies make policy planningand decisions based on information that are gathered from various sources.These sources often represent information at di�erent levels, including na-tional, province, district and sub-districts. Each level has di�erent needs inthe types, the models and the semantic requirements for collecting and rep-resenting information. To provide support for these information centric ac-tivities, more and more applications and decision support systems are basedon integrated and interoperable systems that combine data from both tra-ditional and geo referenced systems. The integrated data can be accessed,manipulated and shared by a diversity of users in dynamic environmentsusing various interfaces and tools.

Information systems interoperability have been extensively studied in tra-ditional data integration research community and many issues related to theresolution of data con�ict have been addressed using di�erent methodologiesand approaches. These approaches range from database translation, databasefederation, to semantic data mediation based on wrapper and mediator ar-chitectures. There have been several attempts at extending interoperabilityapproaches to deal with con�ict resolution in complex data and applicationssuch geographic information systems, resulting in the de�nition of severalstandards and recommendations for spatial information integration. For ex-ample, the OpenGISTM Consortium, an association of software developers,government agencies, has de�ned standards and requirements to aid geo-graphic information and spatial interoperability.

1

Page 15: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 2

Several studies point out the need for resolving data con�icts at higherlevel of abstraction where the meaning associated with data can be preciselyrepresented and used to create correspondences among heterogeneous sys-tems. At a high abstraction level, the main concern is to specify what isrepresented instead of how it is represented. As an example of a decisionsupport system based on the integration of multiple information sources,consider an application for an emergency and rescue team. A user of the sys-tem may be interested in tra�c instructions to the location of a car accident.He or she is not interested in how the information is stored or which datamodel is being used, but in the value and meaning of the information itself.The user want to know "which is the best road to the location of accident?" Moreover, the user would like to access information recorded in relatedsystems such as nearby hospitals or police department in order to provide agood care for the accident victims.

1.1 BackgroundRecently, the focus of research on integrated information systems has shiftedto the de�nition of methodologies, architectures and tools to allow e�ectivemanagement and sharing of data in heterogeneous distributed environments.Large volume of data in a diversity of formats can be accessed in the web,including web pages, semi-structured documents (XML, RDF, etc.) andspatially referenced data. The need for sharing data stems from (1) theexplosive growth of the web and the ability to interconnect a growing numberof information sources, (2) the increasing availability of autonomous datasets, and (3) rising acquisition costs of complex non traditional data. Forexample, geographic information, which are more and more used in locationbased applications and decision support systems, are very costly to collectand need to be reused by di�erent organizations.

Spurred by the recent success and popularity of �le and multimedia datasharing systems such as Napster, Gnutella etc., P2P architectures are increas-ingly used for sharing computer resources, including computing and datastorage capabilities. However, e�ective information sharing in distributedP2P based environments must address several important challenges, includ-ing resource discovery and localization, mediation over heterogeneous data

Page 16: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 3

formats, semantics modeling, and peer based query processing. Traditionally,data sharing from disparate sources is carried out by creating uni�ed inte-grated schemas to combine the contents of the heterogeneous data sources.This approach does not scale well when applied in dynamic distributed en-vironments and has many drawbacks related to the large numbers of peers.An alternative solution more appropriate for server oriented distributed en-vironments is the semantic mediation approach which uses (1)wrappers tomap local schema to common data model and to provide low level data ac-cess functions and (2) a mediator to create integrated views over the localsources. The mediators support the discovery of relevant data sources andprovides various query processing services.

P2P systems consist of peers organized in dynamic architectures. Peerscan join or leave autonomously and frequently. Several applications of P2Pnetworks can be distinguished, ranging from content sharing applications(e.g. Napster [72], Gnutella [45]) to distributed computing applications (e.g.SETI@home [88], Avaki [4], Entropia [36]) and development support plat-forms (JXTA [57]). Generally, two main categories of P2P systems can bedistinguished. Unstructured P2P systems organize peers in networked spaces.Each peer controls and maintains its shared data. User queries are based on1) centralized directory models where one or more servers are used to recordand locate data and services, or 2) a query routing model that essentially�oods the network to determine relevant peers that are likely to contain therequested data. By contrast, structured P2P architectures organize data in akey spaces divided into segments. A number of segments is assigned to eachparticipant peer. User queries are based on a Distributed Hashing Table(DHT) built on the top of the overlay structure of peers.

Extensions of the semantic mediation approach have been proposed forspatial information systems and more recently for P2P based environments.The GeoNis Project [93, 91] is based on the ORCHIDEA [92] mediator plat-forms. It is a middleware, that incorporates GinisWeb[95], GinisNT[94], etc.The orchidea ontology is the basis for de�ning top-level and hybrid modelontology and for processing query based on SQL using XML-DTD. The ISIS(Interoperable Spatial Information System)project [61] is a semantic media-tion model that uses an OO data model and a mediation architecture basedon multi agent paradigm to support GIS interoperability. The Semantic Web

Page 17: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 4

(SemWeb) research project [15] de�nes and uses contexts to represent the se-mantics of information. A context comprises a set of rules and constraintsassociated with the de�nition of an object.

In P2P environments, semantic mediation is used for both simple �lesharing and more complex ontology based information discovery. APPOINT[99] is an enhanced P2P systems which aims to reduce server load in the ex-change of large map �les. DBGlobe [81] is used to support mobile computingfor GIS data with P2P while P2PGIS [5] is a product that supports rapidonline mapping with OPUS technology. Another semantic based P2P sys-tem is the P2P Semantic Web (PSW) [3] which provides a set of tools for thediscovery of relevant ontologies, facilitating the reuse of existing ontologiesand the advertising the resulting ontologies. BUSTER [107] is a P2P databroker which o�ers intelligent data retrieval, coupled with data integrationand semantic translation services [103].

1.2 Scope and ObjectiveThe focus of the thesis is on P2P based semantic interoperability of spa-tial information systems. We aim to de�ne a decision support frameworkthat can provide users with the possibility of linking distributed informationsystems, including spatial and geo-reference systems. Information sharingis a key feature of the proposed system and is realized via the interactionsamong the autonomous peers. Early systems that enable information sharingprimarily focused on low level communication protocols for transferring �lebetween system. To share information in these systems, the provider �rstconverts entire data sets or databases to the receiver's format or a commondata model and then transfers the result at the �le level. Semantic interop-erability deals with information sharing at the abstraction level. To shareinformation the di�erent information sources or peers must share a genericcommon semantics. Each peer interprets this semantics in terms of localconstraints and concepts.

To illustrate some important aspects of information sharing at the ab-straction level, consider a transportation network of a city. This network ismodeled and used in many city management activities (e.g. planning, publictransportation, water mains, electric network etc...). Consider two compa-

Page 18: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 5

Figure 1.1: Peer contents (fragments of ontology and export schema)

nies that use the transportation networks. We will refer to them as providerpeers in the example. One is a toll collection company (Provider peer PP2)and the second is a design and development company (Provider peer PP2).Figure 1.1shows the fragments of the export schemas of the provider peersPP1 and PP2 and part of a common ontology shared by the peers. Providerpeer PP1 as shown in the schema fragment of �gure 1.1.b views the sharedtransportation network as a point to point network. This provider peer is in-terested in land transportation network characteristics related to the natureand types of the road. On the other hand provider peer PP2, as shown bythe schema fragment in �gure 1.1.c, requires detail road information such assize, width, length and tra�c capacity. To exchange data the two providerpeers must view their respective schema as an interpretation of the commonontology. An interpretation must de�ne the correspondences between theontology concepts and the local data of the peer. It must take into accountthe structures, semantics and dynamic properties of the concepts.

The proposed research on information sharing framework is based on fourmajor pillars: semantic mediation, GIS interoperability, P2P architectures,

Page 19: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 6

distributed query processing. Two main categories of problems related toinformation sharing and interoperable systems will be particularly addressedby this research.

• One problem addressed by the thesis is the resolution of semantic dif-ferences among peers. This problem refers to data modeling, semanticrepresentation and mapping issues. Information sharing among di�er-ent sources is often hindered by heterogeneity problems caused by thedi�erent design choices of the participant systems. Information hetero-geneity can occur at di�erent levels: syntactic, structural and seman-tic. Syntactic and structural heterogeneity issues can be solved at themodeling level by translating the information sources to a common datamodel and standard. In summary, heterogeneity issues can be easily re-solved in centralized environments by the creation of integrated globalschemas and the de�nition of mappings that links the global schemato the local schemas of the sources. The schema mappings are used totranslate the global semantics to the local ones. Semantic interoper-ability is more acute in dynamic and autonomous environments, due tothe lack of relationships among sources. When two or more peers coop-erate, there is a need to determine whether their contents overlap or arerelated. This requires the speci�cation of (1) the semantics associatedwith each peer and (2) their semantic mappings. Semantic mapping ofpeer schemas in dynamic environments is a di�cult problem that hasrecently received a great deal of attention. Semantic interoperability isessentially based on having a common understanding of the meaningof the information exchanged by di�erent sources. It's a multi-levelproblem that can occur in data sources, formats or models.

• The other main problem is the implementation of a distributed ar-chitecture to guarantee the autonomy of the participant data sources.This addresses the technical issues of realizing a distributed architec-ture for information sharing. P2P can provide the infrastructure fordynamic environment in which autonomous and independent peers canjoint or leave the network easily and frequently. However in a diverseand large community, it is hard to discover relevant data for decisionmaking. P2P architectures allow the development of communities of

Page 20: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 7

systems which share common interest. The peers can be be 'clustered'based on the same interest to facilitate the search for relevant data.The proposed research will focus on developing a prototype based onthe P2P paradigm to validate our mediation approach. Query process-ing issues must be addressed to allow the peers to formulate globalqueries in terms of the concepts of the common ontology, to broadcastthe queries to relevant peers, and to combine the best responses into a�nal result.

From the above discussion, two important facts can be stated . First,information sharing among di�erent sources requires a common set of globalconcepts with well de�ned and agreed upon understanding. Second, as shownby the example in �gure 1.1 each data source de�nes a local interpretationto link local concepts to the corresponding global concepts. Semantic me-diation between two peers is achieved through the local interpretation ofthe involved peers. To represent the global set of concepts we rely ontologymodeling and related tools. Ontologies are increasingly used to capture themeaning and relationships among concepts used to describe the contents ofdatabase systems. Gruber [47] de�nes an ontology as an explicit speci�cationof a conceptualization of real world entities of an application domain. Anontology is a vocabulary composed of terms and relationships among them.Several representation languages are proposed for modeling and reasoning onthe contents of data sources. They range from informal natural languages toformal languages based on predicate logic or graph concepts. Among the for-mal languages, OWL which has recently gained in popularity, is a descriptionlanguage aimed at incorporating a theory based semantics and an ontologicalinference and reasoning mechanism into RDF.

The main objective of this research is to (1) develop a comprehensivesemantic mediation approach to capture the semantics of the data containedin provider peers, (2) provide a mechanism and modeling concepts for de�n-ing and representing semantic mappings or agreement among the di�erentsemantics of the peers, and (3) implement the comprehensive mediation ap-proach in a P2P based architecture. In summary, the research focuses on thefollowing questions:

• How to represent the sources?

Page 21: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 8

Figure 1.2: Components of the peer agreement approach

• How to resolve diversity among data sources of syntactic, structure,semantic?

• What data models for e�cient sharing of both traditional and spatialinformation?

• How to represent peer's content?

• How to map or establish correspondence among peers?

• How to write a query refer to an agreement of peer?

1.3 Research overview: Peer Agreement Ap-proach and Architecture

This thesis is mainly concerned with the development of a semantic media-tion approach between provider peers. It aims to combine the advantages of

Page 22: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 9

semantic mediation and peer-to-peer systems. The key idea of this researchis the peer agreement. The semantic mediation between peers involves thecreation of agreement units to bridge the semantic gap between their con-tents. This requires the de�nition and computation of semantic similaritymeasures among the concepts used by the di�erent peers. This can be doneby �rst matching the label associated with the concepts to determine con-cepts that are related. Next agreement units are created to express semanticrelations among the concepts. In the following we present overview of theagreement based semantic mediation approach. Then we present the dif-ferent types of peers and describe their contents. Finally discuss similarityissues in semantic matching and present the agreement units.

Figure 1.2 depicts the components of the peer agreement approach. AP2P system ∏

=< P ,A >, is a set P = {P1, ..., Pn} of peers and a set A ofagreements. Three types of peers can be distinguished in the approach: superpeer, provider peer and request peers The main task of the Super peers (SP)is to register active provider and request peers and de�ne the metadata usedto describe their contexts. They are used to maintain global ontologies andto provide concept taxonomies for common domains of applications. Thedescription of the ontologies are made of speci�cations in OWL for globalconcepts, classes and their relationships. The ontologies provide the basisfor semantic matching or reconciliation between local systems and for theconversion and exchange of information between peers. The Provider peers(PP) are used to export and share information with other peers. They containseveral elements: 1) export schemas representing views over shared dataaccessible by other peers, 2) locally accessible databases, and 3) agreementinformation for creating links between export schema and global ontologies.Finally, request peers (RP) are used to submit queries to one or more providerpeers. The agreement information are composed of agreement units whichencapsulate one or more global ontology concept, export schema concepts andtransformation rules used to map export schema into the shared ontology.

The calculation of semantic similarity between concepts is the �rst stepfor creating agreements between the ontology and the contents of the otherpeers (provider or request). Each concept can be represented as a hierarchy ofterminological labels which contains some structural or semantic information.

The semantic similarity can be done by �rst matching the label associ-

Page 23: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 10

ated with the concepts to determine concepts that are related. There aretwo common techniques at label matching [44]. First, a string analysis pre-processing step is used to transform the labels into separate words prior tolinguistic analysis. For example, this step can be used to expand abbrevi-ations and replace acronyms. Next, the labels are matched by determiningrelations between them. This can be done based on thesaurus or WordNetrelations.

The label matching is applied at the top concept level to compute sim-ilarity measures. The result of this matching process can be �ne tuned byconsidering both the internal structure and the external structure of the con-cept represented by a label. The internal structure of a concept consists ofits directly linked attributes while the external structure takes into accountthe position of a concept in a hierarchy. It is based on the internal structuresof entities (object properties and data properties). The properties can belanguage related (see OWL �le) or real world entity based attributes. Theexternal structure is based on comparing the position of the entities within ataxonomy. Two methods can be used to calculate the value of the similarity.The �rst method of external structure is based on Cupid structure matching[64] which considers the similarity of leaves or atomic data described by aschema. The second method of external structure refers to upward cotopicdistance [42] which compares the similarity of the set of super classes.

Once the semantic similarities are computing, they are used to de�nethe agreement units. Each agreement unit de�nes a semantic interpretationor view of one or more ontology concept, thereby adapting the semantic ofglobal shared concepts to the local view and constraints of a provider peer.In its simplest form, an agreement unit de�nes one-to-one or one-to-manymappings between (Ci

CO, {CjES}) where Ci

CO is an ontology and CiES is an

export schema concept. Agreement unit can consists of logical and instanceslevel mapping.

1.3.1 P2P architectureThe Agreement approach architecture is based on a hybrid P2P architectureconsisting of three types of peers. Information exchange is carried out directlyamong peers. The super peer is used to register active provider and request

Page 24: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 11

peers and the metadata used to describe their contexts. To avoid the superpeer becoming a single point of failure, the number of tasks performed by thesuper peer must be reduced to a minimum. The architecture is based on thefollowing assumptions: 1) a common ontology is available and shared by theparticipant peers, 2) each provider peer contains an export schema used todescribe the data it wants to share with other, 3) request peers have a schemawhich describes their contexts, (4) a common model is used to represent theontology and the exported schema. Information system interoperability inthe peer agreement architecture is carried out in two phases. First, theprovider peers register and advertise their contents by connecting to thesuper peer and establishing semantic links between their contents and theontology concept hierarchy. Next, the peers interact and exchange data andmetadata to rewrite queries and transfer real data corresponding to queries.

1.3.2 PrototypeExperiment Setup. A prototype of the architecture is underway to validatethe peer based semantic processing approach. In the initial setup, two SunWorkstations W1100 running under SuSE Linux9.3 and the QEMU emula-tion software are used to create virtual network for peer simulation purposes.The main objective of the initial setup is to test the communication layer be-tween peers of the P2P architecture. The Protégé tool suite has been installedand tested on some virtual machines. The goal is to test the access to somepeers through HTTP. Protégé is a tool for developing common ontology andlocal schemas/ ontologies using OWL and several plugins. Query executionis done through Query tab and SPARQL in Protégé. Some initial evalua-tions have shown that the virtual machine and the network simulation arenot very appropriate because the respond and performance is very low. Analternative is to implement a virtual web server for our experiment scenario.

Experiment Results. In our evaluation scenario there are three providerpeers which have di�erent concepts and classi�cations of of the road conceptas discussed in the example above, and two other provider peers in di�er-ent interest (business and education). A Common ontology based on theOntology of Transportation Networks (OTN), ISO 19107, Geogaphic DataFiles (GDF), and National Road Network Canada. There are 103 classes, 37

Page 25: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 12

object properties, and 89 data properties. The local ontologies are developedfrom a combination some sources. In the initial evaluation, the contents ofthe �ve provider peers are as follows. PP1 has 22 classes, and around 50 indi-viduals. PP2 has 17 classes, and around 50 individuals. PP3 has 10 classes,and around 50 individuals. PP4 has 23 classes, and around 50 individuals.PP5 has 27 classes, and around 50 individuals.

The result of the agreement unit development in one community givesbetter result compared to a development using di�erent communities. Thedevelopment of agreement unit in the experiment was for simple agreement.The result of sending a query in two steps method provides a better queryresponse compare to one step method. The initial results su�er from poorperformance which will require an improvement of the simulation of the un-derlying communication systems among peers.

1.4 ContributionThe major result of the thesis is the speci�cation of a framework based on asemantic agreement approach for P2P environment. The framework allowsinteroperability of information at di�erent levels. It is based on a hybrid P2Parchitecture consisting of three types of peers: the super peers used to registerand manage other peers, the provider peers (PP) used to export and shareinformation with other peers. and the request peers (RP) used to submitqueries to one or more provider peers. A corresponding query processingmethod is proposed. A P2P based architecture has been proposed and aprototype is being implemented to validate the approach. In summary, thefollowings are the main contributions of the thesis.

• Development of the semantic agreement mapping method using thesemantic similarity of concepts based on label matching, internal andexternal structure comparison.

• Representation of agreement between common ontology and exportschema of provider peer.

• Development of the semantic agreement architecture and prototypewith three types of peers in three context.

Page 26: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 1. INTRODUCTION 13

1.5 Thesis OrganizationThe remainder of this thesis is organized as follows.

Chapter 2 reviews related work or state of the art on Semantic and GISinteroperability. This chapter provides a description of many heterogeneityconcepts and provide a foundation for the research.

Chapter 3 is an overview of the thesis. It provides a motivation of theresearch, describes the research objectives and presents the important issuesthat are addressed by the thesis.

Chapter 4 explains the Peer Agreement Approach in spatial informationinteroperability. Agreements are de�ned among peers in a communities thathave common interest. The P2P architecture paradigm is used to modelthe community of data sources. The mechanism of agreement is developedby consider similarity measurement in label matching, internal and externalstructure of shared concepts. Finally, a query processing method based onagreement at provider peers is presented.

Chapter 5 discusses the implementation issues of a prototype used tovalidate the agreement based mediation approach. The various componentsof the architecture are describes.

Chapter 6 presents the conclusions of the thesis and presents some futurework. A summary of the research objective, thesis contributions are pre-sented. Future research issues regarding further development of the frame-work are discussed.

Page 27: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 2

Literature Review

2.1 Introduction

2.1.1 Semantic InteroperabilityThe goal of data or information interoperability is to provide a way to accessto data or information in multiple or heterogenous sources. Many researchersand prototypes have been done in the �eld of information interoperability.

Traditional approach was contributed by researchers from database area,and include ANSI/SPARC architecture, Federated Architecture and Dataware-house architecture. However, it based on point to point translation/convertionor developing global as view (GAV). This approach has problems in hetero-geneity of sources, dynamic of environment, and huge number of sources.

Integration and interoperability is the important keys for the modern in-formation systems. However, the information sources are more heterogeneity(syntactic, structure, ad semantic), large number, dynamic of content and lo-cation, autonomy and open. Therefore, information interoperability requiresdi�erence in modeling, representation, and accessing of sources. Researchersand practitioners in the �elds of database and information integration haveproduced a large e�ort to facilitate interoperability [96].

The sub-chapter on semantic interoperability consist of three sections.First, we introduce the main problems in information integration or inter-operability, including approach in schema and ontology mapping. Next, wepresent a classi�cation of the approaches in semantic integration or interop-erability. Generic solution of semantic integration and interoperability can

14

Page 28: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 15

be categorized in two models: tightly coupled solution and loosely coupledsolution. Main part of sub-chapter is review of implementation semantic webfor integration or interoperability of heterogeneous information sources. Se-mantic integration or interoperability is an active area of research in severaldisciplines. Results of state of the art are comparison of related approachesand as a research foundation.

2.1.2 GIS InteroperabilityComputer network and distributed computing technology have shifted manycomputer-based application from the traditional stand-alone mode to thecontemporary network mode. By networking and Internet, computing re-sources distributed over network are connected, shared and integrated for abetter use, including GIS.

Geographic information is highly desirable, because it is public domaininformation which is used extremely wide and very costly to collect. Thereare some issues to geographic information sharing. Initial e�ort for Interop-erability is at low level (�le format) and standardization. Technical trend ofGIS is to use DBMS to manage and facility the integration of geodata andnon-geospatial data.

Information integration and interoperability refers to the process of inte-grating fragments of data about the real world so that the resulting set givesa more comprehensive description of the real world. Information integra-tion and interoperability is highly related to semantic data sharing, since themajor impediment to information interoperability is currently the semanticheterogeneity.

Many researches have been made for GIS interoperability. They are mo-tivated by:

• Sharing GIS data or information, many GIS data or information arepublic information. Cost for data collection, maintenance is very ex-pensive

• More and more institutions have developed their information based onGIS. Many time contain of data between some institutions are simi-lar. Refer to above point, it is important to reduce data reusability to

Page 29: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 16

overcome above problem.

• Many activities are based on geo-reference information, for example:business support, decision making, etc

The problems of GIS interoperability have same main points with sub-chapter of semantic interoperability, there are two kinds of heterogeneity:1. system heterogeneity can be found as follow: size of data GIS is reallyhuge, format data of GIS is derived from many vendors. 2. informationheterogeneity in syntactic, structural and semantic heterogeneity.

The problem of syntactic heterogeneity emerged as a result of mostlynative data formats and the development of monolithic or proprietary sys-tems. The World Wide Web (WWW) supplies the basic infrastructure for thedistributed use and multiple exploitation of data and systems (system inter-operability), while approved geoinformation technology standards developedby the OpenGIS Consortium (OGC) and the International Organization forStandardization (ISO) provide the essential basis for syntactic interoperabil-ity and cataloging on geoservices and data [12].

According to Xu [112], semantic heterogeneity of geodata can be dividedas follow:

• Formalization heterogeneity (Language heterogeneity and Representa-tion di�erence)

• Conceptualization heterogeneity

• Context heterogeneity

Section 2.3 will look at some state of the art at GIS Interoperability andsome emerging solution at this area.

2.1.3 Peer-to-Peer (P2P)Internet and web has brought step forward for information interchange. How-ever the web approach is more centralized. This situation will bring to heavy-load server. Increasing decentralized and autonomy of today's organization,emerging a peer-to-peer (P2P) solution for information interoperability. Atthe same time, information retrieval, searching or processing rely on keyword

Page 30: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 17

search and rather simple knowledge representation. It is di�cult to deter-mine when two terms are equivalent but has di�erent meaning. SemanticWeb technologies have been shown to solve diversity of meaning.

P2P is not a new technology, it has been introduced two decades ago.Therefore, some di�erence de�nition of P2P, Milojicic [70] has collected somede�nitions from some experts as follow:

• The Intel P2P working group gives de�nition: "the sharing of computerresources and services by direct exchange between systems".

• Ross Lee Graham states P2P de�nition trough three key requirements:a) they have an operational computer of server quality. b) they have anaddressing system independent of DNS. c) they are able to cope withvariable connectivity.

P2P is not physical network, it is more logical network. P2P can beavailable in Internet, LAN or other physical model network. To understandis P2P or not P2P system, characteristics of P2P should be recognized. Themain characteristics of P2P system are:

• Self-organizing, node has freedom to organize himself into network.

• Symmetric communication, node are equal both request or o�er ser-vices. So, nodes need have an operational computer of server quality,since each node can act as a client and/or a server

• Decentralized, no global directory or central control to every node.

In addition to the above main characteristics, there are other derivative char-acteristics such as autonomy, cost of ownership, anonymity / privacy, scala-bility, ad-hoc connectivity, addressing system , dynamic peers.

Currently, three main models of P2P architecture have been implemented.

• Pure P2P / Decentralized, a pure P2P is no dedicated server to han-dle request. Each peer has same position and can be server or client(called servant). Common approach in �nding information sources uses�ooding technique. Example is Gnutella (�rst model).

Page 31: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 18

• Partially, the model is based on pure P2P, but some nodes will besigned as super-peer or super-node. Some nodes act as Super-Peersbecause their capacity, connectivity or reliability. A Super-Peer cankeep a list of connection nodes and speed up the join process. Kazaais an example.

• Hybrid P2P / Mediation, basic mechanism hybrid P2P is same withpure P2P, but hybrid P2P need server(s) to improve the quality ofanswering request. Currently, hybrid P2P is better than pure P2P,and pure P2P is appropriate for hundreds of peers. Napster projectuses hybrid P2P model.

2.2 Semantic Interoperability

2.2.1 The main problemsBerners-Lee et al. write [13]: "Some low-level service-discovery schemes arecurrently available such as Microsoft's Universal Plug and Play, which fo-cuses on connecting di�erent types of devices, and Sun Microsystems's Jini,which aims to connect services. These initiatives, however, attack the prob-lem at a structural or syntactic level and rely heavily on standardization of apredetermined set of functionality descriptions. Standardization can only goso far, because it can not anticipate all possible future needs. The SemanticWeb, in contrast, is more �exible."

Internet and the World Wide Web is increasing the number of informationsources, the need of data exchange, the need of integration / interoperabilityand introducing new problems. The problems can be categorized in informa-tion sources and information exchange.

Information sources in Internet introduces problems as follows [52]:

• The Web is massive. Number of web pages at 2003 is more than 2billion. This number is predicted to double within a year. To �ndthe appropriate sources by considering of content, availability, speedaccess, and trust of sources are di�cult task.

• The web is distributed. Technology of WWW has introduced more au-thority and decentralized in producing information sources. Web is

Page 32: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 19

the product of many individuals who have di�erent domain and inter-est. Di�erent domain can use di�erent 'vocabulary', the main factor ofthe di�erences of vocabulary is less of control in content and catalog.Furthermore, con�icts can be occurred among the web as informationsources. Increasing of autonomy will give other problem for integration.There are many kind of autonomy according to Ozsu [80] as follows:design autonomy (own data representation, own domain), communi-cation autonomy (communicate to others), execution autonomy (localoperation without in�uenced by external operation), and associationautonomy (share degree of sources).

• The Web is dynamic. Changing of web is incredible. The web can jointand leave at every time. Some pages are fairly static, or change ona regular basis or unpredictable intervals. These changes may very insigni�cance. The dynamic condition needs special approach for infor-mation integration.

• The web is open world. The page is available around the world, andevery individual can produce or use the page. One method to searchthe source is by using search engine, even though the largest searchengines have only crawled about 25% of the available pages.

According to Sheth [89] there are 2 groups of heterogeneity: informationand system heterogeneity. Many types of heterogeneity are due to technolog-ical di�erences. Researchers and developers have been working on resolvingthe heterogeneity for many years. Information heterogeneity has lead to dif-ferent level in interoperability of system, syntactic, structure, and semantic.Table 2.1 present diversity of information system.

Syntactic heterogeneity is concerned with di�erences in the representationand encoding of data. It also referred to as format, such as: data formatheterogeneity. Schemas are de�nitions that specify the structure of data andare the result of a database design phase. For example: homonyms, synonymsor di�erent attributes in database tables

Hakimpour [49] distinguishes two types of heterogeneity: data hetero-geneity and semantic heterogeneity. Data heterogeneity refers to di�erencesamong local de�nitions, such as attribute types, format or precision. Seman-tic heterogeneity refers to di�erences or similarities in the meaning of local

Page 33: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 20

data. Semantic is de�ned as the meanings of terms and expressions accordingto their understanding of the world [102]. Hence semantic interoperabilityis "the ability of information systems to exchange information on the basisof shared, pre-established and negotiated meaning of terms and expression,"and is needed in order to make other types of interoperability work (syntactic,cross cultural, international, etc.).

Internet is to become global in the sense of re�ecting all the diversityof the world. It must re�ect knowledge at all levels: locally, regionally,nationally and internationally. From Veltman survey [102], English languagehas changed as dominant language at Internet, the data as follows: 1995 95%,2000 50%, 2001 43% and his projection for 2005 is 25%. Table 2.2 presentdi�erent layer of semantic interoperability from Veltman.

Traditional approaches for information integration or interoperability havebeen introduced. Some e�ort focused at 2 areas; standardization and archi-tecture. Example of standardization as follows: [104]:

• Syntactic standardization, such as: Dublin Core, DIGEST.

• Structural standardization, such as: GEMET, UDK.

• Data Interchange Protocols and Medi39.50, XML, RDF.

Architecture of traditional approaches has introduced a converter pro-gram approach. Traditional architectures are not appropriate solutions forcurrent challenges. Semantic Web is introduced by using semantic approachto solve the problems. Ontology becomes the backbone tool for semanticapproach. Semantic interoperability is achieved by using terminological rela-tionships between terms across ontologies, such as synonyms, hyponyms andhypernym.

Early approaches for semantic integration were mainly based on the useof thesauri to translate between speci�c vocabularies. This approach willdepend on domain of thesauri, the solution is used at global integrated on-tology. A crucial aspect of creating the Semantic Web is to enable users whoare not logic experts to create machine readable content Some approach tosolve the semantic heterogeneity problem for semantic interoperability willbe discused in section 2.2.2.

Page 34: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 21

Table 2.1: Heterogeneity in Information System, from [89]

Type of Heterogeneity Type ofInteroperability

Information HeterogeneitySemantic Heterogeneity Semantic InteroperabilityStructural, Representational/SchematicHeterogeneity

Schematic Interoperability

Syntactic, Format Heterogeneity Syntactic InteroperabilitySystem Heterogeneity System InteroperabilityInformation System Heterogeneity Operational LevelDigital Media Repository ManagementSystemsDatabase Management Systems (hetero-geneity of DBMS,data models, system capabilities such asconcurrency control,and recoveryPlatform Heterogeneity Low LevelOperating System (heterogeneity of �lesystem, naming�le type, operation, transaction support,IPC, protocol, services)Hardware System (heterogeneity of in-struction set, datarepresentation/ coding)

Page 35: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 22

Table 2.2: Di�erent Layers of Semantic Interoperability, from [102]

1. Contain-ers

Fields/Elements

Meaning of De�-nition of Fielders

2. Contents Within theContainers

Fields/Elements

International TerminologicalMeaning

Domain speci�c se-mantic where meaningof contents or negoti-ated world wide(ex. ISO)

Dictionary Mean-ing

Formal Meaning

National TerminologicalMeaning

Domain speci�c se-mantic where meaningof contents or negoti-ated countrywide(ex. NIST)

Dictionary Mean-ing

Formal Meaning

Corpus Written not formal-ized

Regional TerminologicalMeaning

Domain speci�c se-mantic where meaningof contents or negoti-ated regionally

Dictionary Mean-ing

Formal Meaning

Corpus Written not formal-ized

Oral Recorded Archived not writtenLocal Terminological

MeaningDomain speci�c se-mantic where meaningof contents or negoti-ated locally

Dictionary Mean-ing

Formal Meaning

Corpus Written not formal-ized

Oral Recorded Archived not writtenOral Not systematically

archive

Page 36: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 23

2.2.2 Semantic Interoperability ApproachesThere are some views to classi�es the semantic interoperability approach.From BUSTER[104] team (University of Bremen) makes classi�cation basedon ontology view (single/merge ontology, multiple/map ontology, and share/translationontology). Uschold [101] also make classi�cation based on ontology function.To de�ne the architecture, Uschold refers to some questions as follow: (i)Who is generating the agent to agent semantic mapping?: a. The agent de-signer; b. The ontology designer; c. The agents. (ii) When is the mappingbetween two agents ontologies created?: a. Mapping is pre-de�ned before theagents interact; b. Mappings are dynamically generated at agent-interactiontime. (iii) What is the topology of the architecture?: a. Mapping is donepoint-to-point between agents; b. mapping is mediated. (iv) What is thenature of the agreements among the agents?: a. Agreement is on a singleglobal ontology for all interacting agents, b. Agreement is on interlingua on-tology (using reference ontology); c. Agreements is on alignments/mappingsbetween ontologies.

For this sub-chapter, the classi�cation is based on links between informa-tion sources. There are two main approaches: tightly coupled approach andloosely coupled approach. The tightly coupled approach is based on idea ofmaterialized model in traditional database. The loosely coupled approachrefers to virtual model. The classi�cation can be seen at �gure 2.1. Tightlycoupled approach, in this approach some 'components' from all sources willbe merged as one 'interface' for interoperability purpose. Global IntegratedSchema, 'interface' will merge all exported schema views from all sources.Global Integrated Ontology, 'interface' will merge all exported schema andcontent (ontology) views from all sources. Loosely coupled approach, in thisapproach, there is no 'components' from all or part of sources will be put /merge as one 'interface'. The approach uses 'common references' to achieveinteroperability and integration purpose. Common ontology or vocabularyis implemented to use as the reference.

2.2.2.1 Tightly Coupled Approach

There are two models of tightly coupled approach as follows:

• Global Integrated Schema

Page 37: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 24

Figure 2.1: Classi�cation of Semantic Interoperability Approaches

Figure 2.2: Architecture Global Integrated Schema

• Global Integrated Ontology

Global Integrated Schema A common approach to capture informationsemantics is in terms of its structure. The use of conceptual models of storedinformation has a long tradition in database research. Conceptual modelsnormally have a tight connection to the way the actual information is stored.This connection has signi�cant advantages for information sharing, becausethe conceptual model helps to access and validate information. The GlobalIntegrated Schema approach is based on assumption that the structure of theinformation already carries some semantics in terms of the domain knowledgeof the database designer. The common architecture of this approach can beseen at �gure 2.2.

The Global Integrated Schema approach is suitable whenever the schemas

Page 38: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 25

Figure 2.3: Garlic Architecture, from [19]

are not subject to frequent changes. In this approach, database schemascommit to the ontology of a community. Advantages of the approach areeasy for query, reduce information loss. However the drawbacks are smallnumber of sources, and di�cult for dynamic environment.

Some approaches / projects based on Global Integrated Schema are: TheGarlic [19] is a project of IBM Almaden Research Center http://www.almaden.ibm.com/cs/garlic. The purpose is to enable large-scale mul-timedia information systems. Figure 2.3 depicts the overall architecture ofthe Garlic system. Garlic architecture is typical of many database middle-ware systems. Clio has been implemented to help mapping between datato build integrated schema. InfoMaster by Stanford University [43] is aninformation integration system that provides integrated access to multipledistributed heterogeneous information sources in Internet as a virtual datawarehouse of sources. InfoMaster architecture contents wrapper, InfoMas-ter facilitator and Knowledge Base (�gure 2.4). MOMIS (Mediator envi-rOnment for Multiple Information Sources) is an approach from UniversityModena [9, 10, 11]. The objective is to provide a global virtual view ofthe information coming from heterogeneous information sources. A globalmediation schema (ontology) uses for the structured and semi structuredheterogeneous data sources, in order to provide to the user a uniform queryinterface. Components of MOMIS architecture (�gure 2.5) are: wrapper,translate data source to ODLI3 in two directions.

Schema MatchingSchema matching is an approach develops schema integration of the Global

Page 39: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 26

Figure 2.4: InfoMaster Architecture, from [43]

Figure 2.5: MOMIS Architecture, from [10]

Page 40: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 27

Integrated Schema. There are some schema matching techniques [87], the de-scription of techniques is as follow:

• Individual vs. combination: mapping approach can use a single algo-rithm, or combination uses multiple algorithms.

• Schema vs. instances based: mapping approaches can consider instancedata (i.e., data contents) or only schema-level information.

• Element vs. structure granularity: map can be performed for individualschema elements, such as attributes, or for combinations of elements,such as complex schema structures.

• Language vs. constraint based: a matcher can use a linguistic- basedapproach (e.g., based on names and textual descriptions of schemaelements) or a constraint-based approach (e.g., based on keys and re-lationships).

• Matching cardinality: the overall match result may relate one or moreelements of one schema to one or more elements of the other, yieldingfour cases: 1:1, 1:n, n:1, n:m. In addition, each mapping element mayinterrelate one or more elements of the two schemas. Furthermore,there may be di�erent match cardinalities at the instance level.

• Auxiliary information: most matchers rely not only on the input schemasS1, S2 and Sn but also on auxiliary information, such as dictionaries,global schemas, previous matching decisions, and user input.

The schema matching can be divided into two approaches as follow:

• Schema mapping: is a method to develop relation/mapping fromschema source to schema target. Example tools or approach of schemamapping as follow: CLIO [53, 82, 69] is based on schema and instances.COMA (COmbining MAtch algorithm) [31] is based on combiningalgorithms in �exible way. CUPID [64] the algorithm refers to schema-only based. DELTA [23] is a combined approach to attribute-matchingacross heterogeneous. iMAP [28] uses schema and instances based forschema mapping / matching. LSD [32] (Learning Source Description)

Page 41: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 28

Figure 2.6: Global Integrated Ontology

is a mapping schema approach on schema and instances based matchwhich is used and supported by machine-learning approach.

• Schema merging: is a method to combine schema to create a newschema. The tools of schema merging are ARTEMIS [21, 20, 24],the algorithm operates on a hybrid relational Object Oriented modelthat includes the name, data types and cardinalities of attributes andtarget object types of attributes that refer to other objects. It computesmatches by a weighted sum of name, data type and structural a�nities.SemInt [63], this tool is to solve problem in identifying relationshipbetween attributes or classes in di�erent database schema.

Global Integrated Ontology The Global Integrated Ontology approachis 'Single Ontology Approach" by merging of local ontologies or/and severalspecialized ontologies from some sources. A global ontology can provide ashared vocabulary for the speci�cation of the semantic (see �gure 2.6). Thisapproach is very centralized model.

Single ontology approaches can be applied to integration problems whereall information sources to be integrated provide nearly the same view ona domain. In the global integrated ontology is easy to write a query, butthere are some problems: information sources are changed, and informationsources have di�erent domain.

Projects/prototypes based on Global Integrated Ontology approach areCOG project [17, 25] is provided by CRF (Centro Ricerche Fiat). The goal

Page 42: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 29

Figure 2.7: COG Architecture, from [17]

of the project is to implement a single integrated (semantic) informationarchitecture for the various information sources provided by CRF in orderto show the applicability of using ontologies for information integration inindustry. The COG architecture is at �gure 2.7. The SIMS (Services andInformation Management for decision Systems project [2] at the InformationSciences Institute (ISI), University of Southern California uses a knowledgebase representation schema to provide global knowledge model that inte-grates the underlying information sources (www.isi.edu/sims/). SIMS isan intermediate layer (a mediator) between information sources and humanusers or application programs. This method is 'old' enough, however manynew approach takes idea from this approach. Architecture of SIMS (�gure2.8) has an intermediate layer (a mediator) using single merging ontologybased on hierarchical terminological knowledge base.

Ontology MatchingOntology matching is an e�ort to enhance the semantic mapping. Ac-

cording to Noy [75], there are two major architectures for mapping discoverybetween ontologies: using shared ontology and heuristic-based or machinelearning techniques. Shared ontology is a general upper ontology which isagreed upon developers of di�erent applications. Heuristics-based or machinelearning uses various characteristics of ontologies, such as their structure,de�nitions of concepts, and instances of classes, to �nd mappings. These ap-proaches are similar to approaches to mapping XML schema or other struc-

Page 43: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 30

Figure 2.8: SIMS Architecture, from [2]

tured data but tend to rely more heavily on features of concept de�nitionsor on explicit semantics of these de�nitions.

The examples of shared ontology based for mapping are SUMO (Sug-gested Upper Merged Ontology) and DOLCE. SUMO is developed by IEEEStandard Upper Ontology Working Group. The goal is to produce "a stan-dard upper ontology that will promote data interoperability, informationsearch and retrieval, automated inferencing, and natural language process-ing". DOLCE ontology is created by WonderWeb Project, which comprisea large number of European research groups. The objective is to provide acommon reference framework to facilitate sharing of information by captur-ing "ontological categories underlying natural language and human common-sense".

Heuristic-based or machine learning technique is implemented in PROMPT,FCA-Merger, Glue etc. This technique can be used for mapping and integrat-ing of ontology. MAFRA, OntoMapper and Skat are examples of ontologymapping. Chimaera, FCA-Merger, Glue, Onion, Prompt, Smart and On-toMerger are used for ontology merging.

There are two type of tools for ontology matching as follow:

• Ontology mapping: is a method to map components (a class or at-tribute) from an ontology to the components of another ontology. Thetools based on ontology mapping are MAFRA - MApping FRAme-work [90] de�nes structure of speci�c mappings and transformationfunctions to transfer from one ontology to another. MAFRA architec-

Page 44: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 31

ture is based on KAON infrastructure. OntoMapper [83] is a map-ping between two ontologies based on combination of IR (informationretrieval) based text classi�cation and Bayesian inference. SKAT - ASemantic Knowledge Articulation Tool [71] approach focuses on identi-fying the articulation over two ontologies, i.e., the terms where linkageoccurs among the sources. SKAT is based on simple lexical and struc-tural matching for the interaction of two web sources.

• Ontology merging: is a method to combine ontologies to build anew ontology. The tools are CHIMAERA [66] is a browser-basedediting, merging and diagnosis tool for ontology. FCA-Merger [97] isa method for comparing ontologies that have a set of shared instances ora shared set of documents annotated with concepts from source ontolo-gies. GLUE [33] employees machine learning technique to �nd map-ping. ONION [84] uses rules that cross the semantic gap by creatingan articulation or linkage between the systems. OntoMerge [35] thecorrespondence between two ontologies is expressed as a set of bridg-ing axioms relating classes and properties of the two source ontologies.Authors use a general purpose inference engine to enable translationbetween mapped ontologies. PROMPT [77] is developed to supportontology merging. Suggestions of merging are presented as user guideduring the process. The suggestions based on lexical and structuralfeatures, and input form the user during and interactive merging pro-cess to �nd the mapping. SMART [76] is an algorithm that provides asemi-automatic approach to ontology merging and alignment. SMARTis based on an extremely general knowledge model and, therefore canbe applied across various platforms.

2.2.2.2 Loosely Coupled Approach

Semantic Mediation The mediation approach is based on two main com-ponents. Mediator component is used to enable the creation of integratededviews on the local schema. It support the discovery of relevant data sourcesand to provide various query services. Wrapper component is used to mapthe local database to a common data model and to provide low level dataaccess function.

Page 45: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 32

Figure 2.9: Mediation Semantic Integration Approach

Mediation or hybrid approach overcomes the drawbacks of global inte-grated schema / ontology approach. The shared vocabulary (common ontol-ogy) contains basic terms (the primitives) of a domain which are combinedin the local ontologies in order to describe more complex semantic (see �gure2.9).

This approach is appropriate for large number and dynamic of informa-tion sources. However, there are some di�culties to consider, such as: highprocessing cost, to make more reliability in validation similarity, di�cult toreuse existing ontologies.

Projects/prototypes of mediation are:AIDA [34], the basic idea is to treat a data integration system as a �nite

set of parameters whose values must be set. The system administrators canconstruct and deploy a system "shell", then ask the users (feedback) to helpthe system "converge" to the correct parameters values. The approach wascalled MOBS (Mass Collaboration to Build System) in developing mediatedschema. Next step, developing semantic mapping for the mediated schemaelement as system parameters. Successful of AIDA depend on feedback ofusers.

ARIADNE [1] is a system for building mediator that can gather andintegrate information from multiple Internet sources. ARIADNE is basedon SIMS project. SIMS is to integrated information from heterogeneousdatabase for local network. Semantic con�ict will be handled by using map-ping tables and mapping function. LOOM knowledge representation for mod-eling data is used the system.

Page 46: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 33

Figure 2.10: BUSTER Architecture, from [104]

BUSTER (Bremen University Semantic Translator for Enhance Re-trieval) [104, 109] is an approach to provide a comprehensive solution to rec-oncile all heterogeneity problems. BUSTER considers problem in syntactic,schema and semantic level, and take concerns in several technologies includ-ing language, mediator, ontologies and knowledge based classi�ers. BUSTERapproach is based on MECOTA - rule based mediator.

BUSTER architecture contains acquisition phase and query phase. Dur-ing acquisition phase all appropriate information for integration is collected.There are Comprehensive Source Description (CSD) and Integration Knowl-edge (IK)components. The CSD is an information for each source. The IKis an information about transforming from one source to another. Duringquery phase, CSD and IK implement to select appropriate data from sourcesand transform to structure and context of query. The BUSTER architecturecan be seen at �gure 2.10.

COIN (COntext INterchange) project at the Sloan School of Manage-ment at MIT [46] concentrates on customizing the data retrieval and queryingto the needs of heterogeneous users. The COIN data model induces a noveldichotomy between schemas and contexts thus permitting knowledge of datasemantics to be shared and reused across di�erent systems in similar environ-ments. The distinguishing feature of this project is that the data mappingsare carried out according to the speci�c preferences of users.

COIN architecture system is shown in �gure 2.11. The COIN integrationstrategy is called the COIN model, which integrates disparate data sourcesby providing semantic interoperability among them.

Page 47: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 34

Figure 2.11: COIN Architecture, from [46]

The COIN framework is composed of both a data model and a logicallanguage (COINL) derived from F-Logic. The data model and language areused to de�ne the domain model of the receiver and the data source andthe context associated with them. The data model contains the de�nitionsfor the types of information units (called semantic types) that constitute acommon vocabulary for capturing the semantics of data in disparate systems.

The context mediator is the heart of the COIN project. This part pro-vides mediation for user queries. Mediation is the process of rewriting queriesposed in the receiver's context into a set of mediated queries where all po-tential con�icts are explicitly solved. This process is based on an abductionprocedure, which determines what information is needed to answer the queryand how con�icts should be resolved by using the axioms in di�erent contextsinvolved. Answers generated by the mediation unit can be both extensionaland intentional. Extensional answers correspond to the actual data retrievedfrom the various sources involved. Intentional answers, on the other hand,provide only characterization of the extensional answer without actually re-trieving data from the data sources.

DILEMMA, Université de Bourgogne [56], Dillema based on media-tion architecture for interoperability. The approach uses a context matchingmechanism without ontology requirement for discovering related information.The semantic object model associates the semantic of information with itsstructure, The contexts are represented by terminological graphs as abstrac-

Page 48: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 35

tions of the semantic of information.HERMES (HEterogeneous Reasoning and MEdiator System), a project

at University of Maryland [98] is similar to TSIMMIS (www.cs.umd.edu/projects/hermes/). Mediator can be developed by a declarative media-tor speci�cation language and is divided into domain integration (transformthe results of information sources into a form that can be used by di�erentsources) and semantic integration (for con�ict resulting from sources). Rulebased language is used for semantic integration. A special focus of HERMESis instance con�icts during integration.

KRAFT uses shared ontology [85] as a basis for mapping between on-tology de�nitions and communication between agents. By detecting a set ofontology mismatches, KRAFT can establish mapping between a shared on-tology and local ontologies. To map ontologies KRAFT architectures usingmap techniques: class mapping, attribute mapping, relation mapping, andcompound mapping. There are three agents at the KRAFT architecture:wrapper, facilitator, and mediator.

MECOTA [109] is a rule-based mediator which maps the sources to anintegrated view by transformation rules reconciling structural heterogeneityproblems. The rules for the integration (integration rules) are similar torule-based mediators. Context transformation in semantic heterogeneity ishandled by context rules. Context and integration rules together compose theset of transformation rules which con�gure the rule-based mediator. Duringthe inference process, when query is split to sub-queries for each sources, thecontext transformation and the integration are interleaved.

OBSERVER (Ontology Based System Enhanced with Relationships forVocabulary hEterogenity Resoution) [68, 67] uses a component-base approachto ontology mapping. To avoid a global schema, �brokering¢apabilities isimplemented to enhance distributed ontology query across domain ontologies.Repository of ontology is expressed in Description Logic (DL).

There are some component nodes at OBSERVER, one of these is theuser node. Every node has an ontology server that describes de�nitions forthe terms in the ontology and retrieves data underlying the ontology in thecomponent node. Translation is not always completely correct, since thereis loss of information. To handle translation between di�erent componentontologies, an inter-ontology relationship manager (IRM) is introduced. IRM

Page 49: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 36

Figure 2.12: OBSERVER Architecture, from [68]

e�ectively contains a one-to-one mapping. The OBSERVER architecture canbe seen at �gure 2.12.

The TSIMMIS (The Stanford IBM Manager of Multiple InformationSources) project at Stanford University [22] develops a set of tools for access-ing information in an integrated form from multiple heterogeneous sourcesby using mediator and translator/wrapper (www-db.stanford.edu/tsimmis/).TSIMMIS concentrates on optimizing user queries according to the capa-bilities of di�erent data sources. TSIMMIS has successor such as LOREand MIX. The successors move to XML model. Objective of TSIMMIS isto generate translator/wrapper and mediator. The TSIMMSI architecturecontains translator, translator generator, mediator and mediator generatorcomponents, see �gure 2.13.

2.2.3 ComparisonComparison is based on some criteria which are explained at sub-section2.2.3.1. Result of the comparison provides at sub-section 2.2.3.2.

2.2.3.1 Comparison Criteria

The comparison criteria will focus on data model, using of ontology, level ofmapping, degree of automation, query language, availability of architectureand prototype / project of implementation.

Page 50: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 37

Figure 2.13: TSIMMIS Architecture, from [22]

• Data model : what data model is used to achieve semantic integration/ interoperability

• Use of ontology : ontology is "backbone" tool for semantic integration /interoperability, there are three types of ontology use: single, multipleand hybrid ontology approach.

• Level of mapping : mapping is important part during integration / inter-operability process. Refer to table 2.1, information interoperability hasthree heterogeneities: syntactic, structure and semantic. Level of map-ping for schema and ontology mapping / merging was distinguished.

• Degree of automation: Degree of automation can be manual, semi-automatic or fully-automatic.

• Query language: there are some types of query language, type of lan-guage will be presented in the comparison.

• Architecture: availability of architecture in every approach.

• Implementation: Some projects or approaches have been tried to de-velop prototype or implementation in speci�c area. However, othersjust in concept level.

2.2.3.2 Comparison Table

Refer to above comparison criteria, carefully evaluation has been conductedfor some approaches present at table 2.3

Page 51: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 38

.

Table 2.3: Comparison Semantic Integra-tion/Interoperability

System

DataMod

el

Usin

gof

Ontolog

y

Levelo

fMap

ping

Degreeof

Autom

ation

Que

ryLa

ngua

ge

Architecture

Implem

entatio

n

GARLIC OO Single,

schema

level

Schema in-

tegration

semi,

supported

by CLIO

OSQL?,

global

Yes

InfoMaster ACL,

KQML,

KIF

Single,

schema /

ontology

level

Schema in-

tegration

semi SQL and

OQL,

global

Yes Auto Car

Manufac-

turing

MOMIS ODLI3Single,schema /ontologylevel

Schema in-tegration

semi global YesA few test

COG XMLSingle,schema andontologymapping

Ontologymerging

manual SQL like,global

YesManufacturing

SIMS LoomSingle,schema andontologymapping

Ontologiessharing

manual Loom /SQL

Yes

Continue ...

Page 52: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 39Sy

stem

DataMod

el

Usin

gof

Ontolog

y

Levelo

fMap

ping

Degreeof

Autom

ation

Que

ryLa

ngua

ge

Architecture

Implem

entatio

n

AIDAMediatedschema

Schema /ontologymapping

semiSimulation

ARIADNE LoomMediated Schema

mapping

semi SQL YesGeographicInformationDisplay

BUSTERXML, RDF,OWL

Common &local ontol-ogy

Ontologymapping /semantictranslation

semiSQL like,QXML,QTDF,global

YesGIS,Biology

COINRelational+.F-Logic

Mediated &context

Schemamapping

semiSQL, global

Yes

DILEMMA Description

Logic

Mediated &

context

Context

matching

OQL Yes

HERMESLogic based Mediated

schema.proposemediator

Schemamapping

semiSQI,rule base(Prologlike)

US ArmyCorps ofEngineers

KRAFT semi ?, localbibliography

Continue ...

Page 53: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 40Sy

stem

DataMod

el

Usin

gof

Ontolog

y

Levelo

fMap

ping

Degreeof

Autom

ation

Que

ryLa

ngua

ge

Architecture

Implem

entatio

n

MECOTARule-baseand inte-gration rulemediator

Ontologymapping

manual

ObserverDescriptionLogic

Mediatedontology

Ontologymappingandtranslation

Manual CLASSIC,local

YesBibliography

TSIMMIS OEMLocal andCommonontology

Ontologymapping

semiLorel MSL

Yes

InfoSleuth KQML Local and

Common

Ontology

Schema and

Ontology

mapping by

Agents

semi SQL & KIF,

global

Yes Environment

EDEN

2.3 GIS Interoperability

2.3.1 GIS InteroperabilityClassi�cation of GIS interoperability has been proposed by some researches.The classi�cation can be based on content, process or network architecture.Some classi�cation can be found at di�erent view as follow:

• Visser [105] distinguish four levels of integration for GIS as follow:

Page 54: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 41

� Technical integration: WWW has provided good infrastructureto exchange large amounts of data and information around theworld.

� Syntactic integration: many standards have introduced a gatewaysuch as ODBC, HTML and XML

� Structural integration: mapping and mediator are used in thesolution.

� Semantic integration: semantic translation or integration denotesthe resolution of semantic con�icts.

• Level of interoperability from Kokou [113] can be divided as: platform,syntactic, and application level. Platform level is system interconnec-tion for formats of data exchange. Syntactic level is to handle modeland language con�ict. And application level is to resolve problems onschematic, semantic con�icts and data localization.

• Five classi�cation by Amarnath [48] are: cataloging, gateway, datawarehousing, mediator-based system, and hybrid approaches. Cat-aloging of geo-sources is using locational identi�ers. Gateway is byde�ning universal schema and persistent views over a variety of datasource, has similarity with Global Integrated schema. Data warehous-ing is materialized global view, this approach is appropriate for smallnumber of sources. Mediator-based system has similarity with conceptof federated database, the system based on 3 level architecture: wrap-per, mediator and user interface layer. Hybrid approach combiningfeatures from the previous approaches.

• Architecture model for GIS interoperability [111] are: centralized, two-tier C/S, three-tier C/S, multi-tier C/S, P2P, and distributed. Central-ized GIS is controlled by a single server or operating system. Two-tierclient server is a GIS split into two functional layers (client and server).Three-tier client server is GIS client divided into user interface andapplication server. Multi-tier client server is a generalization of thethree-tier model, thus allowing more than one application server. P2Pis model decentralized system. Distributed is model combination ofmany type of architectures.

Page 55: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 42

2.3.1.1 GIS and Web Technology

Internet based GIS has come to be an infrastructure and tool for sharing,remote processing geographical data. Development of web-based by inte-grating GIS and RDBMS would serve two crucial purposed. Firstly, it wouldallow the user to operate the system without having to understand GIS andRDBMS technology. Secondly, it would allow sharing of information andtechnical expertise among a wide range of users. Currently, the user needsonly a Web browser and access to the Internet.

Some e�ort to implement web-based GIS interoperability have been con-ducted. OGC (www.opengeospatial.org) has proposed Web Feature Server(WFS), Web Map Server (WMS), Web Coverage Server (WCS), Web Reg-istry Service (WRS), Web Service (OWS). ESRI [37] has developed documentmetadata and GIS to support problem in information searching in process forInternet environment. Some examples of GIS interoperability at web-based/ Internet are Grid Technology and OGC Web Service in NWGISS forNASA EOS data [29], the project implements OGC Web Services (OWS),web-based and service system. GRASS GIS environment [86] is an on-line implementation of spatial database. DISGIS [14] is an architecture forIntranet / Internet interoperability, based on ISO RM-ODP model.

2.3.1.2 Standardization

GIS technology has been improving from traditional community model toworld wide community model. GIS technology is also being used to sharecrucial information across organizational boundaries via network. Crucialmethod to talk inter organizations is important part. As a �rst step of GISinteroperability, standardization has been introduced as important point.During past 20 years, the concepts, standards and technology for imple-menting GIS interoperability have evolved through six stages (OpenGIS):data converters, standard interchange formats, open �le formats, direct readapplication programming interfaces, common features in a database manage-ment system and integration of standardized GIS Web services.

OpenGIS Consortium (OGC) www.opengis.org[79, 78] is working instandardization of components (OGIS - Open Geodata Interoperability Spec-i�cation). However, OGC topic Semantic and Information Communities

Page 56: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 43

(topic 14) has not been considered in depth [105]. Other organization in-volved with standardization are: Digital Earth [26], ESRI (spatial datastandard [38], interoperability ESRI and OGC speci�cation [41], ArcGIS[39],ESRI Shape�le [40]), DGIWG (Digital Geographic Information Work-ing Group), FGDC (Federal Geographic Data Committee), GSDI (GlobalSpatial Data Infrastructure), etc.

2.3.1.3 Mediation Approach

Semantic heterogeneity in GIS interoperability is an important problem tosolve current problems and needs. To achieve interoperability [93], the �rstprerequisite is that individuals and organization know each other and thedata they process. Second step, there must be a willingness to make dataavailable to users outside the source organization. Leonid [93] make pre-sumptions for GIS interoperability as: (i) simple - users should not haveto understand all details about the data or their source system to importand use them; (ii) transparent - complexities associated with data transfershould be hidden for users; (iii) open - interoperability should apply to allsystems and data exchange should be independent of the technology used;(iv) equal - systems are equal and autonomous; (v) independence - systemshave exclusive right to control their information and information processingwithout centralized control; (vi) e�ective - data transfer should be reliable,and the resultant data should be useful for the intended purposes; (vii) uni-versal - all geospatial databases should be accessible; and (viii) belonging- each system belongs to one GI community, and has its own institution,policy, culture and value viewpoint. Example mediation approach for GISare MIX (Mediation Information using XML) project [48] is a wrapper-mediation architecture based system for obtain logical integration. The basicarchitecture contains wrapper, spatial mediator, other mediator, and mainor application mediator. GeoNis Project [93, 91] is using ORCHIDEA [92]mediator platform. Orchidea platform is a middleware, mediator systems,and it consist of Ginis-Web[95], GinisNT[94], etc. Orchidea ontology refersto top-level and hybrid model ontology, ontology as translator, query basedon SQL using XML-DTD. ISIS (Interoperable Spatial Information System)is a semantic mediation model and an agent based architecture for GIS in-

Page 57: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 44

Figure 2.14: ISIS Architecture [61]

teroperability [61]. OO data model and a mediation architecture based onmulti agent paradigm to support GIS interoperability is implemented for theapproach. The AMUN data model [62] is used to represent information atboth wrapper and cooperation levels. Cooperation level is the key to ex-press of semantic heterogeneity among information sources. There are twotypes cooperation level in the ISIS architecture; reference context for com-mon semantic at an application domain; cooperation context as interpret thecommon reference context in local objects. The architecture of ISIS can beseen at �gure 2.14. SemWeb [16] is a model for information sharing (SFDS- semantic formal data structure). It consists of three layers: syntactic,schematic and semantic layer. Three components of SemWeb, the resourcesdiscovery server, the information resources server, and the client module, see�gure 2.15. BUSTER is a middleware for semantic integration which usesontology for data integration. BUSTER/Q is for information retrieval andBUSTER/IS is for semantic integration. Both tools can be integrated into aninfrastructure of distributed geoservices.Eva [106]. [60] said that BUSTER isfor spatial representation reasoning, queries and information brokering. Thispaper demonstrates how to implement BUSTER as semantic translator forGIS interoperability.

ComparisonThe approaches will be evaluated based on some criteria as follows:

• Purpose. An approach has di�erent purpose, even though tried to solvethe same problem.

Page 58: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 45

Figure 2.15: SemWeb Architecture, from [16]

• Model. Modeling is one important point for semantic heterogeneity. Byusing appropriate modeling, the approach can provide better solution.

• Architecturee. Semantic mediation generally has basic architecture:wrapper and mediator.

• Query. There are many issues can be delivered in query topic. For thispoint, language of query will be considered.

• Technology. Every approach is not always directly related to a technol-ogy. Technology of implementation of approach will be evaluated.

• Prototype. Prototype is a method to validate an approach. Availabilityand domain of prototype will be viewed.

From above criteria, table 2.4 shows comparison some approaches on se-mantic mediation for GIS interoperability. From the table, GeoNix has areal implementation, and developed from the beginning for special case inGIS interoperability to solve semantic problem. BUSTER is general solu-tion and try to adjust for GIS interoperability. SemWeb has similarity withBUSTER, even though level of technology and maturity is little bit behind

Page 59: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 46

BUSTER, because time of development of SemWeb is older around 3 yearsthen BUSTER.

2.3.2 Emerging SolutionRefer to pre-assumption (2.3.1.3) in GIS interoperability, there are some is-sues in large number of client and providers, dynamic environment in Inter-net, more autonomy of sources. Some e�orts to emerging solution betweensemantic mediation, GIS Interoperability and P2P have been conducted. Theapproach for these emerging will be look at as:

• GIS Interoperability and P2P

• Semantic Web and Peer-to-Peer

• Semantic Mediation, GIS Interoperability and P2P.

2.3.2.1 GIS Interoperability and P2P

Main purpose to emerging this solution is to handle huge number of sourcesand dynamic and autonomy environment. For example: every person andorganization can be as spatial information provider cause of technology. How-ever at individual level, sources (computer power, bandwidth, content, etc)is limited. This condition will create huge small information provider whichcan leave and joint at many time.

Implementation of P2P environment for GIS interoperability can be cat-egorized as follow:

• File sharing, big size of GIS data need a strategies in distributed andaccess of the �les.

• Process sharing, processing in GIS is very processor consuming. Dis-tributed of processing to some available machine will give better result.

• Information sharing, information sharing is more important to get in-formation of an object of GIS from some views.

APPOINT [99] is an approach to enhance client server network to ex-change huge size geodata. Method to �nd sources / client is using the current

Page 60: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 47

Table 2.4: Comparison Semantic Mediation Approach for GIS Interoperabil-ity

Criteria MIX GeoNis ISIS SemWeb BUSTERPurpose attain 'logi-

cal integra-tion'

integrationheterogene-ity GIS& non-spatial overInternet

a dynamicresolutionof semanticcon�icts

resolve threeheterogene-ity

resolvethree het-erogeneity,emphasizeas semantictranslator

Model associaterules tointegrate

OGC simplefeature

OO datamodel

semanticformal datastructure

comprehensivesource de-scriptionand in-tegrationknowledge

Architecture wrapper,spatialmediator,other medi-ator, main/applicationmediator

wrapper,media-tor, top/ sharingontology

agent forwrapper,cooper-ation,interface,ontology,semanticrouter co-operationbus

client andprovidermodule,each modulecontentsemantictransla-tor andknowledgebase.

same aschaptersemanticinteroper-ability

Query XMAS(XML-Q),di�erentqueryspatial ornot

XML &SQL

SQL RDF-Q,XML-Q andProlog like

Technology XML, web-based

XML,Internet,web-based

Java,Corba, OO

ODBC,Java, Ne-tExpert,Web server

XML, RDF,OWL

Prototype research telecom research research research

Page 61: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 48

Figure 2.16: APPOINT Architecture, from [99]

server and simple algorithm at �gure 2.16 . This approach is called central-ized P2P. DBGlobe [81] is to manage super-database / peers. DBGlobeconnects a number of peers, which in the DBGlobe context, are called mobiledevices or PMO-primary mobile objects to emphasize the mobility aspects.P2PGIS (P2P geographic information system) is the sharing of computer-based geographic resources and services by direct exchange between systems.It takes advantage of existing computing power and networking connectiv-ity. OPUS (OPen Use Server) [5] is the �rst implementation of P2PGIS �lesharing architecture called RoMap.net (Rapid Online Mapping Network.

2.3.2.2 Semantic Web and Peer-to-Peer

Centralized approach with classic client-server model has di�culty to face dy-namic environment, large number of participants in information integration,asymmetric tra�c between information provider and client, and available ofinformation provider which has limited sources and too speci�c content.

Napster, Gnutella, SETI@home made P2P paradigm popular as a versionof distributed computing between traditional distributed systems and theweb. In new dynamic system, it cannot assume the existence of a globalschema for all data or information in a P2P network [55]. Emerging semanticweb and P2P has derived to solve some problems in information integration,from information searching, processing and query. Mapping issues [59]andview rules [65] are one of interesting topic for semantic web and P2P, someapproaches have been introduced as below.

SWASIE project www.swasie.org [7] aims at implementing an advanced

Page 62: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 49

search engine. It will provide users with intelligent access to heterogeneousinformation on the Internet. SWASIE architecture has some components asfollow: information nodes (SINodes), brokering agents (BAs), query agents(QAs), and user interface. The architecture based on mediator based systems(based to MOMIS) with global ontology (WordNet(EuroWordNet). Diego[18] studied the problem of data integration in P2P system. he proposes anew approach based on epistemic logic, and show that not only it adequatelymodels interactions among peers, but it also supports decidable query an-swering. Edutella project[74, 73] aims to provide an RDF-based metadatainfrastructure for P2P applications. The system brings together RDF andP2P concepts and exploits their strength in a common framework, suitablefor building general schema-based P2P networks for distributed and dynamicinformation providers. Query language uses RQL which is RDF query lan-guage - SQL like. PDMS (Peer data management system) [51, 50, 100]is adata management architecture in which any user can constribute new data,schema information, or even mappings between other peers schema in de-centralized environment. PDMS gives contribution in a �exible language formediation between peer schemas.

2.3.2.3 Semantic Mediation, GIS Interoperability, and Peer-to-Peer

Semantic mediation has been implemented in many cases of interoperability,including GIS Interoperability. From above discussion of GIS Interoperabilityand P2P, problem of semantic heterogeneity is still appeared. These emergingsolution will provide better approach for current problems, even though willbring more complicated approach to merge and synchronize. two papersdiscuss about emerging these solution. These e�ort will be presented at nextparagraph.

PSW (P2P Semantic Web) [3] has capabilities to �nd relevant set of on-tologies, facilitating reuse of existing ontologies, and advertising the resultingontologies. This approach is a step forward objective of InfoQuilt project toP2P environment. A speci�cation framework (DAML+OIL) for indepen-dently creating, maintaining, and interoperating ontologies while preservingtheir semantics, and P2P is used to provide a distributed architecture which

Page 63: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 50

Figure 2.17: PSW InfoQuilt Architecture, from [3]

can support sharing of independently created and maintained ontologies.This model uses registers / publish mode to a directory or directories.

InfoQuilt is agent-based system, each user in the P2P network runs amulti-agent Information Brokering Systems. The information and servicesthat a user is interested in are semantically identi�ed using the KnowledgeSpace Navigation algorithm, which provides means for locating relevant on-tologies / content. Users can build IScapes using the IScape Builder utilizingthe most relevant knowledge about concepts, which are semantically iden-ti�ed and retrieved from P2P Knowledge Sharing network. IScape Builderand IScape Execution supporting Knowledge Based are the main componentof PSW & InfoQuilt architecture (�gure 2.17).

BUSTER in P2P [107] is a concept for spatial metadata based onqualitative spatial reference models, and their application in P2P network.BUSTER is a data broker which o�ers intelligent data retrieval, coupled withdata integration and semantic translation services [103]. BUSTER can beapplied within the framework of P2P data sharing networks as well. Forintelligent thematic queries and semantic data integration, the system usesterminology reasoning based on formal ontologies. Illustration [107] in casereal estate agency and his client who �nd a new house. The system willsupport queries of the type concept@location. Lack of metadata support,most state-of-the-art P2P systems are not well equipped to handle networksfor the exchange of geo-referenced data.

Page 64: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 51

2.4 Summary

2.4.1 Summary of Semantic InteroperabilityIn this sub-chapter, some approaches of semantic interoperability have beenintroduced. The approaches can be divided into two generic approaches:tightly solution (global integrated schema and global integrated ontology)and loosely couple solution (mediation). Mediation approach is more ap-propriate to solve current condition characteristics (heterogeneous, dynamicenvironment, huge number participants, etc.)

Interoperability paradigm is e�ected by mapping issues. This chapterhas addressed some approaches for mapping and merging of schema andontology. Currently, process of mapping and merging have big challenge infully automatic.

Semantic mediation for interoperability give promise to solve current con-dition for interoperability. . Even though, there are several problems [8] ofsemantic mediation, that can be categorize as:

• The availability of content: implementation of semantic to describecontent of WWW is very limited.

• Scalability: grow of pages will push grow of Semantic Web, the methodto handle the growing in scalable manner is important.

• Multilinguality in machine and human level, there are many languagesto create Semantic Web, and also there are many languages for human.

• Stability of Semantic Web languages, to support Semantic Web, stan-dardization e�ort is still important to develop.

• Ontology availability, development and evolution, ontology is impor-tant tool for Semantic Web. However, maturity in development, andmaintain of the ontology need a better way.

• Visualization, increasing amount of information overload, user will de-mand the new approach di�er the usual hypertext in visualization tosupport user.

Page 65: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 2. LITERATURE REVIEW 52

2.4.2 Summary of GIS InteroperabilityIn this sub-chapter, motivation and main problems at GIS interoperabilityhas been addressed. Internet gives contribution to move from traditionalsingle GIS system to distributed GIS system. Interoperability in distributedGIS is needed to face current problem. Some approach for GIS interoperabil-ity has been introduced from low level to information level. This sub-chapteris focusing on interoperability with semantic mediation. GeoNis, BUSTERand SemWeb give a good approach for GIS interoperability by using semanticmediation.

Emerging solution between semantic mediation, GIS interoperability andP2P can occurred to deliver better solution. PSW and BUSTER cons tributeidea to put these solutions in GIS interoperability. Semantic problem andontology is still the important part for this emerging model.

Page 66: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 3

Methodology

3.1 MotivationGovernment institutions make planning, policy, and decision making as reg-ular activities. Information technology constributes greatly to these activi-ties, introducing in the last decade new application and paradigms such ase-Government, e-Commerce, and e-Education. Increasingly, e-governmentdecision making needs gathering information from many sources or depart-ments. Decisions can be carried out at many government levels includingnational, province, district and sub-district. Each level has di�erent needsin sources, types, and detail of information. For example, a district gov-ernment may need information about the length and condition of road forbudget planning and maintenance, while a province may use tra�c densityinformation for economic activity evaluation. A national government levelrequires road classi�cation types, which can be pay road, toll road, highway,etc, for tax calculation. Moreover, decision made at one level can be basedon information de�ned at the other levels. Interoperability and mediationapproach are needed to allow information sharing among the levels.

In many applications, data have high correlation with geographic infor-mation. With the rapid development in GIS, more and more geographicaldatabase have been developed by di�erent programs and applications. Un-fortunately, data sharing and acquisition still are big challenges for the de-velopment of GIS applications. There is a large amount of geographical datastored in di�erent places and in di�erent formats. However, data reuse by

53

Page 67: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 54

new applications and data sharing are hindered by the heterogeneity amongexisting system, heterogeneity data modeling concepts, data encoding tech-niques and storage structures, etc [14] This situation is even worse in largedeveloping countries like Indonesia where large amount of spatial data arestored in various paper and digital formats. For example, to develop a deci-sion making system for land transportation the Indonesian government mustconsider information from departments such as the Ministry of Internal Af-fair, the Police Department, and the Ministry of Public Work. Indonesiais composed of 13,000 islands among which the population is not equallydistributed. As a result economic, social, and transport activities are verydiverse. The other problems of land transportation, are related to �ood-ing disaster during rain session, and very high movement of people in shortperiod time such as New Year.

Internet allows e�cient communication among computers and peoplearound the world. On the other hand, it raises new challenges related tothe large number of information sources, the heterogeneity of sources, secu-rity, and trust of data sources. Peer to peer (P2P) based architectures canbe used to tackle the problems related to searching and �nding appropriatedata sources because they allow the development of community with com-mon interest. They allow reliable search of information source by limitingthe number of sources that are searched and creating peer clusters or groupsto reduce source diversity.

To achieve this goal, combination the methods concepts and results frommany research areas based on semantic interoperability, GIS interoperabilityand P2P architectures are implemented to provide better result.

3.2 Research FrameworkDecision support increasingly requires the ability to manage, combine anduse information from heterogeneous sources.. Figure 3.1 depicts frameworkfor decision support based on four pillars related to semantic interoperabil-ity, GIS interoperability, query processing and P2P architectures. In thefollowing, the main characteristics and issues of the four pillars is presented.

Page 68: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 55

Figure 3.1: Four Pillars Research Famework

3.2.1 Semantic Interoperability, characteristics and is-sues

Research on interoperability is shifting from communication level to highoperational level where data semantics play an important role. Informa-tion heterogeneity can occur at the syntactic, structural and semantic level.Semantic interoperability is more acute in dynamic and autonomous envi-ronments, due to the lack of relationships among sources. Semantic inter-operability is essentially based on having a common understanding of themeaning of the information exchanged by di�erent sources. It's a multi-levelproblem that can occur in data sources, fromats or models.

The main characteristics of semantic interoperability pillar are:

• Heterogeneous semantic is a diversity of information sources. Thisdiversity can stem from the model used to represent the informationto semantic representation. Traditional model looks at structure data,such as relational database, object oriented. Image, music, and mul-timedia are new paradigm in data model which called un-structuredata. Web is a popular information sharing which represented by semi-structure data model. Semantic diversity consider interoperability atsemi-structure model.

Page 69: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 56

• Semantic description of sources, contexts and metadata at seman-tic can been used to enable data sharing. A context represents the ex-plicit constraints and assumptions related to the de�nition and usageof data. Di�erent models have been used to de�ne contexts, includinglogic rules, conceptual graphs, metadata and terms of an ontology.

� Semi-structure data models such as XML are used to model weboriented data and to facilitate interoperability. XML is widely ac-cepted as the de facto standard for data modeling and exchangein web environments. XML (eXtensible Markup Language) is atextual language that provides a structural description (and re-lated semantic) of information. It allows the de�nition of schemain XML Schema or grammars to represent the conceptual char-acteristics of information systems. XML is a meta-language fromwhich several variants have been derived for di�erent domains inthe last few years. Examples of languages derived from XML in-clude GML (Geography Markup Language), Global PositioningSystem Markup Language (GPSML). XML is foundation syntaxfor upper layer in semantic and ontology model, such as RDF,OWL.

� Ontologies are increasingly used in interoperable systems, to cap-ture the meanings and relationships of concepts used in variousdomains. Gruber [47] de�nes an ontology as an explicit speci�ca-tion of a conceptualization of the real world entities of an applica-tion domain. An ontology is a vocabulary composed of terms andrelationships among them.

� Several studies have been devoted to ontology representation lan-guages: they range from informal natural languages to formallanguages based on predicate logic or graph concepts. Among theformal languages, OWL which has recently gained in popularity,is a description language aimed at incorporating a theory basedsemantics and an ontological inference and reasoning mechanisminto RDF .

Some of issues involved in semantic heterogeneity are:

Page 70: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 57

• How to represent the sources? It need to able to write appropri-ate representation to allow e�cient interoperability. The focus shouldbe put available theoretical background, standards, and technologies.Moreover, there is a need for tools and languages to capture the con-tent, context and constraints in order to share information.

• How to resolve diversity among data sources (syntactic, struc-ture, semantic?) In order to exchange information, a connection be-tween information providers and resolve di�erences in their syntacticand semantic inter-representation of data is needed to establish. Syn-tactic di�erence arise from the data model and used to describe data.Schema, semantic and ontology have been introduced to reconcile thisissue. Some approaches and methodologies have been available to dealwith agreement / mapping. an approach which �exible and e�cient todevelop agreement at current condition of environment is important.

3.2.2 GIS Interoperability, characteristics and issuesIn addition to the problem involved in semantic interoperability of traditionaldata, GIS interoperability must take into account spatial and temporal char-acteristics. Geographic information systems are commonly used in manyspatial and environmental applications. There are well known GIS softwaresystem, such as ESRI ArcInfo, Smallworld GIS, Intergraph GeoMedia, Map-Info Professional and Grass. Proposed by di�erent vendor who have theirown proprietary software designs, data models, and database storage struc-ture. This results in the availability of a large number of ad hoc independentspatial data repositories created for speci�c purposes and described in var-ious formats. To exchange information and share computational geo-dataresources among heterogeneous systems, conversion tools are usually devel-oped to transfer data from one format to other format. GIS composed ofimage and traditional database, it required geographic data exhibits com-plex structure, large size, and complex semantics. The motivation for theinteroperability of geographical information systems is to provide users withthe architectures and tools to:

• share existing collections of spatial data sets created by di�erent insti-

Page 71: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 58

tutions and government agencies. However, spatial data sharing can behindered by data volume and con�icts arising from the diversity andcomplexity of spatial data structures.

• reuse spatial data to avoid duplicating the costs of data modeling, ac-quisition, storage and maintenance. Data reusability is based on thefact that many applications require data that are collected on commonspatial entities. example of these shareable collections of spatial dataare cadastral data, various city maps, earth sciences data, weather in-formation, and satellite data or photos of di�erent regions of the earth.

• allow decision support system that are based on multiple sources ofinformation. Thus the need for tools to access heterogeneous data(diversity of data format) and resolve of semantic spatial ontology dis-crepancy using

To achieve the goal of semantic interoperability of geographic systems,several issues must be tackled.:

• Heterogeneity of �le format, Many GIS programs are based on pro-prietary software designs, data-models and database storage structure.This situation create syntactic heterogeneity problem. The main e�ortto address these problem is through standardization. Unfortunately,standardization need more time before can be implemented and alwaysbehind the real situation.

• What data model GIS information sharing? Beside the hetero-geneity of format �le, semantic heterogeneity in GIS interoperabilityinvolves more complex issues. Semantic heterogeneity of geodata canbe classi�ed as formalization heterogeneity, conceptual heterogeneityand context heterogeneity. Moreover, GIS information interoperabilitymust pay attention to handle image, database and their correlation.There are some heterogeneity elements, which constitute major hur-dles in data access and sharing among GIS organization. To handlethe heterogeneity element (semantic), there is a need for appropriatemodel and representation of information, an language should �t to GISenvironment.

Page 72: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 59

3.2.3 P2P Architecture, characteristics and issuesPeer to Peer (P2P) is becoming popular as overlay network on top of Internet.P2P allows sharing of large volume of data and other resources. Some popularimplementations of P2P are Nepster for sharing mp3 music �le, and eDonkeyfor sharing many type of �le such as book, music, movie. Implementation ofP2P for GIS �le sharing has been introduced, the purpose is to overcome thelarge size of GIS data and network tra�c by 'automatically' distributing thedata/�le to some peers.

P2P can provide infrastructure for dynamic environment in which au-tonomous and independent peers can joint or leave the network easily andfrequently. However in a diverse and large community, it is hard to discoverrelevant data for decision making. P2P allow to develop community whichshare common interest, so that large community will be 'clustered' based onthe same interest. Then searching relevant data can be easier.

As shown in �gure 3.1, P2P system are characterized by the followingcomponents:

• Model architecture. The server model and client-server model are acentralization model, where autonomy is very limited. Recently, P2Parchitectures introduce some models in pure and hybrid architecture.The large number and high variety of sources in the Internet can bemanaged by P2P, it creating communities based on common interest.Every peer has a high autonomy, and can freely join the community.P2P architecture allow symmetry sharing, and direct contact betweenpeers. Generally they can be classi�ed as pure, partial, and hybrid P2Parchitecture.

• Characteristics of P2P are self organization, symmetric communica-tion, decentralization, autonomy, anonymity, scalability, ad-hoc com-munity, and dynamic environment.

Generally P2P can be used to:

• aid discovering relevant data sources. P2P allows the developmentof network based on common interests, such as: sport, music, landtransport, etc. By grouping peers with common interest, only a small

Page 73: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 60

number of sources with relevant data need to be searched for. Anothercharacteristic of P2P is the autonomy of data sources. Every source hashigh autonomy to decide which data will be available for public, andwhich format or point of view will implement. P2P systems have mech-anism to handle autonomous environment, and semantic can tackle ofdiversity of information sources in P2P environment.

• Finally, the management of a dynamic environment is major P2P issue,allowing peers to joint or leave the network with or without warningmessage. P2P has approach to handle this situation which using de-centralization model.

There are interesting issues that must be addressed to achieve semanticinteroperability of spatial information.

• How to represent peer's content? Every peer is associated witha schema(s) that represents the peer's domain of interest. Model rep-resentation need to take care of information sharing, autonomy anddynamic content. Many semi-structured model are proposed for rep-resenting Web information. The most popular semi-structured modelsinclude XML, and RDF. Currently many e�orts are underway to de-velop geographic representation models based on the semi-structured,some examples include GML, SVG and RDF.

• How to map or establish correspondence among peers? This is-sue is related to semantic interoperability and GIS information. Specialapproach is needed to map information between peers, taking into ac-count their characteristics.. Agreement approach can be implementedto map between common ontology and local ontology. This requiresthe speci�cation of what an agreement is and how it can be developed.

3.2.4 Query Processing, characteristics and issuesA query is an access facility to database. Query in a heterogeneous databasesystem can be expressed in the query language of the component databasesor in a system-independent query language. Processing a query requires a

Page 74: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 61

mechanism for decomposing, and translating the query into subqueries forrelevant sources.

Query processing can involve query optimization stage that determinesa query strategy and plan for accessing the component sources and combin-ing the query result. The optimization techniques also depend on the typeof query interface supported by the system. Two queries are semanticallyequivalent if they return the same answer for any database state satisfyinga given set of integrity constraints. A semantic transformation transforms agiven query into a semantically equivalent one. Semantic query optimizationis the process of determining the set of semantic transformation that resultsin a semantically equivalent query with a lower execution cost. Therefore themain components of query pillar are:

• Query rewriting: currently sending query is based on keyword andthe assumption that the sources and users have same view in infor-mation. Unfortunately, diversity and the large number of sources inInternet do not meet this condition.

• Query plan and optimization: is an important performance issueof query process. In dynamic, heterogeneous and large environment,where and how to send query is very important to �nd an e�cientsolution. Currently, this research has not yet focus on this issue.

• Merge answer: this is needed because a query can be answered bymany sources. Eventhough dealt with many sources have been con-ducted, the incomplete answers can occur because of agreements onconcepts. This issue must be considered, at this time it is not part ofthe research.

Regarding the query pillar, in this thesis will focus on:

• Discovery appropriate sources to respond the query. Method to �ndrelated sources can depend on status of joint, metadata, routing, oragreement of sources.

• Write a query to the sources based on same concepts shared by theparties. However query optimization stage has not been taken intoaccount for the research.

Page 75: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 62

Figure 3.2: Peer Agreement Famework

3.3 Overview of ApproachFigure 3.2 shows the components of the ontology based P2P mediation ap-proach. It consists of three contexts: community, agreement and local. Con-texts have been used in various �elds to model semantics or knowledge. In ourwork, contexts represent the concepts that are used to capture and specify atdi�erent levels the important characteristics of data from di�erent sources.When data are exchanged or used in contexts di�erent from the ones in whichthey are de�ned, transformation functions are needed to convert the data andadapt their semantics. In the next section, the di�erent component of theapproach which is based on the interactions between independent peers willbe explained. There are several types of peers:

• Super peer (SP) is used to manage a common ontology which providesa taxonomy on description of the common domain. It also handle, bookkeeping about the availability of provider peers.

• Provider peer (PP) will provide information to the community, It allows

Page 76: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 63

access to part of its data through an export schema (local ontology)and de�nes agreements to relate its local data to the common ontology.

• Request peer (RP) submits queries to provider peers.

3.3.1 The Local ContextLocal contexts represent the shared data available at the provider peers.Local data can be stored in traditional data (relational of object) data modelor native XML formats. However, the shared data which can only be a subsetof the local data is described by an export schema. The functionalities andservices are used to bridge or convert schema, queries and data are commonlyhandled by a tool called wrapper.

In the approach, RDBMS or native XML database has a feature to rep-resent its schema to XML/S and RDF/S, and that it is able to handle queryin SQL or XML-QL format and to send result of query in XML format.Therefore, the construction of wrapper is not the focus of this thesis. Theimportant part in local context is how to represent export schema for usingin the higher agreement context level.

An export schema is represented by a set of Export Schema Units whichmap local data to the export schema, including the transformation functionsrequired to convert the corresponding data. This task of selecting whatsubset of local data tooinclude in the export schema is handled by localadministrator, this guaranty the full autonomy of the provider peer.

3.3.2 The Community ContextThe community context is a super peer which manages the common ontology.The common ontology represents the conceptualization of concepts of a GISdomain for a given community of peers. Common ontology is a referencesemantics on which are de�ned the agreements for the peers to the communityamong peer members at the community.

Common ontology is a taxonomy of conceptualization concepts of com-munity application domain. The concepts are de�ned by their structural,behavioral, and semantic properties. RDF/S and OWL can be used to rep-resent the common ontology. In the approach, available common ontology

Page 77: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 64

and make some modi�cations is done. Popular tool such as Protégé can beused to manipulate the common ontology.

The main focus is on how to e�ciently use the common ontology for agree-ment among provider peers. The argument is to implement a super peer tostore common ontology. The selection of the super peer will based on storageand computing resources and the available bandwidth. The main drawbackin using a super peer are super peer failures and performance bottleneck.These problems can be reduced by developing solution to use back-up superpeers and by choosing powerfull state of the art machine as super peer.

3.3.3 The Agreement ContextThe agreement context of a provider peer consist of a subset of the commonontology, an export schema, an agreement component and a query responsemanagement component. The agreement component is used to map exportschema to the corresponding concepts of the common ontology. The keycomponents of the agreement process are the agreement units which establishmappings between the super peer's common ontology and provider peer's.Peer agreement refer to similarities measurement which consist of three steps,as follow:

• Label matching is the matching based on linguistic approach. The as-sumption of this approach is the label bring the semantic term. Thegoal of the �rst step is to create node match elements (NME) betweena super peer (SP) and a provider peer (PP). One popular tool of lin-guistic analysis for semantic similarity is WordNet. Before using theWordNet, tokenization based on linguistic analysis is conducted. Word-Net is a semantic network database for English developed by PrincetonUniversity. An NME is tuple < mID, NPP , NSP , RSC >; where mID isa unique ID of NME, NPP is a node element of the provider peer, NSP

is a node element of the super peer , RSC is value of semantic similaritybetween [0.1].

• Internal structure comparison is based on the internal structures ofentities (object properties and data properties).

Page 78: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 65

• External structure comparison is based on comparing the position ofthe entities within a taxonomy. Two approaches are used by considerset of superconcepts and set of leaves.

Result of similarity, agreement units are created to establish di�erentmatching on the di�erent type of elements of an ontology. An agreement unitis represented by a tuple: < SMCID, ESPPi

, typePPi, COSPj

, typeSPj, tansCP >;

where SMCID is a unique ID, ESPPiis an element of the export schema of

PP, typePPiis type the export schema element of PP, COSPj

is an elementof the common ontology of SP,typeSPj

is the type of the common ontologyelement,tansCP is a tansformation function between an export schema anda common ontology. The tansCP can be a derivation mapping (class andproperty), an operation or constraint mapping (instances). The agreementprocess is mainly concerned with the resolution of syntactic and semanticheterogeneity.

3.3.4 Peer ComponentIn our architecture, the component of the di�erent types of peers are asfollows:

• Super Peer (SP) consists of:

� an ontology engineering tool used to develop and maintain thecommon ontology.

� a book keeping component for managing the member peers of thecommunity.

• Provider Peer (PP) contains:

� a collection of local data.

� an XML wrapper to convert between local data and higher levelmodel.

� an export schema tool to help the administrator in developing thesubset of the local schema that is shared with other peers.

� Query response is a function for handling query, negotiation andfor constructing and sending respons.

Page 79: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 66

• Request Peer (RP) contains:

� a knowledge discover/concept adjustment tool used to related thepeer concepts to common ontology.

� a query management which is a block used to write, re-write, send,re-send, and negotiate.

� a respond management which is a block used to receive and inte-grate the answer of query.

3.3.5 Agreement ProcessFor information sharing, the following phase are required:

• Discovery of relevant information provider. In large environments suchas Internet, a big issue is to search for appropriate source of information.Many search engines have been introduced, even though the result isnot yet satis�ed. Search engine is an electronic catalog which savethe content and address of some information sources in Internet. Thepopular search engines such as Goggle, Yahoo or AltaVista are basedon crawl technology to collect the information. The crawl technologyhas a mechanism to record an address based on link from other sites.By this pull method, some bias or error of information source contentcan be happen. Other thing, during the collection system can notselect to record the address site on speci�c domain. For query, mostlysearch engines are based on keyword input. There is a main problemwith this technology, for example, if an ecology expert is searching forinformation a bank of Nil river, the expert will submit the followingrequest "bank of Nil". A search engine can respond bank as part ofriver or bank as �nancial institution or bank as building, etc. So that,the expert need to �lter the respond of query. According some surveywww.apjii.or.id, relevant respond based on search engine is less than10% of all responds.

Referring to the above, there are two issues in discovery, how to �ndrelated sources and how to send query to get relevant answer. P2Pcan group based on the common interest of sources information and

Page 80: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 67

client. A speci�c model of grouping in P2P called clustering or semanticoverlay network (SON).

Searching based on semantic can enhance the results compared to key-word based search, because semantic will consider related concept suchas synonym, hypername, homonym, etc. If an peer submit a query withthe word "bank", the semantic based search will produce feedback towrite more precisely the meaning of bank. Referring to WordNet, thereare 10 senses of noun and 8 senses of verb, so that sending a query word"bank" will have at least 18 meaning. Semantic promise a solution tohandle the di�erent concept like the example.

P2P which can reduce number of sources to related sources because ofgrouping, and the use of semantic which can enhance query by consid-ering concept can be used to improve discovery of relevant sources.

• In high autonomy and dynamic environment, it is di�cult to developa standard on which di�erent parties agree. Semantic con�ict occurwhenever two or more peers do not use the same interpretation ofthe information. The simplest form of disagreement in interpretationof information are homonyms (the use of the same word with di�er-ent meaning), and synonyms (the use of di�erent words with the samemeaning). These problems can be solved by one-to-one structural map-ping or look-up-table. More interesting are con�ict where one-to-onemapping are not satis�ed, i.e. 1:m, n:m. In this case, the semanticsof information has to be taken into account in order to decide howdi�erent information items relate to each other.

• Query processing is based on agreement between the export schema ofthe request peer and provider peer. The query will directly route toappropriate provider peers without the help of the super peer. Thismethod will reduce the load of the super peer and the impact of superpeer failures.

Process of agreement is conducted by provider peer as follow:

1. Develop export schema of provider peer based on local data. The pro-cess will be independent of administration of the provider peer.

Page 81: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 68

2. Register with a super peer to get common ontology of the peer, see�gure 3.2 point 1.

3. Develop agreement which contains some of unit agreement. To achievethe agreement three steps will be conducted: similarity measurement(label matching, internal and external structure comparison). The sim-ilarity measurement is a foundation to the process semantic matching.The result of semantic matching is in a tuple of Unit Agreement. See�gure 3.2 point 2.

4. Sending query from request peer based on knowledge discovery to com-mon ontology at super peer. The purpose is to give similarity knowl-edge between request and provider peers. Before sending the query, therequest peer will look at super peer book keeping to know the activeprovider peers in the community. Therefore, in contrast to a broad-cast method, the queries are sent only to appropriate provider peers.Query responds based on two evaluation. The �rst step is evaluation ofconcept at agreement, and followed by some negotiation. The secondstep is to transform query in concept level to a query in data level viabridge/wrapper. See �gure 3.2 point 3a and 3b.

5. The respond of the query is received from provider peers and integratedto archive the �nal result, as �gure 3.2 point 3c.

Process of registration is conducted by super peer as follow:

1. Initial step before registration is preparation of common ontology

2. Book keeping of community members is based on registration. Regis-tration will be of two types, registration of provider peer which consistsof ID, address, activation, metadata, and registration of request peerwhich consists of only an ID and an address. To detect active providerpeer there are two possibilities: clean log-o� which a provider peersend message to log o�, and dirty log-o� when the provider peer doesnot send a message to log o�. To detect dirty log-o�, re-registrationof provider peer is needed by a speci�c period time by sending smallmessage to super peer.

Page 82: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 69

Processing of query and response occur between provider andrequest peer as follow:

1. The request peer adjusts his concept (knowledge discovery) to the com-mon ontology.

2. The request peer submits a query base on the adjustment only to re-lated provider peers.

3. The negotiation process is happened between request and provider peerto get better result of drop the query.

4. The response of a query from provider peers are integrated by requestpeer to get the �nal result.

3.4 Contribution and ResultsIn this thesis, there are contributions or research as follow:

• A loosely coupled semantic interoperability in P2P architecture basedon three contexts is proposed. The global context, represented by anontology, de�nes a set of generic application domain level concepts,called peer agreement concepts. The local /schema contexts providethe ontological descriptions of the local data sources. Finally, the peeragreement contexts combine views from the common and local contextsto de�ne a set of cooperation concepts.

• Developing of the peer agreement is to represent part agreementbetween common ontology and export schema provider peer. Creatingof agreement utilize label matching, internal and external structurecomparison. The peer agreement consists of common ontology, exportschema and transformation function.

• Finally, an implementation framework to apply the approach to geo-graphic information systems is proposed. The XML, RDF and OWLontology description language [108] is used to describe the di�erentontologies and implement the contexts. Furthermore, the proposed ap-proach relies on the inference capabilities of OWL for semi-automatic

Page 83: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 3. METHODOLOGY 70

processing. The prototype based on virtual machine, virtual web serverand Protégé tools.

Page 84: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 4

Discussion of Approach

To make into account the characteristics of P2P, an approach in which isthat handle the absence of a global schema , dynamic peer, and di�erentconcept of peers is needed. Our approach for the interoperability is basedon a mediator model which consider semantic interoperability. The termsemantic interoperability or semantic translation is used to resolve semanticcon�icts that occur between heterogeneous information sources. The focusof the paper is on how to develop agreement between peers to handle in-teroperability, i.e. query. The basic need of agreement is how to representthe context description of sources. Peer agreement is used to map betweenlocal schema (export) and common ontology. To have the agreement, thefollowing components are needed:

• A common ontology is stored at the super peer (SP). It will act as pivotor reference point.

• Agreement between export schema at provider peer (PP) and commonontology at SP, the agreement is an initial process in interoperabilitybetween peers.

• Export schema at PP as a representation of local data.

The mediator-based interoperability and the components can be seen in �gure4.1.

71

Page 85: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 72

Figure 4.1: Type of Peers & Basic Task

4.1 Representation of Peer's ContextA P2P system ∏

=< P ,A >, is a set P of peers {P1, ..., Pn} and a setA of P2P agreements. The model of the agreement can be seen in �g-ure 4.1. The purpose of context description is to represent the textual andspatial information which de�ne semantic context. The context descriptionat Peer Agreement consists of export schema, common/share ontology, andagreement between provider peer and common ontology at super peer. Ba-sically the context description will consist of class, relation, and attribute.A set of classes/concepts, representing real world entities sharing structureand semantic. Each class/concept has a name and an optional set of at-tributes/properties. A set of relations is used to represent semantic relation-ship between entities. A instance is an individual entity of a class.

Context description will be applied for the export schema, the commonontology and t agreement,

• Each provider peer (PP), PPi of ∏ is a triple (ESi, Li,Mi). ESi isthe export schema of a peer provider,Li is the local sources of providerpeer,and Mi is a mapping on transfer function from/to the sourcesLi to/from export schema ESi or transformation function between Li

to/from ESi. This model can be seen in �g 4.1. Second circle at rightside is called Export Schema Unit. The detail explanation of ExportSchema Unit is in Agreement Unit, because they have some model.Export schema utilize combination of XMLS and RDFS. RDFS is triple

Page 86: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 73

<S, P, O> can enhance XML in semantic representation. GML cansupport XML for geodata representation more complete.

• Common ontology will based on RDFS which is enhanced with OWL,to describe the class, relation, and attribute of a domain.

• The agreement will be discussed at section 4.2, the main point of theagreement is an Agreement Unit which maps between export schemaand common ontology can be represented as top right circle in �gure4.1.

There are some formal representations of information sources. Refer toW3C (www.w3.org), some representation language have been available, asfollows:

4.1.1 XMLXML has no explicitly de�ned data model. XML (Extensible Mark-up Lan-guage) is called a Mark-up language. In other words, it is a meta-languagefor creating Mark-up languages. While HTML focuses on how data look,XML describes data and focuses on what data is. It provides a means forspecifying the structure of content of document.

XML uses a Document Type De�nition (DTD) or an XML Schema to de-scribe data. With a DTD or XML Schema, it is designed to be self-descriptive. XML is created to structure, store and send information by encrypting inthe well de�ned and validated tags. To list a few of the characteristics ofXML, it is free and extensible, a complement to HTML, an open language,separate data independent of visualization tools, meant to share and storedata.

XML is an answer for Internet-Based GIS. If XML is adopted by manyorganization and industry. It allows users to search and manipulate on-linedata, regardless of derived application. XML allow authors to de�ne anunlimited set of tags. The example of the thesis is to give how spatial dataalong with attributes can be transferred across any platform of Internet.To avoid any confusion tags are also written is a human understandablelanguage. Many works are on going like the exchange of updating informationbetween GIS using XML. Speci�cally for Geospatial domain, many XML

Page 87: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 74

dialects like GML and SVG are coming up as open speci�cation to share,exchange or display GIS information on web.

Widely accepted XML as open standard language, so many domain spe-ci�c dialects of XML are currently being developed. There are a lot of usefulextensions to XML. For example, GML (Geography Mark-up Language) isone those important extensions for Geographic data. GML is designed tosupport interoperability and does so through the provision of basic geometrytags, a common models (features/properties) and a mechanism for creatingand sharing application schemas.

GML is an XML based OGC speci�cation for transport and storageof geographic information, including the geometry as well the propertiesof geographic features. There are some bene�ts of GML refer to www.wirelessdevnet.com/channels/lbs/features/top10gml are good for Web,serves more users of di�erent interest, and better query capability.

GML provides three basic XML Schema documents. The feature.xsd isthe essentially a named list of properties. The geometry.xsd is the actual def-inition of feature properties, it depends upon the domain speci�c applicationschema. The xlink.xsd allows elements to be inserted into XML documentsan thus create linking between resources.

In general, features have an id, name and feature type and compare tosimple properties (booleans, integers, reals, strings) and 2D geometry withlinear interpolation between coordinates. Geometric constructs can consistsof coordinates, box, point, linestring, linearring, polygon, multipoint, multi-linestring, multipolygon, and geometry collection.

GML can be extended as the user demand. One person may de�ne afeature called ROAD where another might use STREET. GML does notconstrain how such objects are named or de�nes what properties they have.Users can however readily compare schemas on the Internet and provideagreement / mapping of data from one schema to another.

One problem of GML is that it doesnot support the representation oftopology.

Page 88: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 75

4.1.2 RDF and OWLRDF (Resources Description Framework) attempts to address XML's seman-tic limitations because XML is designed to provide an interchange format forweakly structured data by de�ning the underlying data model in a schema.XML is purely syntactic / structural describes data on the object level andoften encodes an application-speci�c data model. It presents a simple modelthat can be used to represent any kind of data, in other word, a data modelfor meta-information. This data model consists of nodes connected by la-beled arc, where the nodes represent web resources and the arcs representproperties of these resources. It should be noted that the model is essentiallya semantic network, although unlike many semantic networks, it does notprovide inheritance. EDIT: The nodes/arcs model also means that RDF isinherently , however, this does not restrict expressivity of the language be-cause any n-array relation can represented as a sequence of binary relation.

RDF can be change using an XML serialization syntax, as follow:

<?xml version=�1.0�?><rdf:RDF xmlns=�http://www.w3.org/1999/02/22-rdf-syntax-ns#�

xmlns:g=�http://geo.org/general#><rdf:Description rdf:about=�//http:www.pu.go.id/thamrinroad�>

<rdf:type rdf:resources=�http://geo.org/road#mainroad�/><g:name>M H Thamrin road</g:name>

</rdf:Description></rdf:RDF>

The basic syntax consists of a rdf:Description element which contains a setof property elements. The about rdf:attribute identi�es which resource isdescribed. The property rdf:type is used to express that a resource is amember of a given class, and is equivalent to the instance-of link used in manysemantic nets and frame systems. There are many variation of abbreviatedin RDF syntax.

To prevent accidental name of concept between di�erent vocabularies,RDF uses XML namespace for each vocabulary. This approach has twodisadvantage. First, it is possible to write RDF statements such that it isambiguous as to whether certain tags are RDF or intermeshed tags from

Page 89: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 76

another namespace. Second, namespaces are not transitive, which meansthat each RDF section must explicitly specify the namespace for every schemathat is reference in that section, even for schemas that were extended bayschema whose namespace has already been speci�ed.

RDF schema allow users to create schemas of standard classes and prop-erties using RDF. For this purpose, the speci�cation de�nes a number ofclasses and properties that have speci�c semantic. A signi�cant weakness ofRDF is that it does not specify a schema inclusion feature. OWL will reducethe limitation of RDF for ontology representation. Three types of OWL areOWL-Light, OWL DL, and OWL Full. OWL Full is the most expressivecompare to other. Nowday, OWL Full tool is limited.

4.1.3 Peer's RepresentationIn the Peer Agreement approach, peer representation is based on XML andenhanced by adding RDF level. Peer representation is important in P2P en-vironment for two reasons. The �rst reason is for advertising the content ofPeer. In advertisement, the provider peer sends some information includingmetadata which is written using RDF. Second reason is for creating agree-ment / mapping between peer and super peer. The result agreement will bewrite in RDF(S) and add some OWL feature. More detail for mapping willbe discussed at 4.2.

4.2 The AgreementThe term semantic interoperability or semantic translation is used to resolvesemantic con�icts that occur between heterogeneous information sources. Se-mantic con�ict occur whenever two systems do not use the same interpreta-tion of information. The simplest form of disagreement in interpretation ofinformation are homonyms (the use of the same word with di�erent meaning),and synonyms (the use of di�erent words with the same meaning). However,these problems can be solved by one-to-one structural mapping. More inter-esting are con�icts where one-to-one mapping does not apply. In this case,the semantics of information has to be taken into account in order to decidehow di�erent information items relate to each other. Yaser [16] has divided

Page 90: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 77

schematic heterogeneity into for groups:

• Di�erence in classes

� synonyms and homonyms

� di�erence in class attributes

� di�erence in methods

� di�erence in integrity constraint

• Di�erence in attributes

� di�erence in domains

� di�erence in unit

� di�erence in default value

� di�erence in data type

• Di�erence in Hierarchies

� di�erence in generalization

� di�erence in aggregation levels

� class, attribute and domain di�erence

� di�erence in geometry

An agreement from a provider peer to the common ontology is the resultof a matching or mapping process that is used to resolve their semantic dis-crepancies. The combination matching process consists of terminology level,internal structure comparison and external structure comparison. Result ofmatching combination will be used to develop agreement unit as componentof agreement. There are some assumptions for the agreement, such as: us-ing same language to representation of schema/ontology, label represent themeaning of concept, and there is no individuals at the common ontology.

The calculation of semantic similarity between concepts is the �rst stepfor creating agreements between the ontology and the contents of the otherpeers (provider or request). Each concept can be represented as a hierarchy ofterminological labels which contains some structural or semantic information.

Page 91: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 78

Figure 4.2: Example Common Ontology and Export Schema of ProviderPeers

4.2.1 Similarity MeasurementThe similarity measurement is as a step before creating an agreement. Thefollowing example is used to illustrate the agreement approach. Consider atransportation network used in various activities (e.g. city planning, routingof delivery services). Di�erent organizations such as public work ministry,police department, toll collection departments can model and use di�erentviews of the network. Figure 4.2a presents a generic concept for the landtransportation network. Consider the fragments of export schema of the twoprovider peers shown in �gure 4.2. One of the provider peer is a toll col-lection company that views the shared transportation network as a point topoint network. This provider peer is interested in land transportation net-work characteristics related to the nature and type of the road (�gure 4.2b).Another provider peer PP2, a design and development company, requireddetail road information such as size, width, length and tra�c capacity. Thetwo provider peers generate di�erent representations of the same ontologyconcepts.

The following three similarity measurements:

4.2.1.1 Label Matching

The process will based on linguistic analysis. There are two common tech-niques at label matching [44]. First, a string analysis preprocessing step isused to transform the labels into words for linguistic analysis. For example,this step can be used to tokenization, expand abbreviations, replace acronyms

Page 92: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 79

and elimination. Next, the labels are matching by determining relations be-twen them. This is done by techniques that uses thesaurus or wordnet liketools. Note that this approach essentially calculate the structural matchingbetween labels by relying on the latent semantics that are conveyed by thelabels. Several algorithms and method proposed in the literature can be usedin this step. Label matching is based on Lin matching equation [54] whichattempt to use some information content (IC) .

The IC value is obtained by considering the negative log likehood icres(c) =

−logp(c), where c is some concept in WordNet and p(c)is the probabil-ity of encountering c in a given corpus. Formally similarity is de�ned as:simres(c1, c2) = maxc∈S(c1,c2)icres(c), where S(c1, c2)are the set concepts thatsubsume c1 and c2. Lin gave de�nition of similarity states �The similarity be-tween A and B is measured by the ration between the amount of informationneeded to state the commonality of A and B and the information needed tofully describe what A and B are.� His de�nition is expressed by:

simlin(c1, c2) =2 x simres(c1, c2)

(icres(c1) + (icres(c2))(4.1)

Label matching is done at the class and property levels. A thresholdvalue is used to decide if two labels are similar or not. Finally, some semanticrelations (e.g. sub or super relations) between the concepts associated withlabels are considered. Equations 4.2 and 4.3 below de�ne the results of labelmatching respectively without and with a threshold value.

< N iSP , N j

PP,, Simlabel > (4.2)

where N iSP is the label of i-th concept at the Super Peer (SP); i=1..imax is

the number of super peer labels; N jPP is the label of the j-th at a Provider

Peer(PP); j=1...jmax, is the number of labels at PP; Simlabel speci�es a sim-ilarity relation of the given labels, the corresponding value is in [0,1]

< N iSP , Rel, N j

PP,, τlabel, Simlabelτ > (4.3)

where N iSP= is the label of i-th concept at the Super Peer (SP); i=1..imax

is the number of SP labels; Rel can be ⊂, =,⊃; is the label of the j-th at aProvider Peer(PP); j=1...jmax, is the number of PP labels; τlabel is the thresh-

Page 93: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 80

old value of label matching ;Simlabelτ is the similarity value with threshold,the value are 0 or 1.

Example Label Matching: Consider the running example above and thepeer contents given in �gure 4.2. If provider PP1 wants to calculate similaritymeasures and then make ontological agreements on the data at the providerpeer P2P, it will compare and match the following pairs of concepts (SP:Road,PP1:Street) and (SP:Junction, PP1:Street). The labels with the pre�xesSP and PP1 represent a concept at the super peer SP and the providerpeer PP1 respectively. Using WordNet and Lin equation and assuming athreshold value equals to 0.50, the results of the matching process are asbelow. Table 4.1 presents the calculation of similarity between the ontologyconcepts (LandRoad, Road, Junction) and the export schema concepts (Way,Road, Street) of the provider peer PP2 (see subtable COPP1ES). Similarresults are shown in COPP2ES for the provider peer PP2.

• From equation 4.2: <Road, Street, 0.93>; <Junction, Street, 0.17>.

• From equation 4.3: <Road, Street, ⊃, 0.50, 1>; <Junction, NA, Street,0.50, 0>.

COPP1ES Way Road StreetLandRoad 0.92 1.00 0.93

Road 0.92 1.00 0.93Junction 0.38 0.29 0.29

COPP2ES Trans WSt NSt VNStLandRoad 0.33 0.65 0.65 0.65

Road 0.33 0.93 0.93 0.93Junction 0.25 0.29 0.29 0.29

Trans=Transportation; WSt=Wide Street;NSt=Narrow Street; VNSt=Very Narrow Street

Table 4.1: Label Matching of Concepts/Classes

Page 94: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 81

4.2.1.2 Internal Structure

Methods based on the internal structure of entities use criteria such as therange of their properties (attributes and relations), their cardinality, and thetransitivity and/or symmetry of their properties to calculate the similaritybetween them.

Similarity calculation of internal structure takes into account the pre-de�ned meaning of OWL DL and RDF(S) primitives and attributes. Theapproach is extended from similarity of entity graph structure [6]. Similaritybetween two concepts can be derived not only from similarities between theirdescription components, but also from the similarity between the structuresof the graph representing them. In our system, the similarity of internalstructure is formulated from the ration between the number of similar at-tributes (Pc) over the maximal number of attributes of both concepts. Theequation can be represented as

Simgraph(c1, c2) =|Pc|

maχ(|P (c1)| , |P (c2)|) (4.4)

Results of similarity of internal structure are represented by equation 4.5(in case with no threshold value) and equation 4.6 (case with a thresholdvalue).

< CkSP , C l

PP,, SimIClang, SimICworld

> (4.5)

where CkSP= class of k-th at Super Peer; k=1..kmax. k= number of class

at Super Peer; N lPP= label of l-th at Provider Peer; l=1...lmax; l= number

of class at Provider Peer; SimIClang=similarity value of internal structure of

language attribute; SimICworld=similarity value of internal structure of world

attribute.

< CkSP , C l

PP,, τIClang, SimIClangτ

,

τICworld, SimECworldτ

> (4.6)

where CkSP= class of k-th at Super Peer; k=1..kmax. k= number of class

at Super Peer; N lPP= label of l-th at Provider Peer; l=1...lmax, l= num-

Page 95: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 82

ber of class at Provider Peer; τIClang= threshold value for IC with language

attribute ; SimIClangτ=similarity value of internal structure of language at-

tribute with threshold; τICworld= threshold value for IC with world attribute

; SimICworldτ=similarity value of internal structure of world attribute with

threshold.

Example Internal Structure: Refer to �gure 4.2, consider properties ofSP:Road=is-a; jurisdiction, name, size, tra�c, regulation, speedlimit, de-scribe, SP:Junction=over/under, jurisdiction, name, size, tra�c, regulation,speedlimit, describe,and PP1:Street=is-a; name,size. The results are:

• From equation 4.5: <Road, Street, 1/1, 2/7>; <Junction, Street, 0/1,2/7>.

• From equation 4.6: <Road, Street, 0.40, 1, 0.20, 1>; <Junction, Street,0.40, 0, 0.20, 1>.

4.2.1.3 External Structure

The similarity comparison between two concepts from two ontology can bebased on comparing the position of the entities within a taxonomy. Cri-teria for deciding the similar can be included direct super-entities, sibling-entities, direct sub-entities, most of their descendant-entities, most of theirleaf-entities, and most of entities in the paths from the root.

In our approach, calculation of similarity will refer to two methods. The�rst method is based on Cupid structure matching [64]. The structural simi-larity as the fraction of leaves in the two subtees that have at least one stronglink (and are hence mappable to some leave in the other subtree, i.e:

Simstruct(c1, c2) =

∣∣∣∣∣∣{x | x ∈ leaves (c1) ∧ ∃y ∈ leaves (c2), stronglink (x, y)}∪{x | x ∈ leaves (c2) ∧ ∃y ∈ leaves (c1), stronglink (y, x)}

∣∣∣∣∣∣∣∣∣leaves (s1)) ∪ leaves (s2)

∣∣∣(4.7)

where leaves(c1) = set of leaves in the subtree rooted at c1. It does notcompute a 1-1 bipartite matching as it is computationally expensive andwould preclude m:n mappings.

Page 96: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 83

The second method refer to upward cotopic distance [42] which comparessimilarity set of superclasses divided by maximum number of set of superclassfrom both concept. The calculation is represented as:

SimstructCotopic(c1, c2) =|UC(c1, H) ∩ UC(c2, H)||UC(c1, H) ∪ UC(c2, H)| (4.8)

where UC(c1, H)is the set of superclasses of c1which until n level to the top.The results similarity are shown by equation 4.9 (in case with no threshold

value) and equation 4.10 (case with a threshold value).

< CkSP , C l

PP,, SimECset−sp , SimECleave> (4.9)

where CkSP= k-th class at Super Peer; k=1..kmax is the number of classes

at the Super Peer; N lPP= l-th label at the Provider Peer; l=1...lmax is the

number of classes at the Provider Peer; SimECset−sp is the similarity value ofthe external structure of the set of superclasses; SimECleave

is the similarityvalue of the external structure of the leaves.

< CkSP , C l

PP,, τECset−sp , SimECset−spτ,

τECleave, SimECleaveτ

> (4.10)

where CkSP= k-th class at Super Peer; k=1..kmax. is the number of classes

at Super Peer; N lPP= l-th label at Provider Peer; l=1...lmax is the number of

classes at the Provider Peer; τECset−sp is the threshold value for EC with set ofsuperclasses ; SimECset−spτ

is the similarity value of EC of set of superclasseswith threshold; τECleave

is the threshold value for EC with leaves; SimECleaveτ

is similarity value of EC of leave with threshold.

Example External Structure: Refer to �gure 4.2, calculation of similar-ity just provide based on set of superclasses. Leaves method can not be done,because the classes are leave. Set of superclasses of SP:Road=LandRoad,SP:Junction=LandRoad, and PP1:Street=WayInfrastructure. Assume thresh-old value of set of superclasses is 0.50, and leaves is 0.3. If similarity ofLandRoad and WayInfrastructure is 1, so the results are:

• From equation 4.9: <Road, Street, 1/1, NA>; <Junction, Street, 1/1,

Page 97: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 84

NA>.

• From equation 4.10: <Road, Street, 0.50, 1, 0.30, NA>; <Junction,Street, 0.50, 1, 0.30, NA>. NA=not available

4.2.2 Agreement UnitAgreement unit is developed refer to result of label matching, internal andexternal structure comparison. Agreement unit presents semantic mappingof concept between common ontology and local schema. Generally the stepsof developing agreement unit as follow:

• Label matching process for all concepts and properties of an exportschema to common ontology. The result will be a value and corelation.

• Calculate internal structure comparison for all concepts and properties.Process for concepts is more complex than property.

• Calculate external structure comparison for all concepts. This methodis not applicable for property. Implementation of set of leaves and setof superclass depend on position of concept in hierarchy.

• Result of label matching, internal and external structure comparisoncan give guidance for a user to develop agreement unit. Theoreticalresult of label matching, internal and external structure comparison canbe directly calculate to get �nal result (see equation 4.11 ). Howeverit is very hard to get the weight value for each method. Therefor, userfeedback need to make a �nal decision.

Final = wLMsimlin+wgraphsimgraph+wstructsimstruct+wstructCotopicsimstructCotopic

(4.11)where

wLM , wgraph, wstruct, wstructCotopic ≥ 0 and wLM+wgraph+wstruct+wstructCotopic = 1

The result of agreement based on concept is general refer to set-theoreticrelations e.g. =, ∩, ∪, ⊥, ⊆, ⊇. Sets of agreement unit is an agreement of a

Page 98: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 85

COES ES1

PP1ES2

PP1ESn

PP1

CO1SP µ1,1 µ1,2 µ1,n

CO2SP µ2,1 µ2,2 µ2,n

COmSP µm,1 µm,2 µm,n

Table 4.2: Agreement Unit for Simple Mapping

provider peer. Agreement unit can be presented as follow:

< SMCID, {COmSP , typem

SP} ,

{ESnPP , typen

PP} , µID > (4.12)

where: SMCID= a unique identi�er of agreement; m=1..mmax, m= numberof concept at Super Peer; COm

SP= concept m-th at SP; typemSP= type of

concept, can be class or property at SP; ESnPP= concept n-th at export

schema of PP; typeiPP=type of concept, can be class, or property at PP;

n=1...nmax, n= number of concept at SP; µID= is transformation type needto resolve semantic heterogeneity between SP and PP.

Example Agreement Unit: simple Agreement Unit (1:1) between SPand PP1 as table 4.2,

Agreement unit has limitation because of no instance at common on-tology. User feedback is needed to solve the limitation. Di�culty of thelimitation are:

• The logical level is very di�cult to resolve since they require extra in-formation, and need human who has good knowledge at the domain.This level try to resolve con�icts between di�erent representations (e.g.�sequence� and �unordered list All�), di�erent type (e.g. simple vs com-plex, date vs string), and di�erent constraint (e.g. default values, Nullvalues, Cardinality).

• The data level is the 'traditional transformation', e.g. unit conversion.In our context description by utilize RDFS and XMLS is to reduce

Page 99: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 86

problem in logical and data level. Logical and data level representbased on equation 4.12 especially for instance type.

Transformation (trans) type in mapping [110] can be in simple model aslook-up-table (1-1), or fancy functions. In general the trans can be catego-rized as follow:

• Derivation mappings

� SUPER, land transport SUPER road, means that land transportis more generalized concept of road.

� SUB, road SUB land transport, means that road is more special-ized concept of land transport.

� EQU, road EQU main street, means road and main street aresemantically equivalent.

• Operation mappings, operation mapping is usually used together with aderivation mapping or constraint mapping to de�ne complete mapping.For instant, name EQU (�rstname AND lastname), where AND is theoperator concatenating the string instances of �rstname and lastname.The other example is for unit conversion.

• Constraint mapping, use to specify the constraints that the instancesof the mapped concepts must conform to. Constraint mapping havethree roles in the frameworks: (1) limitations for �ltering the instancefor query answering, (2) requirements of data consistency for datacoordination, and (3) enabling queries with the constraints involv-ing attributes from multiple local sources. Typical constrain oper-ator include comparison (e.g. <,≤, >, ≥, =, and 6=) and disjointof two concepts. For example, it may specify speed_limit_street ≤speed_limit_highway.

4.3 ExampleA �rst simple example for mapping (idea from [27]) di�erent representationof road. Consider 2 sources S1 and S2. S1 contains the following objecttypes:

Page 100: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 87

• RoadSection, is a piece of road, homogeneous in value with respect tothe set of attributes and relationship. Road section is described bythe number of the road it belong to and the number of lanes in eachdirections. Road sections have linear geometry.

• Node is the begin or the end extremity of a road section. A node canbe a crossroad, a tra�c circle, a toll, a dead and, or appoint where thevalue of some attribute of the road changes. This is described by thekind_o_node attribute. A node can delimit several road section, eachroad section has one and only one begin node and one and only oneend node. Nodes have point geometry.

• Overstepping may consist of one or several overlapping bridges. Eachoverstepping is linked to at least two road sections. Overstappings havepoint geometry.

S2 contains the following object types:

• WaySection, is composed of one or several contiguous lane(s) going inthe same 'path' and direction. If the road has several lanes in thesame direction, it is split by a separator, then each group of lanes isrepresented as a separate way section. A way section is homogeneousin values and described by number of road it belongs to. A waysectionhas linear geometry.

• Separator, describes separators between two way section of the sameroad. A separator has linear geometry, and is characterized by itswidth. It is linked to the way sections it separates.

• Extremity is appoint or an area where tra�c conditions change. Thegeometry of extremity is geo.

• Bridge is a surface structure which allows several way sections to inter-sect with no connection. The geometry of bridge is surface.

Illustration sketch of geography representation and the schema from S1 andS2 can be seen at �gure 4.3 and 4.4.

Refer to the above de�nition and the �gures of road network and schema,it can create some agreements between S1 and S2 as follow:

Page 101: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 88

Figure 4.3: Di�erent Point of View on the same Road Network

Figure 4.4: Schema and Agreement between S1 and S2

Page 102: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 89

• agreement=<bridge-overstepping, Overstepping, Bridge, Class, EQU>,thereis agreement with name 'bridge-overstepping', between Overstepping(S1) and Bridge (S2) at the Class/Predicate type, and the agreementis Overstepping equivalent with Bridge.

• agreement=<road-way,RoadSection, WaySection MERGE Separator,Class, SUPER>,there is agreement with name 'road-way', betweenRoadSection (S1) and WaySection (S2) at Class type, and the agree-ment is that RoadSection is more general than WaySection MERGEwith Separator.

• agreement=<node-extremitytoll, Node, Toll, Properties, SUPER>, thereis agreement with name 'node-extremitytoll', between Node (S1) andextremity at Toll (S2) at property/predicate/attribute type, and theagreement is Node is more general then extremity at Toll.

Result of agreement can be represented with RDF(S) which enhanced byOWL. The detail discussion of the example will be at chapter prototype.

A second example is the transportation network which is very importantat many aspect, such as developing business planning, city planning, routingof delivery services. Unfortunately, every organization have di�erent viewin transportation network, especially land transport, for example: policedepartment, public ministry, toll organization. For illustration, there is acommon ontology of land transport at the super peer, and one of the providerpeer is from company which organize toll. The de�nition of toll in this caseis transport network which have to pay if pass it. If one delivery servicecompany need to design his route, so the company need to know the classof road, free/pay, connection between which location, and soon. To solvethis problem, it demonstrate how to use P2P to perform semantic overlaynetwork, in other word, developing of cluster from provider peers which hassimilarity in semantic is tried.

Let us consider �gure 4.6, PP1 will make agreement at P2P environment.The exported schema can be coded by RDFS & XMLS and common ontologywith RDFS & OWL. In this example, an assumption is export schema andcommon ontology have been done. Example some part of common ontologyat SP as follow:

Page 103: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 90

<!--Top level class 'Transport'--><rdfs:Class rdf:ID="Transport" rdfs:comment="Typeof transport network"><rdfs:subClassOf rdf:resource="&rdfs;Resource">

</rdfs:Class>

<!--Subclasses of Transport--><rdfs:Class rdf:ID="LandTransport" rdfs:comment="Network transport in land, including river andlake"><rdfs:subClassOf rdf:resource="#Transport">

</rdfs:Class>

<rdf:Property rdf:ID="jurisdiction"><rdfs:domain rdf:resource="#LandRoad"><rdfs:range rdf:resource="&xsd;string"><rdfs:rangerdf:resource="&rdfs;Literal">

</rdf:Property>

To develop Agreement Unit as part of the agreement as follow:

• Label matching. By using linguistic analysis, concepts �PP:Toll� and�CO:Land Road�, �CO:Road�, �CO:Highway�, etc has high semanticallyequivalent. Toll is road/bridge for maintenance or pay road/bridgefacility. can be represented as:< CO:LandRoad, PP:Toll, 0.0>, it is interesting result because thelimitation of WordNet. By using user feedback better value is providedas 0.9< CO:LandRoad, PP:InterState, 0.8>< CO:Highway, PP:HighWay, 1.0>For simplicity, threshold value is not implemented for the example.

• Internal structure can be calculated by looking attribute based onlanguage and world. For the simple ontology, the attributes are nothigh variety. Consider result calculation similarity of internal structurepresent as <CO:LandRoad, PP:Toll, NA, 0.4>.

Page 104: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 91

Figure 4.5: Example of Agreement Unit

• External structure, leaves and set-superclass approach are not alwayscan be implemented for calculation similarity. It is depend on the on-tology and location of a concept. For example, calculation similarity ofCO:LandRoad and PP:Toll just can be done by using leaves approach.The result is <CO:LandRoad, PP:Toll, NA, 0.6>

• Refer to above step and look at the �gure 4.6, the agreement unitscan be found as follow: A1 <, LandRoad, Class, Toll, Class, trans='jurisdiction= have to pay when pass the facility'>A2 <, LandRoad, Class, Interstate, Class, trans= 'between= state'>A3 <, LandRoad, Class, Intrastate, Class, trans= 'between= district'>

Figure 4.5 is an example of Agreement Unit between common ontology at SPand export schema at PP1 which is presented in RDFS and OWL for id:A1.

Once the agreement at semantic and structure level are generated. Thesystem can deal with logical and data level by using user feedback. For simpleexample:

• Distance highway in CO is expressed in km but in provider peer is ex-pressed in mil, for this situation it need data level with unit conversionkm 'to/from' mil. Simple function or look-up-table can be used.

Page 105: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 92

Figure 4.6: Example of Peer Agreement

• Overstaping in CO is represented by geo:point, and bridge at ProviderPeer is represented by geo:Area. In CO, point is part of area, thereforea transformation needs to covert point to/from area.

• Location of Overstepping at CO using Latitude-Longitude system coor-dinate, and location of Bride at Provider Peer based on administrationarea. A transformation process is need to convert between administra-tion area and system coordinate.

A more di�cult task from the second example, if there are di�erencesof community in concepts of dividing classi�cation of road/street, the clas-si�cation can based on size (width of network road), density (number of carper hour), or connection between region. Figure 4.7, shows 3 provider peers,classi�cation at PP1 is based on region, PP2 is based on density, and PP3 isbased on size. If there is one request peer (RP) searches 'how many numberof secondary street in an area'. To respond the query, it need provider peerswhich have view is based on the size. Basically, if the query just based onview of RP, he will assume all peers agree with his concept of secondarystreet, so RP will send query (based on RDF Query) as bellow.

<owl:Class rdf:ID="Q_secondarystreet"><owl:equivalentClass><owl:intersectionOf rdf:parseType="Collection">

<owl:Class rdf:about="SecondarySt"/>

Page 106: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 93

</owl:Class>

Above query will be unsatis�ed to get appropriate answer, because thereis di�culty to communicate between RP and PPs with very di�erent insemantic. CO of SP will reduce the di�erent in semantic, so above query willbe rewrite based on CO, instead of RP view. The query as follow:

<owl:Class rdf:ID="Q_co_secondarystreet"><owl:equivalentClass><owl:intersectionOf rdf:parseType="Collection">

<owl:Class rdf:about="&sp;Street"><owl:Restriction>

<owl:onProperty rdf:resource="&sp;size"/><owl:hasValue rdf:datatype=&xsd;integer><rdfq:greaterThan >3</rdfq:greaterThan><owl:onProperty rdf:resource="&sp;unit"><owl:hasValue>meter</owl:hasValue>

<rdfq:lessThan>5</rdfq:lessThan><owl:onProperty rdf:resource="&sp;unit"><owl:hasValue>meter</owl:hasValue>

</owl:hasValue></owl:Restriction>

</owl:equivalentClass></owl:Class>

The other problem is still appeared at PP2, it has no size property, onlydensity property, so PP2 inform to RP to consider by using density propertyor drop the query. If RP consider to continue get information from PP2, RPneed to consider density property term of secondary street. For illustration,after get feedback from PP2, PP2 info his concept based on density notsize. To solve this condition RP change his concept to density, so he need torede�ne SecondarySt concept based on density. For simplicity of the example,RP change his SecondarySt concept from size 3-5 meter to density 10-20 carper hour, so the new query from RP to PP2 will be as follow:

<owl:Class rdf:ID="Q_co_pp2_secondarystreet">

Page 107: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 94

Figure 4.7: Example of Multi Peer Agreement

<owl:equivalentClass><owl:intersectionOf rdf:parseType="Collection">

<owl:Class rdf:about="&sp;Street"><owl:Restriction>

<owl:onProperty rdf:resource="&sp;density"/><owl:hasValue rdf:datatype=&xsd;integer><rdfq:greaterThan >10</rdfq:greaterThan><owl:onProperty rdf:resource="&sp;unit"><owl:hasValue>car per hour</owl:hasValue>

<rdfq:lessThan>20</rdfq:lessThan><owl:onProperty rdf:resource="&sp;unit"><owl:hasValue>car per hour</owl:hasValue>

</owl:hasValue></owl:Restriction>

</owl:equivalentClass></owl:Class>

4.4 Query IssuesQuery issues is an important area in information interoperability. Usually,query rewriting will use the result of agreement/mapping in processing. Map-ping or agreement is done during initialization/preparation phase, and in-

Page 108: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 95

volved with expert human at the domain. Query rewriting is executed during'run time' of interoperability process.

The query issues are how to rewrite query to send to the appropriatepeers. In the research, query rewriting will based on the result of agreementand deliver by using RQL.

Query language should be consider the characteristic of semistructureddata model. Semi structure data can be described by the following structurecharacteristics:

• Schema is not given in advance, but is implicit in the data.

• Schema is relatively large and changes frequently.

• Schema is illustrative of data rather than perspective.

• Data is not strongly typed which means that attributes with the samename may change type as they are used in di�erent places.

Query methods in P2P can be conducted at many models as follow:

• Broadcast and routing at P2P level, in this model the query is submit-ted to all member of a community via close neighborhoods, and theneighborhoods will continue to next their neighborhoods. Selected ofneighborhoods will based on P2P network overlay. The advantage ofthis approach will play at logical network level, so easier to handle.

• Select peers and routing at P2P level, in this model the query willbe sent based on criteria of a database of community member, withrouting in logical network level.

The approach to handle the query is based on selected peer and routingat network level. To handle this query the procedure of query is as follow:

• Knowledge discovery from super peer common ontology to request peer,the purpose to develop semantic socket in query.

• Query sending is done by request peer in two steps. First step, evaluatequery to provider peer agreement. The possibility of respond: cananswer, can not answer, may answer. Status may answer because there

Page 109: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 96

System Data Model Language Style Path Ex-pression

GraphEx-pres-sion

WebSQL relational SQL Yes NoW3QL labeled Multi-

graphsSQL Yes No

WebLog relational DataLog No NoLorel labeled graph OQL Yes NoWebOQL hypertrees OQL Yes YesUnQL labeled graph structural

recursionYes Yes

STRUDEL labeled graph DataLog Yes YesARANEUS page schemas SQL Yes YesFLORID F-Logic DataLog Yes No

Table 4.3: Comparison of Web Query Language [30]

is similarity but not close enough, for example request peer ask for'secondary road based on width', but a provider peer has 'secondaryroad based on tra�c density'. In this situation, a pre-respond/informis needed, after the pre-respond, query re-writing can be done or not.Assume query re-writing can be done, and full �ll with concept ofprovider peer, so the query submits to local data via wrapper/bridgeto get the result. The negotiation is one key to achieve better agreementbetween request and provider peer.

• The respond of the query can come from several provider peers. Theanswer integration is needed to get better result. The integration isdone by request peer.

Currently, there are many query language available, from the databasearea, to semantic web level. Survey of web query language was done byDippel [30] can be summarized at table 4.3. RDF for query language hasbeen available at some language. The popular languages are RQL, RDFQL,RDFPath and Triple. The comparison can be seen at table 4.4.

Page 110: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 97

Criteria Data Model Language Origin Closure of queries GeneralityRQL Graph OQL Yes No

RDFQL Triple SQL No NoRDFPath Triple PXath No NoTriple Triple F-Logic No No

Table 4.4: Comparing RDF Query Language [58]

4.5 ArchitectureThe architecture will refer to previous discussion about the approach. Char-acteristics of architecture are follow:

• Based on P2P with super peer (SP), the main task of super peer is toregister active provider peers and associate metadata. In this architec-ture, task of super peer will be minimize to avoid failure of center.

• There are three types of peers: super peer, provider peer and requestpeer. The type is referred to the function, although physically can beone machine with three type of peer.

• Registration or advertisement of provider peer at super peer based onRDF(S)

• Agreement / mapping will conduct from provider peer with respect tocommon ontology at super peer. Result of agreement will representwith RDF(S) and save at local peer.

• Query and respond will conduct directly between request and providerpeer based on selected on book keeping and network level routing.Negotiation can be done during process to get the respond of the query.If the respond come from some provider peers, respond integration isneeded.

To reduce the complexity of the architecture, the following assumption ismade:

• The common ontology is available. The methodology in developing andmaintenance of ontology is not in the architecture.

Page 111: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 98

• Clustering / Semantic Overlay Network (SON) have been setup. Be-cause assume every provider has register to super peer, will has agree-ment / mapping to common ontology in super peer, it is mean theprovider peer is in some semantic.

• Using same P2P platform.

The general architecture can be seen at �gure 4.8. Every peer has block func-tion to handle information interoperability at Peer Agreement Architecture.The architecture will base to generic P2P platform.

Main block every type of peer as follow:

• Super Peer (SP)

� Book Keeping, to save the information about provider peer at thecommunity. The information content of ID, meta data/keyword,activity condition, validation schema information of provider peer.Request peer information will be saved in book keeping as well.There is mechanism to detect active peer, �rst based on clean logo� it is mean peer send a message to leave the P2P, the other isdirty log o� it can be happen such as failure of machine or network.To detect dirty log o�, time stamp method is implemented.

� Ontology Engineering tool, is function to put the common ontol-ogy at super peer. Actually the architecture, not concern in detailabout ontology engineering.

• Provider Peer (PP)

� Registration, mechanism of registration to super peer. The mech-anism including: getting ID, look at common ontology, sendingmeta data/ key word. And there is mechanism for re-registrationin a period time, the purpose to let know the super peer status ofprovider peer.

� Export schemaand Agreement management. This block has taskto prepare export schema and create agreement between localschema and common ontology at SP.

Page 112: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 99

Figure 4.8: General Architecture Peer Agreement

� Query Respond Management. Query from request peer will basedon three phases: (1) asking agreement which save at providerpeer, (2) sending query rewriting based on the agreement, and (3)asking the result of query.

• Request Peer (RP)

� Knowledge Discovery or Concept Adjustment. This block actuallyjust to take care information common ontology from super peer,and adjust the view as semantic concept.

� Query Management Tool, the facilities is to make query based onthe agreement of provider peers, and sending the query to peers.

� Respond Management, result of query not automatically sendingfrom provider peer, but waiting for invitation from request peer.Result of some answers from providers peers need to union andjoin to get better result.

There are three phase for information interoperability at Peer AgreementArchitecture. The phases are:

Page 113: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 100

Figure 4.9: Registration & Agreement Phase at Peer Agreement

• Registration & Agreement Phase, see �gure 4.9

1. Provider peer will send message to super peer for registration.During the registration, provider peer will get ID, putting somemeta data at super peer, and super peer will create time stampfor the provider peer.

2. Provider peer will develop agreement between local schema andcommon ontology at super peer, result of agreement will be savedat provider peer.

3. The result of agreement will be saved at provider peer, refer tothe result, meta data is developed and sent to super peer.

• Query Rewriting Phase, see �gure 4.10

1. The �rst step, RP will adjust the query (rewriting) based on SPcommon ontology.

2. The request peer will look at book keeping at super peer, to knowwhich provider peers available and has high corelation with therequest.

3. The queries will be sent directly to appropriate PPs.

Page 114: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 101

Figure 4.10: Query Rewriting Phase at Peer Agreement

4. There is possibility PPs will sent message if the query can notdirectly respond cause of di�erent in concept. Currently simplemethod is suggested. Two step method is used. The �rst step,request query just sent to �nd an appropriate concept of providerpeers. The second step, there are three scenarios, if the appropri-ate concept found, request peer can send query to get individualinformation. If the appropriate concept did not �nd, the requestpeer can rewrite a query or drop the query.

• Query Respond Phase, see �gure 4.11

1. After success process of the query from RP, the respond will senddirectly from PP to RP.

2. Answers from PPs will be integrated (union & join) to provideanswer refer to initial query.

4.6 SummaryThe research framework is based on four pillars which consist of semantic,GIS interoperability, Query and P2P architecture. The framework is derivedfrom result of state of the art.

The approach consider local, community and agreement contex. The arethree type of peers consists of provider peer, super peer and request peer.

Page 115: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 4. DISCUSSION OF APPROACH 102

Figure 4.11: Query Respond Phase at Peer Agreement

The presentation of peer's information is based on XML and uses RDF(S)which is enhanced with OWL.

The agreement consists of some agreement unit which is carried out be-tween part of common ontology and export schema. The agreement unitutilize label matching, internal and external structure comparison. Result ofagreement will save at provider peer.

Query processing is based on agreement at provider peer. The query willdirectly route to appropriate provider peers without the help of the superpeer.

The architecture has some functions to handle common ontology,bookkeeping, developing of agreement, knowledge discover and query processing.

Page 116: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 5

Result of Prototype

5.1 PrototypePrototype is developed to evaluate the proposed approach of research. Set-upprototype has objectives as follow:

• Testing Communication/protocol for P2P infrastructure.

• Testing similarity measurement for Agreement Unit and sending queries.

Testing communication/protocol for P2P infrastructure, virtualmachine will be implemented. Virtual machine uses QEMU technology whichis possible to provide virtual machine which running di�erent operating sys-tem in one physical machine. For the comunication between virtual machine,virtual distributed ethernet (VDE) is implemented. The VDE feature per-mits the concurrent execution and communication of several virtual or emu-lated machines through an emulated network. The virtual model is built onmachine SUN WZ1100 which run SuSE 9.3. For the virtual machine will runsome version of Linux distro. Con�guration of virtual machine can be seenat �gure 5.1 .

Testing similarity measurement and sending queries, virtual web-server is used. Unfortunately virtual machine has very slow performance tohandle task in running medium program on top of a virtual machine. Resulttesting of communication is implemented in virtual webserver. Figure 5.2depict virtual webserver model.

103

Page 117: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 104

Figure 5.1: Virtual Machine Prototype

Figure 5.2: Virtual Web Prototype

Page 118: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 105

5.2 Experimental SetupThe set-up consist of infrastructure, common ontology and three exportschema of provider peers. The following is the detail of each component:

5.2.1 InfrastructureThere are two types infrastructure should be prepared for experiment. The�rst is virtual machine infrastructure, the second is virtual web server infras-tructure.

The virtual machine consists of:

• host machine is SUN W1100

• host operating system is SuSE 9.3

• software for virtual machine is QEMU

• software for virtual network is VDE

• operating system for virtual machine are SuSE, RedHat, Knoppix, andDebian

The virtual webserver consists of:

• host machine is SUN W1100

• host operating system is SuSE 9.3

• software for virtual web is APACHE

• software for develop schema and handle query is Protégé and relatedTAB/add-on

• three remote host machine are Pentium III, Pentium IV Celeron andPentium IV 2.4 which run Linux and MS Windows

• Internet communication

Page 119: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 106

5.2.2 Common Ontology and Export SchemaCommon ontology and export schema/ontology is refer from real world. Someadjustments of ontology are done because limited information from the realworld.

Common ontology refer to Ontology of Tra�c Networks (OTN). OTNis a direct encoding of Geographic Data Files (GDF) in OWL. OWL hassome technical requirements which required some adjustments. There are�ve di�erent basic classes:

• Feature contains all GDF features as OTN classes

• Geometric de�nes the geometric form of features

• Composite Attributes represents classes consisteng of composed at-tributes

• Relationship describes the non-geometric relationship between features

• Transfer Point is a class which describes how to get from one object toanother object

In last development, the common ontology consists of 139 classes and 211attributes. Adding information is done to put more annotation of terminol-ogy and unit.

Export schemas are developed based on National Road Network ofCanada. An expert on GIS helped us to modi�ed to 3 export schemas relatedto transport network. Other 2 export schema are prepared which on otherdomain (business and education). The detail of export schemas as follow:

• PP1, classi�cation of road based on wide of road, 22 concepts, 3 levelof depth.

• PP2, classi�cation of road based on tra�c density, 17 concepts, 4 levelof depth.

• PP3, classi�cation of road based on area, 10 concepts, 3 level of depth.

• PP4, business, 23 concepts, 4 level of depth.

• PP5, education, 27 concepts, 4 level of depth.

Page 120: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 107

5.3 Experiments• Method for Communication P2P

� Check communication between virtual machine as peer by usingPING and SSH command. The purpose is to check ability tocomunicate between virtual machines. Pass this evaluation, ex-periment can continue to next step.

� Develop script to send PING randomely to peers as �ooding method.

� Develop script to send PING selected peers based on informationof super peer.

� Compare network tra�c based on two method in sending PING.PING assume as query in real world.

• Similarity Measurement as Agreement Unit

� Provide common ontology and all export schemas at OWL at dif-ferent virtual web address. The purpose is information of commonontology and export schemas are accessable from all user.

� Ask peoples who has basic knowledge in computer to calculatesimilarity as based to create agreement unit. Calculation of simi-larity is helped by tools, however it still manual and need humanaction. Similarity calculation is executed for label matching, inter-nal and external comparison. For the experiment, label matchingjust refer to WordNet, external sources have not involved. Resultsof similarity are used to provide agreement unit.

� Compare result of similarity and agreement unit between expert/humanand similarity calculation approach.

• Sending queries

� Provide agreement at appropriate provider peer.

� Sending queries to variety community of provider peers. The pur-pose is to compare result between a community and mix commu-nity.

Page 121: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 108

� Sending queries to a community of provider peer. The purposeis to compare result between one and two step method. The onestep method, query directly send to get the individual/instancesdata. The two step, �rstly the query check available concept atprovider peer, next step send query to get individual data.

5.4 Result and DiscussionExperiment results will be presented and discussed as follow:

• Method for Communication P2PResult, PING and SSH were successfull respond in virtual machine andnetwork. Therefore, it is very possible to evaluate P2P protocol. Theresults after run simulation in 5 provider peers for 100 times are �oodingmethod has tra�c is four time higher then selected method.

Discussion, refer to the result, super peer can provide better networktra�c compare to �ooding. Even though, there is a problem, if superpeer is failure, the network tra�c is zero. Therefor, the prototype willbased on selected method by using super peer.

• Similarity Measurement as Agreement UnitResult, agreement units developed based on similarity calculation willcalculate the precision based on information retrieval (IR) approach asfollow:

Precisionindividual =Agreementrelated

Agreementposibelity

(5.1)

Agreementrelated is number of agreement unit which developed by usersusing the approaches and tools. Agreementrelated is number of possi-bility all unit agreement between common ontology and export schema.For the experiment, value of Agreementrelated is number of concept/propertyat a provider peer. Precisionindividual (Pi) is result of agreement.Higher value of Pi will present higher relation knowledge between providerpeer to a community.

Page 122: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 109

Pi Pe

PP1 1022

= 0.45 810

= 0.80PP2 10

17= 0.59 10

10= 1.00

PP3 710

= 0.70 77

= 1.00PP4 8

23= 0.35 4

8= 0.50

PP5 627

= 0.22 26

= 0.33

Table 5.1: Result of Concept of Agreement Unit

Precisionexpert =Agreementcorrect

Agreementrelated

(5.2)

Agreementcorrect is number of correct agreement unit which developedby users compare to expert view. Precisionexpert (Pe) is result of correctagreement unit.

Result from �ve provider peers and agreement unit develop by twousers for concept can be seen at table 5.1.

Discussion From the result P2P approach in a community can providebetter result for knowledge discovery than mix community, becausevalue of Pi and Pe is higher. Other problem during experiment islimited of WordNet for domain terminology, it can be seen value ofPi of PP1 is low. It is important to develop approach to use externalontology. Currently just simple unit agreement (1:1) is successfull topresented with avaliable tools.

• Sending queriesResult The purpose of this experiment to evaluate utilize agreementfor queries. The testing of prototype only use around 50 individualsfor each provider peer. Result of experiment to respond query betweenone and two step method as follow: one step method provided 0.40-0.90correct respond, two step method provided 0.90-1.00 correct respond.Un-corect respond of two step method cause of di�erent unit of data.

Discussion Agreement can be used for writing query. Two step methodgave better result compare to one step method. However two stepmethod is still manually. E�ort to automation level for discovery re-lated provider peers and two step method are needed.

Page 123: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 5. RESULT OF PROTOTYPE 110

5.5 SummaryThis chapter has described prototype and evaluation of approach. Similaritycalculation approach can helped to develop agreement unit, because the re-sult compare to expert view give high result. Agreement and query result hasdemonstrated that the approach for simple agreement for P2P environmentin a community is appropriate.

Page 124: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Chapter 6

Conclusion and Future Work

First of all, there is a conclusion with respect to research objective of thisstudy in semantic agreement for spatial information interoperability. Sec-ondly, achievements of the approach are summarized. Finally, there are thekey constributions of the thesis and discuss directions for future research.

6.1 Research ObjectiveThis thesis focuses on �nding innovative ways for interoperable spatial in-formation. This approach di�ers from available approaches, our approach isstarted by developing a community which share common interest regardingsemantic agreement for information sharing. The objective of this researchis to develop method and architecture of GIS Interoperability by considersemantic heterogeneity at P2P environment, focus in:

• How to representing semantic of peers?

• How to allowing No Global View semantic interoperability based onappropriate method of agreement between peers?

• How to developing P2P based semantic interoperability for spatial in-formation?

An implementation of ontology as the foundation of the interoperabilityis implemented. Ontologies can represent real world entities using a so-phisticated structure with components such as de�nition, parts, function,

111

Page 125: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 6. CONCLUSION AND FUTURE WORK 112

attributes and rule of relationship. Furthermore, ontologies capture the se-mantics of informtain, can be represented in a formal language, and can beused to store related metadata. Ontologies can be used to establish a se-mantic agreement about diverse views of the world and consequently carrythe meaning of the original ideas that are embedded in the representation ofgeographic phenomena in the human mind.

6.2 AchievementsRegarding the questions of research objective, the followings are achievementsof the research:

• Representation of peers uses export schema or ontology which repre-sented local data. OWL DL is used to write an export schema of peer.

• Similarity calculations based on linguistic analysis, internal and exter-nal structure give good result to develop agreement unit between exportschema of provider peer and common ontology of a community. Sim-ple agreement unit can be developed successfully. However, complexagreement unit is limited because of tools and language (OWL FULL)are not yet mature.

• Architecture of P2P for spatial interoperability based on semantic agree-ment has been developed and put in a prototype. The architectureconsists of three types peers, common ontology, and export schemas.The architecture can provide better result of knowledge discovery in acommunity.

• Agreement can be utilized to write query among peers in a community.Experiment results gave evidence that implementation of approach andprototype can handle spatial information interoperability for simpleagreement.

6.3 ConstributionThe followings are contribution of the thesis.

Page 126: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 6. CONCLUSION AND FUTURE WORK 113

1. To develop a loosely coupled semantic interoperability in P2P architec-ture based on three contexts. The global context, represented by anontology, de�nes a set of generic application domain level data types,called peer agreement concepts. The local /schema contexts providethe ontological descriptions of the local data sources. Finally, the peeragreement contexts combine views from the common and local contextsto de�ne a set of cooperation concepts.

2. To de�ne peer agreement to represent ontological agreements on themediation concepts provided by the generic domain ontologies. Peeragreement extend the traditional role concept of object oriented mod-els to associate di�erent semantic interpretations to the concepts of thecommon ontology. I de�nes a cooperation concept to encapsulate threekey components: 1) common ontology 2) virtual views de�ned on lo-cal schema/ontology, and 3) context transformation mappings for thespeci�cation of inter ontology relationships.

3. To develop P2P architecture which based on the three layers/contexts.The architecture consists of three type of peers as follow: super peer,provider peer and request peer. The super peer provides common ontol-ogy as reference of the community. The provider peer provides exportschema of local data for sharing. The request peer submits queries toprovider peers.

4. Finally, an implementation framework or prototype to apply the ap-proach to geographic information systems is developed. The XML,RDF and OWL ontology description language [108] is used to describethe di�erent ontologies and implement the contexts. Furthermore, theproposed approach relies on the inference capabilities of OWL for semi-automatic processing.

6.4 Future WorkSigni�cant inroad into understanding and developing solutions for represen-tation and agreement of peers have been discussed. However, substantial

Page 127: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 6. CONCLUSION AND FUTURE WORK 114

work remains toward are still needed. Followings are several direction forfuture work.

• Complex agreement, the approach has been developed to cover simpleand complex agreement. However, an prototype has not yet been suc-cessfull to implement complex agreement because limitation of OWLDL and query tools. OWL FULL promise has better feature, howeverit is not yet mature.

• Automation of agreement developing, similarity calculations have uti-lized semi-automatic tools, however creation an agreement unit stillneeds human to operate. A big challenging is to develop an automa-tion tool which gave input from common ontology and export schemaand directly provide agreement. The main problem is how to use exter-nal informations and reduce expert/user feedback during the process.This issue is long histroy of arti�cial intelligent.

• Agreement maintenance, in dynamic and autonomous environment, theadministrator must constantly monitor of common ontology and hisexport schema. Manual monitoring is very expensive and not scalable.It is important to develop technique to automate the monitoring andrepairing of agreement.

• Agreement composition, an agreement is carried out between a providerpeer and super peer. Results of agreement of some provider peerscan be combine to developed agreement composition directly betweenprovider peers. Advantages of agreement composition are reduce lostof information, and avoid super peer failure. The problem is to producean inverse agreement. For example, An agreement of PP1 to SP and anagreement PP2 to SP will be combined to develop directly agreementof PP1 to PP2. Agreement composition of PP1 to PP2 will equivalentto agreement of PP1 to SP AND inverse agreement of PP2 to SP.

• Performance, the prototype has not yet considered performance of sys-tem, such as speed of processing, availability, quality of services. Inreal world, the performance factor is very important. Research in thisarea is needed to take into account.

Page 128: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

CHAPTER 6. CONCLUSION AND FUTURE WORK 115

• Interoperability of temporal spatial information, the approach need todeveloped to handle temporal spatial information. Temporal spatialinformation is more complex cause of dynamic changing time over time.

Page 129: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

Bibliography

[1] Jose Luis Ambite, Naveen Ashsis, Greg Barish, Craig A Knoblock,Steven Minton, Pragnesh J Modi, Andrew Philpot Ion Muslea, andSheila Tejada. ARIADNE: a System for Constructing Mediators forInternet Source. http://www.isi.edu/sims/naveen/sigmod98.ps, March1998.

[2] Yigal Arens, Chun-Nan Hsu, and Craig A Knoblock. Query processingin the SIMS information mediator. In Austin Tate, editor, Proc. ofAdvanced Planning Technology, pages 61�69. AAAI Press, Menlo Park,California, 1996.

[3] Madhan Arumugam, Amit Sheth, and I Budak Arpinar.Towards Peer-to-Peer Semantic Web: A Disitributed En-vironment for Sharing Semantic Knowledge on the Web.http://www.science.unitn.it/ tomasi/think/pdf/towards.pdf, 2002.

[4] Avaki. http://www.avaki.com/, 2001.

[5] Todd S. Bacastow and Ben Lewis. Peer-to-Peer GIS.http://www.personal.psu.edu/faculty/t/s/tsb4/docs/p2pgis.doc.

[6] Thanh-Le Bach and Rose Dieng-Kuntz. Measuring Similarity of El-ements in OWL DL Ontologie. Technical Report ACACIA Project,INRIA Sophia Antipolis, 2002.

[7] D. Beneventano, S. Bergamaschi, A. Fergnani, F. Guerra, M. Vincini,and D. Montanari. A Peer-To-Peer Agent-Based Semantic Search En-gine. In Proc. of 11th Italian Symposium onAdvanced Database SystemCetrato (CS), 24 June 2003.

116

Page 130: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 117

[8] Richard Benjamins, Jesus Contreras, Oscar Corcho, and AsuncionGomez-Perez. Six Challenges for The Semantic Web. KR2002 (ISOCOWhite Paper), 2002.

[9] S. Bergamaschi, S. Castano, and M. Vincini. A Semantic Approach toInformation Integration: the MOMIS Project. In Proc. of AI*IA 98,Padova Italy, September 1998.

[10] S. Bergamaschi, S. Castano, and M. Vincini. Semantic Integrationof Semistructured and Structured Data Sources. SIGMOD Record,28(1):54�59, 1999.

[11] Sonia Bergamaschi and Domenico Beneventano. Momis: an IntelligentSystem for The Integration of Semistructured and Structured Data.Technical report, University of di Modena e Reggio Emilia T3-R08,Pedoa IT, 10 November 1998.

[12] L. Bernard, S Haubrock, S. Hubner, R Lessing, M Lutz, and U Visser.Semantic Interoperability by Means of Geoservices-Semantic Problemin three Use Cass and Approaches for Potential Solutions. In Proc. ofGEOTECHNOLOGIENScience Reprot, pages 1�16. Seite(n), 2003.

[13] Tim Berners-Lee, James Hendler, and Ora Lasila. The semantic web.Scienti�c American, May 2001.

[14] Arne-Jorgen Berre, Roy Gronmo, Hjordis Ho�, and Kim Lantz. DIS-GIS: an Interoperability Framework for GIS - Using theISO/TC 211Model-based Approach. In Proc. of Global Spatial Data Infrastructure(GSDI) 4, Cape Town, South Africa, 2000.

[15] Y Bishr. Overcoming the semantic and other barriers to gis interoper-ability. Int. Journal of Geographical Information Science, 12:299�314,1998.

[16] Yaser Bishr. Semantic Aspects of Interoperable GIS. PhD thesis, Wa-geningen Agricultural University, Netherland, 1997.

[17] Jos de Bruijn, Ying Ding, Sinuhe Arroyo, and Dieter Fensel.Semantic Information Integration in the COG Project.http://www.cogproject.org/publications/sii-wp.pdf, 2002.

Page 131: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 118

[18] Diego Calvanese, Elio Damaggio, Giusepe De Giacomo, Maurizio Lenz-erini, and Riccardo Rosati. Semantic Data Integration in P2P Systems.In Proc. of International Workshop on Databse, Information Systemsand Peer-to-peer computing, Berlin, Germany, September 2003.

[19] M. J. Carey, L. M. Haas, and E. L. Wimmers. Towards HeterogeneousMultimedia Information Systems: the GARLIC Approach. In Proc. ofRIDE-DOM 95, volume 00, page 124, 1995.

[20] Silvana Castano, Valeria De Antonellis, and Sabrina De Capitanidi Vimercati. Global Viewing of Heterogeneous Data Sources. IEEETransactioTKDE, 13(2):277�297, 2001.

[21] Silvana Castano, Valeria De Antonellis, Sabrina De Capitani Di Vimer-cati, and Michele Melchiori. A Uniform Framework for Web Data Mod-eling and Integration. In Proc. of SEBD 2000, pages 153�166, 2000.

[22] Sudarshan Chawathe, Hector Garcia-Molina, and Jennifer Widom. TheTSIMMIS Project: Integration of Heterogeneous Information Sources.http://www.db.ucsd.edu/publications/tsimmis-overview.pdf, 1994.

[23] Chris Clifton, Ed. Housman, and Arnon Rosenthal. Experience witha Combined Approach to Attribute-Matching Across HeterogeneousDatabases. IFIP, 1997.

[24] S Castano De Antonellis, S De Caapitani di Vimercati, and M Mel-chiori. An XML-Based Framework for Information Integration Overthe Web. In Proceeding of IIWAS2000, 2000.

[25] Jos de Bruijn. Semantic Integration of Disparate Data Sources in theCOG Project. ICEIS 2004, 2004.

[26] DEI. Digital Earth Initiative, 2002. http://www.digitalearth.gov, 2002.

[27] Thomas Devogele, Christine Parent, and Stefano Spaccapietra. OnSpatial Database Integration. International Journal of Geographic In-formation System, Special Issue on System Integration, 12(3), 1998.

Page 132: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 119

[28] Robin Dhamankar, Yoonkyong Lee, AnHai Doan, Alon Helevy, andPedro Domingos. IMAP: Discovering Complex Semantic Matches Be-tween Database Schemas. In Proc. of SIGMOD 2004, Paris, France,13 June 2004.

[29] Liping Di, Aijun Chen, Wenli Yang, and Peisheng Zhao. The Integra-tion of Grid Technology with OGC Web Services (OWS) in NWGISSfor NASA EOS data.

[30] Alan K Dippel and Fred Sadri. Web Query Languages, Intel-ligent Information Integration. https://www.cs.indiana.edu/ adip-pel/csc671/web_query_lang.htm, 30 November 1998.

[31] Hong-Hai Do and Erhard Rahm. COMA - a System for Flexible Com-bination of Schema Matching Approaches. VLDB, 2002.

[32] AnHai Doan, Pedro Domingos, and Alon Y. Levy. Data Integration:a Killer Application for Multistrategy Learning. In Proceedings of theWorkshop on Multi-Strategy Learning (MSL-00), Guimaraes, Portugal,2000.

[33] AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy.Learning to Map Between Ontologies On The Semantic Web.WWW2002, 2002.

[34] AnHai Doan and Robert McCann. Building Data Integration Systems:a Mass Collaboration Approach. WebDB2003, 2003.

[35] Deijing Dou, Drew McDermott, and Peishen Qi. On-tology Translation on the Semantic Web. http://cs-www.cs.yale.edu/homes/dvm/daml/ontomerge_odbase.pdf, 2003.

[36] Entropia. http://www.entropia.com/, 2004.

[37] ESRI. Metadata and GIS. http://www.esri.com/library/whitepapers/pdfs/metadata-and-gis.pdf, October 2002.

[38] ESRI. Spatial Data Standards and GIS Interoperability.http://www.esri.com/library/whitepapers/pdfs/spatial-data-standards.pdf, 2003 2003.

Page 133: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 120

[39] ESRI. ArcGIS: Engineered for Interoperability.http://www.esri.com/library/whitepapers/pdfs/arcgis-engineered-for-interoperability.pdf, 2004.

[40] ESRI. ESRI Shape�le Technical Description.http://www.esri.com/library/whitepapers/pdfs/shape�le.pdf, 8May 2005.

[41] ESRI. Interoperabiliyt: ESRI's Support of OGC Speci�cations.http://www.esri.com/software/standards/ogc-support.html, 26 Au-gust 2005.

[42] Jerome Euzenat, Thanh Le Bach, Jesus Barasa, and etc. D2.2.3: Stateof the art on ontology alignment. Technical Report IST-2004-507482,knowledgeweb, 2 August 2004.

[43] Michael R. Genesereth, Arthur M. Keller, and Oliver M Duschka. In-foMaster: an Information Integration System. In Proc. of The ACMSIGMOD Conference, May 1997.

[44] Fausto Giunchiglia and Pavel Shvaiko. Semantic Matching. TechnicalReport DIT-03-013, University of TN, April 2003.

[45] Gnutella. http://www.gnutella.com, 2003.

[46] Cheng Hian Goh, Stuart E. Madnick, and Michael D. Siegel. SemanticInteroperability Through Context Interchange: Representing and Rea-soning about Data Con�icts in Heterogeneous and Autonomous Sys-tems. http://context2.mit.edu/coin/publications/coin/coin.pdf, 1995.

[47] T Gruber. Towards Principles for the Design of Ontologies Used forKnowledge Sharing. Int. Journal of Human-Computer Studies, 43:907�928, 1995.

[48] Amarnath Gupta, Richard Marciano, Ilya Zaslavsky, and ChaitanyaBaru. Integrating GIS and Imagery Through XML-Based InformationMediation. In P. Agouris and A.Stefanidis, editors, Selected Papersof Int. Work. Integrated Spatial Databases, ISD � Digital Images and{GIS}, number 1737 in Lecture Notes in Computer Scie, LNCS, Port-land USA, June 1999. Sprier Verlag.

Page 134: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 121

[49] Farshad Hakimpour and Andreas Geppert. Resolving Semantic Het-erogeneity in Schema Integration: an Ontology Based Approach. InProc. of FOIS'01, Ogunquit, Maine, USA, 17 October 2001. ACM.

[50] Alon Y. Halevy, Zachary G. Ives, Peter. Mork, and Igor Tatarinov.Piazza: Data Management Infrastructure for Semantic Web Applica-tions. In Proc. of WWW2003, Budapes, Hungary, 20 May 2003.

[51] Alon Y. Halevy, Zachary G. Ives, Dan Sucio, and Igor Tatarinov.Schema Mediation in Peer Data Management Systems. ICDE2003,2003.

[52] Je� He�in, James Hendler, and Sean Luke. SHOE: a Blueprint forthe Semantic Web. www.cse.lehigh.edu/ he�in/pubs/swbook03.pdf,21 February 2003.

[53] Mauricio A. Hernandez, J. Renee, and Xuqing Tian. CLIO: a Semi-automatic Tool for Schema Mapping. In Proc. of SIGMOD Conference2001, 2001.

[54] Varelas Ioannis. Semantic Similarity Methods in WOrdNet and TheirApplication to Information Retrieval on The Web. In Proc of the 7thACM International workshop on Web Information and Data Manage-ment, pages 1�10, Brement, Germany, 30 June 2005.

[55] J. Ernst. Peer-to-Peer: The Infrastructure for the Semantic Web. InProc. of Int. Semantic Web Symposium. Stanford University, 6 July2001.

[56] F. Jouanot, C. Nicole, N. Cullot, and Y. Kokou. Information sourceinteroperability using context matching methodology. In Proc. of 1thInternational Conferene on Parallel and Distributed Computing Sys-tems (PDCS00), 8 August 2000.

[57] JXTA. http://www.jxta.org/, 2004.

[58] G. Karvounarakis, S. Alexaki, V. Christophides, D. Plexousakis, andM. Scholl. RQL: A Declarative Query Language for RDF (Slide).http://wwwconf.ecs.soton.ac.uk/archive/00000334/01/index.htm, 1January 2002.

Page 135: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 122

[59] Anastasios Kementsietsidis, Marcelo Arenas, and Renee J. Miller. Map-ping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues.In Proc. of SIGMOD, 2003.

[60] Eva Klien, Udo Einspanier, Michael Lutz, and Sebastian Hubner. AnArchitecture for Ontology-based Discovery and Retrieval of GeographicInformation. In Proc. of GgI Jahrestagung (2), pages 574�578, 2004.BUSTER, Breman TZI.

[61] Eric Leclercq, Djamal Benslimane, and Kokou Yetongnon. ISIS: ASemantic Mediation Model and an Agent Based Architecture for GISInteroperability. IDEAS, pages 87�91, 1999.

[62] Eric Leclercq, Djamal Benslimane, and Kokou Yetongnon. Seman-tic Mediation between cooperative spatial information systems: TheAMUN data model. In Proc. of 6th Advances in Digital Libarries(ADL'99), pages 16�27, Baltimore, USA, 1999. IEEE Computer So-ciety. ISBN 0-7695-0219-9.

[63] Wen-Syan Li and Chris Clifton. Semantic integration in heterogeneousdatabases using neural network. In Proceedings of the 20th vldb con-ference, 1994.

[64] Jayant Madhavan, Philip A. Bernstein, and Erhard Rahm. GenericSchema Matching with CUPID. In Proc. of 27th Intl. Conf. on VeryLarge Data Bases, Very Large Data Bases, pages 49�58, 2001.

[65] Peter McBrien and Alexander Poulovassilis. De�ning Peer-to-PeerData Integration using Both as View Rules. In Proc. of 1st DISP2PIntl. Workshop, number 2994 in LNCS, pages 91�107, Berlin, Germany,2003.

[66] Deborah L. McGuinness, Richard Fikes, James Rice, and Steve Wilder.An environment for merging and testing large ontologies. In Proceedingof the sevent internatinal conference on principles of knowledge repre-sentation and reasoning (KR2000), 2000.

[67] E. Mena, V. Kashyap, A. Illarramendi, and A. Sheth. Domain Speci�cOntologies for Semantic Information Brokering on the Global Infor-

Page 136: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 123

mation Infrastructure. In Proc. of 1st Conf. on Formal Ontogies inInformation Systems, Trento, Italy, June 1998.

[68] E Mena, V Kashyap, A Sheth, and A Illarramendi. Observer: anApproach for Query Processing in Global Information Systems basedon Interoperation Across pre-existing Ontologies. In Proc. of 1stIFCIS International Conference on Cooperative Infroamtion Systems(CoopIS'96), pages 14�25, Brusseles, Belgium, 1996.

[69] Renee J. Miller, Mauricio A Hernandez, and Lucian Popa. The CLIOProject: Managing Heterogeneity. In Proc. of ACM SIGMOD Record,number 30(1) in Interoperability Systems, pages 78�83, March 2001.

[70] D.S. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J. Pruyne,B. Richard, S. Rollinx, and Z. Xu. Peer-to-peer Computing.www.hpl.hp.com/techreports/2002/HPL-2002-57.pdf, 2002.

[71] Prasenjit Mitra, Gio Wiederhold, and Jan Jannink. Semi-AutomaticIntegration of Knowledge Sources. In Proc. of the 2nd Int.Conf. OInformation FUSION'99, 1999.

[72] Napster. http://www.napster.com/, 2003.

[73] W. Nejdl, B. Wolf, W. Siberski, C. Qu, S. Decker, M. Sintek, A. Naeve,M. Nilsson, M. Palmer, and T. Risch. Edutella: P2P Networking forthe Semantic Web, 2002.

[74] W. Nejdl, B. Wolf, S. Staab, and J. Tane. Edutella: Searching andAnnotating Resources Within an RDF-Based P2P Network, 2001.

[75] Natalya F. Noy. Semantic Integration: a Survey of Ontology-BasedApproaches. In Proc. of Workshop on Semantic Integration at SWC-2003, number 13 in Semantic Integration, Sanibel Island, FL, 2003.

[76] Natalya Fridman Noy and Mark A. Musen. Smart: Automated Supportfor Ontology Merging and Alignment. In Proc. of the 12th Workshop onKnowledge Acquisition, Modelling, and Management (KAW'99), Banf,Canada, October 1999.

Page 137: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 124

[77] Natalya Fridman Noy and Mark A Musen. Prompt: Algorithm andTool for Automated Ontology Merging and Alignment. American As-sociaton for Arti�cial Intelligence, 2000.

[78] OGIS. Open GIS Consortioum: GML 2.0 Recommendation Paper.http://opengis.net/gml/01-029/GML2.html, 2001.

[79] Open GIS Consortium, http://www.opengis.org/wmt. Web MappingTestbed Public Page, 2000.

[80] M T Ozsu and P Valduriez. Principles of Distributed Database Systems.Prentice Hall International Edition, 1991.

[81] Evaggelia Pitoura, Serge Abiteboul, Dieter Pfoser, George Samaras,and Michalis Vazirgiannis. DBGlobe: A Service-Oriented P2P Systemfor Globa Computing. Sigmond Record, 32(3):77�82, 2003.

[82] Lucian Popa, Mauricio A Hernandez, and Howard Ho. Mapping XMLand Relational Schemas with CLIO, 2002.

[83] Sushama Prasad, Yun Peng, and Timothy Finin. Using Explicit Infor-mation to Map between Two Ontologies. In Proc. of the AAMAS 2002Workshop on Ontologies in Agent Systems (OAS'02), Bologna, Italy,July 2002.

[84] Mitra Prasenjit, Wiederhold Gio, and Martin Kersten. A Graph-Oriented Model for Articulation of Ontology Interdependencies. InProc. of 7th International Conference on Extending Database Technol-ogy: Advances in Datbase Technology, volume 1777 of Lecture Notes InComputer Science, pages 86�100, 2000.

[85] Alun Preece, Kit Hui, Alex Gray, Philippe Marti, Trevor Bench-Capon,Dean Jones, and Zhan Cui. The KRAFT Architecture for KnowledgeFusion and Transformation. Knowledge Based Systems, 13(2-3):113�120, 2000.

[86] Venkatesh Raghavan, Shinji Masumoto, Phisan Santitamont, andKiyoshi Honda. Implementing an Online Spatial Database Using theGRASS GIS Environment. In Proceedings of the Open source GIS-GRASS User Confernce 2002, Trento, Italy, 11 September 2002.

Page 138: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 125

[87] Erhard Rahm and Philip A. Bernstein. A Survey of Approaches toAutomatic Schema Matching. The VLDB Journal, 10:334�250, 2001.

[88] SETI@Home. http://setiathome.ssl.berkeley.edu/, 2001.

[89] Amit . Sheth. Changing Focus on Interoperability in Information Sys-tems: from System, Syntax, Structure to Semantic. In M.F. Goodchild,M.J. Egenhofer, R. Fegeas, and C.A. Kotman, editors, Proc of. Inter-operating Geographic Information Systems, pages 5�29, Norwell, 1999.Kluwer Academic Publisher.

[90] Nuno Silva and Joao Rocha. Ontology Mapping for Interoperability inSemantic Web. ICWI 2003, 2003.

[91] Leonid Stoimenov and Slobodanka Djordjevic-Kajan. GeoNis- Frame-work for GIS Interoerability, Integration of data sources in TelecomSerbia. 5th Int Conf. on Enterpraise Inf.Systems, ICEIS'03, 23 April2003.

[92] Leonid Stoimenov, Slobodanka Djordjevic-Kajan, and Dragan Sto-janovic. Integration of GIS Data Sources over the Internet using Me-diator and Wrapper Technology. 10th Mediterranean ElectrotechnialConf., MELECON2000, 1:334�336, 29 May 2000.

[93] Leonid Stoimenov and Slobodanka Dodevic-Kajan. Framework for Se-mantic GIS Interoperability. Ser. Math. Inform., 17:107�125, 5 2002.

[94] Leonid Stoimenov, Milan Petkovic, Dragan Stojanovic, SlobodankaDjordjevic-Kajan, Antonija Mitrovic, and Dejan Mitrovic. Develop-ment of Telecom GIS Application Using an Active, Object-OrientedGIS Eenvironment. In Proc. of 1st Intl. Workshop On Telegeoprocess-ing GIS + Telecom, Lyon, France, 6 May 1999.

[95] Ana Stojanovic and Leonid Stoimenov. GinisWeb-the tool for GISApplication Development based on Web. 10th Mediterranean Elec-trotechnical Conf., MELECON2000, 1:331�333, 29 May 2000.

[96] Heiner Stuckenschmidt and Holger Wache. Context Modeling andTransformation for Semantic Interoperability, 2000.

Page 139: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 126

[97] Gerd Stumme and Alexander Maedche. FCA-MERGE:bottom-up merging of ontologies. www.idi.ntnu.no/ xi-aomeng/paper/xiaomeng.bib, 2001.

[98] V. S. Subrahmanian, Sibel Adali, and Charles Ward. HER-MES: a Heterogeneous Reasoning and Mediator Systems.www.cs.umd.edu/projects/hermes/ overview/paper, 1996.

[99] Egemen Tanin, Frantisek Brabec, and Hanan Samet. Remote Accessto Large Spatial Databases. GIS'02, 8 2001.

[100] Igor Tatarinov, Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Su-ciu, Nilesh Dalvi, Xin Luna Dong, Yana Kadiyska, Miklau Gerome, andPeter Mork. The Piazza Peer Data Management Project. Sigmodrec2003, 2003.

[101] Michael Uschold and Michael Gruninger. Creating Semantically Inte-grated Communities on the World Wide Web. In Proc. of SemanticWeb Workshop, Honolulu, May 2002.

[102] Kim H. Veltman. Syntactic and Semantic Interoperability: New Ap-proaches to Knowledge and the Semantic Web. The New Review ofInformation Networking, 7, 2001.

[103] U. Visser and H. Stuckenschmidt. Interoperability in GIS - EnablingTechnologies. In M. Ruiz, M. Gould, and J. Ramon, editors, Proceed-ings of the 5th AGILE Conference on Geographic Information Science,pages 291�297, Palma de Mallorca, Spain, 2002.

[104] U. Visser, H. Stuckenschmidt, H. Wache, and T. Vogele. Using Envi-ronmental Information E�ciently: Sharing Data and Knowledge fromHeterogeneous Sources. Environmental Information Systems in Indus-try and Public Administration, pages 41�73, 2001.

[105] U. Visser, S. Stuckenschmidt, G. Schuster, and T. Vogele. Ontologiesfor Geographic Information Processing. Computers & Geoscience, 28,issue 1:103�117, 2002.

Page 140: APEER-TO-PEER(P2P)BASEDSEMANTIC ...openstorage.gunadarma.ac.id/research/WorkGroup... · APEER-TO-PEER(P2P)BASEDSEMANTIC AGREEMENTAPPROACHFORSPATIAL INFORMATIONINTEROPERABILITY DISSERTATION

BIBLIOGRAPHY 127

[106] Ubbo Visser, Thomas Vogele, and Christoph Schlieder. Spatio-Terminological Information Retrieval using the BUSTER System. InProc. of Environmental Informatcs 2002, Vienna, Austria, 2002.

[107] Thomas Vogele and Christoph Schlieder. The Use of Spatial Metadatafor Information Retrieval in Peer-to-Peer Networks. In Proc. of 5thAGILE Conference Geographic Informatin Science, Palma, BalearicIsland, Spain, 25 April 2002.

[108] W3C. World Wide Web Consortiu. http://www.w3.org/, 2004.

[109] Holger Wache. Towards Rule-based Context Transformation in Me-diators. In S Conrad, W Hasel, and G Saabs, editors, Proc. of In-ternational Workshop on Engineering Federated Information System(EFIS'99), Kuhlungsborn, Germany, 1999. In�x-Verlag.

[110] Huiyong Xiao, Isabel F Cruz, and Feihong Hsu. Semantic Mappings forthe Integration of XML and RDF Sources. In Proc. of IIWEB-2004,30 August 2004.

[111] Zhu Xu and Y.C. Lee. Network-enabling GIS: Issues, Models and Re-view. In Proc. of Symposium on Geospatioal Theory, Processing andApplication, Ottawa, 2002.

[112] Zhu Xu and Y.C. Lee. Semantic Heterogeneity of Geodata. In Proc.of Symposium on Geospatioal Theory, Processing and Application, Ot-tawa, 2002.

[113] Kokou Yetongnon, Djamal Benslimane, Robert Laurini, Eric Leclercq,Fabrice Jouanot, and Nadine Cullot. GIS Interoperability: AnOverview. In Proc. of 2nd International Workshp on Telegeoprocessing,Nice-Sophia Antpolis, France, 10 May 2000.