Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

33
Behshid Behshid Behkamal Behkamal Ferdowsi University of Mashhad Ferdowsi University of Mashhad Web Technology Lab Web Technology Lab
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    3

Transcript of Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

Page 1: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

Behshid BehkamalBehshid Behkamal

Ferdowsi University of MashhadFerdowsi University of MashhadWeb Technology LabWeb Technology Lab

Page 2: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

22

OutlineOutline

1. Linked Data: What and Why1. Linked Data: What and Why

2. Linking Open Data Project2. Linking Open Data Project

3. An Experience of Publishing Linked Data3. An Experience of Publishing Linked Data

4. Challenges of Linked Data4. Challenges of Linked Data

Page 3: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

33

The problem: Doing PhD in a good university

Page 4: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

44

The problem: Planning a trip

Page 5: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

55

The Web of Documents The Web of Documents

The Web of Data The Web of Data

Page 6: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

66

Linked Data is …

A way of publishing data on the web to create typed links between data from different sources

WikipediaWikipedia:

A term used to describe a recommended best practice for exposing, sharing, and connecting pieces of datadata, informationinformation, and knowledgeknowledge on the Semantic Web using URIs and RDF

Page 7: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

77

Linked Data employs …

URI URI to identify resource

HTTPHTTP protocol to retrieve resource

RDFRDF data model to represent resource

Page 8: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

88

URI: Uniform Resource Identifier

As names for any entity existing in the world

A simple and extensible means for identifying a resource

For documents and other entities that can be located on the Web

Page 9: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

99

HTTP

A simple mechanism for retrieving Resources that can be serialized as a stream of bytes

Picture of an animal

Descriptions of entities that cannot themselves be sent across networkDescription of a rabbits

Page 10: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1010

RDF ModelIt is a machine understandable metadata

A model is a collection of statements

Statement := (subject, predicate, object)Statement := (subject, predicate, object)

Subject Object

Predicate

StatementStatement

URI

URI

URI / Literal

Page 11: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1111

Linking Statements

The subject of one statement can be the object of another

Behkamal ComputerEngineering

Student OF

Ferdowsi university of

Mashhadwww.um.ac.ir

Department OF

Has HomePage

Such collections of statements form LOD cloud

Page 12: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1212

LOD Cloud : May 2007

Page 13: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1313

Linked Open Data (LOD) Cloud Each circle represents a dataset

Size of the circle corresponds to the number of triples Arrows represent the links between datasets Thickness of arrows indicates number of links

Datasets are published based on Linked Data PrinciplesLinked Data Principles

Datasets are interlinked with at least oneat least one other dataset in the cloud

Some datasets act as hubE.g. DBpedia, Geonames, …

Page 14: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1414

LOD Cloud : February 2008

14

Page 15: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1515

LOD Cloud : July 2009

15

Page 16: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1616

LOD Cloud : September 2010

Page 17: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1717

LOD cloud : 2007 - 2010

Datasets: Datasets: only 11 more than 200200Triples:Triples: around 22 Billion over 2525 Billion

Links:Links: over 22 Million around 395395 Million

Page 18: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1818

Page 19: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

1919

Process of publishing FUM-LD

1. Selecting Target Data

2. Assigning URIs

3. Publishing Data

4. Interlinking Data Resources

Page 20: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2020

1. Selecting Target Data

- An important issue of publishing linked data is deciding which data is to be selected

- After studying the FUM data base, five important entities are selected:

FacultyFacultyDepartmentDepartmentProfessorProfessorPaperPaperCourseCourse

Page 21: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2121

2. Assigning URLs - We used a simplea simple schemaschema for naming URIs

- http://wtlab.um.ac.ir/LinkedOpenData/TYPETYPE/ID

Faculty: Faculty: http://wtlab.um.ac.ir/LinkedOpenData/faculties/engineeringDepartment: Department: http://wtlab.um.ac.ir/LinkedOpenData/departments/3105 Professor: Professor: http://wtlab.um.ac.ir/LinkedOpenData/profs/kahani Paper: Paper: http://wtlab.um.ac.ir/LinkedOpenData/papers/13 Course: Course: http://wtlab.um.ac.ir/LinkedOpenData/courses/23152183

Page 22: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2222

3. Publishing Data

Methods for publishingMethods for publishing Using existing toolsUsing existing tools Developing an applicationDeveloping an application

Modes of publishingModes of publishing Online (On the fly)Online (On the fly) OfflineOffline

Page 23: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2323

Providing links to other resources inside and outside the FUM-LD is a main design target of our web

presentation

4.1. Interlinking to FUM-LD Resources

4.2. Linking to External Datasets DBPedia Geonames YAGO DBLP …

4. Linking Data Resources

Page 24: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2424

4.1. Interlinking to FUM-LD Resources

Departments Courses

Profs

PapersFaculties

Page 25: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2525

4.2. Linking to Other Resources

Datasets Description Count

DBpedia Links to Resources 4570

DBpedia owl:sameAs links 1311

DBLP owl:sameAs links 475

ACM owl:sameAs links 38

DBpedia skos:subject links 3708

OpenCyc skos:subject links 449

GeoNames Links to resources 936

Page 26: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2626

Page 27: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2727

سيستم کيفيت ارزيابي برای شده ارائه مدلهايالکترونيکي تجارت هاي

Classification of ChallengesClassification of Challenges

Data Challenges

Multi Lingual challenges

Linking challenges

Data and Link Maintenance

Page 28: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2828

1. Linking Challenges Choosing appropriate ontologies and predicatesSelecting based on popularity

Not effective for all cases No automatic approach

Creating appropriate links between dataFinding appropriate links between resources of different

datasets

‘Hasan Amiri’ , ‘S.H. Amiri’ and ‘Amiri H.’

Page 29: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

2929

Lack of required data in the original dataset

No information for publications of some professors

Incorrect or incomplete data abstract or keywords, or list of coauthors of the

papers

Inconsistency of data value Different types of formats (date of a conference) Different language

2. Data Challenges lack of data or presence of low-quality data

Page 30: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

3030

3. Multi Lingual challenges

Linking a Non-English dataset to the related English datasets of LOD cloud.

Multi-lingualMulti-lingual Some users choose their mother tongue language while others use

English for entering their data Because of

Convenience of users Users’ field of activity

Multi SpellingMulti Spelling Some Persian name like “سعید” is entered as “saeed” , “saeid” and

“saied”.

Page 31: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

3131

4. Data and Link Maintenance

An important issue in maintaining the quality of data published as linked data is:

Updating data as well as links between the data itemsUpdating data as well as links between the data items

Two main situations that requires updating the datasetTwo main situations that requires updating the dataset

The original dataset is changed Exp. A new professor joins Computer Engineering

department

A related external dataset is changed Exp. A new resource describing ‘Computer Engineering

Department of Ferdowsi University o Mashhad’ is added to DBpedia

Page 32: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

3232

Summery

Linked Data and LOD cloud are introduces

An experience of publishing linked data is presented

Some problems and challenges are discussed

Page 33: Behshid Behkamal Ferdowsi University of Mashhad Web Technology Lab.

Behshid Behshid BehkamalBehkamal