1 ICS-FORTH Describing Resources on the Web: The Resource Description Framework Vassilis...

78
1 ICS-FORTH Describing Resources on the Web: The Resource Description Framework Vassilis Christophides Dimitris Plexousakis Computer Science Department, University of Crete Institute for Computer Science - FORTH Heraklion, Crete http://www.ics.forth.gr/proj/isst/RDF

Transcript of 1 ICS-FORTH Describing Resources on the Web: The Resource Description Framework Vassilis...

1

ICS-FORTH

Describing Resources on the Web: The Resource Description Framework

Vassilis ChristophidesDimitris Plexousakis

Computer Science Department, University of CreteInstitute for Computer Science - FORTH

Heraklion, Crete http://www.ics.forth.gr/proj/isst/RDF

2

ICS-FORTH

Introduction to Metadata

meta

data

3

ICS-FORTH

What is the Problem?

3.6 million Web sites Five hundred million or more

addressable pages on the Web High consumer expectations

conflicting with primitive tools and mechanisms

Uncertain quality, integrity, trust

4

ICS-FORTH

The Information Landscape in the Web-era

The Web changes relationships among authorspublishersinformation intermediaries and distributorsusers

Lower barriers to “publication”rapid dissemination of information and ideasless advantage to size or centralizationgreatly expanded access

Manageability is reduced resource discovery is chaoticorganization is haphazardpreservation is almost non-existent

5

ICS-FORTH

The Web Information System vs. Traditional Libraries

Search systems are motivated by advertising Index coverage is unpredictable and limited (1/3) Too much recall, too little precision Index spam abound Resources (and their names) are volatile What about versions, editions, back issues? Archiving is presently unsolved Authority and quality of service are spotty Managing Access Rights is hard

6

ICS-FORTH

Metadata: Higher Quality Web Information Services

Traditionally: metadata has been understood as “Data about Data”help to impose order on chaos

Example(s): a library catalogue contains information (metadata) about

publications (data)a file system maintains permissions (metadata) about files (data)

Metadata describes other dataOne application’s metadata is another application’s dataMetadata can itself be described by metadata (but that doesn’t

make it meta-metadata) Example:

Price lists (metadata) have expiration dates: metadata about metadata (It is still just metadata!!)

7

ICS-FORTH

Metadata takes Many Forms

resourcediscovery

documentadministration

rightsmanagement

contentrating

security andauthentication

archivalstatus

products andservices

databaseschemas

process controlor description

8

ICS-FORTH

Metadata exists for Almost Anything

People

Places

Objects

Concepts

Documents

Archives

Databases

9

ICS-FORTH

Application: Item and Collection Cataloguing

Describing individual resources documents, pages, images, audio files, etc.

Describing the content of collectionsWeb sites, databases, directories, etc.

Relationships among ResourcesTables of Content, chapters, images….Site Maps

10

ICS-FORTH

Search engines can better “understand” the contents of a particular page

More accurate searches Additional information aids precision

Makes it possible to automate searches because less manual “weeding” is needed to process the search results

Application: Resource Discovery

11

ICS-FORTH

Metadata can be used to encode information needed in all stages of electronic commerce

locating seller/buyer & productsearching “yellow pages”

agreeing on terms of saleprices, terms of payment,

contractual informationtransactions

delivery mechanisms, dates, terms

Application: Electronic Commerce

Broker

Market place

Providers/Clients

12

ICS-FORTH

Application: Intelligent Agents

Representation and sharing of knowledge

knowledge exchangemodeling

Communicationuser-to-agent, agent-to-agent,

agent-to-service Resource discovery

gives web-roaming agents the ability to “understand” their environment

place

service

place

place

13

ICS-FORTH

Application: Content Rating

Empowering users to select which kinds of web content they wish to see

Child Protection W3C PICS (Platform for Internet

Content Selection) working groupUS Communications Decency Act

of 1996simple metadata architectureprecursor to RDF

14

ICS-FORTH

Application: Digital Signatures

These are key to building the “Web of Trust” Required by

agentselectronic commercecollaboration

RDF will become the preferred way to encode digital signatures on documents and on statements about documents

15

ICS-FORTH

Other Applications

Privacy Preferences and Policiesdescribing a user’s willingness/

reluctance to disclose information about him/her-self

describing a site administrator’s desire to gather information about visiting users

Intellectual Property Rightscontractual terms related to usage

and distribution rights to a document

16

ICS-FORTH

(Meta)Data Transmission Methods

Embedded (eg META)

Associated With(in HTTP header)

Trusted Third Party(explicit HTTP GET)

17

ICS-FORTH

Metadata Assertions

The Web is “machine-readable” but not “machine-understandable”

Metadata is usefulA lot could be gained from

structured description of pages, servers, search services, and other resources

Accommodate multiple varieties of metadata

Metadata requirements will evolve

18

ICS-FORTH

A Plethora of Metadata Standards

Many metadata standards have evolved at different levels, and to meet different requirements...

MICI

19

ICS-FORTH

Interoperability Issues

SemanticInteroperability

StructuralInteroperability

SyntacticInteroperability

“Let’s talk English”Standardisation ofcontent

Standardisation ofform

“Here’s how to make a sentence”

Standardisation ofexpression

“These are the rulesof grammar”

“cat milk sat drank mat ”

“Cat sat on mat. Drankmilk.”

“The cat sat on the mat.It drank some milk.”

20

ICS-FORTH

Metadata Challenges

Many flavours of metadatawhich one do I use?

Managing changenew varieties, and evolution

of existing forms Tension between functionality

and simplicity, extensibility and interoperability

Functions, features, and cool stuff Simplicity and interoperability

21

ICS-FORTH

Towards Metadata for Community Webs

Group of people sharing a domain of discourse and a set of resources (e.g., data, documents, services) and having some common interests

Commerce, Education, Health

Provide community-specific metadata functionality in order to create, administrate, and access resources

common semantic, structural, and syntactic conventions for exchange of resource description information

Community Webs

Education

HealthCommerce

Workplace

22

ICS-FORTH

ScientificData

HomePages Geo

CommunityWebs

Library

Museums

Commerce

Whatever...

Metadata Interoperability in Community Webs

Communities of expertise (not software vendors) are responsible for:

SemanticsRegistrationAdministrationAccess managementAuthority of dataSharing and

Distribution

23

ICS-FORTH

Metadata Implementation Approaches

Harvesting metadata into a repository (database) Distributed Database Search

24

ICS-FORTH

Harvesting Metadata into a Repository (database)

HTML

XML

Other types

Repository HarvesterQuery

Dynamic document creation from database

retrieve resource

25

ICS-FORTH

Distributed Database Search

Z39.50 Server

Z39.50 Server

Z39.50 Server

Z39.50 GatewayQuery

retrieve resource

26

ICS-FORTH

Understanding RDF

RDF

27

ICS-FORTH

RDF origins

W3C Metadata Activity 1997-2000 PICS (Internet content selection) Warwick Framework / Dublin Core XML (XML Data, Channels etc) MCF (Apple, Netscape) URI specification for Web identifiers

28

ICS-FORTH

RDF Objectives

Enables resource description communities to define their own semantics

We can disagree about semantics, but share infrastructure (syntax, query, editors)

Imposes structural constraints on the expression of various application metadata

for consistent encoding, exchange and processing of metadata on the Web

Metadata vocabularies can be developed without central coordination

Fine-grained mixing of diverse metadata Signed RDF is the basis for trust XML used for ‘serialisation syntax’

29

ICS-FORTH

Describing Community Resources using RDF

Advanced Knowledge Schemas

(ontologies, thesauri)

<tag1> <tag2> <tag3></tag1>

<tag1> <tag2> <tag3></tag1>

Complexity and diversity

of information resources

Heterogeneous

resource descriptions

30

ICS-FORTH

The Basic RDF Data Model

RDF: Resource Descriptions Data Model: Directed Labeled

GraphsNodes: Resources (URIs) or

LiteralsEdges: Properties – Attributes

or RelationshipsStatement: assertion of the

form resource, property, valueDescription: set of statements

concerning a resourceXML syntax

31

ICS-FORTH

The Basic RDF Data Model: Primitives

ResourceProperty

Value

Statement

Resource

32

ICS-FORTH

Simple Example

URI:TutorialAuthor

“Vassilis”URI:Vassilis

33

ICS-FORTH

The notion of Resource

A resource is identified by a URI:[absoluteURI | relativeURI] [“#” fragment-id]

The resource identified by a URI may be abstract i.e. not network retrievable

Resource is distinct from entity resolved at any particular timehttp://www.ics.forth.gr/RDF/

From RFC 2396:Resource A resource can be anything that has identity. Familiar examples include an

electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.

34

ICS-FORTH

RDF Syntax

RDF Model defines a formal relationships among resources, properties and values

Syntax is required to...Store instances of the model

into filesCommunicate files from one

application to another W3C XML eXtensible Markup

Language

<tag1> <tag2> <tag3></tag1>

<tag1> <tag2> <tag3></tag1>

35

ICS-FORTH

RDF Model Example: Complex Values

URI:Tutorial“RDF

Presentation”Title

Creatordc:

dc:

“Vassilis Christophides”

<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> Vassilis Christophides </dc:Creator> </Description></RDF>

[email protected]

“`VassilisChristophides”

“ICS-FORTH”

bib:Emailbib:Affbib:Name

URI:FORTH

36

ICS-FORTH

<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0/” xmlns:bib = “http://www.bib.org/persons#”> <Description about = “URI:Tutorial”> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> <Description> <bib:Name> Vassilis Christophides </bib:Name> <bib:Email> [email protected] </bib:Email> <bib:Aff resource = “http://www.ics.forth.gr” /> </Description> </dc:Creator> </Description></RDF>

RDF Syntax Example: Complex Values

<Description bib:Name = “Vassilis Christophides” bib:Email = “[email protected]” > <bib:Aff resource = “http://www.ics.forth.gr” /></Description>

37

ICS-FORTH

RDF Model Example

admin:By

admin:On

“STEP”

“01-01-01”

admin:For“...”

URI:Tutorial“RDF

Presentation”Title

Creatordc:

dc:

[email protected]

“`VassilisChristophides”

“ICS-FORTH”

bib:Emailbib:Affbib:Name

URI:FORTH

38

ICS-FORTH

Where do you stop?

The Basic RDF model & syntax provides enabling technology Degree of metadata simplicity/complexity is a matter of:

Resource description communities needs, best-practice and experience

Organization/Institution’s PolicyEconomicsGoals and requirements of implementation

39

ICS-FORTH

The Basic RDF Data Model: In Brief

Nodes are resources connected by named propertiesR1 R2

P1

The degenerate case is an arc terminating in a fixed value

R1 “foo”P1

An RDF description consists of a directed graph of arbitrary complexity

R1 R2 R3

R6R4

R7

R5

R8

P1 P2

P3 P4 P5

P6

P7

40

ICS-FORTH

One Additional Concept: Container Values

Containers are collectionsthey allow grouping of resources (or literal values)

It is possible to make statements about the container (as a whole) or about its members individually

Different types of containers existBags -- groups of thingsSequences -- ordered group of thingsAlternates -- Alternate things/values

First value is the defaultMust be at least one

Duplicate values are permittedthere is no mechanism to enforce unique value constraints

Syntactic shorthand provided (much like HTML lists)

41

ICS-FORTH

Containers (continued)

“Vassilis

Christophides”

rdf:_1

dc:Creatorrdf:Type

“Dimitris

Plexousakis”

rdf:_2

URI:Tutorial

rdf:Seq

42

ICS-FORTH

Containers (continued)

dc:Creator dc:Creator

“Vassilis

Christophides”

“Dimitris

Plexousakis”

URI:Tutorial

43

ICS-FORTH

The Basic RDF Data Model: Formal Aspects

Statement := (predicate,subject,object) Predicate is a resource Subject is a resource Object is either a resource or a literal

Object = Predicate(Subject) A model is a set of statements

Formal model based on triples (Universal relation)

Example

{author, “http://www.ics.forth.gr/proj/isst/RDF”, node}{name, node, “Vassilis Christophides” }{email, node, “[email protected]” }

44

ICS-FORTH

Triples for Container Values: Example

Triples from the first example:

{“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator,x}{x, rdf:_1, “Vassilis Christophides” }{x, rdf:_2, “Dimitris Plexousakis” }{x, rdf:type, rdf:Seq }

Triples from the second example:

{“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator, “Vassilis Christophides”}

{“http://www.w3.org/TR/REC-rdf-syntax”, dc:Creator, “Dimitris Plexousakis”}

45

ICS-FORTH

Edge Labeled Directed Graphs (RDF)

RDFTutorial

Vassilis

ICS-FORTH

ISL C-Web

creatoraffiliation

projectsactivities

(creator, RDFTutorial, Vassilis)(affiliation, Vassilis, ICS-FORTH)(activities, ICS-FORTH, ISL)(projects, ICS-FORTH, C-Web)

46

ICS-FORTH

Node labeled Directed Graph (XML)

root

foo bar

bazhref x y

x

z

element element

elementattribute attribute attribute

attribute

attribute

<root><foo href=“…” x=“1” /><bar x=“2” y=“3”>

<baz z=“aaa”/></bar>

aaa

2

31

47

ICS-FORTH

What can we Express in RDF?

RDF relies on a (edge labeled) directed graph model that can easily

extended by just adding more edgescombine multiple vocabularies,

distinguished by their URIs RDF provides a standard syntax to

represent these graphs in XMLRDF Model can be thought of as a

simplified XML Infoset But RDF goes beyond XML syntactic

issuesIt allows to define semantic networks

on the Web

48

ICS-FORTH

Semantic Networks

Person

Artist

Painter Sculptor

name

Sculpture

Artifact

Painting

lives in

creates

paintssculpts

“a Person has a name and lives_in somewhere . Artists are persons, paintersand sculptors are artists. An artist creates artifacts, (paintings or sculptures)a painter paints paintings and a sculptor sculpts sculptures”

String

isa

isa isa

isa

isaisa

49

ICS-FORTH

RDF Schema Definition: RDFS

Declaration of label vocabularies for description graph nodes & edges Enables communities to share machine readable tokens and define

human readable labels Node labels (types) are defined as classes

Literal data types as defined by XML Schemas WG Resource may have a specific ‘type’ property

Edge labels (predicates) are defined as properties of these classes A resource of given type may have a given property (domain

constraint) A resource of given type may be the value of a given predicate

(range constraint) RDFS vocabularies expressible in the basic RDF model and syntax

RDFS vocabularies are also Web resources (and have URIs) and therefore can be described using RDF

50

ICS-FORTH

Constructing and Using RDF schemas

RDFS Schema Vocabularies allows for

Specialization of both classes & properties (simple & multiple)

Multiple classification of resources under several classes

Unordered, optional, and multi-valued properties

Domain and range polymorphism of properties

51

ICS-FORTH

A Cultural Community Resource Description Example

r2: museoreinasofia.mcu.es/guernica.jpg

r1:www.rodin.fr/thinker.gif

PortalSchema

PortalResourceDescriptions

ExtResource

last_modified title

StringDate

“oil on canvas”technique

exhibited

“Reina Sofia Museum”

title2000/06/09

last_modified

&r3

&r1

&r2

&r4

Artist

Sculptor

StringArtifact

Sculpture

Painting

sculpts

createsfname

lname

paints

StringMuseum

exhibited

techniqueStringPainter

paints

creates

&r5

&r6

fname

lname

lname

paints

“Pablo”

“Picasso”

“Rodin”

2000/01/02last_modified

r4:museoreinasofia.mcu.esr3:www.artchive.com/woman.jpg

Web Resources

52

ICS-FORTH

RDF/XML Serialization: Data<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/2000/PR-rdf-schema-20000327#" xmlns=""><Painter rdf:id=“picasso132"> <fname>Pablo</fname> <lname>Picasso</lname> <paints> <Painting rdf:about="http://museoreinasofia.mcu.es/guernica.gif"> <exhibited> <Museum rdf:about="http://museoreinasofia.mcu.es"/> </exhibited > <technique>oil on canvas</technique> </Painting> </paints> <paints> <Painting rdf:about="http://www.artchive.com/woman.jpg”/> </paints></Painter> <ExtResource rdf:about="http://museoreinasofia.mcu.es"> <title>Reina Sophia Museum</title > <lastmodified>2000/06/09</lastmodified></ExtResource><Sculptor rdf:id="rodin424" lname="Rodin“> <creates> <Sculpture rdf:about="http://www.rodin.fr/thinker.gif"/> </creates></Sculptor></rdf:RDF>

53

ICS-FORTH

RDF/XML Serialization: Schema<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/ 22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/2000/ PR-rdf-schema-20000327#"><rdfs:Class rdf:ID="Artist"/><rdfs:Class rdf:ID="Artifact"/><rdfs:Class rdf:ID="Style"/><rdfs:Class rdf:ID=“Museum"> <rdfs:Class rdf:ID="Sculptor"> <rdfs:subClassOf rdf:resource="#Artist"/> </rdfs:Class><rdfs:Class rdf:ID="Painter"> <rdfs:subClassOf rdf:resource="#Artist"/> </rdfs:Class><rdfs:Class rdf:ID="Sculpture"> <rdfs:subClassOf rdf:resource="#Artifact"/> </rdfs:Class><rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/> </rdfs:Class><rdf:Property rdf:ID="creates"> <rdfs:domain rdf:resource="#Artist"/> <rdfs:range rdf:resource="#Artifact"/> </rdf:Property>

<rdf:Property rdf:ID="paints"> <rdfs:domain rdf:resource="#Painter"/> <rdfs:range rdf:resource="#Painting"/> <rdfs:subPropertyOf rdf:resource="#creates"/> </rdf:Property><rdf:Property rdf:ID="sculpts"> <rdfs:domain rdf:resource="#Sculptor"/> <rdfs:range rdf:resource="#Sculpture"/> <rdfs:subPropertyOf rdf:resource="#creates"/> </rdf:Property><rdf:Property rdf:ID=“exhibited"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Museum"/></rdf:Property><rdf:Property rdf:ID=" technique"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource="http://www.w3.org/ TR/1999/PR-rdf-schema-19990303#Literal"/></rdf:Property><rdf:Property rdf:ID="title"> <rdfs:domain rdf:resource="#ExtResource"/> <rdfs:range rdf:resource= "http://www.w3.org/ TR/1999/PR-rdf-schema-19990303#Literal"/></rdf:Property> ….</rdf:RDF>

54

ICS-FORTH

RDF/S vs. Well-Known Formalisms

Relational or Object Database Models (ODMG, SQL) Classes don’t define table or object types Instances may have associated quite different properties Collections with heterogeneous members

Semistructured or XML Data Models (OEM, UnQL, YAT, XML Schema) Schema labels on both nodes and edges Class and property subsumption is not captured Heterogeneous structures reminiscent to SGML exceptions

Knowledge Representation Languages (Telos, DL, F-Logic) Absence of complex values and n-ary relationships (bags, sequences)

58

ICS-FORTH

Some RDF Applications

Web Browsers:Netscape 6 from Netscape/AOL uses RDF to integrate various data-oriented

applications such as bookmarks, mail/news, channels, etc. as well as for smart browsing and related links (RDF annotation services)

Amaya Editor/Browser from W3C uses RDF to support user annotations on Web pages as metadata

Brokers/Portals:RSS (RDF Site Summary) XML/RDF Specification 1.0 2000Web Service Description Language (WSDL) XML/RDF Specification 2000PICS Rating Vocabularies in XML/RDF W3C NOTE 27 March 2000Platform for Privacy Preferences and RDF/RDF W3C Draft 10 May 2000

Content Management:OCLC Dublin Core Elements in RDFICOM-CIDOC Conceptual Reference Model in RDFThe Wordnet Lexical Ontology in RDFEuropean Treasury Browser in RDF

59

ICS-FORTH

Example: Annotation & Recommendation Services

60

ICS-FORTH

Practical notes on RDF

Authoring/Visualizationby hand (experts only, perhaps copy & paste)support by other tools (editors like Stanford Protégé)conversion from existing data stores (using XSLT)visualize RDF graphs (using Rudolf RDFViz)

Parsing/ValidatingICS-FORTH Validating RDF Parser (VRP)Rapier RDF Parser W3C Simple RDF Parser & Compiler (SiRPAC)

Storing/QueryingICS-FORTH RSSDB/RQLAidministrator SesameRedland SquishR.V.Guha RDFdb

Harvesting/CrawlingAIFB RDF Crawling

61

ICS-FORTH

ICS-FORTH RDF R&D Activities(http://www.ics.forth.gr/proj/isst/RDF)

62

ICS-FORTH

The ICS-FORTH RDFSuite

The Validating RDF Parser (VRP): Karsten Tolle Diploma ThesisThe first RDF Parser supporting semantic validation of both

resource descriptions and schemas The RDF Schema Specific DataBase (RSSDB): Sophia Alexaki

M.Sc. ThesisThe first RDF Store using schema knowledge to automatically

generate an Object-Relational (SQL3) representation of RDF metadata and load resource descriptions

The RDF Query Language (RQL): Greg Karvournarakis M.Sc. Thesis

The first Declarative Language for uniformly querying RDF schemas and resource descriptions

63

ICS-FORTH

The ICS-FORTH RDFSuite Architecture

Class Property

ORDBMSORDBMS

p_namedomain rangeResource title Literal

c_nameHotel

Hotel Dir

URIcreates

subclHotel Dir

supclHotel

subpr suppr

SubClass SubProperty

sourcepaints

targetcreates

Hotel title

DB

MS R

DF

qu

ery

APIs

SQ

L3+

SP

I fu

nctio

ns

LIBC++

SQL3

RQL InterpreterRQL Interpreter

Typing

Evaluation

GraphConstructor

Parser

Parser

VRP InternalRDF Model

Validator

RD

F Lo

ader

Loadin

g R

DF

Java A

PIsVRPVRP

JDBC

SQL3

64

ICS-FORTH

The Validating RDF Parser (VRP)

The VRP parser checks only if an RDF file is well-formed according to the RDF M & S Spec

The VRP validator checks if the model (i.e. triples) generated by the parser satisfies the constrains imposed by the RDF Schema Spec

LexicalAnalyzer

Parser

VRP InternalRDF Model

Validator

NamespaceManager

SyntaxAnalyzer

RDF graph model

subject predicate object

RDF triple model

RDF/XML

<rdf :RDF xmlns:rdf="...#” xmlns:rdfs="...#" xmlns=“ "> <tag1> <tag2> ,,, </tag2> </tag1></rdf :RDF>

Descriptions

67

ICS-FORTH

C2P1

r1 r2P1

Resource• URI

RDF_Resource•rdf:type•………...

RDF_Class•rdfs:subClassOf

RDF_Property•rdfs:domain•rdfs:range•rdfs:subPropertyOf•link_list

RDF_Statement•rdf:predicate•rdf:subject•rdf:object

Extended VRP Validator

RDF Querying APIs

Persistent Namespace

(DBMS)

Additional Constraints

RDF Loading APIs

C1

ns#C1

URI

ns#C1

p_name domain range

Property

DBMS

store()

store()

store()

RDF Model

RDF_Resource@7844

URI r1

rdf:type ns#C1RDF_Property@5678

rdf:type rdf#Property

rdfs:range ns#C2

rdfs:domain ns#C1

link_list (r1,r2)

URI ns#P1

RDF_Class@2344

URI ns#C1

rdf:type rdfs#Class

c_name

Class

r1

source target

ns#P1

ns#P1 ns#C1 ns#C2

r1 r2

The RDF to DBMS Loader

68

ICS-FORTH

RSSDB Representation of RDF metadata

id

11

Class

nsid

2

lpart

ExternalPage1415

Property

nsid42

lparttitletitle

domainid1011

rangeid11

12 3 Arts13 3 Art_History subid

11

13

SubClass

superid10

12

subid15

SubProperty

superid14

id

12 10

10 DataResource

sourcet14

target

sourcet15

target

t12

t13

URIt10

URIt11

URI

URIsubtable

id1

urihttp://www.w3.org/2000/01/rdf-schema#

Namespace

3 http://www.odp.org/schema.rdf# 4 http://www.arts.org/schema.rdf#5 http://www.dc.org/schema.rdf#

2 http://www.w3.org/1999/02/22-rdf-syntax-ns#id1

Type

nsid1

lpartLiteral

2 1 Bag3 1 Seq

69

ICS-FORTH

The RDF Query Language (RQL)

Declarative query language for RDF description basesrelies on a typed data model (literal & container types + union types)follows a functional approach (basic queries and filters)adapts the functionality of semistructured or XML query languages to

RDF, but also: treats properties as self-existent individualsexploits taxonomies of node and edge labels allows querying of schemas as semistructured data

Relational interpretation of schemas & resource descriptionsClasses (unary relations)Properties (binary relations)Containers (n-ary relations)

70

ICS-FORTH

Browsing Portal Catalogs with RQL

Simple set queries on class and property extents:Find the resources in the extent of the property creates

creates {{ [www.portal.gr/rodin424, www.rodin.fr/thinker.gif], [www.portal.gr/picasso132,

museoreinasofia.mcu.es/guernica.gif], [www.portal.gr/picasso132, www.artchive.com/woman.jpg] }}

Find the resources of type painter and sculptor ExtResource intersect Sculpture

{{ www.rodin.fr/thinker.gif }}

Schema constructs used as query terms & support for automatic query

expansion (similar to thesauri-based IRS)

Useful to query resources with minimal schema knowledge

Includes paints & sculpts

Multiply classified resources

71

ICS-FORTH

Personalizing Portal Catalogs with RQL

Navigational queries on semistructured resource descriptionsFind the Museum resources that have been modified in year 2000. select x from Museum{x}.last_modified{y} where y >= 2000/01/01

{{museoreinasofia.mcu.es}}

Similar functionality to semistructured or XML query languages (Lorel, UnQL, XQL, XML-QL, XML-GL)

Useful in the absence of schema information or when multiple schemas are used to describe resources

Data paths not

foreseen in the schema

72

ICS-FORTH

Querying Portal Catalogs with Large Schemas

Filtering both resource descriptions and schemasFind the paintings having as technique “oil on canvas” that have

been created by a neo-impressionist painter

select y from {:$X}creates{y:Painting}.technique{z} where $X <= neo-impressionist and z = “oil on canvas”

Data filtering with

schema informationSchema Filtering on

Class hierarchies

73

ICS-FORTH

Querying Portal Schemas with RQL

Pure schema queriesFind the properties which specialize the property creates and may

have as domain the class Painter along with their corresponding range classes

select @P, $Y from {:Painter}@P{:$Y} where @P <= creates

{{ [creates, Artifact], [creates, Painting], [creates, Sculpture], [paints, Painting] }}

Schema filtering on

property hierarchies

All Properties defined or

inherited in class Painter

74

ICS-FORTH

RQL: Examples

ns1#creates

ns1#Painter

ns1#Artifact

ns1#Painting ns1#Sculpture

ns1#paints

ns1#Painter ns1#Painting ns1#Sculpture

ns1#paints

ns1#creates

ns1#Artist

ns1#Painter ns1#Sculptor

String

Stringns1#Artifact

ns1#Painting ns1#Sculpture

ns1#Style

String

ns1#paints

ns1#creates ns1#has_style

ns1#has_material

ns1#fname

ns1#lname

ns1#sculpts odp#ExtPage

dc#last_modified

Date

ns1#Impressionist

ns1#PostImpressionist

ns1#Painter ns1#Painting ns1#Sculpture

ns1#paints

ns1#creates

Similar functionality to DBMS schema QLs (SchemaSQL, XSQL) Useful for large schemas (integrating ontologies and thesauri)

75

ICS-FORTH

Putting it all Together

Nested schema and data queriesFind the resources modified after 2000/01/01 which can be reached

by a property applied to the class Painting and its subclasses

select R, y from (select @P from {:$X}@P where $X <= Painting){R}.{y}last_modified{z} where z >= 2000/01/01

{{ [exhibited, museoreinasofia.mcu.es] }}

Subcommunities may use different schemas while sharing the same description base

R ranges over the labels

of type property

76

ICS-FORTH

RQL:Examples

PortalSchema

PortalResourceDescriptions

“oil on canvas”technique

exhibited

&r3

&r2

&r4

Painting

Museumexhibited

techniqueString

2000/06/09last_modified

2000/01/02last_modified

77

ICS-FORTH

Putting it all Together Schema and data queries

Find all metadata about the resources of the site museoreinasofia.mcu.es

select x,$$Y,$P,z,$$W from {x:$$Y}$P{z:$$W} where x like “*museoreinasofia.mcu.es*” or y like “*museoreinasofia.mcu.es*” {{[www.portal.gr/picasso132, Painter, paints, museoreinasofia.mcu.es/guernica.gif,

Painting], [museoreinasofia.mcu.es/guernica.gif, Painting, exhibited, museoreinasofia.mcu.es,

Museum], [museoreinasofia.mcu.es/guernica.gif, Painting, technique, “oil on canvas”, string], [museoreinasofia.mcu.es, ExtResource, title, “Reina Sophia Museum”, string], [museoreinasofia.mcu.es, ExtResource, last_modified, 2000/06/09, date], ….}}

Subcommunities may use both different schemas and description bases

URLs’ pattern matching

78

ICS-FORTH

RQL Query Processing

select y

from {x}creates{y:Painting}.has_material{z}

where z = “oil on canvas”

select y

from creates A, has_material B, D $C

define x = A.source, y = A.target, w = B.source, z = B.target,

R = range(creates), D = subclassOf(R), E = ^($C)

where z = “oil on canvas” and y = w and $C = Painting and y in E

79

ICS-FORTH

RQL Query Optimization

Project

SemiJoin

Joiny = w

Select

y

z = “oil on canvas”

y in ^($C) creates[x,y]

subclassOf(range(creates))[$C]

Select$C = Painting

has_material[w,z]

Project

Joiny = w

Select

y

z = “oil on canvas”

creates[x,y]

Selecty in ^Painting

has_material[w,z]

Project

Joiny = w

Select

y

z = “oil on canvas”

creates[x,y]

SemiJoin

Painting[p]

has_material[w,z]

y = p

select X.targetfrom creates* X, has_material* Y, Painting Pwhere X.target = Y.source and X.target = P.uri and Y.target = ’oil on canvas’

80

ICS-FORTH

The RQL Query Interpreter

Main

Query string

Query string

Query result

Syntax tree under CNF

•Evaluation of dependencies

•Factorization functions

Graph construction

•Syntactical analysis (lex/yacc)

•CNF transformation

•Checks type compatibility

•Sets appropriate evaluation functions

Type inference

DBMS – RDF Query APIs

•Defines evaluation functions

•Query Processing

Evaluator

Syntax analysis

Query graph Typing

Evaluation

Result

Query graph

(1)

(2)

(3)

(4)

(5)

(6)

DBMS

81

ICS-FORTH

RDFSuite Summary

RDFSuite addresses the needs of effective RDF metadata management by providing tools for validation, storage and querying

validation follows a formal data model and constraints enforcing consistency of RDF schemas

incremental loading of voluminous description bases in a persistent store

declarative query language for schema and data querying Ongoing efforts:

RQL query optimization transactional aspects alternative encoding and representation schemes for access

optimization

82

ICS-FORTH

Acknowledgements

Funding was generously provided by the projects:

C-WEB (IST-1999-13479): “A Generic Platform Supporting

Community Webs”

MESMUSES (IST-2000-26074): “Metaphor for Science Museums”

83

ICS-FORTH