Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER...

19
Overview Overview Our research activities concern the Our research activities concern the implementation of Web information systems for implementation of Web information systems for eGovernment eGovernment applications applications Due to development of eGovernment initiatives, Due to development of eGovernment initiatives, more and more on-line more and more on-line resources resources and and services services are being made available by Public are being made available by Public Administrations (PAs) Administrations (PAs) We make use of We make use of temporal database temporal database and and semantic semantic Web Web techniques to provide techniques to provide personalized access personalized access to such resources and services to such resources and services In particular, we consider In particular, we consider multi-version norm multi-version norm texts texts (stored in XML format) available in Web (stored in XML format) available in Web repositories repositories

Transcript of Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER...

Page 1: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

OverviewOverview

Our research activities concern the implementation of Our research activities concern the implementation of Web information systems for Web information systems for eGovernmenteGovernment applications applications

Due to development of eGovernment initiatives,Due to development of eGovernment initiatives,more and more on-line more and more on-line resourcesresources and and servicesservices are are being made available by Public Administrations (PAs)being made available by Public Administrations (PAs)

We make use of We make use of temporal databasetemporal database and and semantic Websemantic Web techniques to provide techniques to provide personalized accesspersonalized access to such to such resources and servicesresources and services

In particular, we consider In particular, we consider multi-version norm textsmulti-version norm texts (stored in XML format) available in Web repositories(stored in XML format) available in Web repositories

Page 2: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

timetime

Original Original normative textnormative text 11 22

new new versionversion

33

new new versionversion

Importance of versioningImportance of versioning Temporal concernsTemporal concerns are ubiquitous in the law domain are ubiquitous in the law domain

Each normative text changes in time due to different Each normative text changes in time due to different modificationsmodifications, , but keeps its but keeps its identityidentity

The ability to model The ability to model temporal dimensiontemporal dimensionss is essential for the is essential for the management of evolving normsmanagement of evolving norms

it is crucial to reconstruct the it is crucial to reconstruct the consolidated versionconsolidated version of a norm of a norm also also past versionspast versions are still important are still important

Page 3: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Importance of versioningImportance of versioning

Applicability (semantic) versioningApplicability (semantic) versioning also plays an important role also plays an important role some norms or some of their parts have or acquire a some norms or some of their parts have or acquire a limited limited

applicabilityapplicability personalized versionpersonalized version of the normof the norm

A version only containing provisions which are applicable A version only containing provisions which are applicable to a citizen’s personal caseto a citizen’s personal case

Self-employedSelf-employed

Art. 1 (unemployed)Art. 1 (unemployed)

xxy yyx yxyx yyyxx xyyxxxy yyx yxyx yyyxx xyyx

Art. 2 (self-employed)Art. 2 (self-employed)

aab bbab abab abba abaab bbab abab abba ab

Art. 3 (retired)Art. 3 (retired)

qwqq ww wqqw wq wwqwqq ww wqqw wq ww

Page 4: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

MotivationMotivation

Large XML collections of norms Large XML collections of norms are made available by the PA on the Web are made available by the PA on the Web but but personalizationpersonalization is:is:

AbsentAbsent, e.g. , e.g. http://www.normeinrete.ithttp://www.normeinrete.it(temporal versioning partially supported)(temporal versioning partially supported)

PredefinedPredefined in the Website structure and contents, in the Website structure and contents, e.g. e.g. http://www.italia.gov.ithttp://www.italia.gov.it(hardwired by human experts following the life-events approach)(hardwired by human experts following the life-events approach)

Lack of an effective, flexible, on-demand Lack of an effective, flexible, on-demand (“intelligent”, efficient) personalization facility(“intelligent”, efficient) personalization facility

Page 5: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

ObjectivesObjectives

Development of an Development of an effectiveeffective and and efficientefficient Web information systemWeb information system where where::

norms are represented as norms are represented as XML documentsXML documents dynamics of norms in timedynamics of norms in time is captured is captured limited applicabilitylimited applicability of normsof norms

(and their parts) is captured(and their parts) is captured selective access selective access and and reconstruction of versionsreconstruction of versions

is supported by a query engineis supported by a query engine

Aimed at:Aimed at: enabling citizens to access enabling citizens to access personalizedpersonalized versions versions

of of multiversionmultiversion resources resources improving and optimizing the improving and optimizing the involvementinvolvement of citizens of citizens

in the eGovernance processin the eGovernance process

Page 6: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

The Technological InfrastructureThe Technological Infrastructure

WEB SERVICESWEB SERVICESOF PUBLICOF PUBLIC

ADMINISTRATIONADMINISTRATION

WEB SERVICESWEB SERVICESWITH ONTOLOGYWITH ONTOLOGY

OOCC

XML REPOSITORY OF XML REPOSITORY OF ANNOTATED NORMSANNOTATED NORMS

SIMPLESIMPLEELABORATIONELABORATION

UNITUNIT

1 – 1 – identification phaseidentification phase: reconstruction : reconstruction on-the-flyon-the-fly of the digital of the digital identity of the authenticated useridentity of the authenticated user

11

classclass CCxx

2 – 2 – classification phaseclassification phase: use of the collected digital identity to : use of the collected digital identity to classify the citizen with respect to the classify the citizen with respect to the civic ontology Ocivic ontology Occ

22

Public Public Administration Administration DBDB

creation creation /update/update

3 – 3 – querying phasequerying phase: access and reconstruction of all and only : access and reconstruction of all and only norms which are applicable to the norms which are applicable to the class Cclass Cxx

33 Querying phaseQuerying phase

Page 7: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

The Civic OntologyThe Civic Ontology

Embodies a classification of citizens based on the distinctions Embodies a classification of citizens based on the distinctions introduced by successive norms that imply some introduced by successive norms that imply some limitations in their limitations in their applicability applicability (founding acts) (founding acts)

At this stage of the project, we manage “tree-like” ontologies(i.e. class taxonomies induced by the IS-A relationship)

Citizen

EmployeeUnemployed Retired

Self-employedSubordinate

PrivatePublic

Page 8: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Extension of a previous Extension of a previous temporal XML temporal XML modelmodel (D&KE 2005) including: (D&KE 2005) including: a temporal multi-version XML schemaa temporal multi-version XML schema

is based on the is based on the hierarchical organizationhierarchical organization of normative of normative texts: texts: contents-section-article-paragraphcontents-section-article-paragraph

at each level of the hierarchy, the history of changes is at each level of the hierarchy, the history of changes is represented by the (time-stamped) represented by the (time-stamped) versionsversions produced produced

it supports it supports ancestor-descendant inheritanceancestor-descendant inheritance

temporal manipulation operationstemporal manipulation operations

Addition of applicability annotations in order to Addition of applicability annotations in order to support support semantic versioningsemantic versioning

The modeling approachThe modeling approach

Page 9: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

The temporal XML The temporal XML schemaschema

4 Temporal Dimensions:4 Temporal Dimensions:

Publication timePublication timetime of publication time of publication on the Official Journalon the Official Journal

Validity timeValidity timetime the norm is in forcetime the norm is in force

Efficacy timeEfficacy timetime the norm time the norm can be appliedcan be applied

Transaction timeTransaction timetime the norm is storedtime the norm is storedin the systemin the system

Law

Title Contents

Publication – R Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O

An_ref – O Ver

Section

Ver

Article

Ver

Heading

Paragraph

Ver

Heading

Num – R

Num – R

Num – R

Num – R

Num – R

An_ref – O

Num – R

An_ref – O

Num – R

An_ref – O

Num – R

Type – R

Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O

TA

Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O

TA

Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O

TA

Vt_Start – RVt_End – OTt_Start – RTt_End – OEt_Start – REt_End – O

TA

Page 10: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Semantic versioningSemantic versioning

A pre-order and post-order numbering scheme is introduced in the tree-like ontology

Classes are identified by means of their pre-order code Encoding is exploited in query processing for quick ancestor-descendant

checking Applicability annotations (AA) are added to semantic versions of

document parts as references to the ontology classes

Citizen

EmployeeUnemployed Retired

Self-employedSubordinate

PrivatePublic

Citizen

EmployeeUnemployed Retired

Self-employedSubordinate

PrivatePublic

(2,1) (3,6) (8,7)

(4,4) (7,5)

(5,2) (6,3)

(1,8)

Page 11: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Semantic versioningSemantic versioning Applicability is inherited by descendant nodes unless locally redefined By means of redefinitions we can also introduce, for each part of a

document, complex applicability properties Restrictions with respect to ancestors Extensions with respect to ancestors

<article num="1"><ver num="1">

<aa applies_to="3"/>[… Temporal attributes … ]<paragraph num="1">

<ver num="1"> [ … Text … ]<aa applies_to="4"/>[… Temporal attributes … ]

</ver></paragraph><paragraph num="2">

<ver num="1"> [ … Text … ]<aa applies_also="8"/>[… Temporal attributes … ]

</ver></paragraph>

</ver></article>

Citizen

EmployeeUnemployed Retired

Self-employedSubordinate

PrivatePublic

(2,1) (3,6) (8,7)

(4,4) (7,5)

(5,2) (6,3)

(1,8)

Page 12: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

John Smith is a self-employed citizen.John Smith is a self-employed citizen.

He is interested in the text of all the norms ...He is interested in the text of all the norms ...

... which contain paragraphs dealing with health care, ...... which contain paragraphs dealing with health care, ...

... which were valid and in effect between 2002 and 2004, ...... which were valid and in effect between 2002 and 2004, ...

... and which are applicable to his ... and which are applicable to his case (civic class 7).case (civic class 7).

Example of Example of full searchfull search

Structural constraintStructural constraint

Textual constraintTextual constraint

Temporal constraintTemporal constraint

Semantic constraintSemantic constraint

4 orthogonal constraints4 orthogonal constraints

Page 13: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

FOR $a IN normsFOR $a IN norms

WHERE textConstr ($a//paragraph//text(), ’health AND care’)WHERE textConstr ($a//paragraph//text(), ’health AND care’)

AND tempConstr (’vTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)AND tempConstr (’vTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)

AND tempConstr (’eTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)AND tempConstr (’eTime OVERLAPS PERIOD(’2002-01-01’,’2004-12-31’)’)

AND applConstr (’class 7’)AND applConstr (’class 7’)

RETURN $aRETURN $a

Example of Example of full searchfull search

Structural constraintStructural constraint

Textual constraintTextual constraint

Temporal constraintTemporal constraint

Semantic constraintSemantic constraint

4 orthogonal constraints4 orthogonal constraints

Page 14: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Norm

Article 1

Par 1

Ver 1AA=3

Ver 1

Par 2

Article 2

Health care…Health care………text Xtext X

Ver 2

Public health…Public health………text Ytext Y

Example of Example of full searchfull search

TA

AA

TAAA=4

TAVer 1

AA=3,8

TA

Health care…Health care………text Ztext Z

Citizen

EmployeeUnemployed Retired

Self-employedSubordinate

PrivatePublic

(2,1) (3,6) (8,7)

(4,4) (7,5)

(5,2) (6,3)

(1,8)

Civic ontologyCivic ontology Normative DBNormative DB

……norm//paragraph//text()norm//paragraph//text()

……‘‘class 7’class 7’

……

Page 15: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Our prototype system (“native” approach)Our prototype system (“native” approach)

The query engine is able to access and retrieve only the strictly necessary data

selection relies on ad-hoc data structures supporting multi-versioning storage granularity is finer than the entire documents used by standard XML engines (including our previous prototype – “stratum” approach)

Only the parts which satisfy the temporal and applicability constraints are used for the reconstruction of the retrieved documents

There is no need to retrieve whole XML documents and build space-consuming structures such as DOM trees

Enhanced query processing efficiency

Reduced memory requirements

Page 16: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Evaluation benchmarkEvaluation benchmark Three XML document setsThree XML document sets

5000 documents 5000 documents (120MB) (120MB) 10000 documents 10000 documents (240MB) (240MB) 20000 documents 20000 documents (480MB) (480MB)

Variable document sizeVariable document size min = 2KBmin = 2KB avg = 24KBavg = 24KB max = 125KBmax = 125KB

Five different query typesFive different query types Queries on keywords (structural + textual constraints)Queries on keywords (structural + textual constraints)

Q1Q1 – keywords in contents – keywords in contents Q2Q2 – keywords in type and contents – keywords in type and contents

Temporal queries (structural + temporal constraints)Temporal queries (structural + temporal constraints) Q3Q3 – conditions on publication, validity and transaction time – conditions on publication, validity and transaction time

Mixed queries (structural + textual + temporal constraints)Mixed queries (structural + textual + temporal constraints) Q4Q4,, Q5 Q5 – with keywords and temporal conditions – with keywords and temporal conditions

Five variants with semantic constraintsFive variants with semantic constraints Qx-AQx-A – with additional – with additional applicability constraintsapplicability constraints

PERSONALIZATION PERSONALIZATION OF THE QUERIESOF THE QUERIES

Page 17: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Performance evaluationPerformance evaluation

The new system outperforms its predecessor (“stratum” approach) The new system outperforms its predecessor (“stratum” approach) as far as temporal queries are concernedas far as temporal queries are concerned

The new system showed a The new system showed a very highvery high efficiencyefficiencyin in personalization querypersonalization query processing processing selection of qualifying versions is improved by a technique selection of qualifying versions is improved by a technique

involving simple comparisons involving pre-post encodingsinvolving simple comparisons involving pre-post encodings 0.5-1%0.5-1% more more timetime than for the original versions than for the original versions 3-4%3-4% storagestorage space overhead space overhead

The new system showed The new system showed good scalabilitygood scalability figures in every type of figures in every type of query contextquery context the computing time grows the computing time grows sublinearlysublinearly with the number of with the number of

documents (it depends mainly on the size of the results)documents (it depends mainly on the size of the results)

Page 18: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

ConclusionsConclusions

We presented our research work concerning the design and implementation of efficient Web-based information systems for eGovernment applications

We introduced support for a personalized access to resources on the basis of the digital identity of citizens (relying on semantic versioning and ontology mapping)

We developed an efficient platform (“native” approach) for which a specialized Multi-version XML Query Processor has been designed and implemented

We showed our approach to be very efficient in a large set of experimental situations with good scale-up figures under growing load configurations

Page 19: Semantic Web Techniques for Personalization of eGovernment Services SemWAT 2006 1st International ER Workshop on Semantic Web Applications: Theory and.

Future WorkFuture Work

Extensions of the current framework more advanced application requirements may include a more

sophisticated ontology definition (graph-like), possibly versioned, and more advanced reasoning services

Completion of the technological infrastructure usable in a large Web-based eGovernment scenario, including

identification and classification services

Assessment of our prototype systems in a concrete working environment

with real users and with a large repository of real norms

Extension to a more general application domain(Web personalization via ontology-based user profiling)