Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science...

21
Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7, Maribor

Transcript of Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science...

Page 1: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian StempfhuberGESIS-IZ Social Science Information CentreBonn, Germany

CRIS 2008, June 5-7, Maribor

Page 2: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

205.06.2008

Agenda

• Information quality (IQ) and CRIS: Why bother?

• IQ in the context of (euro)CRIS– Code of Good Practice (CGP)– IQ coverage at CRIS 2002 – CRIS 2008

• IQ research: An overview• IQ and CRIS: Towards better integration• Conclusions

Page 3: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

305.06.2008

Information quality (IQ) and CRIS: Why bother?

• CRIS vs. library catalogues, repositories, websites etc.Any difference concerning authority, completeness, correctness?

• Are CRISs meant to be of quality? Which?• Does the quality of CRIS contents influence its

use?• Are all CRISs the same (concerning quality)?• Networked CRISs: Just add up individual

quality?• Would CRISs and the CRIS community benefit

of a more explicit, comparable model of quality?

Page 4: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

405.06.2008

IQ in the context of (euro)CRIS – Code of Good Practice

The CGP view to (information) quality: “fit for purpose”

“To ensure the continued use of a CRIS, it is necessary to provide additional value or benefits to both users and contributors to the system. This may be achieved by adhering to a quality plan which defines the accuracy, timeliness, data completeness, presentation of data to the end user, and the functionality offered by the search software.”CGP V3.0, page 14

Page 5: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

505.06.2008

IQ in the context of (euro)CRIS – Code of Good PracticeRESPONSIBILITY PROCESS

RIS proposal

RISDesign

InformationProcessing

Reject

Reject

Reject

Accept

Accept

Accept

RTD-PromoterSteering Group

Development FunctionPublishing Function

RTD-PromoterSteering Group

Collection FunctionProduction FunctionMarketing Function

RTD-Promoter

ConceptAnyone

Project Manager

Reject

Accept

RTD-PromoterSteering Group

Publishing/Distribution

Maintenance

CONTROLS

RIS Proposal - Definition of Purpose - Identification of Users - Definition of Content

RIS Design Plan - Database Specification - Structure and Presentation - Classification and Indexing - Search and Navigation

RIS Information Processing Plan - Data Collection Plan - Collection Guidelines - Quality Control Plan - Acceptance Test Plan

Distribution PlanMarketing Plan (revisited)Economoic model (implementation)

Maintenance Plan /Acceptance

Concept Acceptance

RIS Proposal Acceptance

RIS Development Acceptance

Information processing acceptanceStructuredOutput

Ong

oing

Proc

ess

Rev

iew

Marketing Plan- Market Analysis- Cost benefit analysis/ economic model

Currently waterfall-like approach (one „big“ cycle)

Page 6: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

605.06.2008

IQ in the context of (euro)CRIS – Code of Good PracticeQuality control Essential Important Nice to have

Ensure consistent structure of information

Ensure content is fit for purpose

Availability of usage statistics

Define quality tolerances with provider

Define traffic rules with provider (timescales for receipt and publication)

Ensure consistent presentation of information

Define consequences of not complying versus benefits of complying

Define standard terms, organisation names

Define mandatory fields/kernel information

Establish error correction procedures

Define sampling plan

Establish version control procedures

Page 7: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

705.06.2008

IQ in the context of (euro)CRIS – Code of Good Practice / CERIF

Discussion• Waterfall model might not be adequate for

complex scenarios in or for which CRIS are designed

• “Fit for purpose” only one (specific) view to quality

• General problem: Abstract models (CPG) are hard to translate into actual systems in a deterministic way

• General problem: Detailed specifications (CERIF) do not guarantee systems meeting users’ demands

Page 8: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

805.06.2008

Alternative models: Spiral model

Page 9: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1005.06.2008

Alternative models: Model Driven Architecture (MDA)

Basic Idea: Separation of concerns(specification from implementation)

Provide a platform-independent model (PIM) for CRISsandtransformations to generateplatform-specific models (PSM)

Page 10: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1105.06.2008

IQ in the context of (euro)CRIS – CRIS 2002 to 2008 topics

Summing up 16 of 69 papers from CRIS proceedings:

• Information quality: To improve on aspects like correctness, authoritative registers, controlled vocabularies, persistent identifiers, automatic checking of values and structure are used, and through intellectual processes carried out by experts the data is enriched to make it more useful or trustworthy. Semantic Web technologies are suggested to improve completeness of data (also across individual CRISs).

• Data integration: This becomes an issue as soon as data is exchanged or individual CRIS are networked. Methods employed are the certification of information systems, checking of data structures and values against formal requirements, mapping between vocabularies, and automatic and intellectual de-duplication.

Page 11: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1205.06.2008

IQ in the context of (euro)CRIS – CRIS 2002 to 2008 topicsSumming up 16 of 69 papers from CRIS proceedings (cont.):

• Quality as a process: Checking data towards quality criteria as soon as it is created, using existing data to verify new data, and enabling feedback loops from users of data to incrementally improve overall data quality.

• Personalization: Better matching CRIS features (e.g. amount and level of detail of data, presentation of information, availability of features) to the specific demand of individual users or well defined user groups.

For a community of practice, there is not much concerning IQ that can be shared (practices, tools etc.)

Individual results are not generalized; hard (how?) to apply IQ has not the same coverage as data structures, data

exchange etc.

Page 12: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1305.06.2008

What is (information) quality?

• Degree to which a set of inherent characteristics fulfills requirements (ISO 9000)

• Conformance to requirements (Philip B. Crosby)

• "Fitness for use". Fitness is defined by the customer. (Joseph M. Juran)

• The quality has two dimensions: "must-be quality" and "attractive quality“ (Noriaki Kano)

Page 13: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1405.06.2008

What is (information) quality? (cont.)

IQ or data quality denotes the degree of relevance of information in relation to a specific context and information need:

• Requirements may be user specific or very general

• Total of all requirements towards information or information products ([information] process oriented view)

• Information that is fit for use by information consumers (user oriented view)

Page 14: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1505.06.2008

IQ research: An overview

Alternative views to IQ• fit for purpose• exceptional view (quality as something special)• perfection (quality as a consistent or flawless

outcome)• value for money (quality in terms of return on

investment)• transformation (quality in terms of change from

one state to another) Harvey 1995

Question: Which views could contribute / support our approach to build and promote (the quality of) CRIS?

Page 15: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1605.06.2008

IQ research: A framework for IQ assessment

Ge&Helfert 2007

Spelling errors, incorrect values, outdated dataViolation of domain constraints, company or government regulations

Inaccessible or insecure information,difficult to aggregate / transform

Information not based on facts,impartial view, hard to understand

Accurracy,

Timeliness,Completeness

Page 16: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1705.06.2008

IQ research: PSP/IQ as an example for an IQ model

Product and Service Performance model for Information Quality (PSP/IQ)Kahn et al. 2002

Information meets standards of accuracy, completeness, and free-from-error

Indicates a process by which information consumers regularly receive information in a timely manner

Information product must be useful and relevant to the user’s needs

Information consumers can easily obtain and manipulate information that adds value to their task

Page 17: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1805.06.2008

IQ and CRIS: Towards better integration

IQ and CRIS – what is missing?• A model (see PSP/IQ) expressing a common

approach to IQ, shared by the (euro)CRIS community

• Sets of well defined IQ dimensions, matched to the model and user / CRIS providers’ needs (there are over 180 already defined)

• Common IQ metrics, connected to IQ dimensions and applicable to “real” CRIS

• A shared understanding on how IQ dimensions influence each other

Page 18: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

1905.06.2008

IQ and CRIS: Towards better integration

IQ and CRIS – what is missing? (cont.)• A set of transformation how to transfer the

CRIS IQ model to an individual CRIS (see MDA)

• Standardized ways for assessing IQ (measuring and creating IQ metrics)

• Tested methods for improving IQ • Formal ways for expressing IQ dimensions

at the record / attribute level and on the CRIS level

• Agreed on ways how to use information abot IQ

Page 19: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

2005.06.2008

IQ and CRIS: What can we gain?

• Better and comprehensible service to users• Procedures for improving a CRIS during its

lifetime• Promotion of CRIS to external users for

critical tasks (research evaluation, strategic planning etc.)

• IQ as an incentive for researchers (providing information), sponsors and users of CRIS

• Higher IQ for networked CRISs

Page 20: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

2105.06.2008

Conclusions

• IQ research offers formal models of assessing and improving IQ

• IQ in the context of (euro)CRIS currently has not the same role as data structures and semantics (CERIF)

• Formalizing IQ in the context of CRIS is a precondition of making CRIS a reliable source of information

• The quality of networked CRISs at the ERA level is depending on assessing, preserving and improving IQ

Page 21: Information quality in the context of CRIS and CERIF Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Bonn, Germany CRIS 2008, June 5-7,

Information quality in the context of CRIS and CERIF

Maximilian Stempfhuber

2205.06.2008

Thank You!

Dr. Maximilian StempfhuberGESIS-IZ Social Science Information CentreLennéstr. 30, 53113 Bonn, [email protected]