ETANA-DL NSF Digital Library Project

48
ETANA-DL NSF Digital Library Project Edward A. Fox, Virginia Tech ASOR Annual Meeting, 2004 [email protected] http://fox.cs.vt.edu http://fox.cs.vt.edu/

description

ETANA-DL NSF Digital Library Project. Edward A. Fox, Virginia Tech. ASOR Annual Meeting, 2004. [email protected] http://fox.cs.vt.edu http://fox.cs.vt.edu/talks/2004/. Problems. Delay in publication of primary archaeological data - PowerPoint PPT Presentation

Transcript of ETANA-DL NSF Digital Library Project

Page 1: ETANA-DL NSF Digital Library Project

ETANA-DLNSF Digital Library Project

Edward A. Fox, Virginia Tech

ASOR Annual Meeting, 2004

[email protected] http://fox.cs.vt.edu

http://fox.cs.vt.edu/talks/2004/

Page 2: ETANA-DL NSF Digital Library Project

Problems

Delay in publication of primary archaeological data Lack of sustainable solutions to long-term

preservation of valuable information Lack of services useful to the archaeology

community, including “traditional DL services” Difficulty in understanding complex archaeological

information systems Difficulty in requirements elicitation for archaeological

systems Interoperability among heterogeneous archaeological

systems

Page 3: ETANA-DL NSF Digital Library Project

Solution – our approach

Applying and extending Digital Library (DL) techniques to solve the following problems: making primary data available, data preservation, and interoperability

Modeling archaeological information systems using 5S theory to better understand the domain and design the system and the supported services

Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks: elicitating requirements, providing useful services

Page 4: ETANA-DL NSF Digital Library Project

ETANA-DL

Archaeological Digital Library Applies and extends the OAI-PMH

• Open Archives Initiative Protocol for Metadata Handling

Design considerations• Componentized• Distributed architecture• Extensible• Portable

Page 5: ETANA-DL NSF Digital Library Project

Site Artifact Type Original data sourceNumber of

records harvested

Bab edh-Dhra’ Pottery cp6 database file 786

Lahav Figurine Tab-delimited text file 563

Madaba Locus field record Tables in Access DB 786

Mozan Publication PDF files 19

Nimrin

Bone field record Table in Oracle DB 7419

Seed field record Table in Oracle DB 429

Locus field record Table in Oracle DB 2101

Umayri Bone field record 2 tables in Access DB 2122

Total 18404

Heterogeneous data handling

Page 6: ETANA-DL NSF Digital Library Project
Page 7: ETANA-DL NSF Digital Library Project

ETANA-DL Searching ServiceSearch

Page 8: ETANA-DL NSF Digital Library Project

ETANA-DL Multi-dimensional Browsing

3 new sites

2 new types of artifacts

Page 9: ETANA-DL NSF Digital Library Project

ETANA-DL Visual Browsing Service

Visual BrowseBy site

Page 10: ETANA-DL NSF Digital Library Project

Visual Browsing Nimrin: Topographical Drawings

Full site North west quadrant

Square:N40/W20

Page 11: ETANA-DL NSF Digital Library Project

Visual Browsing Nimrin : Square information

Square:N40/W20

Locus: 86

Loci layout

Page 12: ETANA-DL NSF Digital Library Project

Visual Browsing Nimrin : locus sheet

Page 13: ETANA-DL NSF Digital Library Project

Visual Browsing Bab edh-Dhra' Cemetery

Pottery # 25

Page 14: ETANA-DL NSF Digital Library Project

Visual Browsing Bab edh-Dhra' Cemetery

Pottery # 25

Page 15: ETANA-DL NSF Digital Library Project

5S Archaeological DL Modeling

Modeling archaeological information systems

using the 5S theory to better understand the domain and design the system and the supported

services

Page 16: ETANA-DL NSF Digital Library Project

Digital Object

RepositoryCollection Minimal DL

Metadata Catalog

Descriptive Metadata

Specification

A Minimal DL in the 5S Framework

Structural Metadata

Specification

Streams Structures Spaces Scenarios Societies

indexing

browsing searching

services

hypertext

Structured Stream

Page 17: ETANA-DL NSF Digital Library Project

Streams Structures Spaces Scenarios Societies

indexing

browsing searching

services

hypertext

Structured Stream

Descriptive Metadata

specification

SpaTemOrg

StraDia

Arch Descriptive Metadata specification

ArchDO

ArchObj

ArchColl

Arch Metadata catalog

ArchDColl ArchDR Minimal ArchDL

A Minimal ArchDL in the 5S Framework

Page 18: ETANA-DL NSF Digital Library Project

Modeling ETANA-DL – An Archaeological DL Meta-model

Text Video Audio

*Site *Sub-partition *Container *Artifact*LocusRegion

Taxonomies

Temporal Artifact-specific

Space model

Structuremodel

Metadata

Drawing Photo 3DStreammodel

*Partition

Society model

Archaeologist

General public

Geographic space

Service Manager

Information Satisfaction

Value added

Repository buildingScenario

model Services

Domain specific

User interface Metric space

Spatial

Page 19: ETANA-DL NSF Digital Library Project

Modeling ETANA-DL – ETANA Model

*Field *Pail *Bone*LocusJordan

Taxonomies

Space model

Structuremodel Field record,

locus sheet

Figurine image (photo)

Streammodel

Umayri

Society model

Archaeologist

Generic public

Site-specific coordinate system

Web interface Vector space

ETANA-DLService Manager

Searching, Browsing

Annotation, binding

Harvesting, Converting Scenario

model Services

Object comparison, marking item for analysis

Archaeologicalperiods

Bone type

Seed species

*Square

*Figurine

*Quadrant *Bag*LocusJordan Valley Nimrin *Square

*Field *Basket*LocusSouthern Israel Halif *Area*Seed

Site/field plan(drawing)

Preliminary/FinalReport (application/pdf)

Spatial

Page 20: ETANA-DL NSF Digital Library Project

Overall objective of 5SGraph:Help users model their own instances of a digital library (DL) in the 5S language (5SL).

A simple modeling process which enables rapid generation of digital libraries is needed.

Support non-expert users. Speed-up development process. Increase the quality of final product.

5SGraph: A DL Modeling Tool

Page 21: ETANA-DL NSF Digital Library Project

Goals of 5SGraph

To help digital library designers understand the 5S model quickly and easily

To help digital library designers build their own digital libraries without difficulty

To help digital library designers transform their models into 5SL files automatically

To help digital library designers understand, maintain, and upgrade existing digital library models conveniently

Page 22: ETANA-DL NSF Digital Library Project

5SGraph

How does 5SGraph work?

5SGraph loads and displays a metamodel in a structured toolbox.

The structured editor of 5SGraph provides a top-down visual environment for the DL designer.

5SGraph produces correct 5SL files according to the visual model built by the designer.

Page 23: ETANA-DL NSF Digital Library Project

Overview of 5SGraph

Workspace

(instance model)

Structured

toolbox

(metamodel)

Page 24: ETANA-DL NSF Digital Library Project

Stream Model

Page 25: ETANA-DL NSF Digital Library Project

Structure Model

Page 26: ETANA-DL NSF Digital Library Project

Space Model

Page 27: ETANA-DL NSF Digital Library Project

Scenario Model

Page 28: ETANA-DL NSF Digital Library Project

Society Model

Page 29: ETANA-DL NSF Digital Library Project

Component Reuse

Components can be loaded/saved. Load and save sub-trees

Component reuse saves time and effort. Full reuse from component pool Partial reuse: adapting components

Page 30: ETANA-DL NSF Digital Library Project

Semantic Constraints

There are inherent semantic constraints in the hierarchical structure of the 5S model.

5SGraph maintains the constraints and enforces these constraints over the instance model to ensure correctness.

Page 31: ETANA-DL NSF Digital Library Project

DiscoveryCurrent

AwarenessPreservation

Service Providers

Data Providers

Meta

data

harv

estin

g

The World According to OAI

Page 32: ETANA-DL NSF Digital Library Project

Data and Service Providers

Data Providers possess metadata and share it (internally / externally) via well-defined OAI protocols (e.g., database servers)

Service Providers harvest data from Data Providers provide higher-level services to users (e.g., search engines)

Who will fit where in ETANA-DL? Data Provider – YOUR PROJECT Service Provider – ETANA-DL

Page 33: ETANA-DL NSF Digital Library Project

Why be an OAI Provider

Speed up publication

Long-term preservation

Do not need to worry about providing services

Page 34: ETANA-DL NSF Digital Library Project

How to be an OAI Provider

Requirements• Perl• Web server with ability to run CGI scripts

Download OAI-XMLFile-2.1.tar.gz fromhttp://www.dlib.vt.edu/projects/OAI/software/xmlfile/xmlfile.html

Extract the files into a directory from which CGI scripts may be run • gunzip OAI-XMLFile-2.1.tar.gz• tar –xvf OAI-XMLFile-2.1.tar

Page 35: ETANA-DL NSF Digital Library Project

How to be an OAI Provider (Cont.)

Want your pottery collection be an OAI data provider?

Create a director “mySitePottery” under ‘OAI-XMLFile-2.1/XMLFile’ • Copy the contents in test5 directory to

“mySitePottery” directory

Modify the config.xml under ‘OAI-XMLFile-2.1/XMLFile/mySitePottery’

Page 36: ETANA-DL NSF Digital Library Project

<?xml version="1.0" ?>… <repositoryName> pottery repository name</repositoryName> <adminEmail> YourAdmin@yourServer </adminEmail> <archiveId> pottery Archive ID </archiveId> <recordlimit>500</recordlimit> <datadir> directory of pottery XML collection </datadir> … <metadata> <prefix> prefix of pottery repository </prefix> <namespace> namespace of your schema </namespace> <schema> location of your XML file schema </schema></metadata></xmlfile>

Page 37: ETANA-DL NSF Digital Library Project

Apply the 5S Framework in Integrating Archaeological DLs

Architecture of a Union DL

Union Catalog Integration

Union Services Automation

Page 38: ETANA-DL NSF Digital Library Project

Repository1

DL1

Repository2

Union Catalog

Union Repository

Catalog1 Catalog2

Searching

Union DL DL2

archaeologists

Society

General Public

Society

ArchaeologistsGeneral Public

Union Society

ServiceBrowsingService

Union Service

Harvesting, Mapping,Searching, Browsing,

Clustering, Visualization

Architecture of a Union DL

Page 39: ETANA-DL NSF Digital Library Project

Union Catalog

VNCatalog

Union Catalog Integration

Virtual Nimrin(VN)

Halif DigMaster(HD)

HDCatalog

VN MetadataFormat

MappingTool

MappingTool

Global MetadataFormat

Wrapper

Wrapper

HD MetadataFormat

Page 40: ETANA-DL NSF Digital Library Project

Visualizing Components

Mapper1

Composite Mapper

Mapper2 Mapper3 Mapper4

Visual Mapping Tool Architecture

Page 41: ETANA-DL NSF Digital Library Project
Page 42: ETANA-DL NSF Digital Library Project

local schema global schema

Page 43: ETANA-DL NSF Digital Library Project

Mapping recommendation

Page 44: ETANA-DL NSF Digital Library Project

Mapping confirmation

Mapping history

Page 45: ETANA-DL NSF Digital Library Project

No recommendation for “Tomb_Area”

Page 46: ETANA-DL NSF Digital Library Project

User-decided mapping

Page 47: ETANA-DL NSF Digital Library Project

5S MetaModel

5SGraphDL

Expert

DL Designer

5SL DL

Model

5SLGen

Practitioner

Researcher

TailoredDL

Services

Teacher

componentpool

ODLSearch,ODLBrowse,ODLRate,ODLReview,

…….

Requirements (1) Analysis (2)

Implementation (4)

Design (3)

5SGraph 5SGen

Mapping Tool

5SSuite

Page 48: ETANA-DL NSF Digital Library Project

5SGraph5S Archaeology

MetaModelArchDL Expert ArchDL Designer

VN Metadata Format

ETANA-DL Metadata Format

Mapping Tool

Wrapper4VN Wrapper4HD

HD Metadata Format

Inverted Files

Services DB

Index

Index

BrowseService

SearchService

Browse DB

OtherETANA-DL

Services

Web

Interface

XOAI

XOAI

VNCatalog

VNCatalog

UnionCatalog

Structure Sub-modelScenario

Sub-model

Harvesting description

Mappingdescription

Browsing description

5SGen

ComponentPool

Browsing…