eXtensible Catalog Software Portfolio

140
eXtensible Catalog Software Portfolio Ben Anderson, Software Engineer, XCO

description

eXtensible Catalog Software Portfolio. Ben Anderson, Software Engineer, XCO. eXtensible Catalog is…. eXtensible Catalog (XC) is open source, user-centered, next generation software for libraries. - PowerPoint PPT Presentation

Transcript of eXtensible Catalog Software Portfolio

Page 1: eXtensible Catalog Software Portfolio

eXtensible Catalog

Software PortfolioBen Anderson, Software Engineer, XCO

Page 2: eXtensible Catalog Software Portfolio

2

eXtensible Catalog is…

eXtensible Catalog (XC) is open source, user-centered, next generation software for libraries.

XC provides a discovery interface and a set of tools for libraries to manage metadata and build applications.

Page 3: eXtensible Catalog Software Portfolio

3

Software Overview

• User Interface: – Faceted, FRBRized, customizable search interface– Web application framework for libraries

• Metadata Tools: – Automated metadata processing: Enable libraries to aggregate metadata and run services on it

• XC Schema– New XML schema with Dublin Core terms, RDA elements and roles, MARC vocabularies, and XC elements– FRBR levels +: Work, Expression, Manifestation, Holdings, Item

• Connectivity Tools: – Harvest and synchronize metadata with OAI-PMH– Circulation and authentication with NCIP

XC Metadata Services Toolkit

Metadata Tools

Drupal CMS

XC Drupal Toolkit

User Interface

ILS

XC OAI Toolkit

XC NCIP Toolkit

Connectivity Tools

Metadata Application

Profile

XC Schema

Page 4: eXtensible Catalog Software Portfolio

4

Partners and Contributors

• University of Rochester• The Andrew W. Mellon Foundation• Consortium of Academic and Research Libraries in Illinois (CARLI)• University of Notre Dame • Rochester Institute of Technology• Kyushu University working with NTT-Data• University of North Carolina at Charlotte• Serials Solutions• OCLC• University at Buffalo• Cornell University• Yale University• Ohio State University• Nylink

Page 5: eXtensible Catalog Software Portfolio

5

Partners and Contributors

Page 6: eXtensible Catalog Software Portfolio

6

Note from Takanori HAYASHIHi, folks,

As all know, great earthquake and tsunami attack East Japan Area (Alsomy library).Many libraries has earthquake destruct.Now, Code4Lib Japan members and other librarians manage and distributeof destruct information about any libraries.http://www45.atwiki.jp/savelibrary/pages/1.html

NHK is broadcasting Japanese information by Ustreamhttp://www.ustream.tv/channel/nhk-world-tvGoogle provide Crisis response service, if your friend in Japan statusis unknown, I recommend use Person Finder and Disaster message boardsprovide by telephone careers.http://www.google.co.jp/intl/en/crisisresponse/japanquake2011.html

And send response to Japanese librarian by twitter, using hachtag #jishinlib

Thank you for very kindness message from all member of c4l by twitteror other, and sending rescue team form USA and other countries.

Please don't worry, hacker is never die :-)

--Takanori HAYASHIAgriculture, Forestry and Fisheries Research Information Technology Center

Page 7: eXtensible Catalog Software Portfolio

eXtensible Catalog

User InterfaceXC Drupal Toolkit

Page 8: eXtensible Catalog Software Portfolio

8

XC Drupal Toolkit

• Discovery interface and library web application platform in one

• Faceted, FRBRized search of XC Schema metadata• Extensive, easy customization• Established open source community• Data-driven web applicatons with web forms

Drupal CMS

XC Drupal Toolkit

User Interface

Page 9: eXtensible Catalog Software Portfolio

9

Page 10: eXtensible Catalog Software Portfolio

10

Page 11: eXtensible Catalog Software Portfolio

11

Page 12: eXtensible Catalog Software Portfolio

12

Page 13: eXtensible Catalog Software Portfolio

13

Page 14: eXtensible Catalog Software Portfolio

14

Page 15: eXtensible Catalog Software Portfolio

15

Page 16: eXtensible Catalog Software Portfolio

16

The XC Schema combines metadata fields from multiple standard schemas (RDA and DC) plus adds new XC schema elements.

Page 17: eXtensible Catalog Software Portfolio

17

Page 18: eXtensible Catalog Software Portfolio

18

XC Schema Fields

• New XML metadata schema– Dublin Core terms– RDA - 22 elements and 11 roles designators– XC elements (contain MARC vocabularies,

linking fields, etc)• Subset of RDA chosen to retain the

granularity in current MARC data:– Frequency– Numbering of Serials– Coordinates of Cartographic Content– Plate number (music)

DCMI

RDA

XC

Page 19: eXtensible Catalog Software Portfolio

19

XC Schema Record Types

• FRBR Group 1 Entities +– Work– Expression– Manifestation– Holdings– Item

Page 20: eXtensible Catalog Software Portfolio

20

Creating XC Schema data from MARC

MARCXMLBibliographic

XCWork

XCExpression

XC Manifestation

XC Holdings

• Parse MARCXML records into linked FRBR-based records

• MARC Holdings records produce XC Holdings records (to preserve MARC granularity)

• All XC records have globally unique identifiers, and a permanent host repository

• Uplinks created

MARCXMLHoldings

OO4 “Uplink”Manifestation Held

Expression Manifested

Work Expressed

Page 21: eXtensible Catalog Software Portfolio

21

Page 22: eXtensible Catalog Software Portfolio

22

Page 23: eXtensible Catalog Software Portfolio

23

Page 24: eXtensible Catalog Software Portfolio

24

Page 25: eXtensible Catalog Software Portfolio

25

Page 26: eXtensible Catalog Software Portfolio

26

Page 27: eXtensible Catalog Software Portfolio

27

Page 28: eXtensible Catalog Software Portfolio

28

Page 29: eXtensible Catalog Software Portfolio

29

Page 30: eXtensible Catalog Software Portfolio

30

Page 31: eXtensible Catalog Software Portfolio

31

Page 32: eXtensible Catalog Software Portfolio

32

Page 33: eXtensible Catalog Software Portfolio

33

Page 34: eXtensible Catalog Software Portfolio

34

Page 35: eXtensible Catalog Software Portfolio

35

Page 36: eXtensible Catalog Software Portfolio

36

Page 37: eXtensible Catalog Software Portfolio

37

Page 38: eXtensible Catalog Software Portfolio

38

Page 39: eXtensible Catalog Software Portfolio

39

Page 40: eXtensible Catalog Software Portfolio

40

Page 41: eXtensible Catalog Software Portfolio

41

Page 42: eXtensible Catalog Software Portfolio

42

Page 43: eXtensible Catalog Software Portfolio

43

Page 44: eXtensible Catalog Software Portfolio

44

Page 45: eXtensible Catalog Software Portfolio

45

Page 46: eXtensible Catalog Software Portfolio

46

Page 47: eXtensible Catalog Software Portfolio

47

Page 48: eXtensible Catalog Software Portfolio

48

Page 49: eXtensible Catalog Software Portfolio

49

Page 50: eXtensible Catalog Software Portfolio

50

Page 51: eXtensible Catalog Software Portfolio

51

Page 52: eXtensible Catalog Software Portfolio

52

Page 53: eXtensible Catalog Software Portfolio

53

Page 54: eXtensible Catalog Software Portfolio

54

Page 55: eXtensible Catalog Software Portfolio

55

Page 56: eXtensible Catalog Software Portfolio

56

Page 57: eXtensible Catalog Software Portfolio

57

Page 58: eXtensible Catalog Software Portfolio

58

Page 59: eXtensible Catalog Software Portfolio

59

Page 60: eXtensible Catalog Software Portfolio

60

Page 61: eXtensible Catalog Software Portfolio

61

Page 62: eXtensible Catalog Software Portfolio

62

Page 63: eXtensible Catalog Software Portfolio

63

Page 64: eXtensible Catalog Software Portfolio

64

Page 65: eXtensible Catalog Software Portfolio

65

Page 66: eXtensible Catalog Software Portfolio

66

Page 67: eXtensible Catalog Software Portfolio

67

Page 68: eXtensible Catalog Software Portfolio

68

Page 69: eXtensible Catalog Software Portfolio

69

Page 70: eXtensible Catalog Software Portfolio

70

Page 71: eXtensible Catalog Software Portfolio

71

Page 72: eXtensible Catalog Software Portfolio

72

Page 73: eXtensible Catalog Software Portfolio

73

Page 74: eXtensible Catalog Software Portfolio

74

Page 75: eXtensible Catalog Software Portfolio

75

Page 76: eXtensible Catalog Software Portfolio

76

Page 77: eXtensible Catalog Software Portfolio

77

Page 78: eXtensible Catalog Software Portfolio

78

Page 79: eXtensible Catalog Software Portfolio

79

Page 80: eXtensible Catalog Software Portfolio

80

Page 81: eXtensible Catalog Software Portfolio

81

Page 82: eXtensible Catalog Software Portfolio

82

Page 83: eXtensible Catalog Software Portfolio

83

Page 84: eXtensible Catalog Software Portfolio

84

Page 85: eXtensible Catalog Software Portfolio

85

Page 86: eXtensible Catalog Software Portfolio

86

Page 87: eXtensible Catalog Software Portfolio

87

Page 88: eXtensible Catalog Software Portfolio

88

Page 89: eXtensible Catalog Software Portfolio

89

Page 90: eXtensible Catalog Software Portfolio

90

Page 91: eXtensible Catalog Software Portfolio

91

Page 92: eXtensible Catalog Software Portfolio

92

Page 93: eXtensible Catalog Software Portfolio

93

Page 94: eXtensible Catalog Software Portfolio

94

Page 95: eXtensible Catalog Software Portfolio

95

Page 96: eXtensible Catalog Software Portfolio

96

Page 97: eXtensible Catalog Software Portfolio

97

Page 98: eXtensible Catalog Software Portfolio

98

New Default Theme

Page 99: eXtensible Catalog Software Portfolio

99

Page 100: eXtensible Catalog Software Portfolio

100

Page 101: eXtensible Catalog Software Portfolio

101

Page 102: eXtensible Catalog Software Portfolio

102

Themes: Kyushu University Library

Page 103: eXtensible Catalog Software Portfolio

103

Kyushu - Search results in Japanese

Reasons why these items are shown

Query : America Japan

Translated : Faceted navigation

Page 104: eXtensible Catalog Software Portfolio

104

Anyone in WNYLRC using Drupal?

Page 105: eXtensible Catalog Software Portfolio

105

YES!

• Cattaraugus- Allegany-Erie-Wyoming BOCES, SLS

• Jamestown Community College (Jamestown)• Jamestown Community College (Olean)• Roswell Park Cancer Institute Corp.

+others that are cloaking or using drupal elsewhere besides the main web site

Page 106: eXtensible Catalog Software Portfolio

Metadata

Metadata Issues and XC Metadata Management Tools

Page 107: eXtensible Catalog Software Portfolio

Metadata Issues: Silos

• Users have many starting points for search because all the data is not available in a single system:– Integrated Library Systems– Institutional Repositories– Webpages– Subscription Databases

• Libraries don’t have good options for searching across all of these sources

Usability

Silo

Quality

Format

Silo

Page 108: eXtensible Catalog Software Portfolio

Metadata Issues: Quality

• ILS MARC export issues• Cataloging errors and variant practices• End-user generated metadata• Lack of authority control• Libraries don’t have good options for

making use of data at a range of quality levels

Usability

Silo

Quality

Format

Quality

Page 109: eXtensible Catalog Software Portfolio

Metadata Issues: Format

• MARC format is everywhere but does not support current metadata needs

• Multiple formats are useful to describe a range of resources, but difficult to search across consistently

• Libraries don’t have good options to try out new standards like RDA

Usability

Silo

Quality

FormatFormat

Page 110: eXtensible Catalog Software Portfolio

Metadata Issues: Usability

• ILS OPAC interfaces are deficient in:– Ease of learning / ease of use– Precision and recall– Finding similar and related resources

Usability

Silo

Quality

Format

Usability

Page 111: eXtensible Catalog Software Portfolio

111

Other XCrecords

MMW

MME

MMM

5. Index4. Aggregate3. Transform

Following one MARC record through XC

Steps:1. Convert from raw MARC to MARCXML (minor cleanup)2. Normalize MARCXML (major cleanup)3. Transform from MARCXML to XC (FRBRize)4. Aggregate at each FRBR level (match and merge)5. Index records / create WEMs (one for each unique Manifestation)

MARC MARCXML(dirty)

MARCXML(clean)

W

E

M

XC

2. Normalize1. Convert

WEMWEM

Index

Data is ready for searchand faceted browse

XC

merge

W

E

M

match

?

?

?

5. Index4. Aggregate3. Transform2. Normalize1. Convert

Page 112: eXtensible Catalog Software Portfolio

112

Drupal CMS

XC Metadata Services Toolkit

XC Software Components

ILSXC OAI Toolkit

5. Index4. Aggregate3. Transform2. Normalize1. Convert

MARCXMLNormalization

MARCXML to XCTransformation XC Aggregation

XC Drupal Toolkit

Usability

Silo

Quality

Format

Usability

Silo

Quality

Format

Usability

Silo

Quality

Format

Quality

Format

Usability

5. Index4. Aggregate3. Transform2. Normalize1. Convert

Format

Silo

Quality

Metadata Issue Handling

Usability

Silo

Page 113: eXtensible Catalog Software Portfolio

113

XC Software Components

ILS

1. Convert

Usability

Silo

Quality

Format

1. Convert

Format

Silo

Quality

XC OAI Toolkit

Expose ILS metadata to XC’s next generation catalog interface and metadata tools

Synchronize ongoing changes in ILS records with XC software automatically

Convert raw MARC into MARCXML

Address data and identifier issues

Compatible with most ILSs

XC OAI Toolkit

Page 114: eXtensible Catalog Software Portfolio

114

XC Metadata Services Toolkit

New type of staff client for processing large batches of metadata through an orchestrated set of services.

Harvest from multiple sources (silos) to address format and quality issues.

Aggregate and de-dupe metadata.

Automatic synchronization propagates changes in source metadata through services and on to discovery interface.

XC Metadata Services Toolkit

XC Software Components

4. Aggregate3. Transform2. Normalize

MARCXMLNormalization

MARCXML to XCTransformation XC Aggregation

Page 115: eXtensible Catalog Software Portfolio

115

XC Metadata Services Toolkit

XC Software Components

4. Aggregate3. Transform2. Normalize2. Normalize

MARCXMLNormalization

Usability

Silo

Quality

Format

Usability

Quality

MARCXML Normalization Service

• Transform language codes to spelled-out languages: e.g. fre becomes French

• Normalize forms of OCLC numbers so that they are all the same

• Substitute vocabulary terms for format/type of material codes in the MARC record (Leader, 006, 007, 008) to enable building facet values

• Substitute codes for audience level (juvenile, etc.) and type of material (fiction, non-fiction; identifies dissertation/thesis)

• “Deconstructs” LC Subject headings so we can map parts of them to different facets: geographic, genre, topic, etc.

Page 116: eXtensible Catalog Software Portfolio

116

XC Metadata Services Toolkit

XC Software Components

4. Aggregate3. Transform2. Normalize

MARCXML to XCTransformation

Usability

Silo

Quality

FormatFormat

Usability

3. Transform

MARCXML to XC Transformation Service

• Parse flat MARC records to create linked FRBR-based records (work, expression, etc.) in XC Schema

• One input record results in several output records

• Manage relationships between records, including one to many relationships

• Creates multiple work and expression records for analytics

• Handles “bound-withs” (e.g. two books bound together)

Page 117: eXtensible Catalog Software Portfolio

117

XC Metadata Services Toolkit

XC Software Components

4. Aggregate3. Transform2. Normalize

XC Aggregation

Usability

Silo

Quality

Format

Usability

4. Aggregate

XC Aggregation Service

• Aggregate records that represent the same resource at:• Manifestation-level • Work-level (depends on

Authority service)• Manage relationships between

records (FRBR entities, etc.)• Enable automated synchronization of

updates for records at each FRBR level

• Sets stage for future “non-MARC” RDA implementation

Page 118: eXtensible Catalog Software Portfolio

118

XC Metadata Services ToolkitXC Metadata Services ToolkitDrupal CMS

eXtensible Metadata Services

ILSXC OAI Toolkit

5. Index4. Aggregate3. Transform2. Normalize1. Convert

MARCXMLNormalization

MARCXML to XCTransformation XC Aggregation

XC Drupal Toolkit

5. Index4. Aggregate3. Transform2. Normalize1. Convert

DC / Qualified DCNormalization

DC to XCTransformation

MARCXML / XC Authority

DSpace

<other schema> Normalization

<other> to <other> Transformation

XC to RDF(Linked data out)

Page 119: eXtensible Catalog Software Portfolio

119

XC Drupal Toolkit

• Adds support for library metadata into Drupal (DC and XC schemas)

• Apache SOLR index of WEMs to enable faceted, FRBRized results navigation

• Single search interface across:– Library catalog– Digital repository– Website resources

• Extensive customization• Integration with ILS circulation system (via XC NCIP

Toolkit)

Drupal CMS

XC Software Components

5. Index

XC Drupal Toolkit

Usability

Silo

Quality

Format

Usability

5. Index

Page 120: eXtensible Catalog Software Portfolio

eXtensible Catalog

Software PortfolioMetadata Services Toolkit

Page 121: eXtensible Catalog Software Portfolio

121

Metadata Services Toolkit (MST) Tasks

• Get metadata into the MST– Add Repositories– Schedule Harvests

• Tell MST what to do with metadata– Install metadata services– Add processing rules

• Verify results / troubleshoot processing– Browse records– View error logs

XC Metadata Services Toolkit

Metadata Tools

Page 122: eXtensible Catalog Software Portfolio

122

MST: Add RepositoryTelling the MST about a repository is easy. Assign a name of your choice and enter the URL.

After adding a repository, the MST will automatically do a “handshake” with it and provide “Success” or “Error” messages for each step in the handshake.

When successful, the MST reports on what formats and sets are available in the remote database.

The MST supports all XML schemas, but individual services are schema-specific.

Page 123: eXtensible Catalog Software Portfolio

123

MST: Schedule HarvestThe next step is to schedule the harvesting of metadata from the remote repository.

Options- Set the schedule- Choose start and end dates- Select sets and formats

Page 124: eXtensible Catalog Software Portfolio

124

MST: Install A Service

The next step is to install a metadata service.

A service is a separate program, written in Java, that is managed by the MST.

Services can be downloaded from the XC website or you can write your own by following the developer’s manual.

In order to use a service, you place the downloaded file in a directory by following the MST manual.

This screen can then be used to install the service in the MST.

Page 125: eXtensible Catalog Software Portfolio

125

MST: Add A Processing Rule

This example shows two services already installed in this Metadata Services Toolkit (MST): MARC Normalization and MARC-to-XC-Transformation.

Now we need to tell the MST which metadata records we want proccessed through which services, and in what order. This is called service orchestration.

We will now add a “Processing Rule”

Page 126: eXtensible Catalog Software Portfolio

126

MST: Browse Records

“Browse Records” is a feature of the MST that includes faceted browse and full-text search.

The MST has a local copy of all harvested metadata and all metadata produced by each installed service.

Page 127: eXtensible Catalog Software Portfolio

127

MST: Browse Records

Library staff use “Browse Records” to verify that services are functioning properly and to debug any issues.

Page 128: eXtensible Catalog Software Portfolio

128

MST: Browse Records

Navigation to full record display (MST handles display of any XML schema)

Whenever a record is processed by a service, the original record is preserved and one or more new records may be produced. These records are called successors.

Navigation links take you to predecessor and successor records. In this case, links connect MARC records to their normalized successor. In another case, links connect a normalized MARC record to its successor Work, Expression and Manifestation records.

Page 129: eXtensible Catalog Software Portfolio

129

MST: Browse Records

Each service can register error messages with the MST upon installation. In this example the MARC Normalization service has attached errors to specific records.

Errors are facets in the MST.

The “i” icon links to a customizable webpage with instructions for staff to address the error.

Page 130: eXtensible Catalog Software Portfolio

130

MST: View Full Record

Full Record Display: MARC Holdings Record

Administrative metadata managed by the MST

Predecessor and successor links

XML viewer (supports any XML schema)

Page 131: eXtensible Catalog Software Portfolio

131

MST: View Error Logs

Log file management system with navigation.

This page shows MST system log files. Each installed service as well as harvest-in and harvest-out logs are available.

Page 132: eXtensible Catalog Software Portfolio

eXtensible Catalog

What’s NextVision for Linked Data

Page 133: eXtensible Catalog Software Portfolio

133

Semantic Web and Linked Data

• The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web

• Linked data is a mechanism for exposing, sharing and connecting data on the web

Page 134: eXtensible Catalog Software Portfolio

134

Semantic Web and Linked Data

• If everything has a unique identifier, then information from one website can be related to information from another via a computer program

• Everything includes people, places, things, vocabularies, metadata elements, web documents, …

Page 135: eXtensible Catalog Software Portfolio

135

Semantic Web and Linked Data

• A Uniform Resource Identifier (URI) is a string of characters used to identify a name (URN) or an resource on the internet (URL).

• Two kinds of resources– information resources – traditional web things like

documents, images, etc. – non-information resources – these are real world

objects like people, physical products, places, concepts, proteins, etc

Page 136: eXtensible Catalog Software Portfolio

136

Turning information into Linked Data

Predicate (URI from a defined vocabulary)Subject (URI) Object (URI or literal)

RDF Triple defined:

Information that might be on a webpage, but cannot be readily understood by a computer: “David Lindahl is 40 years old.”

Example #1: Describe something…

foaf:ageDavid Lindahl 40

Step 1: Parse it into a Subject, Predicate, and Object:

http://xc.org/resource/dlindahl 40http://xmlns.com/foaf/spec/#term_age

Step 2: Convert to URI’s:

Page 137: eXtensible Catalog Software Portfolio

137

Turning information into Linked Data

Predicate (URI from a defined vocabulary)Subject (URI) Object (URI or literal)

RDF Triple defined:

Information that might be on a webpage, but cannot be readily understood by a computer: “David Lindahl knows Jennifer Bowen.”

Example #2: Define a relationship…

foaf:knowsDavid Lindahl Jennifer Bowen

Step 1: Parse it into a Subject, Predicate, and Object:

http://xc.org/resource/dlindahl http://xc.org/resource/jbowenhttp://xmlns.com/foaf/spec/#term_knows

Step 2: Convert to URI’s:

Page 138: eXtensible Catalog Software Portfolio

138

Linking Open Data Cloud

Page 139: eXtensible Catalog Software Portfolio

139

Linked Data on XC

• XC Metadata Services Toolkit (MST):– Converts multiple formats into XC Schema• XC Schema is linked data ready• XC Schema uses defined vocabularies (rda, dcterms, xc)

– Persistent OAI-PMH (web services) data repository– Plug-in service architecture can be extended

support RDF technologies

Page 140: eXtensible Catalog Software Portfolio

Download XC software at

eXtensibleCatalog.org