Download - ResourceSync in 24x7

Transcript
Page 1: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

Synchronize your resources with ResourceSync

Simeon Warner(Cornell University Library)

1

Page 2: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 2

Team sport

Page 3: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 3

more, still more missing

JISC

Richard JonesGraham Klyne

Stuart Lewis

OCLC

Jeff Young

LOCKSS

David Rosenthal

RedHat

Christian Sadilek

Ex Libris Inc.

Shlomo Sanders

Library of Congress

Kevin Ford

Page 4: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 4

$Alfred P. Sloan

Foundation

Page 5: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 5

Synchronize• keep “in sync” (colloq.)

• Following changes over timeand

• Keeping copies on different systems the same

• Tackle only the unidirectional problem:

From a Source, to a Destination

Page 6: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 6

Resourcesaka Web Resources:

have URI, HTTP GET representation(s)

Many / Few Big / Small Fast / Slow

Page 7: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

Why?

Page 8: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 8

Scholarly repositories• Replicate data/articles for mirroring, reuse,

indexing, ...• OAI-PMH for metadata• Many custom solutions

for full content

Page 9: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 9

Linked dataFundamentally distributed but local copy often required. Either:

1. cache

2. sync local copy...

• Many custom solutions

for local copy

Last.FM

MusicBrainz

GeoNames

DBpedia

others...

BBC

Page 10: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 10

Didn’t you sell us OAI-PMH?

Or... will ResourceSync replace OAI-PMH?

Proven metadata transfer protocolWidely adopted in our community

X Predates REST, not “of the web”X Not adopted for content transfer

Can replace, likely coexistence

Page 11: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

What?

Page 12: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 12

1. Baseline syncInitial load, copy, or catch-up from source• need list of all resources• optional packaged content

Want to• avoid out-of-band setup & customization

Page 13: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 13

2. Incremental syncKeep up-to-date with changes at a source• need information about changes• optional packaged content• minimal primitives: create/update/delete

Want• allow catch-up after destination offline• lower latency and/or greater efficiency than

repeated baseline sync

Page 14: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 14

3. AuditDestination should be able to verify whether it is synchronized with a source• need list of all resources + fixity info

Want• lower latency and/or greater efficiency than

baseline sync• note: subject to some latency

Page 15: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

How?

Page 16: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

All ResourceSync documents are

Sitemaps with

minor

extensions

Page 17: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 17

Minor?<urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </url> <url> … </url></urlset>

Page 18: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 18

Baseline sync & Google

Most basic capability is Resource List:• Snapshot of state of resources• URI, datestamp + optional extra fixity info• Destination does GET on each resource

ResourceSync Baseline sync & Audit

Google/Bing/Yahoo!/etc. harvest

Page 19: ResourceSync in 24x7

19

Modular

Discovery

Four CoreCapabilities

1 2 3 4

Page 20: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 20

ExtensibleExtensible use of Link Relations from Atom• Spec describes use for mirrors, patches,

historical, provenance, conneg...• Use <rs:ln rel=“your-relation-here” .../>

Extensible attributes for fixity etc.• Includes lastmod, fixity, length, type...

Extensible framework -> new capabilities

Page 21: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada 21

Push = Lower latency Pull• easy setup, no trust required

Push Changes• lower latency, better scaling• same descriptions as pull• standard transports (XMPP, Websockets...)• can push discovery info to trigger pull

Page 22: ResourceSync in 24x7

“Synchronize your resources with ResourceSync”July 10, 2013, Open Repositories 2013, PEI, Canada

Timeline January 2013

June 2013

July 2013

Fall 2013

• Tools and libraries being developed to ease implementation

First betaVersion 0.9 Update and push spec NISO standardization

• Tutorials at major conferences (OAI8, OR, JCDL,...)

22

Page 23: ResourceSync in 24x7

23

http://www.openarchives.org/rs/

• Framework• Archives• Push (to come)

• Links to Google group, associated articles, blogs, etc.

Page 24: ResourceSync in 24x7

24

That’s all

folks