ResourceSync Tutorial from Open Repositories 2013

175
ResourceSync Tutorial July 8, 2013, Open Repositories 2013, PEI, Canada ResourceSync: A Web-Based Resource Synchronization Framework ResourceSync is funded by The Sloan Foundation & JISC #resourcesync 1

description

Slides form the tutorial on the ResourceSync Framework presented at Open Repositories 2013 in Charlottetown, PEI on 8 July 2013. The latest set of ResourceSync tutorial slides are available at http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial

Transcript of ResourceSync Tutorial from Open Repositories 2013

Page 1: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync:A Web-Based

Resource SynchronizationFramework

ResourceSync is funded by The Sloan Foundation & JISC

#resourcesync

1

Page 2: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Tutorial History

Simeon WarnerCornell [email protected]@zimeon

• First outing: OAI8, Geneva, Switzerland, June 2013• Second run: Open Repositories, here and now

• Most recent version of these tutorial slides is available at: http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial

Presenter

Page 3: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Martin KleinLos Alamos National Laboratory<[email protected]>

@mart1nkle1n

ResourceSync Tutorial Contributors

3

Simeon WarnerCornell University

[email protected]@zimeon

Herbert Van de Sompel Los Alamos National Laboratory

<[email protected]>@hvdsomp

Robert SandersonLos Alamos National Laboratory

<[email protected]>@azaroth24

Richard JonesCottage Labs

<[email protected]>@cottagelabs

Page 4: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSyncCore Team

4

OAI

Herbert Van de SompelMartin KleinRobert Sanderson(Los Alamos National Laboratory)

Simeon Warner(Cornell University)

Berhard Haslhofer(University of Vienna)

Michael L. Nelson(Old Dominion University)

Carl Lagoze(University of Michigan)

NISO

Todd CarpenterNettie Lagace

Lyrasis

Peter Murray

Page 5: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Technical Group

5

JISC

Richard JonesGraham Klyne

Stuart Lewis

OCLC

Jeff Young

LOCKSS

David Rosenthal

RedHat

Christian Sadilek

Ex Libris Inc.

Shlomo Sanders

Library of Congress

Kevin Ford

Page 6: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

6

Page 7: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

7

Page 8: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Synchronize What?

• Web resourceso things with a URI that can be dereferenced

• Focus on needs of research communication and cultural heritage organizationso but aim for generality

8

Page 9: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Synchronize What?

• Small websites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources)

9

sync

sync

Page 10: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Synchronize What?

10

• Low change frequency (weeks/months) to high change frequency (seconds)

sync

sync

sync

Page 11: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Synchronize What?

11

• Synchronization latency and accuracy needs may vary

sync

Sync ???

Page 12: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Why?

… because lots of projects and services are doing synchronization but have to resort to ad-hoc, case by case, approaches!

• Project team involved with projects that need this

• Experience with OAI-PMH: widely used in repos buto XML metadata onlyo Attempts at synchronizing actual content via OAI-PMH

(complex object formats, dc:identifier) not successful.o Web technology has moved on since 1999

• Devise a shared solution for data, metadata, linked data?

12

Page 13: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Problem

13

• Consideration:• Source (server) A has resources that change over time: they

get created, modified, deleted• Destination (servers) X, Y, and Z leverage (some)

resources of Source A.• Problem:

• Destinations want to keep in step with the resource changes at Source A: resource synchronization.

• Goal:• Design an approach for resource synchronization aligned

with the Web Architecture that has a fair chance of adoption by different communities.• The approach must scale better than recurrent HTTP

HEAD/GET on resources.

Page 14: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source: Four CoreSynchronization Capabilities

1. Describing content – publish a list of resources available for synchronization to enable Destinations to perform an initial load or catch-up with a Source

2. Packaging content – bundle resources to enable bulk download for destinations

3. Describing changes – publish a list of resource changes to enable destinations to stay synchronized and decrease latency

4. Packaging changes – bundle resource changes for bulk download for destinations

14

Page 15: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source: Synchronization Features

5. Linking to related resources – provide links from to be synchronized resources to related resources

applicable to all core capabilities (1..4)

6. Access to historical data – provide archives of 1..4

7. Discovery of capabilities – support Destinations in discovering all offered capabilities 1..4

15

Page 16: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Destination: Synchronization Needs

1. Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source

- avoid out-of-band setup

2. Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source

- subject to some latency; minimal: create/update/delete- allow to catch-up after destination has been offline

3. Audit – A destination should be able to determine whether it is synchronized with a source

- subject to some latency

16

Page 17: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

17

Page 18: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Cases – The Basics

18

a)

b)

Page 19: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Cases – The Basics

19

c)

d)

Page 20: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Cases – The not-so-Basics

20

e)

f)

Page 21: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Cases – The not-so-Basics

21

g)

h)

Page 22: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 1: arXiv Mirroring and Data Sharing

• Repository of scholarly articles in physics, mathematics, computer science, etc.

• > 850k articles• approx. 1.5 revisions per article on

average• approx. 75k new articles per year• Each article has full-text and separate

metadata record• approx. 3.8M resources

22

Page 23: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 1: arXiv Mirroring and Data Sharing

• 2,700 updates dailyo at 8pm ESTo Currently using homebrew mirroring

solution (running with minor modifications since 1994!)

o occasional rsync (file system-specific, auth issues)

23

Page 24: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Mirroring arXiv: 1994 - 2013

• Operated since the very early days of the Web!

1. HTTP trigger from the main site

2. HTTP pull update specific to mirror site

3. HTTP download of the resources

4. HTTP trigger to main site when mirror process complete

5. HTTP verification (via HEAD) by the main site which updates the update list specific to mirror site

6. periodic repeat as long as there are updates in the inventory for that mirror

• Requires trusted set of servers operating with the same internal organization

• Does not support synchronization check (so rsync is used periodically)

24

Page 25: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 1: arXiv

Mirroring

• GOAL: Keep mirror sites synchronized with daily changes

• WANT:o high consistencyo moderate latencyo robustness to global network outages (low admin effort)o ability to verify sync status in case of questions

25

Page 26: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 1: arXiv

Data Sharing

• GOAL: Make resources and update information publicly available so that any other service may synchronize at the frequency it needs, e.g.o Math Front at UC Daviso EprintWeb from IOP in UKo Data for bibliometric and scientometric analysis

• WANT:o low admin effort (i.e. standard approach, standard tools)o reasonable consistency, latency, efficiency

26

Page 27: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 2: DBpedia Live Duplication

• Average of 2 updates per second• Low latency desirable => need for a push technology

27

Page 28: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 2: DBpedia Live Duplication

• Initial experiment with distributed infrastructure

28

Page 29: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 2: DBpedia Live Duplication

• Daily traffic:o 99% updateso 0.6% deletionso 0.03% creations

29

Page 30: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Use Case 2: DBpedia Live Duplication

• # of content transfer events in two 8 hour intervals

• Max, queue size of remote duplication process

30

Page 31: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

31

Page 32: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source Capability 1: Describing Content

In order to advertise the resources that a source wants destinations to know about, it may describe them:

o Publish a Resource List, a list of resource URIs and possibly associated metadata- Destination GETs the Content Description- Destination GETs listed resources by their URI

o Describes state of set of resources at one point in time (snapshot)

32

Page 33: ResourceSync Tutorial from Open Repositories 2013

33

Page 34: ResourceSync Tutorial from Open Repositories 2013

34

Page 35: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source Capability 2: Packaging Content

By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms:

o Publish a Resource Dump, a document that points to packages of resource representations and necessary metadata- Destination GETs the package- Destination unpacks the package- ZIP format supported

o Packages set of resources at one point in time (snapshot)

35

Page 36: ResourceSync Tutorial from Open Repositories 2013

36

Page 37: ResourceSync Tutorial from Open Repositories 2013

37

Page 38: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source: Modular Capabilities

38

Page 39: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source Capability 3: Describing Changes

In order to achieve lower latency and/or greater efficiency, a source may communicate about changes to its resources:

o Publish a Change List, a list of recent change events (created, updated, deleted resource)- Destination acts upon change events, e.g. GETs

created/updated resources, removes deleted resources.o Describes changes to resources that occurred in a temporal

interval with a start- and an end-date

39

Page 40: ResourceSync Tutorial from Open Repositories 2013

40

Page 41: ResourceSync Tutorial from Open Repositories 2013

41

Page 42: ResourceSync Tutorial from Open Repositories 2013

42

Page 43: ResourceSync Tutorial from Open Repositories 2013

43

Page 44: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source Capability 4: Packaging Changes

In order to reduce the number of requests to obtain resource changes, a source may provide packaged bitstreams for changed resources:

o Publish a Change Dump, a document that points to packages of recently changed resource representations and necessary metadata - Destination GETs the package- Destination unpacks the package- ZIP format supported

o Packages resources that changed in a temporal interval with a start- and an end-date

44

Page 45: ResourceSync Tutorial from Open Repositories 2013

45

Page 46: ResourceSync Tutorial from Open Repositories 2013

46

Page 47: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Source: Modular Capabilities

47

Page 48: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

FrameworkStructure

(light)

48

Page 49: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

FrameworkStructure

(complete)

49

Page 50: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Destination: Key Processes

50

Page 51: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

51

Page 52: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

52

Page 53: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

53

Page 54: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

So Many Choices

54

XMPP

AtomPub

SDShare

RSS

Atom

PubSubHubbub

Sitemap

XMPP

rsync

OAI-PMH

WebDAV Col. Syn.

OAI-ORE

DSNotify

RDFsync

Crawl

Push

Pull

SWORD

SPARQLpush

Page 55: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

So Many Choices

55

XMPP

AtomPub

SDShare

RSS

Atom

PubSubHubbub

Sitemap

XMPP

rsync

OAI-PMH

WebDAV Col. Syn.

OAI-ORE

DSNotify

RDFsync

Crawl

Push

Pull

SWORD

SPARQLpush

Page 56: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

56

Page 57: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

A Framework Based on Sitemaps

• Modular framework allowing selective deployment

• Sitemap is the core format throughout the framework

o Introduce extension elements and attributes: - In ResourceSync namespace (rs:) to

accommodate synchronization needso Reuse Sitemap format for all capability documents:

Resource List, Resource Dump, Change List, Change Dump, as well as for manifest in Dumps

o Utilize Sitemap index format where needed/allowed

57

Page 58: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Sitemap Format

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>

<url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </url>

<url> <loc>http://example.com/res2</loc> <lastmod>2013-01-02T14:00:00Z</lastmod> </url> …</urlset>

58

Page 59: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Sitemap Index Format

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>

<sitemap> <loc>http://example.com/sitemap1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </sitemap>

<sitemap> <loc>http://example.com/sitemap2.xml</loc> <lastmod>2013-01-02T14:00:00Z</lastmod> </sitemap> …</sitemapindex>

59

Page 60: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Sitemap Extensions

<urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </url> <url> … </url></urlset>

60

Page 61: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Sitemap Extensions

<sitemapindex xmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/><sitemap> <loc>http://example.com/sitemap1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </sitemap>…</sitemapindex>

61

Page 62: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

62

Page 63: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 1: Resource List

63

Page 64: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 1: Resource List

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist" from="2013-01-03T09:00:00Z"/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url></urlset>

64

Page 65: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource List

• Describe Source’s resources that are subject to synchronization• At one point in time (snapshot)

• Typical Destination use: Baseline Synchronization, Audit

• Each URI typically listed only once• Might be expensive to generate• Destinations use @from to determine freshness• Issue GETs against URIs to obtain resources• Very similar to current Sitemaps

65

Page 66: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

What if I have a million resources?

• Current sitemap limit is 50k resources (or maximum document size of 50MB)

• Break complete list of resources into 50k-resource chunks, each on a Resource List document

• Create a Resource List Index document to group them:o Based on <sitemapindex>o May have up to 50k component Resource Listso Extends capacity to 2,500,000,000 resources within current

community practices

Page 67: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource List Index <resourcelist_index.xml>

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcelist" from="2013-01-02T09:00:02Z”/> <sitemap> <loc>http://example.com/resourcelist1.xml</loc> <lastmod>2013-01-02T11:00:00Z</lastmod> <rs:md type="application/xml"/> </sitemap> <sitemap> <loc>http://example.com/resourcelist2.xml</loc> <lastmod>2013-01-02T11:00:01Z</lastmod> <rs:md type="application/xml"/> </sitemap></urlset>

67

Page 68: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource List <resourcelist1.xml>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs=http://www.openarchives.org/rs/terms/> <rs:ln rel=”up” href=”http://example.com/resourcelist_index.xml”/> <rs:md capability=”resourcelist" from="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T08:07:06Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> ...</urlset>

68

Page 69: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 2: Resource Dump

69

Page 70: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 2: Resource Dump

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcedump" from="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/resourcedump_part1.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”97553" type=”application/zip"/> </url> <url> <loc>http://example.com/resourcedump_part2.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”21294" type=”application/zip"/> </url></urlset>

70

Page 71: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource Dump Manifest

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcedump-manifest" from="2013-01-02T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type="text/html" path=”/resources/res1"/> </url> <url> <loc>http://example.com/res2</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md type=”application/pdf” path=”/resources/res2"/> </url></urlset>

71

Page 72: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource Dump

• Package Source’s resourcesthat are subject to synchronization• At one point in time (snapshot)

• Points to ZIP packages• Mandatory, even for only one ZIP• ZIP package contains manifest, listing contained bitstreams• Typical Destination use: Baseline Synchronization, bulk

download

• Each URI typically listed only once• Might be expensive to generate• Destinations use @from to determine freshness• GETs against individual URIs from Resource List achieves the

same result (ignoring varying freshness)

72

Page 73: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 3: Change List

73

Page 74: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 3: Change List

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> <url> … </url></urlset>

74

Page 75: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change List

• Describe Source’s resource changes• Occurring during temporal interval with start- and end-date

• Typical Destination use: Incremental Synchronization, Audit

• Changes are listed in chronological order• Multiple changes to one URI may result in multiple listing of

same URI• Source determines duration of temporal interval• Destinations use @from and @until to determine freshness• Issue GETs against URIs to obtain changed resources

75

Page 76: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 4: Change Dump

76

Page 77: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability 4: Change Dump

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/change_dump_part1.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length="887" type=”application/zip"/> </url> <url> <loc>http://example.com/change_dump_part2.zip</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md length=”9767" type=”application/zip"/> </url></urlset>

77

Page 78: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change Dump Manifest

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump-manifest" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" length=”2887” type=”text/html” path=”changes/res1”/> </url> <url> … </url></urlset>

78

Page 79: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change Dump

• Package Source’s resources that have changed• during temporal interval with start- and end-date

• Points to ZIP packages• Mandatory, even for only one ZIP• ZIP package contains manifest, listing contained bitstreams• Typical Destination use: Incremental Synchronization, bulk

download of changes

• Changes in Change Dump Manifest listed in chronological order• Same URI can be listed multiple times• Might be expensive to generate• Destinations use @from and @until to determine freshness

79

Page 80: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Recall… *Index

<changelist_index.xml>

<changelist1.xml>

80

Page 81: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change List Index <changelist_index.xml>

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <sitemap> <loc>http://example.com/changelist1.xml</loc> <lastmod>2013-01-02T11:00:00Z</lastmod> <rs:md type="application/xml"/> </sitemap> <sitemap> <loc>http://example.com/changelist2.xml</loc> <lastmod>2013-01-02T23:00:00Z</lastmod> <rs:md type="application/xml"/> </sitemap></urlset>

81

Page 82: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change List <changelist1.xml>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs=http://www.openarchives.org/rs/terms/> <rs:ln rel=”up” href=”http://example.com/changelist_index.xml”/> <rs:md capability="changelist" from="2013-01-02T09:00:00Z” until="2013-01-02T21:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated" hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url></urlset>

82

Page 83: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource Metadata SummaryElement/Attribute Description Defined by

<loc> Resource URI (identity) sitemaps

<lastmod> Timestamp of last change sitemaps

<changefreq> Expected update frequency sitemaps

<rs:md>   ResourceSync

change Change type (Change List & Change Dump Manifest only) ResourceSync

encodingHTTP Content-Encoding header value RFC2616

hashOne or more content digests (md5, sha-1, sha-256)

Atom Link Ext.

lengthHTTP Content-Length header value RFC4287

pathPath in ZIP package (Dump Manifests only)

ResourceSync

typeHTTP Content-Type header value RFC4287

Page 84: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

84

Page 85: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Supported Linking Use Cases

The web is based on links between resources, many of which are important to understand for synchronization.

1. Mirrored content with multiple download locations

2. Alternate representations of the same content

3. Patching content rather than replacing

4. Resources and their metadata

5. Prior versions of resources

6. Collection membership of resources

7. Republishing synchronized resources

All cases are handled with a <rs:ln> element referring to the remote resource

85

Page 86: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Notes about Linked Resources

Some important things to keep in mind about linked resources:

• They may also be subject to synchronization• They may be updated in a very different schedule to the

resource it is linked from• Therefore, it is recommended to convey metadata about the

linked resource too• Links can be bi-directional – the linked resource can link back to

the linking resource

86

Page 87: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #1 - Mirror

1. Mirrored content with multiple download locations

This might occur due to:• Content distribution networks• Mirror sites• Backup locations• Load balancing

87

Page 88: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #1 - Mirror

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”duplicate” pri=”1” href=”http://mirror1.example.com/res1"/> <rs:ln rel=”duplicate” pri=”2” href=”http://mirror2.example.com/res1"/> </url></urlset>

88

Page 89: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #2 – Alternate Representations

2. Alternate representations of the same content

This might occur due to:• Server supports HTTP content negotiation• Multiple copies of the same resource• Format migration for preservation reasons • Different clients wanting different formats• Multiple languages of the content

89

Page 90: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #2 – Alternate Representations

90

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel="alternate" type="text/html" href="http://example.com/res1.html"/> <rs:ln rel="alternate" type=“application/pdf" href=”http://example.com/res1.pdf"/> </url></urlset>

Page 91: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #2 – Alternate Representations

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1.html</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”canonical” href="http://example.com/res1"/> </url></urlset>

91

Page 92: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #3 – Patching Content

3. Patching content rather than replacing

This might occur due to:• Resources are very large and server wishes to conserve

bandwidth where possible• Changes are frequent and small• Changes are managed in a CMS that tracks differences• Format exists or can be described that is machine

processable to replicate the change

92

Page 93: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #3 – Patching Content

93

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1.json</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated” length=“398723”/> <rs:ln rel=”http://www.openarchives.org/rs/terms/patch” type=”application/json-patch” modified=“2013-01-02T17:00:00Z” length=“58” href=”http://example.com/res1-patch.json"/> </url></urlset>

Page 94: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #4 – Metadata about Resources

4. Resources and their metadata

This might occur due to:• Resources have additional metadata records, which are

useful for understanding the resource• Such as cultural heritage images, audio, video• Collections with descriptive metadata• Resources with technical metadata• Administrative or Rights metadata

94

Page 95: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #4 – Metadata about Resources

95

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”describedby” type=”application/xml” href=”http://example.com/metadata/res1.xml"/> </url></urlset>

Page 96: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #4 – Metadata about Resources

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/metadata/res1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”describes” type=”text/html” href=”http://example.com/res1"/> </url></urlset>

96

Page 97: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #5 – Prior Versions of Resources

But first…

97

Page 98: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Memento Intermezzo

http://www.mementoweb.org/

Page 99: ResourceSync Tutorial from Open Repositories 2013

URI for Original, URI for Version

URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/

Web Archive

URI-R - http://www.cnn.com/

Page 100: ResourceSync Tutorial from Open Repositories 2013

URI for Original, URI for Version

URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333

CMS

URI-R - http://en.wikipedia.org/wiki/September_11_attacks

Page 101: ResourceSync Tutorial from Open Repositories 2013
Page 102: ResourceSync Tutorial from Open Repositories 2013
Page 103: ResourceSync Tutorial from Open Repositories 2013
Page 104: ResourceSync Tutorial from Open Repositories 2013
Page 105: ResourceSync Tutorial from Open Repositories 2013
Page 106: ResourceSync Tutorial from Open Repositories 2013
Page 107: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #5 – Prior Versions of Resources

107

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”memento” href=”http://example.com/past/20130102130000/res1"/> <rs:ln rel=”timegate” href=”http://example.com/timegate/res1"/> <rs:ln rel=”timemap” href=“http://example.com/timemap/res1” type=“application/link-format”/> </url></urlset>

Page 108: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #6 – Collection Membership

6. Collection membership of resources

A source might want to express this because:• Resources are part of OAI-ORE aggregations• Resources are part of OAI-PMH sets• Or to indicated any other type of collections of resources

Collections are named with URIs and can then be linked to with rel=“collection”

• Nice if the collection URI resolves to a useful description

108

Page 109: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #6 – Collection Membership

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”collection” href=”http://example.com/aggregation/allres"/> </url></urlset>

109

Page 110: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #7 – Republishing Resources

7. Republishing synchronized resources

This might occur due to:• Aggregator systems that harvest resources from remote

sites and then republish them at new URIs• Examples include Blog republishing, content distribution

networks, mirrored or combined collections• Hypothetical scenario: Lots of little museums with small

collections, and a large European/American aggregating digital library system that wants to provide fast, combined access to the content (with permission)

110

Page 111: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #7 – Republishing Resources

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”via” modified=“2013-01-02T10:00:00Z” href=”http://original.example.org/res1"/> </url></urlset>

111

Page 112: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Linking #7 – Republishing Resources

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://aggregator.example.com/res1</loc> <lastmod>2013-01-02T18:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”via” modified=“2013-01-02T13:00:00Z” href=”http://example.org/res1"/> <rs:ln rel=”via” modified=“2013-01-02T10:00:00Z” href=”http://original.example.org/res1"/> </url></urlset>

112

Page 113: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Link Relation Summary

Relation Use in ResourceSync Defined in

rel="alternate"Link from generic to specific URI HTML 5

rel="canonical"Link from specific to generic URI RFC6596

rel="collection"Resource is member of collection RFC6573

rel="describedby" Has metadata Protocol for Web Description Resources (POWDER): Description Resources

rel="describes" Is metadata for The 'describes' Link Relation Type

rel="duplicate" Mirror or alternative copy RFC6249

rel=".../rs/terms/patch"A patch -- efficient change information This specification

rel="memento" Link to time-specific URI Memento Internet Draft

rel="timegate" Link to timegate Memento Internet Draft

rel="via" Provenance chain, came from RFC4287

Page 114: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Related Resource Metadata Summary

• Attributes of the <rs:ln> element; c.f. resource metadata + pri

Element/Attribute Description Defined by

<rs:ln>   ResourceSync

encoding HTTP Content-Encoding header value RFC2616

hash One or more content digests (md5, sha-1, sha-256) Atom Link Ext.

href Related resource URI (identity) RFC4287

length HTTP Content-Length header value RFC4287

modified Timestamp of last change (c.f. <lastmod>) Atom Link Ext.

path Path in ZIP package (Dump Manifests only) ResourceSync

pri Priority of link RFC6249

rel Relation - IANA registered or URI RFC4287

type HTTP Content-Type header value RFC4287

Page 115: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

115

Page 116: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery ofCapabilities

116

Page 117: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery of Capability Documents

Requirements:• Need to discover capability documents, i.e. Resource List,

Resource Dump, Change List, Change Dump, Archives• Need to know the type of capability each document

represents.

Approach:• The Capability List provides links to these capability documents,

if the Source supports them.• These links have appropriate relation types, e.g.

“resourcelist”, “changelist”, etc.

117

Page 118: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Capability List

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”capabilitylist”/> <rs:ln rel=“resourcesync” href=“http://example.com/.well-known/resourcesync”/> <url> <loc>http://aggregator.example.com/dataset1/resourcelist.xml</loc> <rs:md capability=”resourcelist”/> </url> <url> <loc>http://aggregator.example.com/dataset1/changelist.xml</loc> <rs:md capability=”changelist”/> </url></urlset>

118

Page 119: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

119

Requirements:• Need to discover a Capability List

Approach:• HTTP Link header from resources subject to synchronization,

relation type “resourcesync”• Links from HTML document <head>, relation type

“resourcesync”• Links from Capability documents, relation type “up”

Link header on example.com/res1.pdf

Link: <example.com/dataset1/capabilitylist.xml>;rel=“resourcesync”

Discovery of Capability Lists

Page 120: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery ofCapabilities

120

Page 121: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery: ResourceSync Description

Requirements:• Support for multiple Capability Lists, one per “set of

resources”• Need to discover these Capability Lists• Need descriptive information about each set of resources

that a Capability List pertains to• Useful to have descriptive information about the Source itself

Approach:• The ResourceSync Description document meets these

requirements. • It should be at a particular location to avoid having registries:

http://(hostname)/.well-known/resourcesync• It can be linked to from the Capability Lists as well.

121

Page 122: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery ofCapabilities

122

Page 123: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Description

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”resourcesync”/> <rs:ln rel=“describedby” href=“http://example.com/info_about_source.xml”/> <url> <loc>http://aggregator.example.com/dataset1/capabilitylist.xml</loc> <rs:md capability=”capabilitylist”/> <rs:ln rel=“describedby” href=“http://example.com/dataset1/info_about_dataset1.xml”/> </url> <url> <loc>http://aggregator.example.com/dataset2/capabilitylist.xml</loc> <rs:md capability=”capabilitylist”/> <rs:ln rel=“describedby” href=“http://example.com/dataset2/info_about_dataset2.xml”/> </url></urlset>

123

Page 124: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Discovery ofCapabilities

124

Page 125: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

125

Page 126: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Motivation for a Push Component in ResourceSync

126

• Reduce synchronization latency by having the Source push out resource change information• To avoid continuous pull of Change Lists by Destinations

• Share information about changes to the Source’s ResourceSync implementation, e.g. announcement of new Resource List, new Capability List, etc.• To avoid continuous polling of e.g. Resource Lists,

ResourceSync Description

Page 127: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Notification Types

127

• Events pertaining to a resource• updated | created | deleted for a resource• 3rd party defined events

• Events pertaining to a set of resources• updated | created | deleted for a Resource List, Resource

Dump, Change List, Change Dump, Archives• 3rd party defined events

• Events pertaining to the overall ResourceSync implementation• updated | created | deleted for a Capability List,

ResourceSync Description• 3rd party defined events

Page 128: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Possible Push Technology: XMPP PubSub

128

Other technologies: WebSockets, HTTP callback

Page 129: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Notification Payload

129

• Payload the same irrespective of transport protocol• Use <urlset> as encapsulating element• One <url> element per notification

Page 130: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Notification Payload – Resource Update (XMPP)

130

<xmpp:iq from=“[email protected]” to=“[email protected]” type=“set” id=“liAJUz3S”> <xmpp:pubsub> <xmpp:publish node=“resource_notification_channel”> <xmpp:item id=“1234577”> <sm:urlset xmlns:sm=“http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs=“http://www.openarchives.org/rs/terms/”> <sm:url> <sm:loc>http://example.com/res1</sm:loc> <sm:lastmod>2013-01-02T14:00:00Z</sm:lastmod> <rs:md change=“updated” hash=“md5:12324324jhhjl234234” length=“987665” type=“application/pdf”/> </sm:url> </sm:urlset> </xmpp:item> </xmpp:publish> </xmpp:pubsub></xmpp:iq>

Page 131: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Notification Payload – Capability Update (XMPP)

131

<xmpp:iq from=“[email protected]” to=“[email protected]” type=“set” id=“liAJUz3S”> <xmpp:pubsub> <xmpp:publish node=“changelist_notification_channel”> <xmpp:item id=“1234577”> <sm:urlset xmlns:sm=“http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs=“http://www.openarchives.org/rs/terms/”> <sm:url> <sm:loc>http://example.com/dataset1/changelist.xml</sm:loc> <sm:lastmod>2013-01-02T14:00:00Z</sm:lastmod> <rs:md capability=“changelist” change=“updated”/> </sm:url> </sm:urlset> </xmpp:item> </xmpp:publish> </xmpp:pubsub></xmpp:iq>

Page 132: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Push Technology Considerations

132

• Notification channels• Multiple channels per Source to divide up notifications, e.g.

• a channel for changes pertaining to all resources that belong to a set of resources

• a channel for changes to capabilities for a set of resources

• Server-side filtering preferred over client-side

• Authentication/Authorization• To subscribe/create channels

Page 133: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Push Technology Considerations

133

• Delayed notification• Insurance that Destination does not miss anything

• Discovery• Links to channels e.g. from a Capability List• Links from channels to other channels• Provide channel metadata (transport protocol info etc.)

Page 134: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9” xmlns:rs=“http://www.openarchives.org/rs/terms/”>… <url> <loc>xmpp:pubsub.example.com/dataset1?;node=resource_notification_channel</loc> <rs:md capability=“resource-notification”/> <rs:ln rel=“alternate” href=“ws://example.com/dataset1/meta_notification_channel”/> </url> <url> <loc>xmpp:pubsub.example.com/dataset1?;node=capability_notification_channel</loc> <rs:md capability=“capability-notification”/> </url> <url> <loc>xmpp:pubsub.example.com/dataset1?;node=resourcesync_notification_channel</loc> <rs:md capability=“resourcesync-notification”/> </url></urlset>

Push Channel Discovery

134

Page 135: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

4. Framework (Technical) Details

1. Sitemaps

2. Pull method

3. Linking between resources

4. Discovery

5. Push method

6. Archives

135

Page 136: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Framework Component: Archives

In order to allow a Source to hold on to historical data and Destinations to catch up with events it has missed:

o Publish a - Resource List Archive, - Resource Dump Archive,- Change List Archive, and/or a - Change Dump Archive

o Documents, listing historical capability documents

136

Page 137: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource List Archive

137

Page 138: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist-archive" from="2013-01-09T13:00:00Z"/> <url> <loc>http://example.com/resourcelist1.xml</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> </url> <url> <loc>http://example.com/resourcelist2.xml</loc> <lastmod>2013-01-09T13:00:00Z</lastmod> </url> <url> … </url></urlset>

Resource List Archive

138

Page 139: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Resource Dump Archive

139

Page 140: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcedump-archive" from="2013-02-10T03:00:00Z"/> <url> <loc>http://example.com/resourcedump1.xml</loc> <lastmod>2013-01-10T03:00:00Z</lastmod> </url> <url> <loc>http://example.com/resourcedump2.xml</loc> <lastmod>2013-02-10T03:00:00Z</lastmod> </url> <url> … </url></urlset>

Resource Dump Archive

140

Page 141: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change List Archive

141

Page 142: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist-archive" from="2013-02-01T23:00:00Z until="2013-02-03T23:00:00Z"/> <url> <loc>http://example.com/changelist1.xml</loc> <lastmod>2013-02-01T23:00:00Z</lastmod> </url> <url> <loc>http://example.com/changelist2.xml</loc> <lastmod>2013-02-02T23:00:00Z</lastmod> </url> <url> <loc>http://example.com/changelist3.xml</loc> <lastmod>2013-02-03T23:00:00Z</lastmod> </url></urlset>

Change List Archive

142

Page 143: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Change Dump Archive

143

Page 144: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changedump-archive" from="2013-02-10T03:00:00Z until="2013-02-17T03:00:00Z"/> <url> <loc>http://example.com/changedump1.xml</loc> <lastmod>2013-02-10T03:00:00Z</lastmod> </url> <url> <loc>http://example.com/changedump2.xml</loc> <lastmod>2013-02-17T03:00:00Z</lastmod> </url> <url> … </url></urlset>

Change Dump Archive

144

Page 145: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

145

Page 146: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Implementation #1:The Metadata Harvesting Use Case

146

Page 147: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

The Metadata Harvesting Use Case

1. Identification of metadata records within a service

2. Use of standards in metadata formats

3. Incremental updates

4. Create, Update, Delete

5. Sets

147

Page 148: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

The Metadata Harvesting Use Case

1. Identification of metadata records within a service

2. Use of standards in metadata formats

148

ResourceSync does not specifically care about metadata records, only resources. It is up to the server to identify which of those resources are metadata.

We are free to annotate a resource's entry with appropriate metadata to indicate the format.

Page 149: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

The Metadata Harvesting Use Case

3. Incremental updates

4. Create, Update, Delete

5. Sets

149

All resources that can be obtained from a change list will be annotated with the kind of change that happened to them.

ResourceSync allows the server to publish lists of resources and changes and indexes of those lists all annotated with metadata.

ResourceSync publishes changes as static documents. The client is then free to walk up and down the change lists provided by the server.

Page 150: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

(Required) Documents formetadata harvesting use case

150

Page 151: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Describing Metadata Resources

151

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist" from="2013-05-05T13:00:00Z"/> <url> <loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc</loc> <lastmod>2013-05-01T19:09:35Z</lastmod> <changefreq>never</changefreq> <rs:md type=”application/xml”/> <rs:ln href="http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf" rel="describes"/> <rs:ln href="http://mydspace.edu/bitstream/123456789/7/2/image.jpg" rel="describes"/> <rs:ln href="http://mydspace.edu/123456789/3" rel=”collection"/> </url></urlset>

Page 152: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Describing Bitstream Resources

152

<urlset … <url> <loc>http://mydspace.edu/bitstream/123456789/7/1/bitstream.pdf</loc> <lastmod>2013-05-01T19:09:35Z</lastmod> <changefreq>never</changefreq> <rs:md hash="md5:75d0ea94097a05fce9aca5b079e2f209" length="419805" type="application/pdf"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/qdc" rel="describedby"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/7/mets" rel="describedby"/> <rs:ln href="http://mydspace.edu/dspace-rs/resource/123456789/12/qdc" rel="describedby"/> <rs:ln href="http://mydspace.edu/123456789/2" rel=”collection"/> </url></urlset>

Page 153: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Serving Metadata Resources

153

http://mydspace.edu/dspace-rs/resource/123456789/7/qdc

ResourceSync webapp Item handle Metadata Format

metadata.formats = \ qdc = http://purl.org/dc/terms/, \ mets = http://www.loc.gov/METS/

metadata.types = \ qdc = application/xml, \ mets = application/xml

<loc>http://mydspace.edu/dspace-rs/resource/123456789/7/qdc<loc> <rs:md type="application/xml”/> <rs:ln href="http://purl.org/dc/terms/" rel="describedby"/>

<loc>http://mydspace.edu/dspace-rs/resource/123456789/7/mets</loc> <rs:md type="application/xml”/> <rs:ln href="http://www.loc.gov/METS/" rel="describedby"/>

Page 154: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Generating Documents1. Initialise

Creates initial Capability List and Resource List documents

[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -i

2. Update

Creates a new Change List which covers the period since the last Change List was created

[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -u

3. Rebase

A combination of both Initialise and Update.

[dspace]/bin/dspace dsrun org.dspace.resourcesync.ResourceSyncGenerator -r

154

Page 155: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Usage of Resources by clients

155

Page 156: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Impact on DSpace

156

Page 157: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

URLs• Stable identifiers for archived items• Stable identifiers for unarchived items• Stable identifiers for metadata resources (in their various formats)• Stable identifiers for previous versions

Provenance• History of changes to an item/bitstream• Item/bitstream deletions (vs withdraw)• Bitstream create/update dates• Item create/update dates

157

?

Page 158: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Versioning• Access of previous versions of both metadata and bitstreams• Stable identifiers for previous versions of both metadata and

bitstreams

Metadata Resources• Metadata in a variety of formats• Metadata as file/bitstream

158

?

?

Page 159: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Admin Files• ResourceSync documents (Resource Lists, Change Lists, etc)• ResourceSync exports - Resource Dumps, Change Dumps• Metadata exports in a number of formats

Scheduled Tasks• Regular generation of RS documents

Complex Objects• Item/bitstream relationships• Collections of content

159

Page 160: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Dspace Module:https://github.com/CottageLabs/DSpaceResourceSync

depends on the common java library:https://github.com/CottageLabs/ResourceSyncJava

PHP client:https://github.com/stuartlewis/resync-php

depends on the SWORDv2 clienbt library:https://github.com/swordapp/swordappv2-php-library/

Get the software!

160

Page 161: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Implementation #2:ResourceSync at arXiv.org

161

Page 162: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync @ arXiv

• Use ResourceSync for both mirroring and public data accesso efficient updateso ability to do periodic auditso public synchronization capabilityo reduce admin burden

• Likely start with metadata + source for mirroring use case (doing experiments now)

• Open access use cases requires processed PDF also• Some concerns about likely use/load…

162

Page 163: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

163

Page 164: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Alternate download location

• Likely want to separate machine accesses from human accesses to preserve response time on main server

=> Use Mirrored Content part of spec

o <loc> specifies canonical URI - e.g. http://arxiv.org/pdf/1306.1073v1.pdf

o <rs:ln rel=“duplicate”> specifies preferred download location- e.g. http://export.arxiv.org/pdf/1306.1073v1.pdf

164

Page 165: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

<url> <loc>http://arxiv.org/pdf/1306.1073v1.pdf</loc> <lastmod>2013-06-06T00:57:12Z</lastmod> <rs:md hash="md5:e08e0c4e4d7b0895120014f0aa09e7c4" length="287714” type=”application/pdf"/> <rs:ln rel="duplicate” pri="1" href="http://export.arxiv.org/pdf/1306.1073v1.pdf" modified="2013-06-06T02:00:59Z"/></url>

Alternate download location

165

Page 166: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Getting a copy of arXiv

It might be as easy as:

166

(of course, you probably have to wait a while but it is nice to know ResourceSync is stateless so one can efficiently restart)

Page 167: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Python Library and Client

• Aim to provide library code implementing all ResourceSync facilities for use in both source and destination implementationso Designed for python 2.6 (RHEL6) and 2.7o Will not work with python <= 2.5

• Client (resync) supports many destination operations, inspired by the common Unix rsync program

• Client also supports some operations that might be useful in a source, such as generation of static Resource Lists, or periodic Change Lists (used in arXiv experiments)

• Explorer (resync-explorer) intended to allow easy inspection of a source’s resource sets and capabilities

• Developed since ResourceSync v0.5, updated for v0.9

http://github.org/resync/resync

Page 168: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync Source Simulator

• Python code using Tornado server• Provides random set of resources of different sizes updated at a

particular rate• Very useful for testing Destination code

http://github.com/resync/simulator

Page 169: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync - Agenda

1. ResourceSync: Problem Perspective & Conceptual Approach

2. Motivation & Use Cases

3. Framework Walkthrough

4. Framework (Technical) Details

5. Implementation

6. Q&A

169

Page 170: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

170

Page 171: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

171

Page 172: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

172

Page 173: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Timeline

• June 2013o Version 0.9 of ResourceSync framework specification releasedo Soliciting broad feedback

• July 2013o Version 0.x of Push-based methods for ResourceSync

• Fall 2013o Specification becomes NISO standard

173

Page 174: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

Pointers

• Specification

http://www.openarchives.org/rs/http://www.openarchives.org/rs/0.9/resourcesynchttp://www.openarchives.org/rs/0.9/archives

• List for public comment

https://groups.google.com/d/forum/resourcesync

• Client and simulator code

http://github.org/resync/resynchttp://github.org/resync/simulator

174

Page 175: ResourceSync Tutorial from Open Repositories 2013

ResourceSync TutorialJuly 8, 2013, Open Repositories 2013, PEI, Canada

ResourceSync:A Web-Based

Resource SynchronizationFramework

ResourceSync is funded by The Sloan Foundation & JISC

#resourcesync

175