Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

33
Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009

Transcript of Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Page 1: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Publishing and Cataloguing DatasetsIt’s time everyone got involved

UKSG Conference 2009

Page 2: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

BUT FIRST, LET’S GET SERIOUS

Part 1OECD likes being cool with data.

Page 3: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

TIB in Hamburg say they have archived and added DOIs to

500,000 datasets.Yet their most-cited dataset has

been cited just 3 times.

Some data to start with

Page 4: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Either no-one wants to cite data

OR, having a DOI by itself isn’t enough

So …

Page 5: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Let us imagine for a moment . . .

Page 6: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

If an article is . . .

A piece of data that is presented in a static, two-dimensional, form.

“ “Geoffrey Bilder, CrossRef, 2007

Page 7: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

• Print

• HTML• PDF

Static, two-dimensional objects

http://dx.doi.org/10.1787.280675838368

Page 8: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

http://dx.doi.org/10.1787.280675838368

Active, two-dimensional object

Page 9: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Another active,

two-dimensional object?

Page 10: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

It’s a view on a datacube

Page 11: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

In fact, it’s a view on a collection of datacubes

Active,

multi-dimensional object!

Page 12: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

OECD Article

Static, two-dimensional

object

OECD Excel TableActive, two-dimensional

object

OECD Database

Active, multi-dimensional

object

http://dx.

doi.org/10

.1787.2806

7

5838368

Page 13: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

So, instead of imagining, let’s say

we built this.

We’d get something like . . .

Page 14: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Dataset OECD Regional Database

Excel – Active two-

dimensional object

Dataset – Active multi-

dimensional object

PDF – Passive two-

dimensional object

. . . this.

Page 15: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

OK – that’s cool and OECD can do this because we have

all the objects in our publishing system.

But how are other publishers, authors and

librarians coping with data?

Page 16: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Source: OECD

Chart from The Economist

Page 17: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Source: Acemoglu et al (2001), based on Curtin, 1989, Philip D. Curtin, Death by migration: Europe’s encounter with the tropical world in the nineteenth century, Cambridge University Press, New York

(1989).Curtin 1989 and other sources.

Tertiary school enrollment: School enrollment, tertiary (% of gross).

Source: Barro and Lee (2000) and their databases.

Taken from an appendix to an article published in Elsevier’s World Development

You can’t fault the author for trying . . .but it’s not a lot of help for a reader

Page 18: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

And Librarians,

How many are cataloguing datasets in their OPACs in ways which are compatible with search systems for books and journals?

Page 19: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Conclusion: Datasets:Scholarly Publishing’s Black

Sheep?

Page 20: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

A&I & subject portals Publishers

Library portals

Content Aggregators

EconLitRePEc

ScienceDirect

OPACs

Ingenta

Scholarly Publishing Sites for Journals and Books

Network

Page 21: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

A&I & subject portals Publishers

Library portals

Content Aggregators

EconLitRePEc

ScienceDirect

OPACs

Ingenta

Scholarly Publishing Sites for Journals and Books

Network

Page 22: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

A&I & subject portals Publishers

Library portals

Content Aggregators

EconLitRePEc

ScienceDirect

OPACs

Ingenta

Scholarly Publishing Sites for Journals and Books

Network

Page 23: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Using metadata for: Datasets

In the same industry standard formats as . . .Book chapters

andJournal articles

Page 24: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Authors will be able to cite . . .

Publishers will be able to link . . .

Discovery systems will be able to find . . .

Librarians will be able to catalogue . .

.

Datasets alongside published outputs . . .

. . . to the benefit of Everyone

Page 25: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

A proposed example of a dataset using standard bibliographic and citation metadata.

Bibliography of Books that

cite this database

Citation tool compatible with EndNotes et al

Dataset title with ISSN,

DOI (& MARC) record

Page 26: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

There are still challenges:- Dynamic data- Versioning- Preservation

But, let’s round the sheep up first.

Page 27: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

OECD is:• issuing a white paper on

Publishing Standards for Datasets

• Speaking with CrossRef about citation standards for dynamic objects

• publishing OECD datasets with ‘sheepdogs’ from mid-2009:

MARC recordsONIX recordsCitation records

Page 28: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

End of Part One

QUESTIONS?

DISCUSSION?

Page 29: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

BEING COOL WITH DATA: OECD’S FIRST STEPS

Part 2

Page 30: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

March 2007

Page 31: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Print editionWeb-book on SourceOECDUSB Key EditionOECD Factbook on eXplorer (new for 2009)OECD Factbook on iPhone (new for 2009)

April 2009

Page 32: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

OECD Regional Statistics data using the eXplorer tool tool

OECD Regional Statistics using NCVA’s eXplorer tool

October 2008March 2009

http://stats.oecd.org/OECDregionalstatistics/

Page 33: Publishing and Cataloguing Datasets It’s time everyone got involved UKSG Conference 2009.

Other cool visualisation stuff

• IMF Datamapper on www.imf.org. See also www.mappingworlds.com who provided the technology.

• See Gapcasts and Trendalyzer on www.gapminder.org

• The New York Times uses a lot of dynamic graphics

• USA Today built their reputation on graphics – now they’re doing it online. We like How much is $700bn?

• Economist’s Chart Gallery generate a lot of comment.

• Data sharing sites include www.swivel.com, www.many-eyes.com an newcomers www.icharts.net and www.widgenie.com .

• There are many blogs on charts or visualization such as www.flowingdata.com or www.eagereyes.org