Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing...

33
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global Biodiversity Information Facility (GBIF) [email protected] 20 June 2011

Transcript of Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing...

Page 1: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Experts Workshop on the IPT, v. 2, Copenhagen, Denmark

The Pathway to the Integrated Publishing Toolkit version 2

Tim RobertsonSystems ArchitectGlobal Biodiversity Information Facility (GBIF)

[email protected]

20 June 2011

Page 2: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Agenda

‣Why an IPT?

‣The project history

‣ IPT version 1.0

‣The rationale for version 2.0

‣Key functionality of the IPT v2.0

Page 3: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who has used an IPT?

Page 4: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who has installed an IPT?

Page 5: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

The IPT Vision

‣A single platform allowing the sharing of

‣Primary biodiversity data

‣Species name information

‣Dataset descriptions (metadata)

Page 6: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

The IPT Vision

‣The ability to register with GBIF

‣Technical contact information

‣E.g. Internet URLs

‣Physical contact information

‣E.g. telephone details

‣ Institutional affiliations

‣Accurate attribution

Page 7: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

The IPT Vision

‣Connect databases

‣Upload text files

‣Lower the technical threshold for

participation

Page 8: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

The IPT Vision

‣Flexibility to accommodate data extensions

‣Support efficient and simple transfer of

content

‣An open source project

Page 9: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Why an IPT?‣Biodiversity provider tools existed

‣DiGIR‣PHP implementation

‣BioCASe‣Python implementation

‣TAPIR‣PHP / .NET implementation

Page 10: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Why an IPT?

‣Limitations in existing tools‣Checklist content lacking‣No formally recognized metadata

standards‣No automatic registration with GBIF‣Schemas either simple or very complex‣Data transfer sub-optimal (e.g. speed)‣No ability to upload data

Page 11: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Why an IPT?

Page 12: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Why an IPT?

Page 13: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who has used the IPT v1.0?

Page 14: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who had trouble using the IPT v1.0?

Page 15: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

IPT v1.0

‣First released 2009‣ Java based web application

Page 16: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

IPT v1.0: Feature rich

‣Administration‣Users, organisations, extensions, vocabularies

‣Datasets‣Text files, connect a database

‣Discovery of content‣Graphs, metrics, maps, search, browse

‣ Interfaces‣DwC Archive, TAPIR, OGC WMS

Page 17: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Consequences of features

‣Required an embedded database‣Limited performance

‣Required a mapping server‣Significant resources (memory)

Page 18: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Community Feedback

‣Server requirements too high for many‣Performance unsatisfactory‣Dataset size limitations a barrier‣Stability unacceptable

‣Data loss in 2 instances

‣Complexity too high for some

Page 19: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

The concept was sound!

…rationale for

Page 20: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who has used the IPT v2.0?

Page 21: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Who has installed the IPT v2.0?

Page 22: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

v2.0: Key functionality

‣User management‣Extension management‣ Institution management‣Configuring datasets‣Managing dataset state‣ Interfaces

Page 23: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

User management

‣Administrator‣Manager (different trust levels)

‣With registration permissions‣Without registration permissions

‣General user

Page 24: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Extension management

‣By communicating with the GBIF registry, automatically discover‣Data extensions‣Vocabularies

Page 25: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Institution management

‣No ability to create institutions‣By communicating with the GBIF

registry, select‣Institution hosting the IPT‣Institutions that will share datasets in the IPT

Page 26: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Configure Datasets

‣Author metadata‣GBIF Metadata profile

‣Upload text files‣CSV, tab delimited etc.

‣Connect a database‣MySQL, Oracle, SQL Server, PostgreSQL etc.

Page 27: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Configure Datasets

‣Map content to extensions‣Manage user permissions

‣Shared dataset management

Page 28: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Configure Datasets

‣Manage dataset state‣Private: only to the managers‣Public: anybody‣Registered: On the GBIF network

Page 29: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Interfaces

‣Darwin Core Archive‣Ecological Metadata Language

‣Now as a manuscript also in 2.0.2+

Page 30: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

‣Reduced functionality‣TAPIR‣Geoserver‣Visualisations‣Search and browse

Page 31: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

‣Reduced server requirements‣Memory 1-2GB (v1.0) now 256MB (v2.0)

Page 32: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

‣Increased performance‣24m records‣50 minutes‣MySQL ‣256MB memory

Page 33: Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

‣No internal database‣Increase robustness with simple files