Download - Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.

Experts Workshop on the IPT, v. 2, Copenhagen, Denmark

The Pathway to the Integrated Publishing Toolkit version 2

Tim RobertsonSystems ArchitectGlobal Biodiversity Information Facility (GBIF)

[email protected]

20 June 2011

mailto:[email protected]

Agenda

‣Why an IPT?

‣The project history

‣ IPT version 1.0

‣The rationale for version 2.0

‣Key functionality of the IPT v2.0

Who has used an IPT?

Who has installed an IPT?

The IPT Vision

‣A single platform allowing the sharing of

‣Primary biodiversity data

‣Species name information

‣Dataset descriptions (metadata)

The IPT Vision

‣The ability to register with GBIF

‣Technical contact information

‣E.g. Internet URLs

‣Physical contact information

‣E.g. telephone details

‣ Institutional affiliations

‣Accurate attribution

The IPT Vision

‣Connect databases

‣Upload text files

‣Lower the technical threshold for

participation

The IPT Vision

‣Flexibility to accommodate data extensions

‣Support efficient and simple transfer of

content

‣An open source project

Why an IPT?‣Biodiversity provider tools existed

‣DiGIR‣PHP implementation

‣BioCASe‣Python implementation

‣TAPIR‣PHP / .NET implementation

Why an IPT?

‣Limitations in existing tools‣Checklist content lacking‣No formally recognized metadata

standards‣No automatic registration with GBIF‣Schemas either simple or very complex‣Data transfer sub-optimal (e.g. speed)‣No ability to upload data

Why an IPT?

Who has used the IPT v1.0?

Who had trouble using the IPT v1.0?

IPT v1.0

‣First released 2009‣ Java based web application

IPT v1.0: Feature rich

‣Administration‣Users, organisations, extensions, vocabularies

‣Datasets‣Text files, connect a database

‣Discovery of content‣Graphs, metrics, maps, search, browse

‣ Interfaces‣DwC Archive, TAPIR, OGC WMS

Consequences of features

‣Required an embedded database‣Limited performance

‣Required a mapping server‣Significant resources (memory)

Community Feedback

‣Server requirements too high for many‣Performance unsatisfactory‣Dataset size limitations a barrier‣Stability unacceptable

‣Data loss in 2 instances

‣Complexity too high for some

The concept was sound!

…rationale for

Who has used the IPT v2.0?

Who has installed the IPT v2.0?

v2.0: Key functionality

‣User management‣Extension management‣ Institution management‣Configuring datasets‣Managing dataset state‣ Interfaces

User management

‣Administrator‣Manager (different trust levels)

‣With registration permissions‣Without registration permissions

‣General user

Extension management

‣By communicating with the GBIF registry, automatically discover‣Data extensions‣Vocabularies

Institution management

‣No ability to create institutions‣By communicating with the GBIF

registry, select‣Institution hosting the IPT‣Institutions that will share datasets in the IPT

Configure Datasets

‣Author metadata‣GBIF Metadata profile

‣Upload text files‣CSV, tab delimited etc.

‣Connect a database‣MySQL, Oracle, SQL Server, PostgreSQL etc.

Configure Datasets

‣Map content to extensions‣Manage user permissions

‣Shared dataset management

Configure Datasets

‣Manage dataset state‣Private: only to the managers‣Public: anybody‣Registered: On the GBIF network

Interfaces

‣Darwin Core Archive‣Ecological Metadata Language

‣Now as a manuscript also in 2.0.2+

‣Reduced functionality‣TAPIR‣Geoserver‣Visualisations‣Search and browse

‣Reduced server requirements‣Memory 1-2GB (v1.0) now 256MB (v2.0)

‣Increased performance‣24m records‣50 minutes‣MySQL ‣256MB memory

‣No internal database‣Increase robustness with simple files