Experts Workshop on the IPT, v. 2, Copenhagen, Denmark
The Pathway to the Integrated Publishing Toolkit version 2
Tim RobertsonSystems ArchitectGlobal Biodiversity Information Facility (GBIF)
20 June 2011
Agenda
‣Why an IPT?
‣The project history
‣ IPT version 1.0
‣The rationale for version 2.0
‣Key functionality of the IPT v2.0
Who has used an IPT?
Who has installed an IPT?
The IPT Vision
‣A single platform allowing the sharing of
‣Primary biodiversity data
‣Species name information
‣Dataset descriptions (metadata)
The IPT Vision
‣The ability to register with GBIF
‣Technical contact information
‣E.g. Internet URLs
‣Physical contact information
‣E.g. telephone details
‣ Institutional affiliations
‣Accurate attribution
The IPT Vision
‣Connect databases
‣Upload text files
‣Lower the technical threshold for
participation
The IPT Vision
‣Flexibility to accommodate data extensions
‣Support efficient and simple transfer of
content
‣An open source project
Why an IPT?‣Biodiversity provider tools existed
‣DiGIR‣PHP implementation
‣BioCASe‣Python implementation
‣TAPIR‣PHP / .NET implementation
Why an IPT?
‣Limitations in existing tools‣Checklist content lacking‣No formally recognized metadata
standards‣No automatic registration with GBIF‣Schemas either simple or very complex‣Data transfer sub-optimal (e.g. speed)‣No ability to upload data
Why an IPT?
Why an IPT?
Who has used the IPT v1.0?
Who had trouble using the IPT v1.0?
IPT v1.0
‣First released 2009‣ Java based web application
IPT v1.0: Feature rich
‣Administration‣Users, organisations, extensions, vocabularies
‣Datasets‣Text files, connect a database
‣Discovery of content‣Graphs, metrics, maps, search, browse
‣ Interfaces‣DwC Archive, TAPIR, OGC WMS
Consequences of features
‣Required an embedded database‣Limited performance
‣Required a mapping server‣Significant resources (memory)
Community Feedback
‣Server requirements too high for many‣Performance unsatisfactory‣Dataset size limitations a barrier‣Stability unacceptable
‣Data loss in 2 instances
‣Complexity too high for some
The concept was sound!
…rationale for
Who has used the IPT v2.0?
Who has installed the IPT v2.0?
v2.0: Key functionality
‣User management‣Extension management‣ Institution management‣Configuring datasets‣Managing dataset state‣ Interfaces
User management
‣Administrator‣Manager (different trust levels)
‣With registration permissions‣Without registration permissions
‣General user
Extension management
‣By communicating with the GBIF registry, automatically discover‣Data extensions‣Vocabularies
Institution management
‣No ability to create institutions‣By communicating with the GBIF
registry, select‣Institution hosting the IPT‣Institutions that will share datasets in the IPT
Configure Datasets
‣Author metadata‣GBIF Metadata profile
‣Upload text files‣CSV, tab delimited etc.
‣Connect a database‣MySQL, Oracle, SQL Server, PostgreSQL etc.
Configure Datasets
‣Map content to extensions‣Manage user permissions
‣Shared dataset management
Configure Datasets
‣Manage dataset state‣Private: only to the managers‣Public: anybody‣Registered: On the GBIF network
Interfaces
‣Darwin Core Archive‣Ecological Metadata Language
‣Now as a manuscript also in 2.0.2+
‣Reduced functionality‣TAPIR‣Geoserver‣Visualisations‣Search and browse
‣Reduced server requirements‣Memory 1-2GB (v1.0) now 256MB (v2.0)
‣Increased performance‣24m records‣50 minutes‣MySQL ‣256MB memory
‣No internal database‣Increase robustness with simple files