Standards enabling secure and interoperable IoT - Piers Hogarth-Scott, KPMG Australia
Enabling access to the David Scott Mitchell digital ...
Transcript of Enabling access to the David Scott Mitchell digital ...
Enabling access to the David Scott Mitchell digital collection
for digital humanities research
21 Oct 2019
Euwe Ermita
State Library of New South Wales
Project overview
The David Scott Mitchell collection is the Library’s most renowned collection. Formats include books, maps, photos, coins, sound recordings.
Objective is to make this collection more accessible to researchers, by understanding their needs and use of eResearch tools/platforms
Pilot the transformation and delivery of DSM digital datasets (metadata, full-text and digitised pages of books only) onto select researcher platforms.
Key issues
1. 3 months to deliver outcomes
2. Quickly understanding context of cultural collections within eResearch
3. Restrictions on access, including Indigenous Cultural and Intellectual Property (ICIP)
4. Data is currently Findable, but lacks easy Accessibility by machines
5. Data is currently Interoperable and is well structured.
6. Not currently licensing our datasets, making Reusability difficult
Approaches
1. Establishment of multi-disciplinary project team - completed
2. Workshops with partners and stakeholders – completed
3. Pilot datasets on select research tools/platforms – completed
4. Post-pilot review - underway
1. Jupyter Notebooks for interacting with Library APIs
Retrieving book metadata via ALMA (catalogue) API
1. Access ALMA records with an authorised API key
2. Load the metadata for multiple books in a dataframe
Retrieve specific book
metadata and cover page
for context and research
Retrieving book metadata through ALMA API
Rosetta API login credentials with an in-house Python API for access to METS metadata
Retrieving Book Data Through Rosetta (DAM) API
2. Jupyter Notebook integration with Voyant-Tools
Visualising and analysing book contents with Voyant-Tools
Whether in Voyant-Tools or in Jupyter, named entity recognition provides more insightful visualisations of corpus.
Named Entity Recognition Within Book Text
3. UTS Collaboration: RO-Crates for Researchers
RO-Crates for researchers
ALTOs and
scanned images
● Research Object Crate (RO-
Crate) is a community effort to
establish a lightweight approach
to packaging research data with
their metadata.
● Based on schema.org
annotations in JSON-LD.
Lessons
1. Timeframe and scope – more time to undertake research to determine
information and data needs of researchers
2. Organisational change management – appreciating the benefits/value in
undertaking and investing in similar initiatives is yet to be determined
3. Technology and capabilities – developers with capabilities across data,
information management and ETL processes are still emerging.
Acknowledgements
• UTS eResearch Team
• Peter S., Moises S., Michael L.
• Macquarie Uni – Steve Cassidy
• ARDC – Rowan Brownlee
• State Library of New South Wales
• Salek A., Peter B., Robin P., Richard N., Maggie P., Brendan S.