Computational Neuroscience: Towards Neuropharmacological Applications
Towards Computational Research Objects
-
Upload
david-de-roure -
Category
Technology
-
view
674 -
download
2
description
Transcript of Towards Computational Research Objects
Towards ComputationalResearch Objects
David De Roure
Indianapolis Edition
1. A Brief History of Research Objects
2. The motivation for Computational Research Objects
3. (A small illustration)
http://www.myexperiment.org/
Packs
In contrast to photo-sharing on Flickr or videos on YouTube, the basic unit of sharing in myExperiment is not a single file but rather a package of components that make up an experiment - what we call an Encapsulated myExperiment Object (EMO), and others have called Reproducible Research Objects. Notionally an EMO is a folder containing the various assets associated with an experiment. In the scientific context there are stringent requirements with respect to versioning, ownership, intellectual property and the maintenance of provenance information. We have looked at emerging practice in sharing “pieces of science” in the scientific and scholarly lifecycle, from social sites to digital repositories. myExperiment provides simple and extensible support to better understand requirements as new collaborative practice emerges. In this presentation, we will describe the characteristics of EMOs and present our initial design solution which supports the requirements of encapsulation and preserves our principles of simplicity and interoperability.
Sharing Digital Science
David De Roure, University of Southampton; Carole Goble, University of Manchester
EMOs
Iain Buchan
ResearchObjects
Results
Logs
Results
Metadata PaperSlides
Feeds into
produces
Included in
produces Published in
produces
Included in
Included in Included in
Published in
Workflow 16
Workflow 13
Common pathways
QTLPaul’s PackPaul’s Research
Object
http://www.openarchives.org/ore/terms/aggregates
http://eprints.ecs.soton.ac.uk/id/eprint/20817
OAI-ORE
• Workflow – pack contains a number of workflows
• Presentation - encapsulation of a single presentation
• Collection - a number of things (workflows/presentations/papers)
• Heterogeneous - where the workflows do not appear to have a clear common purpose
• Homogeneous - workflows appear to be designed to work together
• Paper - source for a paper• Tutorial - tutorial material• Data - collection of data files• Derived data - results of
workflow• Benchmark - benchmarking
data• Supplementary - stuff
associated with a paper• Noise - tests, tryouts, rubbish• Oddity - none of the above
Analysis by Sean Bechhofer
Pack analysis Workflow Centric ROs
used
wasGeneratedBy
wasStartedAt
"2012-06-21"
Metagenome
Sample
wasAssociatedWith
Workflow server
wasInformedBy
wasStartedBy
Workflow run
wasGeneratedBy
Results
Sequencing
wasAssociatedWith
Alice
hadPlan
Workflow definition
hadRole
Lab technician
Resultshttps://w3id.org/bundleStian Soiland-Reyes
Research Object Bundle
Join the W3C Community Group www.w3.org/community/rosc
www.researchobject.org
Notifications and automatic re-runs
Machines are users too
Autonomic
Curation
Self-repair
New research?
The Executable Thesis
new data
new results
executablethesis
PhD Student
A new role for thescientific publisher?
Digital library?
The Executable Journal
A thought experiment…
KnowledgeInfrastructureKnowledge
Objects
Descriptivelayer
Observatories
An
no
tati
on
Research Objects
ComputationalResearch Objects
WorkflowsPacks O
AIO
RE
W3C PRO
V
• Social Objects, designed to facilitate human interpretation (e.g. containing narratives) and shared as part of a (hybrid) sensemaking network
• Machine Objects, semantically described and programmatically accessible, designed for automation, scale and heterogeneity
• Composable with a distributed computational model, such that a Computational Research Object can itself assemble systems of objects, and these systems may consume and produce Computational Research Objects. We can reason about them.
Computational Research Objects
1. I take a digital audio recording and perform a series of analysis tasks leading to a result dataset
2. The environment captures the history of my analysis in a CRO, with descriptions of input data, analysis history (workflow) inc software, output data, narrative.
3. Another researcher finds CRO (cited in social media), tests it, runs it with different audio data (capturing as a CRO)
4. A data scientist registers the CRO to be run automatically when new data arrives, and configures a post-process so that they are notified if new results meet criteria
5. This common pattern of installing multiple CROs with a post-processor is captured for reuse
Simplest Scenario
• The simple example takes us quickly to the stage of writing programs which act on CROs
• Isn’t this all a bit Computer Sciencey?• Yes! But it’s not CS for the sake of CS • It’s CS for “rigour and openness”• The idea is to establish Computer Science techniques
to be able to help design and validate our future research systems
Towards a Science of Reproducibility?
Several Scheme concepts map directly into the CRO model:1. Closures (as mutable objects and first class functions)2. Environments3. Continuations
A prototype RO interpreter has been implemented – here is a simple example based on memoization (or should I say roification…)
(For Lisp hackers)
> (define (f x) (analyse x))> (f 10);Value: 100> (define ro1 (roify f))> ((ro1 'x) 2);Value: 4> ((ro1 'x) 3);Value: 9> ((ro1 'x) 2); precomputed;Value: 4
> (define foo (ro1 'v))> (foo); confirmed(3) = 9; confirmed(2) = 4;Value: #t> (define (analyse x) (+ x x))> (foo); changed(3) = 6 <> 9;Value: #f> (define a (delay ((ro1 'x) 5))> (a);Value: 10
1. Next steps? Develop more scenarios – including scale, validation, design
2. Higher order functions, e.g. capturing common patterns, seem to be expressive compared to normal workflow mechanics
3. The RO interpreter in Scheme is proof of concept… but actually it could be made operational
4. If nothing else this is a simulation of the/a future and may provide insights
5. Social machines and human computation research involves computational-style descriptions of processes involving humans – exploring in SOCIAM and Smart Society projects
Closing thoughts
[email protected]/people/dderwww.scilogs.com/eresearch@dder
Thanks to Iain Buchan, Sean Bechhofer, Carole Goble and all my colleagues in myExperiment, Wf4Ever, myGrid and FORCE11. Research supported in part by Wf4Ever (FP7-ICT ICT-2009.4 project 270192)Some of these ideas were first presented at Microsoft e-Science Workshop, Stockholm, December 2011