Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source...

25
Open Science Data & Source Code dissemination for scientific research Charles Marion ([email protected]) Julien Jomier ([email protected])

Transcript of Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source...

Page 1: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Open Science Data & Source Code dissemination for scientific

research

Charles Marion ([email protected])

Julien Jomier ([email protected])

Page 2: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Introduction

• Publications do not cure Cancer !

• Doctors do not prescribe “reading papers” as a

treatment.

• So.. Why do scientists care so much about publishing ?

Page 3: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Introduction

• Scientific/Technical papers disseminate knowledge

• Academic/Scientific achievements are assessed via

publishing: “publish or perish”

Page 4: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Why publishing research?

“. . . whenever I found out anything remarkable, I have thought it my

duty to put down my discovery on paper, so that all ingenious people

might be informed thereof.”

Antony van Leeuwenhoek. Letter of June 12, 1716

Page 5: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Introduction

• How long it takes to publish a paper on a Journal?

– Typically 2 years

• How much do you have to pay to publish a paper in a

journal? – About 500€ / paper

• How much do you have to pay for reading the same

paper? – About 30€ / paper

Page 6: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Introduction

• How much it costs to post a PDF on the Web ?

Page 7: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

New ways of collaboration

• Creating public repositories for source code

• Creating public image databases

• Creating forums for hosting positive discussions online

• Validating other’s methods and suggesting improvements.

Page 8: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Open Science

Open Science

Open Source

Open Access

Open Comput

ing

Page 9: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Open Access Revolution

• Few journals enforce REPRODUCIBILITY

• Few journals publishes CODE, DATA and PARAMETERS

• No journal publishes NEGATIVE results

Page 10: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Insight Journal

Open Source

Open Science

Agile Programming

Agile Publishing Insight

Journal

Page 11: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Insight Journal

• Started in 2000 with the Insight Toolkit (ITK)

• Web-based open-access journal

• Technical work must be reproducible

• Papers should be publicly accessible

• It should take less than 2 years to publish

• The Peer-Review process must be open

Page 12: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Insight Journal: submission

Code

Input

Data

Journal CVS

Repository

Web

Site Results

Data

Author

Build

Machines

PDF doc

Page 13: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Insight Journal: review

Reviewer

Selected

Papers

Checked

Paper

Reviewer

Checked

Paper

Checked

Paper

Checked

Paper

Web Site

Checked

Paper

Page 14: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

The Insight Journal: Demo

Page 15: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Open Access: Dataset

• Scientific datasets are becoming larger and larger

• Storing datasets is the first step but querying and retrieving them

is even more important

• Data without metadata information are useless

• Distributed and remote computing is becoming a necessity

Page 16: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

An Web-based Multimedia Digital Archiving System to store, search,

share and manage (any) digital media.

• Open source (BSD license) and Cross Platform

• Modular, extensible and highly customizable Framework

• Remote and Local solutions

What is MIDAS?

Page 17: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

MIDAS

Don’t think

Think

Page 18: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

MIDAS Interface

Page 19: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Server Side Processing

Page 20: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

• Reliable, scalable, distributed computing using Hadoop:

Server Side Processing

Page 21: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

• Distributed Visualization: ParaviewWeb

• Client side Visualization: WebGL

Online Visualization

Page 22: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Online Visualization

Page 23: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

MIDAS Instances • Open science journals

- Insight Journal: http://www.insight-journal.org 1,500 users, 380+ open-access publications, 765 reviews

- MIDAS Journal: http://www.midas-journal.org

• Publication Database - Harvard: http://www.slicer.org/publications 1500+ publications

- Kitware: http://www.kitware.com/publications

• Data server - Kitware Public: http://www.insight-journal.org/midas 30+ GB of open-access data

- NCI Small Animal Imaging Multiple TB of data

- NLM Visible Human

- Optical Society of America

Page 24: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

What’s next?

• Unification of technologies for complete reproducibility

– CTest, CDash, MIDAS, ParaViewWeb

• Integration Git / GitHub

• Algorithm Validation (COVALIC)

– Source code

– Testing data

– Validation metrics

– Online reporting, comparison, rating

Page 25: Open Science - 2011.rmll.info2011.rmll.info/IMG/pdf/OpenScience.pdf · Open Science Data & Source Code dissemination for scientific research Charles Marion (charles.marion@kitware.com)

Open Science

Data & Source Code dissemination for scientific

research

Charles Marion ([email protected])

Julien Jomier ([email protected])