Nanoinformatics 2010 SMIRP-ONS Talk

70
The implications of Open Notebook Science and other new forms of scientific communication for Nanoinformatics Jean-Claude Bradley November 3, 2010 Nanoinformatics 2010 Associate Professor of Chemistry Drexel University

description

Jean-Claude Bradley presents on "The implications of Open Notebook Science and other new forms of scientific communication for Nanoinformatics" at the Nanoinformatics 2010 conference on November 3, 2010. The presentation first covers the use of the laboratory knowledge management system SMIRP for nanotechnology applications during the period of 1999-2001 at Drexel University. The exporting of single experiments from SMIRP and publication to the Chemistry Preprint Archive is then described followed by the evolution to Open Notebook Science in 2005. Abstraction of semantic structure from ONS projects in the areas of drug discovery and solubility is then detailed as an efficient mechanism to provide web services and machine readable data feeds.

Transcript of Nanoinformatics 2010 SMIRP-ONS Talk

Page 1: Nanoinformatics 2010 SMIRP-ONS Talk

The implications of Open Notebook Science and other new forms of

scientific communication for Nanoinformatics

Jean-Claude Bradley

November 3, 2010

Nanoinformatics 2010

Associate Professor of ChemistryDrexel University

Page 2: Nanoinformatics 2010 SMIRP-ONS Talk

LIMS CENS

Single Instrument Automation

Laboratory Information Management Systems

Collaborative Electronic Notebook Systems

Human /Autonomous Agent Hybrid Systems

Human ManagedFully AutonomousScientific Research Systems

TODAY

SMIRP bridge

The Evolution of Automation in Scientific Research

Page 3: Nanoinformatics 2010 SMIRP-ONS Talk

StandardModularIntegratedResearchProtocols

Capturing semantic structure in research

at the point of data entry

Page 4: Nanoinformatics 2010 SMIRP-ONS Talk
Page 5: Nanoinformatics 2010 SMIRP-ONS Talk

HumanAgent

AutonomousAgent

SMIRP

(Bot)

Browser

Excel

The SMIRP model for a hybrid Human/Autonomous Agent System

Anthropomimetic Design

Page 6: Nanoinformatics 2010 SMIRP-ONS Talk

Approaches to Collaborative Electronic Notebooks

rigid

SMIRPcompromise:Rigid information representationFlexible linking of modules

flexible

• Structured• Generallydomainspecific

• Adaptable• Unstructured

http://smirp.drexel.edu

Page 7: Nanoinformatics 2010 SMIRP-ONS Talk

Fundamental Information Representation in SMIRP

Module 1 Module 2

Parameter 1

Parameter 2

Parameter 4

Parameter 5

instance

Record 1

instance

Record 2

(People)

(Name)

(Employee of)

(Company)

(Name)

Parameter 3(email)

(Address)

Bill Gates Microsoft

Page 8: Nanoinformatics 2010 SMIRP-ONS Talk

Two approaches to the development of databases

Communicateanticipated need

Designdatabase structure

Let database structureevolvethrough useSMIRP

Page 9: Nanoinformatics 2010 SMIRP-ONS Talk

Case-study: Evolution of SMIRP structure in a nanoscience laboratory

Location Drexel UniversityDepartment of Chemistry

Users faculty, undergraduate students, graduatestudents, librarians and other university personnel

Period Feb 1999 – April 2001, with a detailed focus onlast 7 months (Sept 2000-April 2001)

Total accounts (last 7 months) 78

Active Accounts (added records) 50

Administrators (changed database structure)

9

Page 10: Nanoinformatics 2010 SMIRP-ONS Talk

HumanResource Management 13%

Maintenance1%

Knowledge Processing 72%

Most Active Module Categories (9/00 – 4/01)

Labwork14%

118 modules 1/3 account for 98% of activity

Page 11: Nanoinformatics 2010 SMIRP-ONS Talk

Activity Analysis by Category over Time

20

00

-10

-3

20

00

-10

-17

20

00

-10

-30

20

00

-11

-12

20

00

-11

-25

20

00

-12

-8

20

00

-12

-21

20

01

-1-3

20

01

-1-1

6

20

01

-1-3

0

20

01

-2-1

2

20

01

-2-2

5

20

01

-3-1

0

20

01

-3-2

3

20

01

-4-5

20

01

-4-1

8

Maintenance

Human Resource Management

Laboratory Work

Knowledge Processing0

1000

2000

3000

4000

5000

6000

7000

8000

Page 12: Nanoinformatics 2010 SMIRP-ONS Talk

Recruitment events 2%

ProjectManager 5%Errors

5%

Productivity Tracking 14%

People 28%

Workstudy hours reporting 46%

Most Active Human Resource Management Modules

Page 13: Nanoinformatics 2010 SMIRP-ONS Talk

Most Active Maintenance Modules

SMIRPProblems22%

Orders 19%

Invoice (TEM/SEM and other instrument charges) 19%

Laboratorymaterials16%

Vendor15%

Orderforms9%

Page 14: Nanoinformatics 2010 SMIRP-ONS Talk

Most Active Knowledge Processing Modules

Journal 9%

Knowledge Filter 3%

ReformatReference requests 20%Find

Reference 66%

PublisherDocument ProductionReference ProcessingParameter CorrelationData source filesExperimental Conclusion GenerationKnowledge consolidation

Page 15: Nanoinformatics 2010 SMIRP-ONS Talk

Seamless Integration of Human and Autonomous Agents in Workflows

Real-Time Workflow Designs

Automated

Human(default)

State A State B

Page 16: Nanoinformatics 2010 SMIRP-ONS Talk

Workflow for Extraction of Article information and URL

Queries Web and extracts information

Page 17: Nanoinformatics 2010 SMIRP-ONS Talk

Most Active Laboratory Modules

Preparation of Silver rods for SCBETEM Micrographs Of Pd on CSCBE on membranesHydrogenation of Crotonaldehyde using Pd CatalystsReduction of Methylene blue by Pd Metal Particles in a Field

Electrodeposition of Pd on Graphite 29%

Protocol Prototyping25%

Pd onto Carbon Nanofibers17%

Electroless plating on Membranes9%

Synthesis of Pd catalysts by Bipolar electrochemistry5%

TEM Micrographs Of Pd on C3%

Pd particle size analysis using TEM 3%

Page 18: Nanoinformatics 2010 SMIRP-ONS Talk

Keyword Search Results: example “nanotube”

Page 19: Nanoinformatics 2010 SMIRP-ONS Talk

From Keyword to Orders

Page 20: Nanoinformatics 2010 SMIRP-ONS Talk

From Keyword to Article

Page 21: Nanoinformatics 2010 SMIRP-ONS Talk

From Keyword to Knowledge Filter

Page 22: Nanoinformatics 2010 SMIRP-ONS Talk

From Keyword to Protocol Prototyping

Page 23: Nanoinformatics 2010 SMIRP-ONS Talk

Sharing results semi-automatically: SMIRP Knowledge Product

•Single Experiment•Full Context•Supporting Data•Not suitable for traditional peer-reviewed publications

Page 24: Nanoinformatics 2010 SMIRP-ONS Talk

Non-traditional publication options in 2003

(Elsevier)

Page 25: Nanoinformatics 2010 SMIRP-ONS Talk
Page 26: Nanoinformatics 2010 SMIRP-ONS Talk
Page 27: Nanoinformatics 2010 SMIRP-ONS Talk
Page 28: Nanoinformatics 2010 SMIRP-ONS Talk
Page 29: Nanoinformatics 2010 SMIRP-ONS Talk
Page 30: Nanoinformatics 2010 SMIRP-ONS Talk
Page 31: Nanoinformatics 2010 SMIRP-ONS Talk

To Cite or Not to Cite?

Page 32: Nanoinformatics 2010 SMIRP-ONS Talk
Page 33: Nanoinformatics 2010 SMIRP-ONS Talk

“I would never consider a claim made in a patent as blocking an author's claim of novelty.” Langmuir Editor

What is a Scientific Precedent in Academia?

What is a Scientific Precedent in Patent Law?

Page 34: Nanoinformatics 2010 SMIRP-ONS Talk

What is Scholarship?*also indexed in Chemical Abstracts!

Page 35: Nanoinformatics 2010 SMIRP-ONS Talk

The UsefulChem Project (2005)

What would happen if a chemistry project was completely transparent

in real time?

Page 36: Nanoinformatics 2010 SMIRP-ONS Talk

Motivation: Faster Science, Better Science

Page 37: Nanoinformatics 2010 SMIRP-ONS Talk

TRUST

PROOF

Page 38: Nanoinformatics 2010 SMIRP-ONS Talk

First record then abstract structure

In order to be discoverable use Google friendly formats (simple HTML, no

login) In order to be replicable use free hosted tools (Wikispaces, Google

Spreadsheets)

Strategy for an Open Notebook:

Page 39: Nanoinformatics 2010 SMIRP-ONS Talk

UsefulChem Project: Open Primary Research in Drug Design using Web2.0

tools

Docking

Synthesis

Testing

Rajarshi GuhaIndiana U

JC BradleyDrexel U

Phil RosenthalUCSF

(malaria)

Dan ZaharevitzNCI

(tumors)

Tsu-Soo TanNanyang Inst.

Page 40: Nanoinformatics 2010 SMIRP-ONS Talk

Malaria Target: falcipain-2 involved in hemoglobin metabolism

Dana.org

Page 41: Nanoinformatics 2010 SMIRP-ONS Talk

Outcome of Guha-Bradley-Rosenthal collaboration

Page 42: Nanoinformatics 2010 SMIRP-ONS Talk

The Ugi reaction: can we predict precipitation?

Can we predict solubility in organic solvents?

Page 43: Nanoinformatics 2010 SMIRP-ONS Talk

Crowdsourcing Solubility Data

Page 44: Nanoinformatics 2010 SMIRP-ONS Talk

ONS Challenge Judges

Page 45: Nanoinformatics 2010 SMIRP-ONS Talk

ONS Submeta Award Winners

Page 46: Nanoinformatics 2010 SMIRP-ONS Talk

Data provenance: From Wikipedia to…

Page 47: Nanoinformatics 2010 SMIRP-ONS Talk

…the lab notebook and raw data

Page 48: Nanoinformatics 2010 SMIRP-ONS Talk

• Concentration (0.4, 0.2, 0.07 M)• Solvent (methanol, ethanol, acetonitrile, THF)• Excess of some reagents (1.2 eq.)

How does Open Notebook Science fit with traditional publication?

Page 49: Nanoinformatics 2010 SMIRP-ONS Talk

Paper written on Wiki

Page 50: Nanoinformatics 2010 SMIRP-ONS Talk

References to papers, blog posts, lab notebook pages, raw

data

Page 51: Nanoinformatics 2010 SMIRP-ONS Talk

Paper on Journal of Visualized Experiments (JoVE)

Page 52: Nanoinformatics 2010 SMIRP-ONS Talk

Pre-print on Nature Precedings

Page 53: Nanoinformatics 2010 SMIRP-ONS Talk

ONSArchive: Semi-Automated Snapshot of the Entire Scientific Record

Automated Download of

Spreadsheets and Parsing of

Web Pages

Manual Backup of Spectral Data Files

Manual

Export of

Wikispace

s

Page 54: Nanoinformatics 2010 SMIRP-ONS Talk

Lulu.com Data Disks

Page 55: Nanoinformatics 2010 SMIRP-ONS Talk

Interactive NMR spectra using JSpecView and JCAMP-DX

Page 56: Nanoinformatics 2010 SMIRP-ONS Talk

Raw Data As Images

Splatter?

Some liquid

Page 57: Nanoinformatics 2010 SMIRP-ONS Talk

YouTube for demonstrating experimental set-up

Page 58: Nanoinformatics 2010 SMIRP-ONS Talk

The importance of raw data availability

Missed in a prior publication on

solubility for this compound

Page 59: Nanoinformatics 2010 SMIRP-ONS Talk

The Intersection of Open Notebooks (Bradley/Todd) and IP implications

Open Notebook could have blocked patent

if done earlier

Page 60: Nanoinformatics 2010 SMIRP-ONS Talk

Convenient web services for solubility measurement and

prediction

(Andrew Lang)

Page 61: Nanoinformatics 2010 SMIRP-ONS Talk

Other Web Services…

(Andrew Lang)

General Transparent Solubility Prediction

Page 62: Nanoinformatics 2010 SMIRP-ONS Talk

Semi-Automated Measurement of solubility via

web service analysis of JCAMP-DX files

(Andy Lang)

Page 63: Nanoinformatics 2010 SMIRP-ONS Talk

Integration of Multiple Web Services to Recommend Solvents

for Reactions

(Andrew Lang)

Page 64: Nanoinformatics 2010 SMIRP-ONS Talk
Page 65: Nanoinformatics 2010 SMIRP-ONS Talk
Page 66: Nanoinformatics 2010 SMIRP-ONS Talk

Reaction Attempts Book

Page 67: Nanoinformatics 2010 SMIRP-ONS Talk

Reaction Attempts Book: Reactants listed Alphabetically

Page 68: Nanoinformatics 2010 SMIRP-ONS Talk

For all Formats of ONS Projects

Page 69: Nanoinformatics 2010 SMIRP-ONS Talk

Dynamic links to private tagged Mendeley collections

(Andrew Lang)

Page 70: Nanoinformatics 2010 SMIRP-ONS Talk

Conclusions

• Open Notebook Science can provide an additional channel to communicate useful scientific information

• Recording first for human consumption followed by abstracting the semantics later works but the format will be field specific

• As long as proof is valued over trust there is no limit to what useful forms of scientific communication will emerge.