On the reproducibility of science

15
On the reproducibility of science Melissa Haendel Beyond the PDF2 20 March 2013 @ontowonka [email protected]

description

Presented at Beyond the PDF2 in Amsterdam 2013 http://www.force11.org/beyondthepdf2. This talk describes preliminary data showing the lack of scientific reproducibility solely based on an inability to identify the material resources used in the research. Final work to be published soon!

Transcript of On the reproducibility of science

Page 1: On the reproducibility of science

On the reproducibility of science

Melissa Haendel Beyond the PDF2 20 March 2013 @ontowonka [email protected]

Page 2: On the reproducibility of science

Do we know if the infrastructure is actually broken?

Slide  from  Gully  Burns  

The  science  cycle  

Page 3: On the reproducibility of science

This is a broken data story.

The  science  cycle  

Image:  h6p://www.joinchangena=on.org/blog/post/roadblocks-­‐on-­‐the-­‐pathway-­‐to-­‐ci=zenship  

Page 4: On the reproducibility of science

Journal guidelines for methods are often poor and space is limited

“All  companies  from  which  materials  were  obtained  should  be  listed.”   -­‐  A  well-­‐known  journal  

Reproducibility  is  dependent  at  a  minimum,  on  using  the  same  resources.  But…  

Page 5: On the reproducibility of science

Hypothesis:  AnAbodies  in  the  published  literature  are  not  uniquely  idenAfiable    

An experiment in reproducibility

Gather  journal  ar=cles  

5  domains:  Immunology  Cell  biology  Neuroscience  Developmental  biology  General  biology  

3  impact  factors:  High  Medium  Low  

28  Journals  

119  papers  

454  an=bodies  

408  commercial  an=bodies  

46  non-­‐commercial  an=bodies  

Iden=fying  ques=ons:  

Is  the  an=body  iden=fiable  in  the  vendor  site?  

Is  the  catalog  number  reported?  

Is  the  source  organism  reported?  

Is  the  an=body  target  iden=fiable?  

Page 6: On the reproducibility of science

The data shows…

Approximately  half  of  anAbodies  are  not  uniquely  idenAfiable  in  119  publicaAons  

Percen

t  ide

nAfia

ble  

0%  

10%  

20%  

30%  

40%  

50%  

60%  

Commercial  an=body   Non-­‐commerical  an=body  

n=408  

n=46  

Page 7: On the reproducibility of science

0%  

10%  

20%  

30%  

40%  

50%  

60%  

70%  

80%  

90%  

100%  

Immunology  Neuroscience   Dev  Bio   Cell  Bio   General  Bio  

High  Medium  Low  Pe

rcen

t  ide

n=fiable  

n=124   n=94  

n=87  

n=95  

n=56  

Unique  idenAficaAon  of  commercial  anAbodies  varies  across  discipline  and  impact  factor  

In some domains high impact journals have worse reporting, and in others it is the opposite

Page 8: On the reproducibility of science

Maybe labs are just disorganized?

Page 9: On the reproducibility of science

Meet the Urban Lab

Page 10: On the reproducibility of science

Meet the Urban Lab

Image:  Gourami  Watcher  

Page 11: On the reproducibility of science

A+ organization!

The  Urban  lab  anAbodies  

Page 12: On the reproducibility of science

Of 14 antibodies published in 45 articles, only 38% were identifiable

0%  

10%  

20%  

30%  

40%  

50%  

60%  

70%  

80%  

90%  

Commerical  Ab  iden=fiable  

Non-­‐commercial  Ab  iden=fiable    

Catalog  number  reported  

Source  organism  reported  

Target  uniquely  iden=fiable  

Percen

t  ide

nAfia

ble  

Page 13: On the reproducibility of science

What does this tell us?

Page 14: On the reproducibility of science

Scientists really do put their data in cardboard boxes.

Page 15: On the reproducibility of science

Ø Promote  beJer  reporAng  guidelines  in  journals  Ø Include  reviewing  guidelines  Ø Provide  tools  to  reference  research  resources  with  unique  and  persistent  IDs/URIs    

Ø Train  librarians  and  other  data  stewards  to  apply  data  standards  

What are we going to do about it?