On the reproducibility of science
description
Transcript of On the reproducibility of science
On the reproducibility of science
Melissa Haendel Beyond the PDF2 20 March 2013 @ontowonka [email protected]
Do we know if the infrastructure is actually broken?
Slide from Gully Burns
The science cycle
This is a broken data story.
The science cycle
Image: h6p://www.joinchangena=on.org/blog/post/roadblocks-‐on-‐the-‐pathway-‐to-‐ci=zenship
Journal guidelines for methods are often poor and space is limited
“All companies from which materials were obtained should be listed.” -‐ A well-‐known journal
Reproducibility is dependent at a minimum, on using the same resources. But…
Hypothesis: AnAbodies in the published literature are not uniquely idenAfiable
An experiment in reproducibility
Gather journal ar=cles
5 domains: Immunology Cell biology Neuroscience Developmental biology General biology
3 impact factors: High Medium Low
28 Journals
119 papers
454 an=bodies
408 commercial an=bodies
46 non-‐commercial an=bodies
Iden=fying ques=ons:
Is the an=body iden=fiable in the vendor site?
Is the catalog number reported?
Is the source organism reported?
Is the an=body target iden=fiable?
The data shows…
Approximately half of anAbodies are not uniquely idenAfiable in 119 publicaAons
Percen
t ide
nAfia
ble
0%
10%
20%
30%
40%
50%
60%
Commercial an=body Non-‐commerical an=body
n=408
n=46
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Immunology Neuroscience Dev Bio Cell Bio General Bio
High Medium Low Pe
rcen
t ide
n=fiable
n=124 n=94
n=87
n=95
n=56
Unique idenAficaAon of commercial anAbodies varies across discipline and impact factor
In some domains high impact journals have worse reporting, and in others it is the opposite
Maybe labs are just disorganized?
Meet the Urban Lab
Meet the Urban Lab
Image: Gourami Watcher
A+ organization!
The Urban lab anAbodies
Of 14 antibodies published in 45 articles, only 38% were identifiable
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Commerical Ab iden=fiable
Non-‐commercial Ab iden=fiable
Catalog number reported
Source organism reported
Target uniquely iden=fiable
Percen
t ide
nAfia
ble
What does this tell us?
Scientists really do put their data in cardboard boxes.
Ø Promote beJer reporAng guidelines in journals Ø Include reviewing guidelines Ø Provide tools to reference research resources with unique and persistent IDs/URIs
Ø Train librarians and other data stewards to apply data standards
What are we going to do about it?