BioPharma and FAIR Data, a Collaborative Advantage
-
Upload
tom-plasterer -
Category
Science
-
view
170 -
download
1
Transcript of BioPharma and FAIR Data, a Collaborative Advantage
BioPharma Adoption of FAIR* Data, a Collaborative Advantage
Tom Plasterer, PhDResearch & Development Information (RDI); US Cross-Science Director 25 May 2017
* Findable, Accessible, Interoperable and Reusable
The right data is there when I need it
Your data and my data are mutually understandable
Our data can be effortlessly combined
I am permitted to use any data I can access
Data can be reshaped for a different purpose
Data sharing is rewarded
‘I’ can be a human or a machine
3
We Want Data Nirvana!
4
FAIR Data: Overview
To be Findable:
• Globally unique, resolvable and persistent identifiers
• Machine-actionable contextual information supporting discovery
To be Accessible:
• Clearly defined access protocol
• Clearly defined rules for authorization/authentication
To be Interoperable:
• Use shared vocabularies and/or ontologies
• Syntactically and semantically machine-accessible format
To be Reusable:
• Be compliant with the F, A and I Principles
• Contextual information, allowing proper interpretation
• Rich provenance information facilitating accurate citation
Mark Wilkinson, Data Interoperability and FAIRness Through Existing Web Technologies
5
FAIR Data: A Brief History
Moving away from Narrative
• Nanopublications
Incubating Standards in Open PHACTS
• VoID, PROV-O
Lorentz Center Workshop
• FORCE 11 FAIR Guiding Principles
• Participants: IMI members, US researchers,
Content providers, ELIXIR; European Open
Science Cloud, Big Data to Knowledge (BD2K)
Current Status:
• FAIR Data Workshops (EU-ELIXIR nodes, Bio-IT)
• Inclusion in Horizon 2020, NIH Advocacy
• IMI2 Data FAIR-ification Call
• Vendors getting up to speed
6
Rapid Adoption of Principles
Developed and endorsed by researchers, publishers, funding agencies, industry partners.
As of May 2017,
100+ citations since 2016 publication
Included in G20 communique, EOSC, H2020, NIH, and more…
Thanks to: @micheldumontier::2017-05-19
7
Introductory Nature Paper: The FAIR Guiding Principles for scientific data management and stewardship
Thanks to: @micheldumontier::2017-05-19
This Altmetric score
indicates the article is:
• In the 99th percentile (ranked
615th) of the 278,235 tracked
articles of a similar age in all
journals
• In the 95th percentile (ranked
1st )of the 23 tracked articles
of a similar age in Scientific
Data
8
FAIR Data: Systems Biology Survey
Molecular Systems Biology
Volume 11, Issue 12, 28 DEC 2015 DOI: 10.15252/msb.20156053
http://onlinelibrary.wiley.com/doi/10.15252/msb.20156053/full#msb156053-fig-0001
9
FAIR Data: Data Stewardship Survey
Data Stewardship Survey13 Questions – One minute out of your day!
http://bit.ly/BiopharmaDataStewardship
10
Survey: What best describes your department?
65.24.3
8.7
13
4.34.3IT/IS
Target Discovery
Lead Discovery
Clinical Development
Marketing & Sales
Other - Write In
11
Survey: What is your scientific background?
21.7
13
4.3
34.8
8.7
17.4 Experimentalist
Modeler (Structural)
Modeler (Statistical)
Informatician
Project Manager
Other - Write In
12
Survey: How importance is data reuse to your organization?
2 2
4
14
0
2
4
6
8
10
12
14
16
2 3 4 5
2
3
4
5
13
How important are the use of public standards to structuring your data?
2 2
8
9
0
1
2
3
4
5
6
7
8
9
10
2 3 4 5
2
3
4
5
14
Survey: Is integrating internal data a challenge?
95.7
4.3
Yes
No
15
Is integrating external data from partnerships a challenge?
91.3
4.34.3
Yes
No
Don't know
16
Are metadata and data models considered proprietary at your organization?
40
55
5
Yes
No
Don't know
17
What controlled vocabularies and/or ontologies do you use for structuring and
annotating your data and models?
31.6
21.126.3 26.3
36.8
68.4
42.1
21.115.8
52.6
78.9
63.2
15.8 15.8 15.810.5
52.6
15.8
31.6
00
10
20
30
40
50
60
70
80
90
18
Are data usage requirements clearly understood within your organization?
No Yes
1
7
3
2 2 2
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5
0
1
2
3
4
5
19
Is it easy or hard to get access to clinical data in your organization?
Easy Hard
2
4
7
2
0
1
2
3
4
5
6
7
8
2 3 4 5
2
3
4
5
20
Is it easy or hard to get access to clinical metadata in your organization?
Easy Hard
3 3
4 4
1
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5
1
2
3
4
5
21
Survey: Who ‘owns’ clinical data at your organization?
40
13.36.7
26.7
13.3
A drug project team
A clinical area
A third party/vendor
Don't Know/Not Applicable
Other
22
How do you share models and data with your collaborators before publication?
43.8 43.8
6.3
50
12.5 12.5
0
10
20
30
40
50
60
By email Through projectdatabase/content
management system
Through bespoke SystemsBiology platform
Dropbox/Box/SharePoint Software VersioningSystem
Don't know
23
FAIR Data & Biopharma?
Collaborative & Competitive Intelligence:
• Who do we want to partner with? Are there complementary assets to our portfolio?
• What space is too crowded and not our area of expertise?
• Greenfield situations?
Mergers, Acquisitions, Partnerships:
• How do we efficiently and deeply absorb data generated elsewhere into our systems? How
do we efficiently share?
• Does this make a smaller biotech/start-up a more viable partner?
Improved Patient Care:
• Can we share data and outcomes more efficiently in complicated trial settings (basket trials,
adaptive trials) to better engage opinion leaders and foster dialog?
• Along with Differential Privacy approaches, can we have the broader research community
help mine our data?
Data (Ir)-reproducibility:
• Is preclinical data reproducible?
• Can we utilize data credentialization? (thanks to Dan Crowther @ Sanofi)
25
Getting Started
What’s the difference between FAIR Data and Linked Data?
What’s Critical?
• URIs, PURLs
• Standards, vocabularies, cross-mapping
• Access rules
• FAIR-ness metrics
• Data and Information Scientists
FAIR and Enterprise Data Management
Adoption, Sticks and Carrots; Winners and Losers
Linked Data FAIR Data
R&D | RDI
Interoperable: Need clearly recognized• Use the same plumbing and your data won’t be stuck in a silo
Accessible: Open, if permitted• Interoperate first then govern
Reusable: Use public solutions and consortia• Don’t reinvent the wheel (OK—Ontology…)
Invest in FAIR Data Stewardship• Investment to future-proof your efforts
FAIR Data and Collaboration: Take-aways
R&D | RDI
Thanks
Key Influencers
David Wood
Toby Segaran
Tim Berners-Lee
Lee Harland
Bryn Williams-Jones
Eric Neumann
Dean Allemang
Barend Mons
Carole Goble
Bernadette Hyland
Bob Stanley
Eric Little
Michel Dumontier
John Wilbanks
Hans Constandt
Dan Crowther
Tim Hoctor
Bio-IT 2017
Conference Organizers
AZ/MedImmune Linked
Data Community
Kerstin Forsberg
Rajan Desai
Jeff Saltzman
David Ruau
Kathy Reinold
Bridget Behringer
Nirmal Keshava
Sara Dempster
Bryan Takasaki
Nick Wright
David Fenstermacher