BioPharma and FAIR Data, a Collaborative Advantage

Post on 23-Jan-2018

170 views 1 download

Transcript of BioPharma and FAIR Data, a Collaborative Advantage

BioPharma Adoption of FAIR* Data, a Collaborative Advantage

Tom Plasterer, PhDResearch & Development Information (RDI); US Cross-Science Director 25 May 2017

* Findable, Accessible, Interoperable and Reusable

The right data is there when I need it

Your data and my data are mutually understandable

Our data can be effortlessly combined

I am permitted to use any data I can access

Data can be reshaped for a different purpose

Data sharing is rewarded

‘I’ can be a human or a machine

3

We Want Data Nirvana!

4

FAIR Data: Overview

To be Findable:

• Globally unique, resolvable and persistent identifiers

• Machine-actionable contextual information supporting discovery

To be Accessible:

• Clearly defined access protocol

• Clearly defined rules for authorization/authentication

To be Interoperable:

• Use shared vocabularies and/or ontologies

• Syntactically and semantically machine-accessible format

To be Reusable:

• Be compliant with the F, A and I Principles

• Contextual information, allowing proper interpretation

• Rich provenance information facilitating accurate citation

Mark Wilkinson, Data Interoperability and FAIRness Through Existing Web Technologies

5

FAIR Data: A Brief History

Moving away from Narrative

• Nanopublications

Incubating Standards in Open PHACTS

• VoID, PROV-O

Lorentz Center Workshop

• FORCE 11 FAIR Guiding Principles

• Participants: IMI members, US researchers,

Content providers, ELIXIR; European Open

Science Cloud, Big Data to Knowledge (BD2K)

Current Status:

• FAIR Data Workshops (EU-ELIXIR nodes, Bio-IT)

• Inclusion in Horizon 2020, NIH Advocacy

• IMI2 Data FAIR-ification Call

• Vendors getting up to speed

6

Rapid Adoption of Principles

Developed and endorsed by researchers, publishers, funding agencies, industry partners.

As of May 2017,

100+ citations since 2016 publication

Included in G20 communique, EOSC, H2020, NIH, and more…

Thanks to: @micheldumontier::2017-05-19

7

Introductory Nature Paper: The FAIR Guiding Principles for scientific data management and stewardship

Thanks to: @micheldumontier::2017-05-19

This Altmetric score

indicates the article is:

• In the 99th percentile (ranked

615th) of the 278,235 tracked

articles of a similar age in all

journals

• In the 95th percentile (ranked

1st )of the 23 tracked articles

of a similar age in Scientific

Data

8

FAIR Data: Systems Biology Survey

Molecular Systems Biology

Volume 11, Issue 12, 28 DEC 2015 DOI: 10.15252/msb.20156053

http://onlinelibrary.wiley.com/doi/10.15252/msb.20156053/full#msb156053-fig-0001

9

FAIR Data: Data Stewardship Survey

Data Stewardship Survey13 Questions – One minute out of your day!

http://bit.ly/BiopharmaDataStewardship

10

Survey: What best describes your department?

65.24.3

8.7

13

4.34.3IT/IS

Target Discovery

Lead Discovery

Clinical Development

Marketing & Sales

Other - Write In

11

Survey: What is your scientific background?

21.7

13

4.3

34.8

8.7

17.4 Experimentalist

Modeler (Structural)

Modeler (Statistical)

Informatician

Project Manager

Other - Write In

12

Survey: How importance is data reuse to your organization?

2 2

4

14

0

2

4

6

8

10

12

14

16

2 3 4 5

2

3

4

5

13

How important are the use of public standards to structuring your data?

2 2

8

9

0

1

2

3

4

5

6

7

8

9

10

2 3 4 5

2

3

4

5

14

Survey: Is integrating internal data a challenge?

95.7

4.3

Yes

No

15

Is integrating external data from partnerships a challenge?

91.3

4.34.3

Yes

No

Don't know

16

Are metadata and data models considered proprietary at your organization?

40

55

5

Yes

No

Don't know

17

What controlled vocabularies and/or ontologies do you use for structuring and

annotating your data and models?

31.6

21.126.3 26.3

36.8

68.4

42.1

21.115.8

52.6

78.9

63.2

15.8 15.8 15.810.5

52.6

15.8

31.6

00

10

20

30

40

50

60

70

80

90

18

Are data usage requirements clearly understood within your organization?

No Yes

1

7

3

2 2 2

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5

0

1

2

3

4

5

19

Is it easy or hard to get access to clinical data in your organization?

Easy Hard

2

4

7

2

0

1

2

3

4

5

6

7

8

2 3 4 5

2

3

4

5

20

Is it easy or hard to get access to clinical metadata in your organization?

Easy Hard

3 3

4 4

1

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5

1

2

3

4

5

21

Survey: Who ‘owns’ clinical data at your organization?

40

13.36.7

26.7

13.3

A drug project team

A clinical area

A third party/vendor

Don't Know/Not Applicable

Other

22

How do you share models and data with your collaborators before publication?

43.8 43.8

6.3

50

12.5 12.5

0

10

20

30

40

50

60

By email Through projectdatabase/content

management system

Through bespoke SystemsBiology platform

Dropbox/Box/SharePoint Software VersioningSystem

Don't know

23

FAIR Data & Biopharma?

Collaborative & Competitive Intelligence:

• Who do we want to partner with? Are there complementary assets to our portfolio?

• What space is too crowded and not our area of expertise?

• Greenfield situations?

Mergers, Acquisitions, Partnerships:

• How do we efficiently and deeply absorb data generated elsewhere into our systems? How

do we efficiently share?

• Does this make a smaller biotech/start-up a more viable partner?

Improved Patient Care:

• Can we share data and outcomes more efficiently in complicated trial settings (basket trials,

adaptive trials) to better engage opinion leaders and foster dialog?

• Along with Differential Privacy approaches, can we have the broader research community

help mine our data?

Data (Ir)-reproducibility:

• Is preclinical data reproducible?

• Can we utilize data credentialization? (thanks to Dan Crowther @ Sanofi)

25

Getting Started

What’s the difference between FAIR Data and Linked Data?

What’s Critical?

• URIs, PURLs

• Standards, vocabularies, cross-mapping

• Access rules

• FAIR-ness metrics

• Data and Information Scientists

FAIR and Enterprise Data Management

Adoption, Sticks and Carrots; Winners and Losers

Linked Data FAIR Data

R&D | RDI

Interoperable: Need clearly recognized• Use the same plumbing and your data won’t be stuck in a silo

Accessible: Open, if permitted• Interoperate first then govern

Reusable: Use public solutions and consortia• Don’t reinvent the wheel (OK—Ontology…)

Invest in FAIR Data Stewardship• Investment to future-proof your efforts

FAIR Data and Collaboration: Take-aways

R&D | RDI

Thanks

Key Influencers

David Wood

Toby Segaran

Tim Berners-Lee

Lee Harland

Bryn Williams-Jones

Eric Neumann

Dean Allemang

Barend Mons

Carole Goble

Bernadette Hyland

Bob Stanley

Eric Little

Michel Dumontier

John Wilbanks

Hans Constandt

Dan Crowther

Tim Hoctor

Bio-IT 2017

Conference Organizers

AZ/MedImmune Linked

Data Community

Kerstin Forsberg

Rajan Desai

Jeff Saltzman

David Ruau

Kathy Reinold

Bridget Behringer

Nirmal Keshava

Sara Dempster

Bryan Takasaki

Nick Wright

David Fenstermacher