BioPharma and FAIR Data, a Collaborative Advantage

Click here to load reader

  • date post

    23-Jan-2018
  • Category

    Science

  • view

    169
  • download

    1

Embed Size (px)

Transcript of BioPharma and FAIR Data, a Collaborative Advantage

  • BioPharma Adoption of FAIR* Data, a Collaborative Advantage

    Tom Plasterer, PhDResearch & Development Information (RDI); US Cross-Science Director 25 May 2017

    * Findable, Accessible, Interoperable and Reusable

  • The right data is there when I need it

    Your data and my data are mutually understandable

    Our data can be effortlessly combined

    I am permitted to use any data I can access

    Data can be reshaped for a different purpose

    Data sharing is rewarded

    I can be a human or a machine

    3

    We Want Data Nirvana!

  • 4

    FAIR Data: Overview

    To be Findable:

    Globally unique, resolvable and persistent identifiers

    Machine-actionable contextual information supporting discovery

    To be Accessible:

    Clearly defined access protocol

    Clearly defined rules for authorization/authentication

    To be Interoperable:

    Use shared vocabularies and/or ontologies

    Syntactically and semantically machine-accessible format

    To be Reusable:

    Be compliant with the F, A and I Principles

    Contextual information, allowing proper interpretation

    Rich provenance information facilitating accurate citation

    Mark Wilkinson, Data Interoperability and FAIRness Through Existing Web Technologies

    http://www.slideshare.net/markmoby/fair-data-prototype-interoperability-and-fairness-through-a-novel-combination-of-web-technologies

  • 5

    FAIR Data: A Brief History

    Moving away from Narrative

    Nanopublications

    Incubating Standards in Open PHACTS

    VoID, PROV-O

    Lorentz Center Workshop

    FORCE 11 FAIR Guiding Principles

    Participants: IMI members, US researchers,

    Content providers, ELIXIR; European Open

    Science Cloud, Big Data to Knowledge (BD2K)

    Current Status:

    FAIR Data Workshops (EU-ELIXIR nodes, Bio-IT)

    Inclusion in Horizon 2020, NIH Advocacy

    IMI2 Data FAIR-ification Call

    Vendors getting up to speed

  • 6

    Rapid Adoption of Principles

    Developed and endorsed by researchers, publishers, funding agencies, industry partners.

    As of May 2017,

    100+ citations since 2016 publication

    Included in G20 communique, EOSC, H2020, NIH, and more

    Thanks to: @micheldumontier::2017-05-19

  • 7

    Introductory Nature Paper: The FAIR Guiding Principles for scientific data management and stewardship

    Thanks to: @micheldumontier::2017-05-19

    This Altmetric score

    indicates the article is:

    In the 99th percentile (ranked

    615th) of the 278,235 tracked

    articles of a similar age in all

    journals

    In the 95th percentile (ranked

    1st )of the 23 tracked articles

    of a similar age in Scientific

    Data

  • 8

    FAIR Data: Systems Biology Survey

    Molecular Systems Biology

    Volume 11, Issue 12, 28 DEC 2015 DOI: 10.15252/msb.20156053

    http://onlinelibrary.wiley.com/doi/10.15252/msb.20156053/full#msb156053-fig-0001

    http://onlinelibrary.wiley.com/doi/10.1002/msb.v11.12/issuetochttp://onlinelibrary.wiley.com/doi/10.15252/msb.20156053/full#msb156053-fig-0001

  • 9

    FAIR Data: Data Stewardship Survey

    Data Stewardship Survey13 Questions One minute out of your day!

    http://bit.ly/BiopharmaDataStewardship

    http://bit.ly/BiopharmaDataStewardship

  • 10

    Survey: What best describes your department?

    65.24.3

    8.7

    13

    4.34.3IT/IS

    Target Discovery

    Lead Discovery

    Clinical Development

    Marketing & Sales

    Other - Write In

  • 11

    Survey: What is your scientific background?

    21.7

    13

    4.3

    34.8

    8.7

    17.4 Experimentalist

    Modeler (Structural)

    Modeler (Statistical)

    Informatician

    Project Manager

    Other - Write In

  • 12

    Survey: How importance is data reuse to your organization?

    2 2

    4

    14

    0

    2

    4

    6

    8

    10

    12

    14

    16

    2 3 4 5

    2

    3

    4

    5

  • 13

    How important are the use of public standards to structuring your data?

    2 2

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    2 3 4 5

    2

    3

    4

    5

  • 14

    Survey: Is integrating internal data a challenge?

    95.7

    4.3

    Yes

    No

  • 15

    Is integrating external data from partnerships a challenge?

    91.3

    4.34.3

    Yes

    No

    Don't know

  • 16

    Are metadata and data models considered proprietary at your organization?

    40

    55

    5

    Yes

    No

    Don't know

  • 17

    What controlled vocabularies and/or ontologies do you use for structuring and

    annotating your data and models?

    31.6

    21.126.3 26.3

    36.8

    68.4

    42.1

    21.115.8

    52.6

    78.9

    63.2

    15.8 15.8 15.810.5

    52.6

    15.8

    31.6

    00

    10

    20

    30

    40

    50

    60

    70

    80

    90

  • 18

    Are data usage requirements clearly understood within your organization?

    No Yes

    1

    7

    3

    2 2 2

    0

    1

    2

    3

    4

    5

    6

    7

    8

    0 1 2 3 4 5

    0

    1

    2

    3

    4

    5

  • 19

    Is it easy or hard to get access to clinical data in your organization?

    Easy Hard

    2

    4

    7

    2

    0

    1

    2

    3

    4

    5

    6

    7

    8

    2 3 4 5

    2

    3

    4

    5

  • 20

    Is it easy or hard to get access to clinical metadata in your organization?

    Easy Hard

    3 3

    4 4

    1

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    4.5

    1 2 3 4 5

    1

    2

    3

    4

    5

  • 21

    Survey: Who owns clinical data at your organization?

    40

    13.36.7

    26.7

    13.3

    A drug project team

    A clinical area

    A third party/vendor

    Don't Know/Not Applicable

    Other

  • 22

    How do you share models and data with your collaborators before publication?

    43.8 43.8

    6.3

    50

    12.5 12.5

    0

    10

    20

    30

    40

    50

    60

    By email Through projectdatabase/content

    management system

    Through bespoke SystemsBiology platform

    Dropbox/Box/SharePoint Software VersioningSystem

    Don't know

  • 23

    FAIR Data & Biopharma?

    Collaborative & Competitive Intelligence:

    Who do we want to partner with? Are there complementary assets to our portfolio?

    What space is too crowded and not our area of expertise?

    Greenfield situations?

    Mergers, Acquisitions, Partnerships:

    How do we efficiently and deeply absorb data generated elsewhere into our systems? How

    do we efficiently share?

    Does this make a smaller biotech/start-up a more viable partner?

    Improved Patient Care:

    Can we share data and outcomes more efficiently in complicated trial settings (basket trials,

    adaptive trials) to better engage opinion leaders and foster dialog?

    Along with Differential Privacy approaches, can we have the broader research community

    help mine our data?

    Data (Ir)-reproducibility:

    Is preclinical data reproducible?

    Can we utilize data credentialization? (thanks to Dan Crowther @ Sanofi)

  • 25

    Getting Started

    Whats the difference between FAIR Data and Linked Data?

    Whats Critical?

    URIs, PURLs

    Standards, vocabularies, cross-mapping

    Access rules

    FAIR-ness metrics

    Data and Information Scientists

    FAIR and Enterprise Data Management

    Adoption, Sticks and Carrots; Winners and Losers

    Linked Data FAIR Data

  • R&D | RDI

    Interoperable: Need clearly recognized Use the same plumbing and your data wont be stuck in a silo

    Accessible: Open, if permitted Interoperate first then govern

    Reusable: Use public solutions and consortia Dont reinvent the wheel (OKOntology)

    Invest in FAIR Data Stewardship Investment to future-proof your efforts

    FAIR Data and Collaboration: Take-aways

  • R&D | RDI

    Thanks

    Key Influencers

    David Wood

    Toby Segaran

    Tim Berners-Lee

    Lee Harland

    Bryn Williams-Jones

    Eric Neumann

    Dean Allemang

    Barend Mons

    Carole Goble

    Bernadette Hyland

    Bob Stanley

    Eric Little

    Michel Dumontier

    John Wilbanks

    Hans Constandt

    Dan Crowther

    Tim Hoctor

    Bio-IT 2017

    Conference Organizers

    AZ/MedImmune Linked

    Data Community

    Kerstin Forsberg

    Rajan Desai

    Jeff Saltzman

    David Ruau

    Kathy Reinold

    Bridget Behringer

    Nirmal Keshava

    Sara Dempster

    Bryan Takasaki

    Nick Wright

    David Fenstermacher