2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

18
Precision Medicine Workshop: Big Data Aspects Atul Butte, MD, PhD Director, Institute for Computational Health Science University of California, San Francisco [email protected] @atulbutte @ImmPortDB

Transcript of 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

Page 1: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

Precision Medicine Workshop:Big Data Aspects

Atul Butte, MD, PhD

Director, Institute for Computational Health Science

University of California, San Francisco

[email protected]

@atulbutte

@ImmPortDB

Page 2: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

Disclosures• Scientific founder and

advisory board membership– Genstruct– NuMedii– Personalis– Carmenta

• Honoraria for talks– Lilly– Pfizer– Siemens– Bristol Myers Squibb– AstraZeneca– Roche– Genentech– Warburg Pincus

• Past or present consultancy– Lilly– Johnson and Johnson– Roche– NuMedii– Genstruct– Tercica– Ecoeos– Ansh Labs– Prevendia– Samsung– Assay Depot– Regeneron– Verinata

– Pathway Diagnostics– Geisinger Health– Covance– Wilson Sonsini Goodrich & Rosati– 10X Genomics– Medgenics– GNS Healthcare– Gerson Lehman Group– Coatue Management

• Corporate Relationships– Northrop Grumman– Aptalis– Thomson Reuters– Intel– SAP– SV Angel

• Speakers’ bureau– None

• Companies started by students– Carmenta– Serendipity– NuMedii– Stimulomics– NunaHealth– Praedicat– MyTime– Flipora

Page 3: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

bit.ly/1b4sa7b

Page 4: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

Institute for Computational Health Sciences

Page 5: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

1. Major potential for disparities

• Will you capture any from the 2.2 million incarcerated? Nearly half black?

• The 43 million over age 65? Only 16% over age 65 with income under $50k have a smartphone.

• The 14 million disabled?• The 4 million just born last year?• The 2.6 million that died last year?

Page 6: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

2. Start with a million, or end with a million?

• Keeping it sticky and useful?

Page 7: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

3. Active participants

• If data returned to participants, will they alter their behavior and exposures?

• Can we tell they are doing this?

Page 8: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

4. Not enough power

• So must think early about downstream validation studies.

• Leave one sub-cohort out cross-validation?

• Or are you testing whether every individual gets something out of the approach?

Page 9: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

5. If done right, reproducibility won’t matter

Page 10: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

6. Exploit the network effect

• Connect cohort and data to others, to gain synergy• Need methods to connect data sets,

keep confidentiality

• Not just academic cohorts, also pharma trials?• Maybe recruit at the end of a trial, and gather

starting data from pharma and contract research organizations.

• Maybe start from the 35 million discharged from a hospitalized last year?

• Maybe work with Quest and LabCorp to get existing lab data on patients

Page 11: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

7. Success of the effort depends on 3rd party usage

• Needs to be easy to access and understand data without you.

• Easy to build useful tools and mashup data.• Shouldn’t have to hire an insider or expert to

understand the data.• Of course, the cloud and all modern commercial

tools and services should be allowed.• Put real money into dissemination, do not

assume this will happen correctly.• Beyond data sharing agreements• Difference between Genome and ENCODE

Page 12: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

8. Perfection is the enemy of the good

• Perfection delays data release.• You won’t always make the right choices.• Keep simple things simple (e.g. API), but

complex things possible (e.g. downloading).

• Let others in, access, and build tools, alternative representations.

Page 13: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

9. Data gets stale

• 1500 papers at Nucleic Acids Research on open databases!

• Even reference data sets get stale.• Will soon be a struggle to get eyes on this

data set.• Shelf life from technologies, from

measurements. Freshen data.• Framingham Health Study has great data

on dbGAP. Why aren’t you using it now?

Page 14: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting
Page 15: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

In August, I unveiled the Cancer Genome Anatomy Project -- the comprehensive clearinghouse of information about tens of thousands of cancer genes, which will enable scientists and researchers around the world to work together through a website available on the Internet, to bring us closer a cure.-- Al Gore, 1998

Page 16: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting
Page 17: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

9. Data gets stale

• 1500 papers at Nucleic Acids Research on open databases!

• Even reference data sets get stale.• Will soon be a struggle to get eyes on this

data set.• Shelf life from technologies, from

measurements. Freshen data.• Framingham Health Study has great data

on dbGAP. Why aren’t you using it now?

Page 18: 2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative working group meeting

10. Leave some interesting questions open for others

• Don’t shoot for a whole issue of Science or Nature that tries to answer everything about a million people.

• Leave some of the Nature papers for others.• The real value of this data set will be in the

questions others can see being asked and answered

• Great success stories already with Geisinger, Million Veterans, and many more.

• Create something here that cannot be done by the academic, medical, and private world.