Secondary Use of Healthcare Data for Translational Research

40
Secondary Use of Healthcare Data for Translational Research Shawn Murphy MD, Ph.D. Indivo Conference June 18, 2012

description

An overview of the i2b2 clinical research platform, and the implications of connecting Indivo to i2b2 as a source of patient-reported outcomes. Presented at the 2012 Indivo X Users' Conference.By Shawn Murphy MD, Ph.D., Partners Healthcare.

Transcript of Secondary Use of Healthcare Data for Translational Research

Page 1: Secondary Use of Healthcare Data for Translational Research

Secondary Use of Healthcare Data for Translational Research

Shawn Murphy MD, Ph.D. Indivo Conference June 18, 2012

Page 2: Secondary Use of Healthcare Data for Translational Research

Example: PPARγ Pro12Ala and Diabetes

0.1 0.2

0.3 0.4

0.5 0.6

0.7 0.8

0.9 1

1.1 1.2 Estimated risk

(Ala allele) 1.3 2.0

Deeb et al. Mancini et al.

Ringel et al.

Meirhaeghe et al.

Clement et al.

Hara et al.

Altshuler et al.

Hegele et al.

Oh et al.

Douglas et al.

All studies

Lei et al. Hasstedt et al.

1.4 1.5

1.6 1.7

1.8 1.9

Sample size

Ala is protective

Mori et al.

Overall P value = 2 x 10-7

Odds ratio = 0.79 (0.72-0.86)

Courtesy J. Hirschhorn

Page 3: Secondary Use of Healthcare Data for Translational Research

High Throughput Methods for supporting Translational Research

Set of patients is selected from medical record data in a high throughput fashion

Investigators explore phenotypes of these patients using i2b2 tools and a translational team developed to work specifically with medical record data

Distributed networks cross institutional boundaries for phenotype selection, public health, and hypothesis testing

Working with Indivo for advancement of patient participation in Translational Research

Page 4: Secondary Use of Healthcare Data for Translational Research

High Throughput Methods for supporting Translational Research

Set of patients is selected from medical record data in a high throughput fashion

Investigators explore phenotypes of these patients using i2b2 tools and a translational team developed to work specifically with medical record data

Distributed networks cross institutional boundaries for phenotype selection, public health, and hypothesis testing

Working with Indivo for advancement of patient participation in Translational Research

Page 5: Secondary Use of Healthcare Data for Translational Research

De-identified

Data Warehouse

1) Queries for aggregate patient numbers

0000004 2185793 ... ...

0000004 2185793 ... ...

2) Returns identified patient data

Z731984X Z74902XX ... ...

Real identifiers

Query construction in web tool

Encrypted identifiers

OR - Start with list of specific patients, usually from (1) - Authorized use by IRB Protocol - Returns contact and PCP information, demographics, providers, visits, diagnoses, medications, procedures, laboratories, microbiology, reports (discharge, LMR, operative, radiology, pathology, cardiology, pulmonary, endoscopy), and images into a Microsoft Access database and text files.

- Warehouse of in & outpatient clinical data - 6.0 million Partners Healthcare patients - 1.5 billion diagnoses, medications, procedures, laboratories, & physical findings coupled to demographic & visit data - Authorized use by faculty status - Clinicians can construct complex queries - Queries cannot identify individuals, internally can produce identifiers for (2)

Research Patient Data Registry exists at Partners Healthcare to find patient cohorts for clinical research

Page 6: Secondary Use of Healthcare Data for Translational Research

Organizing data in the Clinical Data Warehouse

Binary Tree

start search

Patient-Concept FACTS patient_key concept_key start_date end_date practitioner_key encounter_key

Patient DIMENSION patient_key patient_id (encrypted) sex age birth_date race

ZIP deceased

Concept DIMENSION concept_key concept_text search_hierarchy

Encounter DIMENSION encounter_key encounter_date

Pract . DIMENSION practitioner_key name service

hospital_of_service

value_type numeric_value textual_value abnormal_flag

Star schema

1500 million

.16 .06 150 6.0

Page 7: Secondary Use of Healthcare Data for Translational Research

Query items Person who is using tool

Query construction

Results - broken down by number distinct of patients

FINDING PATIENTS

Page 8: Secondary Use of Healthcare Data for Translational Research
Page 9: Secondary Use of Healthcare Data for Translational Research

High Throughput Methods for supporting Translational Research

Set of patients is selected from medical record data in a high throughput fashion

Investigators explore phenotypes of these patients using i2b2 tools and a translational team developed to work specifically with medical record data

Distributed networks cross institutional boundaries for phenotype selection, public health, and hypothesis testing

Working with Indivo for advancement of patient participation in Translational Research

Page 10: Secondary Use of Healthcare Data for Translational Research

The National Center for Biomedical Computing entitled Informatics for Integrating Biology and the Bedside (i2b2), what is it?

Software for explicitly organizing and transforming person-oriented clinical data to a way that is optimized for clinical genomics research Allows integration of clinical data, trials data, and genotypic data

A portable and extensible application framework Software is built in a modular pattern that allows additions without

disturbing core parts Available as open source at https://www.i2b2.org

Page 11: Secondary Use of Healthcare Data for Translational Research

i2b2 Cell: The Canonical Software Module

i2b2

HTTP XML (minimum: RESTful)

Business Logic

Data Access

Data Objects

Page 12: Secondary Use of Healthcare Data for Translational Research

An i2b2 Environment (the Hive) is built from i2b2 Cells

A B

C

“Hive” of software services provided by i2b2 cells

remote

local

observation_fact

PK Patient_NumPK Encounter_NumPK Concept_CDPK Observer_CDPK Start_DatePK Modifier_CDPK Instance_Num

End_Date ValType_CD TVal_Char NVal_Num ValueFlag_CD Observation_Blob

visit_dimension

PK Encounter_Num

Start_Date End_Date Active_Status_CD Location_CD*

patient_dimension

PK Patient_Num

Birth_Date Death_Date Vital_Status_CD Age_Num* Gender_CD* Race_CD* Ethnicity_CD*

concept_dimension

PK Concept_Path

Concept_CD Name_Char

observer_dimension

PK Observer_Path

Observer_CD Name_Char

1

∞∞

1

modifier_dimension

PK Modifier_Path

Modifier_CD Name_Char

observation_fact

PK Patient_NumPK Encounter_NumPK Concept_CDPK Observer_CDPK Start_DatePK Modifier_CDPK Instance_Num

End_Date ValType_CD TVal_Char NVal_Num ValueFlag_CD Observation_Blob

visit_dimension

PK Encounter_Num

Start_Date End_Date Active_Status_CD Location_CD*

patient_dimension

PK Patient_Num

Birth_Date Death_Date Vital_Status_CD Age_Num* Gender_CD* Race_CD* Ethnicity_CD*

concept_dimension

PK Concept_Path

Concept_CD Name_Char

observer_dimension

PK Observer_Path

Observer_CD Name_Char

1

∞∞

1

modifier_dimension

PK Modifier_Path

Modifier_CD Name_Char

observation_fact

PK Patient_NumPK Encounter_NumPK Concept_CDPK Observer_CDPK Start_DatePK Modifier_CDPK Instance_Num

End_Date ValType_CD TVal_Char NVal_Num ValueFlag_CD Observation_Blob

visit_dimension

PK Encounter_Num

Start_Date End_Date Active_Status_CD Location_CD*

patient_dimension

PK Patient_Num

Birth_Date Death_Date Vital_Status_CD Age_Num* Gender_CD* Race_CD* Ethnicity_CD*

concept_dimension

PK Concept_Path

Concept_CD Name_Char

observer_dimension

PK Observer_Path

Observer_CD Name_Char

1

∞∞

1

modifier_dimension

PK Modifier_Path

Modifier_CD Name_Char

observation_fact

PK Patient_NumPK Encounter_NumPK Concept_CDPK Observer_CDPK Start_DatePK Modifier_CDPK Instance_Num

End_Date ValType_CD TVal_Char NVal_Num ValueFlag_CD Observation_Blob

visit_dimension

PK Encounter_Num

Start_Date End_Date Active_Status_CD Location_CD*

patient_dimension

PK Patient_Num

Birth_Date Death_Date Vital_Status_CD Age_Num* Gender_CD* Race_CD* Ethnicity_CD*

concept_dimension

PK Concept_Path

Concept_CD Name_Char

observer_dimension

PK Observer_Path

Observer_CD Name_Char

1

∞∞

1

modifier_dimension

PK Modifier_Path

Modifier_CD Name_Char

Data Model

Data Repository Cell

Page 13: Secondary Use of Healthcare Data for Translational Research

I2b2 Software components are distributed as open source

Page 14: Secondary Use of Healthcare Data for Translational Research

Set of patients is selected through Enterprise Repository and data is gathered into a data mart

EDR

Selected patients

Data directly from EDR

Data from other sources

Data imported specifically for project

Automated Queries search for Patients and add Data

Project Specific

Phenotypic Data

Page 15: Secondary Use of Healthcare Data for Translational Research

Interogation can occur through i2b2 web client

Page 16: Secondary Use of Healthcare Data for Translational Research

Data is available through the i2b2 Workbench

Page 17: Secondary Use of Healthcare Data for Translational Research

RPDR

Final Project

DB

RPDR Mart

Local Clinical EDC

Local sources Ex: BICS

Project Manager Biostatistician Analyst

Local data extract analyst Programmer

RPDR Support Programmers

Team support for Projects

Page 18: Secondary Use of Healthcare Data for Translational Research

Natural Language Processing

HOSPITAL COURSE: ... It was recommended that she receive …We also added Lactinax, oral form of Lactobacillus acidophilus to attempt a repopulation of her gut.

SH: widow,lives alone,2 children,no tob/alcohol.

BRIEF RESUME OF HOSPITAL COURSE: 63 yo woman with COPD, 50 pack-yr tobacco (quit 3 wks ago), spinal stenosis, ...

SOCIAL HISTORY: Negative for tobacco, alcohol, and IV drug abuse.

SOCIAL HISTORY: The patient is a nonsmoker. No alcohol.

SOCIAL HISTORY: The patient is married with four grown daughters, uses tobacco, has wine with dinner. Smoker

Non-Smoker

SOCIAL HISTORY: The patient lives in rehab, married. Unclear smoking history from the admission note…

Past Smoker

Hard to pick

Hard to pick

???

Page 19: Secondary Use of Healthcare Data for Translational Research

NLP Workflow

NLP Specialists

I2b2 Project Investigators

Page 20: Secondary Use of Healthcare Data for Translational Research

Research Investigator Workflow enabled by mi2b2

Images Retrieved

from Clinical PACS

BIRN/XNAT

Use i2b2

Request Images with

Accession #’s

Query is done To find patients

Study Images

Derive new data from images

mi2b2

Page 21: Secondary Use of Healthcare Data for Translational Research

Builds complex “Custom Study” displays

Page 22: Secondary Use of Healthcare Data for Translational Research

Ontology-driven data organization allows simplistic data models that paste together

i2b2 DB Project 1

i2b2 DB Project 2

i2b2 DB Project 3

of Project 3

of Project 2

Shared data of Project 1

[ Enterprise Shared Data ]

Ontology

Consent/Tracking

Security

Page 23: Secondary Use of Healthcare Data for Translational Research

Custom views supports clinical trials in i2b2

A Patient-Centered View is welcomed by clinical researchers when they review patients for clinical trials

The Patient-Centered View can be focused to support specific needs of each clinical trial review

New Apps can be written to assist eligibility determination during the viewing of a patient

Page 24: Secondary Use of Healthcare Data for Translational Research

1.

2. SMART Connect

SMART REST

SMART Addition to i2b2

Page 25: Secondary Use of Healthcare Data for Translational Research

A “Patient-Centered View” enabled by SMART

Page 26: Secondary Use of Healthcare Data for Translational Research

Up close …

Page 27: Secondary Use of Healthcare Data for Translational Research

Patient-Centered View can be edited by user

Page 28: Secondary Use of Healthcare Data for Translational Research

High Throughput Methods for supporting Translational Research

Set of patients is selected from medical record data in a high throughput fashion

Investigators explore phenotypes of these patients using i2b2 tools and a translational team developed to work specifically with medical record data

Distributed networks cross institutional boundaries for phenotype selection, public health, and hypothesis testing

Working with Indivo for advancement of patient participation in Translational Research

Page 29: Secondary Use of Healthcare Data for Translational Research

Community Arizona State University Beth Israel Deaconness Hospital, Boston, MA Boston University School of Medicine, Boston, MA Brigham and Women's Hospital, Boston, MA Case Western Reserve Hospital Children's Hospital, Boston, MA (Denver) Children's Hospital, Denver, CO Children's Hospital of Philadelphia, PA Childrens's National Medical Center (GWU) Cincinnati Children's Hospital, Cincinnati, OH Cleveland Clinic, Cleveland, OH (Weil Medical College of) Cornell, NYC, NY Duke Medical College Group Health Cooperative Harvard Pilgrim Healthcare Harvard Medical School, Boston, MA Health Sciences South Carolina Kaiser Permanente Health Kimmel Cancer Center (Thomas Jefferson University) Massachusetts General Hospital, Boston, MA Maine Medical Center, Portland, ME Marshfield Clinic, Wisconsin Morehouse School of Medicine, Atlanta, GA Ohio State University Medical Center, Columbus, OH Oregon Health & Science University, Portland, OR Renaissance Computing Institute, Chapel Hill, NC South Carolina Clinical and Translational Research Institute Tufts Medical Center, Boston, MA University of Alabama University of Arkansas Medical School University of California Davis, Davis, CA University of California San Francisco, SF, CA University of Chicago University of Massachusetts Medical School, Worcester, MA University of Michigan Medical Center, Ann Arbor, MI University of Pennsylvania School of Medicine, Philadelphia, PA University of Rochester Medical Center, Rochester, NY University of Texas Health Sciences Center at Houston, Houston, TX University of Texas Health Sciences Center at San Antonio, SA, TX University of Texas Health Sciences Center Southwestern, Dallas, TX Utah Health Science Center, Salt Lake City, UT University of Washington, Seattle, WA University of Wisconsin Madison Veterans Administration Boston and Utah

Georges Pompidous Hospital, Paris, France Institute for Data Technology and Informatics (IDI), NTNU, Norway Karolinska Institute, Sweden University of Erlangen-Nuremberg, Germany University of Goettingen, Goettingen, Germany University of Leicester and Hospitals, England (Biomed. Res. Informatics Ctr. for

Clin. Sci) University of Pavia, Pavia, Italy University of Seoul, Seoul, Korea

United States International

Page 30: Secondary Use of Healthcare Data for Translational Research

Aggregating across 4 hospitals, 3 i2b2 instances SHRINE (Shared Research Informatics Network) = Distributed Queries

Page 31: Secondary Use of Healthcare Data for Translational Research

Clinical data in SHRINE

10 years (2001-2011) 4 hospitals 6 million total patients >1 billion medical observations

Demographics Diagnoses (ICD9-CM) Medications (RxNorm) Labs (LOINC)

Page 32: Secondary Use of Healthcare Data for Translational Research

2012

Page 33: Secondary Use of Healthcare Data for Translational Research
Page 34: Secondary Use of Healthcare Data for Translational Research

Query Health = I2b2 queries distributed to hospitals that may or may not be hosting i2b2

Page 35: Secondary Use of Healthcare Data for Translational Research

High Throughput Methods for supporting Translational Research

Set of patients is selected from medical record data in a high throughput fashion

Investigators explore phenotypes of these patients using i2b2 tools and a translational team developed to work specifically with medical record data

Distributed networks cross institutional boundaries for phenotype selection, public health, and hypothesis testing

Working with Indivo for advancement of patient participation in Translational Research

Page 36: Secondary Use of Healthcare Data for Translational Research

Integration of i2b2 and Indivo

INDIVO

PATIENT MATCH

DATA FILTER

PATIENT PUBLISH

Page 37: Secondary Use of Healthcare Data for Translational Research

Key new components

PATIENT MATCH

DATA FILTER

PATIENT PUBLISH

Allows patients permission to be attached to their records in i2b2

Allows only specified parts of the i2b2 record on the patient to be viewed by the patient

Allows patients to publish their outcomes to i2b2 = WRITE BACK TO i2b2

Page 38: Secondary Use of Healthcare Data for Translational Research

Advantages for Patients

Able to see data on themselves from medical record Important to learn lessons form Beth Israel – Microsoft Healthvault

venture

Integration can span across several institutions May be able to take advantage of SHRINE infrastructure

SMART will allow integrated view to focus on patient’s illnesses and concerns

Indivo may provide a way to return research results to patients

Page 39: Secondary Use of Healthcare Data for Translational Research

Advantages for Research

Indivo makes it possible to collect Patient Reported Outcomes Can allow engagement of patient for recruitment for Clinical

Trials

Page 40: Secondary Use of Healthcare Data for Translational Research

Collaborators RPDR

Eugene Braunwald John Glaser Diane Keogh Henry Chueh

I2b2 and SMART

Isaac Kohane Susanne Churchill Griffin Weber Michael Mendis Nich Wattanasin Vivian Gainer Lori Phillips Rajesh Kuttan Wensong Pan Janice Donahue William Simons (SHRINE) Andy McMurry (SHRINE) Doug McFadden (SHRINE) Ken Mandl (SMART) Josh Mandel (SMART)

Medical Imaging (mi2b2) Christopher Herrick David Wang Bill Wang

i2b2 Driving Biology Projects

Vivian Gainer Victor Castro Raul Guzman Robert Plenge Scott Weiss Stan Shaw John Brownstein Qing Zeng Guergana Savova

Query Health

Jeff Klann