Evaluation Considerations for EHR-Based Phenotyping

Evaluation considerations for EHR-based phenotyping algorithms: A case study for Drug Induced Liver Injury

Casey Overby, Chunhua Weng, Krystl Haerian, Adler Perotte Carol Friedman, George Hripcsak

Department of Biomedical Informatics

Columbia University

AMIA TBI paper presentation March 20th, 2013 !

Published Genome-Wide Associations through 09/2011 1,596 published GWA at p≤5X10-8 for 249 traits

NHGRI GWA Catalog www.genome.gov/GWAStudies

Success in part due to GWAS consortia to obtain needed sample sizes

Background and Motivation

There are added challenges to studying pharmacological traits

  Drug response is complex   Risk factors in pathogenesis of drug induced liver injury (DILI)

  Sample sizes are small compared to typical association studies of common disease   Adverse drug events   Responder types

Source: Tujios & Fontana et al. Nat. Rev. Gastroenterol. Hepatol. 2011

Consortium recruitment approaches

  Recruit and phenotype participants prospectively   Protocol driven recruitment

  Electronic health records (EHR) linked with DNA biorepositories   EHR phenotyping

Successes developing EHR algorithms within eMERGE

  Type II diabetes   Peripheral arterial disease   Atrial fibrillation   Crohn disease   Multiple sclerosis   Rheumatoid arthritis

  High PPV

(Kho et al. JAMIA 2012; Kullo et al. JAMIA 2010; Ritchie et al. AJHG 2010; Denny et al. Circulation 2010; Peissig et al. JAMIA 2012)

Source: www.phekb.org

Unique characteristics of DILI

  Rare condition of low prevalence   Complex condition

  Drug is causal agent of liver injury   Different drugs can cause DILI   Pattern of liver injury varies between drug

  Pattern of liver injury based on liver enzyme elevations   No tests to confirm drug causality (some assessment tools exist)

  High PPV may be challenging

Why study?   DILI accounts for up to 15 % of acute liver failure cases in the

U.S., of which 75% requires liver transplant or lead to death   Most frequent reason for withdrawal of approved drugs from the

market

  Lack understanding of underlying mechanisms of DILI   Computerized approaches can reduce the burden of

traditional approaches to screening for rare conditions (Jha AK et al. JAMIA 1998; Thadani SR et al. JAMIA 2009)

Overview of EHR phenotyping process

Case definition

Case definition (Re-)Design

EHR Phenotyping algorithm

e.g., liver injury e.g., ICD-9 codes for acute liver injury, Decreased liver function lab

Implement EHR Phenotyping

algorithm

Evaluate EHR Phenotyping

algorithm

Disseminate EHR Phenotyping

algorithm

If algorithm needs

improvement

If algorithm is sufficient to be

useful

Overview of methods to develop & evaluate initial algorithm

DILI Case definition (iSAEC)

Design EHR Phenotyping

algorithm

Methods and Results

Overview of methods to develop & evaluate initial algorithm

algorithm

Develop an evaluation framework

Report lessons learned

Methods and Results

algorithm

Lessons inform evaluator approach and algorithm design changes

Lessons learned

DILI case definition 1.  Liver injury diagnosis (A1)

a.  Acute liver injury (C1-C4) b.  New liver injury (B)

2.  Caused by a drug a.  New drug (A2) b.  Not by another disease (D)

A1. Diagnosed with liver

injury?

Clinical data warehouse

B. New liver

injury?

Consider chronicity

C2. ALT >= 5x ULN

Patients meeting drug induced liver injury criteria

A2. Exposure to

C1. ALP >= 2x ULN

C3. ALT >= 3x ULN

C4. Bilirubin >=

2x ULN

Exclude

D. Other

diagnoses?yes

Exclude

18,423

13,972

Initial DILI EHR phenotyping algorithm

Methods and Results

Ref: Aithal, G.P., et al. Case Definition and Phenotype Standardization in Drug-induced Liver Injury. Clin Charmacol Ther. 2011 Jun; 89(6):806-15

  TP: 27

  FP: 42

  NA: 30

  PPV: TP/(TP+FP) = 27/(42+27) = 39%

  Preliminary kappa coefficient: 0.50 (Moderate agreement)

  Interpretation of PPV is unclear given moderate agreement among reviewers

Initial algorithm results: 100 randomly selected for manual review from 560 patients

Estimated positive predictive value

20 Reviewer 2

Reviewer 1

Reviewer 3

Reviewer 4

Methods and Results

Included measurement and demonstration studies

  Measurement study goal   “to determine the extent and nature of the errors with which a measurement

is made using a specific instrument.”   Evaluator effectiveness

  Demonstration study goal   “establishes a relation – which may be associational or causal – between a set

of measured variables.”   Algorithm performance

Definitions from: “Evaluation methods in medical informatics” Friedman & Wyatt 2006

Methods and Results

Included quantitative and qualitative assessment

  Quantitative data   Inter-rater reliability assessment   PPV

  Qualitative data   Perceptions of evaluation approach effectiveness

  e.g., review tool, artifacts reviewed

  Perceptions of benefit of results   e.g., correct for the case definition?

Methods and Results

An evaluation framework and results Measurement study

(evaluator effectiveness) Demonstration study

(algorithm performance)

s Kappa coefficient: 0.50 TP: 27 FP: 42 NA: 30 PPV: TP/(TP+FP) = 39%

Perceptions of evaluation approach effectiveness: •  Differences between evaluation

platforms •  Visualizing lab values •  Availability of notes

•  Discharge summary vs. other notes

Perceptions of benefit of results (themes in FPs): •  Babies •  Patients who died •  Overdose patients •  Patients who had a liver transplant

Methods and Results

Lesson learned: What’s correct for the algorithm may not be correct for the case definition

  Are we measuring what we mean to measure?

  Case definition: liver injury due to medication, not by another disease

  Many FPs were transplant patients   Patients correct for the algorithm, but liver enzymes elevated due to procedure

  Revision: exclude transplant patients

Lessons learned

Improved algorithm design given themes in FPs

  Added exclusions   Babies   Overdose patients   Patients who died   Transplant patients

Lessons learned “A collaborative approach to develop an EHR phenotyping algorithm for DILI” in preparation

Lesson learned: Evaluator effectiveness influences ability to drawing appropriate inferences about algorithm performance

  How does our evaluation approach influence performance estimations?

  Moderate agreement among algorithm reviewers, so interpretation of PPV unclear

  Revision: Improve evaluator approach

Lessons learned

Improved evaluator approach

  Consensus among 4 reviewers

  Assign TP/FP status by 1.  First-pass review of temporal relationship   Assign preliminary TP, FP, unknown status

2.  Perform Chart review   Confirm suspected TPs   Assign TP/FP if unknown status in first pass

review

●●

●●●●

●●●●●●●●●●●●●●●

●●●●

●●●●●●●●

●●●●

●●●●●●●

●●●●●

●●●

●●●●●●●

●●●●●

●●●

●●●●●●

●● ●●

●●●●●●

● ● ●

● ●●

2012−0

3−15

2012−0

3−20

2012−0

3−25

2012−0

3−30

2012−0

4−05

2012−0

4−10

2012−0

4−15

2012−0

4−20

2012−0

4−26

2012−0

5−01

2012−0

5−06

2012−0

5−11

2012−0

5−17

2012−0

5−22

2012−0

5−27

2012−0

6−01

2012−0

6−07

2012−0

6−12

2012−0

6−17

2012−0

6−22

2012−0

6−28

2012−0

7−03

2012−0

7−08

2012−0

7−13

2012−0

7−19

●●●●●●●●●●●●●●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ● ●●●●●●●●● ●●●●●●●● ● ● ● ● ● ● ●0

2012−0

3−15

2012−0

3−20

2012−0

3−25

2012−0

3−30

2012−0

4−05

2012−0

4−10

2012−0

4−15

2012−0

4−20

2012−0

4−26

2012−0

5−01

2012−0

5−06

2012−0

5−11

2012−0

5−17

2012−0

5−22

2012−0

5−27

2012−0

6−01

2012−0

6−07

2012−0

6−12

2012−0

6−17

2012−0

6−22

2012−0

6−28

2012−0

7−03

2012−0

7−08

2012−0

7−13

2012−0

7−19

●●

●●●

●●

●●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●●●●

●●

●●●●● ●

●●●●●●●●●●●●●

●●●●●●●●●

●●

●●●

●●●●●

●●●0

2012−0

3−15

2012−0

3−20

2012−0

3−25

2012−0

3−30

2012−0

4−05

2012−0

4−10

2012−0

4−15

2012−0

4−20

2012−0

4−26

2012−0

5−01

2012−0

5−06

2012−0

5−11

2012−0

5−17

2012−0

5−22

2012−0

5−27

2012−0

6−01

2012−0

6−07

2012−0

6−12

2012−0

6−17

2012−0

6−22

2012−0

6−28

2012−0

7−03

2012−0

7−08

2012−0

7−13

2012−0

7−19

Lessons learned “A collaborative approach to develop an EHR phenotyping algorithm for DILI” in preparation

Summary of findings

  Lessons learned from applying our evaluation framework   What’s correct for the algorithm may not be correct for the case definition (Are we

measuring what we mean to measure?)   Evaluator effectiveness influences ability to draw appropriate inferences about algorithm

performance

  Demonstrated that our evaluation framework is useful   Informed improvements in algorithm design   Informed improvements in evaluator approach   Likely more useful for rare conditions

  Demonstrated EHR phenotyping algorithm development is an iterative process   Complexity of the algorithm may influence

Acknowledgments

  Dr. Yufeng Shen - Serious Adverse Event Consortium collaborator

  eMERGE collaborators   Mount Sinai (Drs. Omri Gottesman, Erwin Bottinger, and Steve Ellis)   Mayo Clinic (Drs. Jyotishman Pathak, Sean Murphy, Kevin Bruce, Stephanie Johnson,

Jay Talwalker, Christopher G. Chute, Iftikhar J. Kullo)   Northwestern (Dr. Abel Kho)   Vanderbilt (Dr. Josh Denny)

  DILIN collaborator   UNC-CH (Dr. Ashraf Farrag)

  Columbia Training in Biomedical Informatics (NIH NLM #T15 LM007079)

  The eMERGE network U01 HG006380-01 (Mount Sinai)

Questions?

Casey L. Overby casey@dbmi.columbia.edu

Measurement study Demonstration study

s Kappa coefficient: 0.50 TP: 27 FP: 42 NA: 30 PPV: TP/(TP+FP) = 39%

Perceptions of evaluation approach effectiveness: •  Differences between evaluation

platforms •  Visualizing lab values •  Availability of notes

•  Discharge summary vs. other notes

Perceptions of benefit of results (themes in FPs): •  Babies •  Patients who died •  Overdose patients •  Patients who had a liver transplant

algorithm

Evaluation Considerations for EHR-Based Phenotyping

Documents

Transcript of Evaluation Considerations for EHR-Based Phenotyping

HL7 “EHR SD RM” Project “EHR System Design … · Volume II demonstrates HL7’s EHR System Design Reference Model (EHR-SD RM) – Linking EHR System Functional Model, ... OpenSSO

Combining high-throughput phenotyping and genome-wide ... · demonstrate that high-throughput phenotyping has the potential to replace traditional phenotyping and serves as a novel

EHR Go Guide: Insurance in the EHR€¦ · EHR Go Guide: Insurance in the EHR G1009.3 1 Archetype Innovations LLC ©2019 EHR Go Guide: Insurance in the EHR Introduction The Insurance

Privacy and security considerations for EHR incentives and ... › media › pdf › publications › May10_Advisor.pdf · the incentive payment provisions in the Health Information

European initiatives on phenotyping platforms

EHR-Based Phenotyping: Bulk Learning and Evaluaon (with … · Bulk Learning is a batch-phenotyping framework that uses multiple diseases collectively (i.e. bulk learning set) ...

Next-generation Phenotyping Using Interoperable Big · PDF fileNext-generation Phenotyping Using Interoperable Big Data ... Vaccine safety surveillance ... Next-generation Phenotyping

Improving EHR Data Quality with Automated Phenotyping€¦ · Multiple Sclerosis Bipolar Disorder Consent Status 1 6. 95% Specificity 1. Create a gold standard training set. 2. Create

Physician Considerations in CPOE implementationscsohio.himsschapter.org/sites/himsschapter/files/Chapter...Building a Physician Foundation in EHR Implementations Physician Uses in

TLI 2012: Drought phenotyping for legumes

Center for Computational Molecular Phenotyping

(Forward Scattering Phenotyping) using BARDOT

EHR Planning Your EHR Implementation

Standards for eHR - eHealth Consortium · Handling – interim eHR record creation VS record completion 18 Record Completion eHR PMI database eHR PMI identifiers eHR PMI record (interim

3g Cassavabase workshop: manage phenotyping

Genomic Selection & Precision Phenotyping

Next-generation phenotyping

Remote Sensing for Phenotyping - WordPress.com

EHR სისტემის მომხმარებლის (ექიმის) სახელმძღვანელო ... · ehr სისტემის მომხმარებლის

and Phenotyping Research - BFA