Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

32
Creating Dynamic Groupers Using Overrepresentation of Clinical Terms Tomasz Adamusiak MD PhD Froedtert & Medical College of Wisconsin

description

Presented at Epic's Research Advisory Council, April 3, 2014, Verona, WI See a novel approach to query expansion based on pre-existing structured information within the EHR. Presenters adopted over-representation analysis to find statistically significant associations among the clinical terms extracted from Clarity reports. The study population consisted of over 7,000 patients and their 12 million observations - including labs, medications, phenotypes, diseases, and procedures. See the detailed findings and discuss computational and terminology challenges.

Transcript of Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Page 1: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Tomasz Adamusiak MD PhD

Froedtert & Medical College of Wisconsin

Page 2: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

2

Page 3: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Conflict of interest disclosure

Tomasz Adamusiak has no real or apparent conflicts of interest to report

3

Page 4: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Learning objectives

• Recognize the value of structured clinical information

• Identify computational and terminology challenges in big data analytics

• Evaluate how this approach applies to different use cases

4

Page 5: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

What is a grouper?

Lists of specific values derived from standard vocabularies used to define clinical concepts, e.g. patients with diabetes

• SNOMED CT concepts

• ICD-9/10 codes

• EDG terms

• CQM Value Sets

5

Page 6: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Diabetes: Eye Exam CMS eMeasure: CMS131v2

Value Set Name

Diabetes

Type Grouping

Steward National Committee for Quality Assurance

Program CMS,MU2 EP Update 2013-06-14

… … …

190330002 Diabetes mellitus, juvenile type, with hyperosmolar coma (disorder)

SNOMEDCT

250 Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled

ICD9CM

E10.10 Type 1 diabetes mellitus with ketoacidosis without coma

ICD10CM

6

Page 7: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Mining associations in EHR data

Diabetes mellitus

Yes No

Glucohemoglobin measurement

Yes 1509 5442

No 881 99

7

Positive association

Background reference

Page 8: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Dynamic = expansion + association

8

CPT-4 83036

ICD10 E08-E13

Page 9: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Extract-Load-Transform

9

Page 10: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Transformation in ClinMiner https://clinminer.hmgc.mcw.edu user:epicdemo pass:epicdemo

10

This image by Tomasz Adamusiak is licensed under a CC BY 3.0 US license

ClinMiner is a non-commercial, prototype software

Page 11: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Pilot: test all possible diabetes associations

11

8k patients

12M observations

Labs (CPT-4/LOINC)

Medications (RxNorm)

Problems (ICD-9)

Procedures (CPT-4)

18 764 terms 162 significant

associations

Page 12: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Summarize, but normalize per patient 1 + 1 = 1

12

Parent Concepts

ICD-10-CM

Page 13: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Relatively straightforward in ICD

13

Parent Concepts

ICD-10-CM

Page 14: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Caveat: flat hierarchy results in disconnected clinical contexts

Q: All tuberculosis codes

• 010-018.99 TUBERCULOSIS

• 137 Late effects of tuberculosis

• 647.3 Tuberculosis complicating pregnancy childbirth or the puerperium

14

Page 15: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Expansion has to take into account multiple inheritance in SNOMED CT

15

SNOMED CT

Parent Concepts

Page 16: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Pieter Brueghel the Elder (1526/1530–1569) [Public domain], via Wikimedia Commons

In pursuit of a single language

16

Page 17: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Integrating terminologies with UMLS

Donald A.B. Lindberg, M.D.

Clinical

Terminologies

UMLS

17

Page 18: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

UMLS is ideal for integration of heterogeneous clinical data

• Single entry point to MU terminologies

• Cross-walk between MU terms

• Terminology-agnostic

• Text-mining

18

Page 19: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

UMLS

Exanthema C0015230

SNOMED CT

ICD-10-CM

UMLS establishes equivalence mappings across biomedical terminologies

SNOMED CT

rash NOS

ICD-10:R21

Cutaneous eruption

SCT:112625008

Eruption

SCT:1806006

Page 20: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

UMLS

Exanthema C0015230

SNOMED CT

ICD-10-CM

UMLS establishes equivalence mappings across biomedical terminologies

SNOMED CT

Cutaneous eruption

SCT:112625008

rash NOS

ICD-10:R21

Eruption

SCT:1806006

Page 21: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

6o of terminological Kevin Bacon

Acute myocardial infarction

Myocardial ischemia

Vascular Diseases

Disorder of soft tissue

Collagen Diseases

Connective Tissue Diseases

Epidermal and dermal conditions

Skin and subcutaneous tissue disorders

Dermatologic disorders

21

Page 22: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Expansion limited to MU terminologies and by semantic type

22

Finding

Disease or Syndrome

Ignore

Page 23: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Open issue: cycles due to subtle differences in meaning

23

Immune System

Endocrine System

Page 24: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Expansion in UMLS across MU sources

24

Diabetes mellitus without mention of complication,

type II or unspecified type, not stated as

uncontrolled

ICD-9

ICD-10

SNOMED CT

NDF-RT

Situation with explicit

context

Metabolic diseases

roots:

Page 25: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Statistical methods for establishing over/under-representation

• Serial contingency tables

• Chi-squared test with Bonferroni correction

• RR estimate of effect size

• Test diabetes in all 18 764 concept pairs

25

Page 26: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

EHR-based association rule mining

Diabetes mellitus (C0011849)

Yes No

Glucohemoglobin measurement

(C0202054)

Yes 1509 5442

No 881 99

26

Positive association

Background reference

Page 27: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Other positive associations

• C0785704 Blood glucose monitoring equipment

• C0935929 Antidiabetics

• C0304870 Insulin, Long-Acting

• C0770893 Metformin hydrochloride

• C0011882 Diabetic Neuropathies

• C0011880 Diabetic Ketoacidosis

• C0011884 Diabetic Retinopathy

27

Expansion generalization on

class or system level

Page 28: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

A non-representative control background can bias the findings

Diabetes inversely associated with

• C1314183 Special EEG tests

• C0242953 Barbiturate hypnotic

• C0064636 lamotrigine

• C1719410 Epilepsy and recurrent seizures

28

Page 29: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Open issue: reconciling lab orders with results

Clinical Laboratory

Hemoglobin A1c/Hemoglobin .total in Blood by

HPLC

LOINC:17856-6

Hemoglobin; glycosylated (A1C)

CPT-4:83036

29

Page 30: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Challenges

• Availability of correctly and exhaustively coded data

• Expansion with multiple inheritance memory intensive

• Testing all possible (180M) combinations computationally expensive

30

Page 31: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

What can we learn from other industries?

31

Page 32: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Thank You!

Tomasz Adamusiak MD PhD

Human and Molecular Genetics Center

Medical College of Wisconsin

[email protected]

@7omasz

For more information

• Next-generation phenotyping using the Unified Medical Language System (UMLS). Adamusiak T, Shimoyama N, Shimoyama M, JMIR Med Inform. doi:10.2196/medinform.3172

• EHR-based phenome wide association study in pancreatic cancer. Adamusiak T, Shimoyama M, AMIA Summits Transl Sci Proc. 2014 (in press)