COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman...

27
COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    230
  • download

    0

Transcript of COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman...

Page 1: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

COMPLEX QUESTION ANSWERING BASED ON

A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE

Dina Demner-Fushman

24 August 2006

Page 2: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

2

Informed clinical decision making

Evidence Based Medicine• Combine:

– the best published medical research findings

– clinical judgment– expertise and experience

• Use systematic approach:– Formulate specific and

relevant questions– Know where to look for an

answer– Answer questions

iteratively

clinical stateand circumstances

patient’spreferences

information resources

Page 3: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

3

Outline

• Motivation • Evidence Based Medicine• Hypotheses• Clinical Question Answering system• Evaluation

– System components• extractors• Document re-ranking

– Answers• Multi-tier answers• Best answers

• Contributions, Limitations, Future work

Page 4: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

4

Real-life questions

• How do we diagnose prostatitis?– Interactive multi-level answers:

Diagnosis of chronic abacterial prostatitis: evaluate infection inflammation biochemistry ultrasonography …

• How much better is Amiodarone in controlling fast atrial fibrillation with rapid ventricular response compared to cardizem?

• Zosyn dosage regimens: 3.375g or 4.5g?– Best answer:

The usual total daily dose of Zosyn for adults is 3.375 g every six hours. Patients with nosocomial pneumonia should start with Zosyn at a dosage of 4.5 g …

Page 5: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

5

Evidence search and appraisal

• Convert information needs into focused questions

• Track down the best evidence with which to answer them

• Identify “bottom-line” recommendations, supporting evidence, and its strength

Page 6: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

6

Information sources

• Paper (books, desk references, journals): 50%• Colleagues: 40%• Clinical librarians or services: 32%• Online Resources: 25%

– Primary sources:Bibliographic databases (MEDLINE)

– Secondary sources:Systematic reviews (Cochrane collaboration, American College

of Physicians Journal Club)

Databases of expert answers to clinical questions (FPIN, BMJ Clinical Evidence)

Page 7: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

7

Question frameHow much better is Amiodarone in controlling fast atrial fibrillation with rapid ventricular response compared to cardizem?

Etiology Diagnosis

Task: Therapy Prognosis

Population: [unspecified]Problem: Atrial fibrillation with

rapid ventricular response Intervention 1: Amiodarone Intervention 2: Cardizem Outcome: achieve rate control

Users' Guides to Evidence-based Medicine (JAMA series)

P

ICO

Page 8: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

8

Strength of Evidence

• A– Meta-Analysis– Randomized Controlled Trials

• B – Cross-Sectional Studies– Retrospective Studies

• C– Case Report– Animal Studies

Page 9: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

9

Hypotheses

• Document frames based on three main EBM components (clinical task, PICO, SoE) are sufficient to answer questions

– Document frames could be generated using a hybrid statistical/knowledge-based approach to leverage existing resources

– Complex clinical questions could be answered through semantic matching of the question-document frames

Page 10: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

10

MEDLINE

Clinical QA system architecture

Entities & relations

annotation

PubMed Document Retrieval

Query termsE-Utilities

citationsMetaMap SemRep

Semantic matchingAnswer

Generation

Document frame

Question frame

Answer

Annotated citations

PICO Query Formulation

UMLSEBM

Domain Model

Semantic processing

Knowledge Extraction

Clinical Task Classification

Strength of Evidence

Classification

Page 11: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

11

Component architecture

Search Engine Wrapper

Question Processing

MetaMap Wrapper

Semantic Matcher

Answer Generator

Citations

Query

MEDLINEAnnotated Document

Semantic processor

Task Classifier

Strength of Evidence Classifier

Problem Extractor

Population Extractor

Intervention Extractor

Outcome Extractor

ESearchEFetch

Question frame

Document frame

Answer

Page 12: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

12

Semantic processor

Task Classifier

Strength of Evidence Classifier

Problem Extractor

Population Extractor

Intervention Extractor

Outcome Extractor

Semantic processing example

… Patients with atrial fibrillation (n = 57), … were randomly assigned to one of three intravenous treatment regimens.

Amiodarone versus diltiazem for rate control in critically ill patients with atrial tachyarrhythmias.

Group 1 received diltiazem … group 2 received amiodarone ….

Sufficient rate control can be achieved in critically ill patients with atrial tachyarrhythmias using either diltiazem or amiodarone …

Task: Therapy

Strength of Evidence: A (RCT)

Page 13: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

13

Outcome Extractor

Classifiers

Cue-terms

Naïve Bayes

N-gram

Position

Heuristic

Length

Multiple Linear

Regression

Score: 0.99Sufficient rate control can be achieved in critically ill patients with atrial tachyarrhythmias using either diltiazem or amiodarone.

Score: 0.75Although diltiazem allowed for significantly better 24-hr heart rate control, this effect was offset by a significantly higher incidence of hypotension requiring discontinuation of the drug.

Problem Extractor

Population Extractor

Intervention Extractor

Training: 275 manually annotated abstracts

(x)PαMLR(x) k

K

kk

Page 14: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

14

Extractor accuracyExtractor Test

set N=Correct Unknown Wrong

Problem 50 90% 5% 5%

Population 100 79% 11% 10%

Intervention 100 77% 23%

Outcome 358 90% 10%

Page 15: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

15

Outline

• Motivation • Evidence Based Medicine• Hypotheses• Clinical Question Answering system• Evaluation

– System components• extractors• Document re-ranking

– Answers• Multi-level answers• Best answers

• Contributions, Limitations, Future work

Page 16: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

16

Semantic Matcher Doc3 frameProblem: atrial fibrillationIntervention: coronary surgery Outcome: ……Outcome score: Pico score: Task: THERAPY score: SoE score:

Document re-ranking

Question frameTask: THERAPYProblem: atrial fibrillation Intervention: Amiodarone

Cardizem

Doc Score = λPSPICO + λSSSoE + λTSTask

SPICO = λpSproblem + λptSpopulation

+ λiSintervention + λoSoutcome

SSoE = λjSjournal + λsSstudy + λdSdate

STask=ΣλiTask_Indicator(i)

Doc2 frameProblem: arterial hypertensionIntervention: Warfarin Outcome: ……Outcome score: Pico score: Task: THERAPY score: SoE score:

Doc1 frameProblem: atrial fibrillation Intervention: diltiazem, amiodaroneOutcome: ……Outcome score: 0.79Pico score: 0.89Task: THERAPY score: 0.64SoE score: 0.32

Page 17: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

17

Document re-ranking evaluation

baseline filtering components

Relevance judgments for 24 FPIN questions by Dr. CS

Page 18: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

18

Answer Generation

Intervention Extractor

Outcome Extractor

Semantic Clustering

Imaging by method  

[ultrasound][Doppler studies]

Transrectal ultrasound (TRUS) offers a valuable complement to digital rectal examination (DRE) in diagnosing prostate diseases. A sensitivity of 90.6% and a specificity of 64.2% was reached.

Automated analysis and interpretation of transrectal ultrasonography images in patients with prostatitis.

Eur Urol. 1995;27(1):47-53. Metadata SoE Task

Cluster label

Intervention

Outcome

Title

Page 19: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

19

UMLS Semantic Clustering

Magnetic resonanceDoppler studies

  MRI of abdomen     Specific ultrasound

studies  

Imaging by method  

Imaging by body site  Ultrasound scan  

Evaluation procedure  

Procedure by method  

Diagnostic imaging …

Procedures

Investigations  

SNOMED Clinical Terms

Operations, procedures and interventions  

Read Codes

Pruned top

Interior nodes

Extracted interventions

Ultrasonography

Page 20: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

20

Q1

Q2

30

Q1

Q2

30

Q1

Q2

25

Cluster selection for evaluation

UMLS (latest,3 largest clusters)

User (latest,3 best clusters)

Pubmed (3 latest)

Imaging by method  

biochemistry

good OK bad

infection

inflammation

Page 21: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

21

Answer evaluation

Clinical Evidence categories

Beneficial Harmful

B LB T U+N H

0.13 0.25 0.13 0.46 0.01

0.23 0.27 0.12 0.37 0.01

0.35 0.28 0.11 0.26 -

Distribution for 25 Clinical Evidence questions (cluster selection and judgment by Dr CA)

Cluster selection strategy

Evidence support

good OK bad

PubMed 0.57 0.14 0.27

UMLS 0.72 0.09 0.19

User 0.85 0.08 0.07

Page 22: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

22

Outline

• Motivation • Evidence Based Medicine• Hypotheses• Clinical Question Answering system• Evaluation

– System components• extractors• Document re-ranking

– Answers• Multi-level answers• Best answers

• Contributions, Limitations, Future work

Page 23: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

23

Answer precision at 5

221 answers to 24 questions judged by Drs. CS and KWH

Page 24: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

24

Contributions• Leveraging semantic domain model as a foundation for

an end-to-end clinical question answering system.

• Identification of the domain-model components necessary and sufficient for system development.

• Demonstration of applicability of the system architecture for complex question answering in the clinical domain.

• Methods for combining information extraction based on statistical and knowledge-based methods.

• Adaptation of question answering evaluation methods for the clinical domain.

• Development of test collections for information extraction and question answering evaluation.

Page 25: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

25

Limitations

• No user interface

• Manual question processing

• PubMed for document retrieval

• Processing speed of automatic semantic annotation

Page 26: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

26

Future work

• Combining knowledge-based and corpus-based methods beyond outcome extractor

• Developing a corpus-based stopping condition for hierarchical ontological clustering

• In-depth study of PICO frame alternatives

• Combining ranking results of different search engines

Page 27: COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006.

27

Thanks to my advisory cloud!

Douglas Oard

Jimmy Lin

Philip Resnik

Dagobert Soergel

Ben Shneiderman

Susan Hauser

Thomas Rindflesch

George Thoma

Alan Aronson

Susanne Humphrey