Post on 05-Jul-2020
Informatics: a scientific basis for Informatics: a scientific basis for use of IT in health care and use of IT in health care and
biomedicinebiomedicine
William W. Stead, M.D.
McKesson Foundation Professor of Biomedical Informatics & Medicine
Associate Vice Chancellor for Strategy/Transformation & CIO
Vanderbilt University Medical Center
April 21, 2008
Disclosures: McKesson’s Horizon Expert Orders is licensed from Vanderbilt. Vanderbilt is a founder of Informatics Corporation of America. I am a Director of Healthstream.
The Science of Informatics
Challenges to Application of IT to Health Care and Biomedicine
Examples of the Science at Work
Assembling a Critical Mass of Talent
Structural Considerations
Road Map to the LectureRoad Map to the Lecture
InformaticsInformaticsDefinition:
Science that deals with information, its structure, acquisition and use
Cornerstones:Techniques to structure, discover, visualize & reason with information contentApproaches to link people, process & technology together as a systemMethods to evaluate systems and their technology componentsProcesses to facilitate change
Techniques to Structure, Discover, Visualize & Reason with Information Content
InformaticsInformatics++
• Data knowledge structures
• Ontologies
& reference terminologies
• Algorithms for matching & machine learning
• Models
Data sourcesData sourcescreated by peoplecreated by people
• Text literature
• Bibliographic + evidence databases
• Patient records
• Environmentaldatabases
Information useable Information useable by people & by people & computerscomputers
==
• Drug x potentiates drug y
• Pt x is on the following drugs
…
…
…
Approaches to Link People, Process & Technology together as a System
• Compassion
• Pattern Recognition
• Judgment
PeoplePeople
• Simplification
• Standardization
ProcessProcess++
•↓
Memory Dependence
•↑Forcing Function
InformaticsInformatics++
• Reproducible Performance
SystemsSystems==
Methods to Evaluate Systems and their Technology Components
Informatics Informatics EvaluationEvaluation++
•• Interface usability and Interface usability and safetysafety
•• Information dose and Information dose and responseresponse
•• Information recall and Information recall and precisionprecision
EvidenceEvidence-- based Designbased Design==
•• What to doWhat to do
•• How to get it How to get it donedone
Health Service Health Service ResearchResearch
•• System effect on System effect on quality and costquality and cost
Processes to Facilitate ChangeProcesses to Facilitate Change
Working in Working in New WaysNew Ways==
People & People & Organizational Organizational
IssuesIssues
•• StrategyStrategy
•• UnderstandingUnderstanding
•• SupportSupport
InformaticsInformatics
•• Scenario planning & use Scenario planning & use casescases
•• Iterative design: concept Iterative design: concept ==> > build => use => refinebuild => use => refine
•• Embedded monitoring & Embedded monitoring & self correctionself correction
++
Informatics is InterInformatics is Inter--disciplinarydisciplinary
Biological SciencesCognitive SciencesComputer ScienceEngineeringHealth SciencesInformation ScienceStatistics
Informatics has SubInformatics has Sub--disciplinesdisciplinesInformation Structure & Information Structure & Semantic RelationshipsSemantic Relationships
Database architectureDatabase architectureTerminology architectureTerminology architectureKnowledge architectureKnowledge architectureSystem architectureSystem architecture
Workflow Fit & EvaluationWorkflow Fit & EvaluationOrganizational developmentOrganizational developmentDesign & requirementsDesign & requirementsHuman factors & safetyHuman factors & safetyUtility & effectiveness Utility & effectiveness assessmentassessment
Algorithms, Models & SystemsAlgorithms, Models & SystemsInformation retrievalInformation retrievalNatural language processingNatural language processingData miningData miningModelingModelingData privacy, deData privacy, de--identification identification and reand re--identificationidentificationMachine learning & discovery systemsMachine learning & discovery systemsScientific visualization/imaging Scientific visualization/imaging informaticsinformaticsDecision models & expert systemsDecision models & expert systems
Application DomainsApplication DomainsHealth Informatics (medical, nursing, Health Informatics (medical, nursing, public health, regional, consumer...public health, regional, consumer...Educational processesEducational processesBiomolecular analysis methods & toolsBiomolecular analysis methods & toolsClinical research data management & Clinical research data management & analysisanalysis
Characteristics of Biological Systems &
Health Care that Challenge Conventional IT
Representative Challenges
Health care is fragmented and chaotic
Clinical measurements rarely have precise meaning
Diagnoses lack clinical detail
Biological systems are highly variable
Fact
s pe
r Dec
isio
n
1000
10
100
5Human
Cognitive Capacity
2000 20101990 2020
Structural Genetics: e.g. SNPs, haplotypes
Functional Genetics: Gene expression
profiles
Proteomics and othereffector molecules
Decisions by Clinical Phenotype
#1 Data Aggregation & Organization
Examples of Informatics Examples of Informatics at Workat Work
Systems (ADT, Lab, Rad…)
Dictation/Transcription
Note Capture Tools
Sources
Image (PACS, EKG…)
Fax/Scan
Text Reports
Externalize Content as Documents
Report Header plus Pointerto Image
Index each Word or Number for Cross-
Patient Queries
Electronic Patient Chart
Assemble “Documents”
by Patient
“Tag”
Clinical Concepts for Decision Support
⎨ StarChartStarChart
Vanderbilt Knowledge Map Vanderbilt Knowledge Map Concept IdentifierConcept Identifier
Uses NLP techniquesAbbreviation and acronym extractionConjunctive expansion
Score-basedDerivational forms (stenosis →stenotic, lungs → pulmonary)Document-based disambiguation
Find all of my student’s notes discussing “breast cancer”
A prior diagnosis of breast cancer
The Vanderbilt DNA Databank Project DesignThe Vanderbilt DNA Databank Project Design
Create de-identified “Synthetic Derivative” of Electronic Medical Records (650 Gb data on 1.4 million individuals)Extract DNA from leftover blood samples that have been de-identified (25K-75K per year)Enable DNA sample retrieval based on clinical queriesProspective genotyping panels
Examples of Informatics Examples of Informatics at Workat Work
#2 Discovery Systems
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
ALL
LAR
SLA
RS-
ENG
A-K
NN
RFE
RFE
-Guy
onR
FE-P
OLY
RFE
-PO
LY-G
uyon
SIM
CA
SIM
CA
-SVM
WFC
CM
-CC
RU
AF-
KW
UA
F-B
WU
AF2
-BW
UA
F-S2
NX H
ITO
N-P
CH
ITO
N-P
C-W
HIT
ON
-MB
HIT
ON
-MB
-WH
ITO
Ngp
-PC
HIT
ON
gp-M
BH
ITO
Ngp
-PC
-WH
ITO
Ngp
-MB
-W
Averaged over datasets
Marker selection method
Per
form
ance
Random markersSelected markers
1. Many genes that belong to different pathways are determining phenotype
2. Our predictive algorithms select the locally causal and
essential (non-redundant) genes
3. This leads to great parsimony
of selected marker panels
4. It also leads to better performance than random marker selection (other methods are barely distinguishable than random)
5. Selected marker genes are localized in pathways responsible for phenotype improved biological significance.
Other methods are not localized and thus uninterpretable.
Advantages of Causal Algorithms for HighAdvantages of Causal Algorithms for High--Throughput DiscoveryThroughput Discovery
A
E
Ph
H
Q
L
M
N
P O
C
D I
J
K
R
S
T Y
1
2
3 4
5
B
B
A
E
T
H Q L
M N
P O
C D
I J
K
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
ALLLARS
LARS-EN
HITON_PC
HITON_PC_W
HITON_MB
HITON_MB_W
HITONgp_PC
HITONgp_MB
HITONgp_PC_W
HITONgp_MB_W
HITONgp_PC_S
HITONgp_MB_S
GA_KNNRFE
RFE_Guyo
nRFE_P
OLY
RFE_POLY_G
uyon
SIMCA
SIMCA_S
VM
WFCCM_CCR
UAF_KW
UAF_BW
UAF2_BW
UAF_S2N
FullyFully--Automated Discovery SoftwareAutomated Discovery Software
Acc
urac
y, %
0
20
40
60
80
100
Multiple specialized classification methods(original primary studies: results by human expert analysis)
GEMS(fully automated)
1.
Data
analysis protocol that is disease & assay specific
e.g. GEMS protocol benchmarked and tailored for microarray gene expression analysis for cancer
diagnosis and outcome prediction
2. Tested software system: e.g., GEMS was tested against published analyses and in cross-dataset experiments
3. Fully-automated use. The software:
Runs many different algorithms and configurations
Selects the best one
Fits the data and reports results
The system avoids overfitting
and gives unbiased estimation of model performance
GEMS
NormalCancerCancerNormalNormalCancer…CancerCancerNormal
Gene expression data andoutcome variable
ring finger protein 1tubulin, beta, 5glucose-6-phosphate dehydrogenaseglutathione S-transferase M5carnitine acetyltransferaseRho GTPase activating protein 4SMA3mannose phosphate isomerasemitogen-activated protein kinase 3leukotriene A4 hydrolasechromosome 21 open reading frame 1dihydropyrimidinase-like 2beta-2-microglobulindiscs, large (Drosophila) homolog 4
Optional: Gene names & IDs
Cross-validationperformance
estimate
Classificationmodel
Rho GTPase activating protein 4SMA3mannose phosphate isomerasemitogen-activated protein kinase 3
Reduced set of genes
Links to literature
Cross-validationperformance
estimate
Classificationmodel
Rho GTPase activating protein 4SMA3mannose phosphate isomerasemitogen-activated protein kinase 3
Reduced set of genes
Links to literature
(model generation & performance estimation mode)(model generation & performance estimation mode)
Random markers*Marker selection Bhattacharjee Bhattacharjee Bhattacharjee
Classifier training Beer Beer Bhattacharjee
Classifier testing Beer Beer BeerHITONgp_MB 98.71% 100.00% 93.79% 100.00%HITONgp_PC_W 98.47% 98.89% 64.79% 99.88%HITONgp_MB_W 98.47% 98.89% 58.36% 99.88%HITONgp_PC_S 99.53% 100.00% 79.43% 100.00%HITONgp_MB_S 99.71% 100.00% 90.32% 100.00%
Dataset ReferenceBhattacharjee Bhattacharjee, 2001
Beer Beer, 2002
Bhattacharjee
Diagnostic taskLung cancer vs normalsLung cancer vs normals
Dataset
Outer loop: Cross-validation for performance estimation
Inner Loop: Cross-validation for model selectionTraining
setValidation
setC Accuracy Average
AccuracyP1 P2 86%P2 P1 84%P1 P2 70%P2 P1 90%
1 85%
2 80%
Training set
Testing set
C Accuracy Average Accuracy
P1, P2 P3 1 89%P1,P3 P2 2 84%P2, P3 P1 1 76%
83%…
predictor variables outcome variable
P1
P2
P3
Choose C=1 since it maximizes accuracy
…
9_Tum
ors
14_Tu
mors
Brain_
Tumor2
Brai
n_Tu
mor1
11_Tu
morsLe
ukem
ia1
Leuk
emia2
Lu
ng_C
ancer
SRBCT
Prostat
e_Tum
or DLBCL
Examples of Informatics Examples of Informatics at Workat Work
#3 Assuring Quality
Assurance of Quality
Outcomes
Voluntary ReportingVoluntary Reporting
Algorithmic surveillance for potential adverse events
Need trend detection
or escalationmechanism
Algorithmicfalse positive or under-reporting
New rule needed No actionNo
Yes No
Yes
Assembling a Critical Assembling a Critical Mass of Biomedical Mass of Biomedical Informatics TalentInformatics Talent
0
10
20
30
40
50
60
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
Faculty19
91
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
Disciplines
NursingLibrary
Computer
Science Org Behavior
EducationBioinformatics
Biomedical Engineering
Medicine
Center for Better Health
BMI Training Grant
Division of BMI
Informatics Center
Structures Center for BMI Dept of BMI
McKesson Partnership
Masters & PhD Programs
Launch and Growth Launch and Growth of the Department of of the Department of
Biomedical Biomedical InformaticsInformatics
Mod
elin
g
Data
base
Desi
gnSy
stem
Arch
itect
ures
Voca
bula
ries
Data
Min
ing
Expe
rtSy
stem
sKn
owle
dge
Arch
itect
ures
Disc
over
y M
etho
ds&
Syst
ems
Scie
ntifi
c Vi
sual
izat
ion
Hum
an
Fact
ors
Educ
atio
nal
Proc
esse
sEv
alua
tion
Org
aniz
atio
nal
Deve
lopm
ent/
Man
agem
ent
AlphanumericClinicalSystems
Clinical Imaging
MolecularBiology &Genetics
Integrative (Systems) Biology
Molecular Cell and Tissue
Imaging
Relative Importance of Competencies in Selected Relative Importance of Competencies in Selected Application Areas (2004 DBMI Plan)Application Areas (2004 DBMI Plan)
Informatics Career RolesInformatics Career Roles
Basic informatics
research
Use of tools in research &
practice
Tool development & evaluation
Organization Structure
Informatics Education Informatics Education & Training& Training
Operation & Decision Operation & Decision Support SystemsSupport Systems
Informatics Research & Informatics Research & DevelopmentDevelopment
Operatio
nal Exp
erien
ce
Operatio
nal Exp
erien
ceReal World Problems,
Real World Problems,
Delivery Vehicle & Funding
Delivery Vehicle & Funding
Techniques & Technologies
Techniques & TechnologiesAdministrat
ors an
d Staff
Administrat
ors an
d Staff
InvestigatorsInvestigatorsLaboratory ExperienceLaboratory Experience
Relationship of Academics to OperationsRelationship of Academics to Operations
,, , ,
Steven Steven GabbeGabbeDean School of Dean School of
MedicineMedicine
John John DoulisDoulisAsst VC & Chief Asst VC & Chief
Operations Operations OfficerOfficer
NunziaNunzia
GiuseGiuseDirectorDirector
EskindEskind
LibraryLibrary
Bill SteadBill SteadChairman,Chairman,
Vanderbilt CenterVanderbilt Centerfor Better Healthfor Better Health
Carol AronsonCarol AronsonAdministrative Officer Administrative Officer
IC IC
Donna ForsytheDonna ForsytheIC Administration & IC Administration &
FinanceFinance
Bill SteadBill SteadAssociate VC Health AffairsAssociate VC Health AffairsDirector, Informatics CenterDirector, Informatics Center
CIOCIO
Harry JacobsonHarry JacobsonVice ChancellorVice ChancellorHealth AffairsHealth Affairs
Nick Nick ZepposZepposChancellorChancellor
Nancy Nancy LorenziLorenziAsst VC Health Affairs, Asst VC Health Affairs,
Organization Organization DevelopmentDevelopment
Bill SteadBill SteadChief Information Chief Information
Architect (CIA)Architect (CIA)
Lauren Lauren BriskyBriskyVice ChancellorVice Chancellor
Administration and CFOAdministration and CFO
Mark JohnsonMark JohnsonChief Information Chief Information
Security OfficerSecurity Officer
Tim Tim GetsayGetsayAsst VC, MISAsst VC, MIS& Assoc CIA, & Assoc CIA,
Enterprise Enterprise ApplicationsApplications
Matt HallMatt HallAsst VC, ITS & Assoc Asst VC, ITS & Assoc
CIA, Enterprise CIA, Enterprise InfrastructureInfrastructure
Dan Dan MasysMasysChair, DBMI & ChiefChair, DBMI & Chief
Academic OfficerAcademic Officer
VU Information Technology OrganizationVU Information Technology Organization
VUMC’s
Virtual Office of Research Informatics
Executive Sponsorship
Jeff BalserAssoc VC, Research
Bill SteadAssoc VC,
Strategy & CIO
Tactical Leadership
Gordon BernardAsst VC, Research
John ManningVP, Research
Operations
Dan MasysChair, DBMI
John DoulisAsst VC,
Informatics
Day to Day Operations
Jill PulleyDirector, DNA
Databank
Paul Harris Director, GCRU, Informatics Core