FROM TINNITUS DATA TO CLASSIFIERS CONSTRUCTION: Building Decision Support System for Diagnosis and...
-
Upload
jeffry-sullivan -
Category
Documents
-
view
217 -
download
1
Transcript of FROM TINNITUS DATA TO CLASSIFIERS CONSTRUCTION: Building Decision Support System for Diagnosis and...
FROM TINNITUS DATA TO CLASSIFIERS CONSTRUCTION:
Building Decision Support Systemfor Diagnosis and Treatment
of Tinnitus
Zbigniew W. Ras1 & Paul Jastreboff2 & Pamela Thompson1
1) University of North Carolina at Charlotte College of Computing and Informatics2) Tinnitus and Hyperacusis Center
Emory University School of Medicine
1
2
In collaboration with Jan RauchDepartment of Computer ScienceUniversity of Economics, Prague, Czech Republic
Research partially supported by the Project ME913 of the Ministry of Education, Youth, and Sports of the Czech Republic
Methodology◦ Domain Knowledge◦ Data Collection◦ Data Preparation
New Feature Construction Tolerance Relation Based Clustering & New Temporal Features Classifiers Construction –
[for Total Score or Difference in Total Score] Action Rules Discovery [hints how to treat tinnitus] Future Research
From Music to Emotions and Tinnitus Treatment
4
IntroductionIntroduction
Neil Young, Barbra Streisand, Pete Townshend, William Shatner, David Letterman, Paul Schaffer, Steve Martin, Ronald Reagan, Neve Campbell, Jeff Beck, Burt Reynolds, Sting, Eric Clapton, Thomas Edison, Peter Jennings, Dwight D. Eisenhower, Cher, Phil Collins, Vincent Van Gogh, Ludwig Van Beethoven, Charles Darwin, . . .
5
IntroductionIntroduction6
Methodology: Domain KnowledgeMethodology: Domain Knowledge
TRT includes
DIAGNOSIS◦ Preliminary medical examination◦ Completion of initial interview questionnaire◦ Audiological testing
◦ TREATMENT◦ Counseling◦ Sound Habituation Therapy
◦ Exposure to a different stimulus to reduce emotional reaction◦ Visit questionnaire (THI)◦ Secondary questionnaire (TFI) in new dataset◦ Instrument tracking (instruments can be table top or in ear,
different manufacturers)◦ Continued audiological tests
7
Original Dataset◦555 patients◦Relational◦11 tables
New Dataset◦758 patients◦Relational◦Secondary questionnaire -
Tinnitus Functional Index (TFI)
8
9
Methodology: Database Features
Initial Interview form provides basis for initial patient classification.
Category - 0 to 4 (stored in Questionnaires tables)
0 – low tinnitus only: counseling
1 – high tinnitus: sound generators set at mixing point
2 – high tinnitus w/hearing loss (subjective): hearing aid
3 – Hyperacusis: sound generators set above threshold of hearing
4 – persistent hyperacusis: sound generators set at the threshold; very slow increase of sound level
9
1010
Methodology: Database Features
Tinnitus Functional IndexNew cognitive and emotional questionsScale of 0 to 10 and some %Includes questions related to
Anxious/worriedBothered/upsetDepressed
This new set of features is mapped to “arousal-valence emotion plane” used for construction of emotion-basedclassifiers in music information retrieval domain (personalization aspects are considered as well).
10
1111
Arousal-valence emotion plane - used in Automatic Indexing of Music by emotions
12New Feature Construction: TFI and Emotions
New Features Based on the TFI and emotions
12
Table 2: Tinnitus Functional Index (scale of 0 to 10)
Category of Question
Q1 % of time aware Awareness E-V Scale
Q2 loud HEARING
Q3 in control E11 E1
Q4 % of time annoyed Annoyance
Q5 cope E11 E1
Q6 ignore E21 E2
Q7 concentrate THINKING CONCENTRATION
Q8 think clearly THINKING CONCENTRATION
Q9 focus attention THINKING CONCENTRATION
Q10 fall/stay asleep E33 E3
Q11 as much sleep E33 E3
Q12 sleeping deeply E33 E3
Q13 hear clearly HEARING
Q14 understand people HEARING
Q15 follow conversation HEARING
Q16 quite, resting activities E41 E4
Q17 relax E43 E4
Q18 peace and quiet E42 E4
Q19 social activities SOCIAL
Q20 enjoyment of life E11 E1
Q21 relationships SOCIAL
Q22 work on other tasks SOCIAL
Q23 anxious, worried E23 E2
Q24 bothered upset E22 E2
Q25 depressed E31 E3
Sum of values represents E1 Energetic Positive, E2 Energetic Negative, E3 Calm Negative, E4 Calm Positive
Methodology: Database FeaturesMethodology: Database Features
Tinnitus Handicap Inventory◦Questionnaire, forms Neumann-Q Table◦Function, Emotion, Catastrophic Scores◦Total Score (sum)◦THI
0 to 16: slight severity 18 to 36: mild 38 to 56: moderate 58 to 76: severe 78 to 100: catastrophic
13
14
New Feature Construction: Decision FeatureNew Feature Construction: Decision Feature
Total Score
Difference
Discretization
Description
(score a represents the highest T Score in all cases)
TSa a= {s: s>0}, b= {0} , c = {s: s < 0}
TSb a={ s: s>30}, b ={s: 10 < s 30}, c={s: -10 < s 10},
d={s: -40 < s -10}, e – remaining scores TSc a={s : s > 28}, b={s: 0 < s 28}, c ={s: -1 < s 0},
d ={s: -15 < s -1} , e – remaining scores TSd a={s: s > 40}, b={s: 10 < s 40}, c={s: -10 < s 10},
d={s: -40 < s -10}, e – remaining scores TSe a={s: s > 50}, b={s: 0< s 50}, c={s: -50< s 0}, d – remaining scores
TSf a={s: s > 80}, b={s: 60< s 80}, c={s: 40<s 60}, d={s: 20 < s 40},
e ={s: 0< s 20}, f={s: -20 < s 0}, g={s: -40< s -20},
h={s: -60 < s -40}, i – remaining scores TSg a={s: s > 28}, b={s: 0 < s 28}, c={s: -12 < s 0}, d – remaining scores
TSh a ={s: s> 10}, b={s: -10 s 10}, c – remaining scores
14
New 8 decision attributes based on different discretizations of the difference in Total Score (between first and last visit)
Data Transformation – ORIGINAL DATABASE◦Flattened File (by Patient) From original database,
one tuple per patient with addition of features◦ Discovered from Text Data◦ Statistical (standard deviations, averages, ..) ◦ Temporal (sound level centroid, sound level spread, recovery rate) ◦ Decision Feature – discretized Difference in Total Score
from THI
Data Transformation – NEW DATABASE Clustered patient-driven datasets (by similar visit
patterns) with addition of features Coefficients, angles
15
16
New Feature Construction: Text FeaturesNew Feature Construction: Text Features
Text Mining◦Text fields
Demographic, Miscellaneous, Medication tables Categories may show cause of tinnitus for patient Stress, Noise, Medical:
17
New Boolean Features Stress, Noise, and Medical Based on
Text Mining of Terms
Stress stress, depression, emotion, work, marriage, wedding
Noise accident, noise, concert, loud, music, shooting, blast
Medical surgery, infection, medicine, depression, hospital
New Feature Construction: Temporal FeaturesNew Feature Construction: Temporal Features
New Temporal Features◦Sound Level Centroid
18
T = Total number of Visits per patient (3)V is some sound level feature (ex. LDL measurement) measured at each visit V(1), V(2), V(3)
1/3*V(1) + 2/3 * V(2) + 3/3 * V(3) V(1) + V(2) + V(3)
New Feature Construction: Temporal FeaturesNew Feature Construction: Temporal Features
New Temporal Features◦Sound Level Spread
19
SQRT V(1) * (1/3-C)2 + V(2) * (2/3-C)2 + V(3) * (3/3 – C)2
V(1) + V(2) + V(3)
New Feature Construction: Temporal FeaturesNew Feature Construction: Temporal Features
New Temporal Features◦Recovery Rate
20
],0[,min,0
0 NiVkTT
VVi
k
k
V = Total Score from THIVo = first score (should be less than Vk)Vk is the best or min score in the vectorTk is the date of best score
Data Mining: Unclustered Data Mining: Unclustered DataData
In Search for Optimal Classifiers describing Total Score or changes in Total Score [new decision attributes]
◦WEKA
◦ J48 (C4.5 Decision Tree Learner)
◦Random Forest
◦Multilayer Perceptron
21
Data Mining: Unclustered Data Mining: Unclustered DataData
Experiments and Results
1) Original data with Standard Deviations and Averages from Audiological features 2) Original data with Standard Deviations, Averages, Sound level centroid and sound level spread (Sound) only 3) Original data with Standard Deviations, Averages, and Text 4) Original Data Standard Deviations, Averages, Text and Sound 5) Original Data with Text 6) Original Data with Sound 7) Original Data with Sound, Text, and Recovery Rate 8) Original Data with Sound, and Recovery Rate /the winner/ 9) ……………………………………….
22
Data Mining: Unclustered Data Mining: Unclustered DataData
23
Top Classification Results for all 8 decision variablesOriginal Data with Sound Level Centroid, Sound Level Spread, Recovery Rate
Data Mining: Clustered DataData Mining: Clustered Data
Continuing the Search for Optimal Classifiers
◦Transformation to Visit Structure◦Creating Tolerance-Relation based Datasets ◦Adding New Features
Two groups of databases: three and four visit centered sets were constructed.
24
25
Clustering Techniques for Temporal Feature ExtractionClustering Techniques for Temporal Feature Extraction
Coefficients and Angles Feature Construction for Dp where p is a patient with 4 visits:
26
27
Clustering TechniquesClustering Techniques
Quadratic Equation Based New Features
28
Clustering TechniquesClustering Techniques
29
3030
New Feature Construction: Decision FeatureNew Feature Construction: Decision Feature
Total Score
Difference
Discretization
Description
(score a represents the highest T Score in all cases)
TSa a= {s: s>0}, b= {0} , c = {s: s < 0}
TSb a={ s: s>30}, b ={s: 10 < s 30}, c={s: -10 < s 10},
d={s: -40 < s -10}, e – remaining scores TSc a={s : s > 28}, b={s: 0 < s 28}, c ={s: -1 < s 0},
d ={s: -15 < s -1} , e – remaining scores TSd a={s: s > 40}, b={s: 10 < s 40}, c={s: -10 < s 10},
d={s: -40 < s -10}, e – remaining scores TSe a={s: s > 50}, b={s: 0< s 50}, c={s: -50< s 0}, d – remaining scores
TSf a={s: s > 80}, b={s: 60< s 80}, c={s: 40<s 60}, d={s: 20 < s 40},
e ={s: 0< s 20}, f={s: -20 < s 0}, g={s: -40< s -20},
h={s: -60 < s -40}, i – remaining scores TSg a={s: s > 28}, b={s: 0 < s 28}, c={s: -12 < s 0}, d – remaining scores
TSh a ={s: s> 10}, b={s: -10 s 10}, c – remaining scores
30
Eight new decision attributes based on different discretizations of Differences in Total Score
Data Mining: Clustered DataData Mining: Clustered Data
Classifiers Construction [learning differences in total score] for clustered data:
J48, Random Forest, and Multilayer Perceptron(Neural Network) have been tested on the cluster-based original datasets with:
1) standard deviations and averages, 2) coefficients and text, 3) coefficients and angles, 4) coefficients only, 5) angles only, 6) angles and text, 7) angles, coefficients and text /the winner/.
31
Data Mining: Clustered DataData Mining: Clustered Data32
Summary Data Mining: Clustered DataSummary Data Mining: Clustered Data
Results are quite encouraging
◦ Top precision is .884◦ This represents an improvement over the classification
precision of .751 with J48 classification on the original dataset and features Sound Level Centroid, Sound Level Spread and Recovery Rate being present
33
Action RulesAction Rules
34
35
Action rule is defined as a term
A B D
a1 b2 d1
a2 b2
a2 b2 d2
Information System
conjunction of fixed condition features shared by both groups
proposed changes in values of flexible features
desired effect of the action
[(ω) ∧ (α → β)] →(ϕ→ψ)
Action RulesAction Rules
New Feature Construction: Decision Features New Feature Construction: Decision Features showing change over timeshowing change over time
New Decision Feature
◦Boolean features + or – related to a feature such as Total Score improving or getting worse Calculated from score on next visit Stored as + or – on visit related tuple
36
Rules using LISpMiner
37
ACTION RULES: EXPERIMENT AND RESULTSACTION RULES: EXPERIMENT AND RESULTS
Analysis:
38
ACTION RULES: EXPERIMENT AND RESULTSACTION RULES: EXPERIMENT AND RESULTS
Before confidence: 9/9+0After confidence: 9/ [9+20]Low confidence but shows promise
39
Summary
Future ResearchFuture Research
Continue Action Rule StudyDevelop GUI for patient data entryUse knowledge gained from rules to
develop decision support system for treatment support for tinnitus sufferers
Continue research with music, emotions, and tinnitus treatment
40