Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and...
-
Upload
annis-watkins -
Category
Documents
-
view
212 -
download
0
Transcript of Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and...
![Page 1: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/1.jpg)
Data Visualization and Feature Selection: New Algorithms for
Nongaussian Data
Howard Hua Yang and John MoodyNIPS’99
![Page 2: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/2.jpg)
Contents
Data visualizationGood 2-D projections for high dimensional data interpretation
Feature selectionEliminate redundancy
Joint mutual informationICA
![Page 3: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/3.jpg)
Introduction
Visualization of input data and feature selection are intimately related.Input variable selection is the most important step in the model selection process.
Model-independent approaches to select input variables before model specification.Data visualization is very important for human to understand the structural relation among variables in a system.
![Page 4: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/4.jpg)
Joint mutual information for input/feature selectionMutual information
Kullback-Leibler divergence
Joint mutual information
))()(||),(();( ypxpyxpKYXI iii
x xq
xpxpxqxpK
)()(
log)())(||)((
))(),...,(||),,...,(();,...,( ypxxpyxxpKYXXI kikiki
![Page 5: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/5.jpg)
Conditional MI
When
Use joint mutual information instead of the mutual information to select inputs for a neural network classifier and for data visualization.
);,( YXXI ji
);( YXI i
0),...,|;();,...,();,,...,(X 111111 nnnnn XXYXIYXXIYXXI
)|;()|;();,();,( 13123121 XYXIXYXIYXXIYXXI
kj xx
kjkjikjkji xxypxxyxpKxxpXXYXI,
)),|(),|,((),(),|;(
);();();( 321 YXIYXIYXI
![Page 6: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/6.jpg)
![Page 7: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/7.jpg)
Data visualization methods
Supervised methods based on JMI cf) CCA
Unsupervised methods based on ICA cf) PCA
Efficient method for JMI
);,(maxarg ),( YXXI jiji
)|;();();,( ijiji XYXIYXIYXXI
![Page 8: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/8.jpg)
Application to Signal Visualization and
ClassificationJMI and visualization of radar pulse patterns
Radar pattern 15-dimensional vector, 3 classes
Compute JMIs, select inputs
![Page 9: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/9.jpg)
![Page 10: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/10.jpg)
![Page 11: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/11.jpg)
Radar pulse classification
7 hidden unitsExperiments
all inputs vs. 4 selected inputs4 inputs with the largest JMI vs. randomly selected 4 inputs
![Page 12: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/12.jpg)
![Page 13: Data Visualization and Feature Selection: New Algorithms for Nongaussian Data Howard Hua Yang and John Moody NIPS ’ 99.](https://reader035.fdocuments.net/reader035/viewer/2022072010/56649da95503460f94a977c5/html5/thumbnails/13.jpg)
ConclusionsAdvantage of single JMI
Can distinguish inputs when all of them have the sameCan eliminate the redundancy in the inputs when one input is a function of other inputs