Exemplar-based Visualization of Large Document Corpus Yanhua Chen, Lijun Wang, Ming Dong, and Jing...
-
Upload
gabriel-mathews -
Category
Documents
-
view
222 -
download
0
Transcript of Exemplar-based Visualization of Large Document Corpus Yanhua Chen, Lijun Wang, Ming Dong, and Jing...
Exemplar-based Visualization of Large Document Corpus
Yanhua Chen, Lijun Wang, Ming Dong, and Jing Hua
{chenyanh, ljwang, mdong, jinghua}@wayne.edu
Department of Computer Science
Wayne State University, Detroit, MI
Overview
• Text Mining and Visualization• Current Visualization Systems
• Exemplar-based Visualization (EV)• Experiments and Results• EV Demo
Text Mining: Clustering Definition
• Given:• A source of textual
documents• Similarity measure
• e.g., how many words are common in these documents
ClusteringSystem
Similarity measure
Documentssource
Doc
DocDoc
Doc
Doc
DocDoc
Doc
Doc
Doc
• Find:• Several clusters of
documents that are relevant to each other
Current Visualization Systems
• Text Visualization: select the representation of selected features of complex multi-dimensional data to display in a logical layout (2-D or 3-D) and understand the relationship between documents
• IN-SPIRE• Infosky
Exemplar-based Visualization (EV)
Data
Low-rank Approximation
Exemplar-based Clustering
X~
X
Visualization by Parameter Embedding
G
1.
2.
3.
• Challenges :• Preserve original
relationship from multi-dimensional to low-dimensional:
Accuracy ?• Large scale document:
Efficiency ?• Layout overlap:
Exemplar ?
Experiments and Results
Visualization of 20,000 Medical Articles
Exemplar-based Visualization Demo
Reminder
Title:
Exemplar-based Visualization of Large Scale Document Corpus
Session:
Text Visualization
Time:
10:30am-12:10pm
Friday, 16 October