Paper Outline
Introduction Visualizing text Visualization transformations: from text to
pictures Examples from the MVAB Project Conclusions and directions for future
research and development
Introduction
Current Visualization approaches– For visualizing mostly structured and/or hierarchical
information Some research in information retrieval
– Utilized graph theory or figural display– Information returned is documents in text form
Users still have to read Causes a severe upper limit
Open Source digital information– Available text overwhelms the traditional reading methods
of inspection, sift and synthesis
Visualizing text
True text visualizations– Must represent textual content and meaning without the
user having to read it– Result from content abstraction and spatialization of the text
document
Use primarily preattentive, parallel processing powers of visual perception
Goal is to spatially transform text information into a new visual representation
Visualization transformations: from text to pictures
Four important technical considerations – Clear definition of text
what comprises text how it can be distinguished from other symbolic
representations
– Way to transform raw text into a different visual form– Foundation for meaningful visualization
Suitable mathematical procedures and analytical measures
– A database management system
Processing Text
Requirements of text processing engine– Identification and extraction of text features
Frequency-based measures on words Higher order statistics taken on the words Semantic in nature
– Efficient and flexible representation of documents in terms of these text features
– Support for information retrieval and visualization Pre-process, indexing
Visualizing output from text processing
Representing the document– a vector in high dimensional feature space
Comparisons, filters, and transformations can be applied
Clustering using the normalized document vectors Projection
– Principal Components Analysis– Multi-Dimensional Scaling– Exponential order of complexity
Clustering in the high-dimensional feature space Visualize the cluster centroids
Managing the representation
Two basic classes of data– Raw text files
Static in nature, Simple in structure Easy to manage
– Visual forms of the text Extensive and dynamic
Object-Oriented Database– Flexibility of data representation– Power of inheritance– Ease of data access
Interface design for text visualization
Backdrop– Central display resource
Workshop– Grid having resizable windows to hold multiple
views Chronicle
– Area where views are placed and linked to form a visual story
Examplesfrom the MVAB Project
MVAB– Multidimensional Visualization and Advanced Browsing
Project– Visualization and analysis of textual information– Showcased in SPIRE
SPIRE– Spatial Paradigm for Information Retrieval and Exploration
Starfields and Topographical maps metaphors– Galaxies and Themescapes
Galaxies
Displays cluster and document interrelatedness
2D scatterplot of ‘docupoints’ Simple point and click exploration Sophisticated tools
– Facilitate more in-depth analysis– Ex) temporal slicer
ThemeScapes
Abstract, 3D landscapes of information Convey relevant information about topic or
themes without the cognitive load Spatial relationships reveal the intricate
interconnection of thems
ThemScapes - Advantages
Displays much of the complex content of the document database
Utilizes innate human abilities for pattern recognition and spatial reasoning
Communicative invariance across levels of textual scale
Promote analysis
Conclusions
Text visualizations can overcome much of the user limitations
– Enhanced insight and time savings (35 mins vs 2 weeks)– Creative with the tool
Querying and analytical manipulation come together in a single visualization
– Permits a different kinds of querying Text visualizations will have to access and utilize the
cognitive and visual processes
Directions for Future R & D
Visual Data Analysis Elaborate the visual metaphors Addition of sensory modalities
– Virtual interaction
My Favorite Sentence
The bottleneck in the human processing and understanding of information in large amounts of text can be overcome if the text is specialized in a manner that takes advantage of common powers of perception.
Contributions
Explorations of new visualizations Discussion of the process for mapping Raw
Data Document collections into visualizations
Notes on the Reference
Designing Interaction: Psychology at the Human Computer Interaction
Interfaces Issues and Interaction Strategies for Information Retrieval Systems
Clustering and Dimensionality Reduction in SPIRE
Critique – Strengths and Weaknesses
Strengths– Provide natural visual metaphors– Enable the users to see the relationships between
documents with minimal required reading
Weaknesses– No validation of some conclusions
What has happened to this topic?
1996 R&D 100 Award OCSB
– On-line Citation Searching and Browsing in UMD
"ThemeScape" is now a trademarked term of Cartia, Inc.
WebThemeTM – an interactive tool that provides a visual display of the
common themes in collections of web-based documents
Document Lens
Why: -Text too small to read but yet needed to perceive patterns. - Perspective wall wastes corner areas of screen
What: General visualization technique based on a common strategy for understanding paper documents when their structure is not known.
How: 3D Visualization Tool For Large Rectangular Presentations
Document Lens Features
Lens – rectangular – interested in text that is mostly rectangular
Sides are elastic and pull the surrounding parts towards the lens creating a pyramid
Top Related