Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan...
-
date post
20-Dec-2015 -
Category
Documents
-
view
223 -
download
0
Transcript of Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan...
Visualization of Visualization of Multidimensional Multivariate Multidimensional Multivariate Large DatasetLarge Dataset
Presented by:Presented by:
Zhijian PanZhijian Pan
[email protected]@cs.umd.edu
University of MarylandUniversity of Maryland
DescriptionDescription Covered papers:Covered papers:
– Alfred Inselberg, Multidimensional DetectiveAlfred Inselberg, Multidimensional Detective– Ted Mihalisin, Visualizing Multivariate Ted Mihalisin, Visualizing Multivariate
Functions, Data, and DistributionsFunctions, Data, and Distributions
The problem:The problem:• Visualization and analysis of large dataset with Visualization and analysis of large dataset with
multiple parameters or factors, and the key multiple parameters or factors, and the key relationships among themrelationships among them
• MDMV problemMDMV problem
Key words explanationKey words explanation Multidimensional:Multidimensional:
– The dimensionality of independent variables The dimensionality of independent variables Multivariate:Multivariate:
– The dimensionality of dependent variablesThe dimensionality of dependent variables Example:Example:
– 3-D volume space+temperature+pressure 3-D volume space+temperature+pressure produces 3D2V dataproduces 3D2V data
The data set could The data set could largerlarger than number of than number of pixelspixels
Four Stages of DevelopmentFour Stages of Development 1st:Graphical representation of either one or two 1st:Graphical representation of either one or two
variate data, e.g. scatterplot, scatterplot matrixvariate data, e.g. scatterplot, scatterplot matrix 22ndnd:Two dimensional graphics, but encoding :Two dimensional graphics, but encoding
multiple parameters, e.g. color, size,shape codingmultiple parameters, e.g. color, size,shape coding 33rdrd:High dimensional graphics, high speed :High dimensional graphics, high speed
computation, single display, such as Parallel computation, single display, such as Parallel CoordsCoords
44thth:elaboration and assessment of various :elaboration and assessment of various visualization techniquesvisualization techniques
MDMV Visualization CategoryMDMV Visualization Category
Broadly categorized into five groups:Broadly categorized into five groups:– BrushingBrushing– Panel MatrixPanel Matrix– IconographyIconography– Hierarchical DisplaysHierarchical Displays– Non-Cartesian DisplaysNon-Cartesian Displays
Group 1Group 1
BrushingBrushing– Direct manipulation of MDMV visualization Direct manipulation of MDMV visualization
display:labeling, enhanced linkingdisplay:labeling, enhanced linking
– E.g. brushing a scatterplot matrixE.g. brushing a scatterplot matrix
Group 2Group 2
Panel Matrix (pairwise 2-D plot, n-D box)Panel Matrix (pairwise 2-D plot, n-D box)– E.g. Hyperbox: n*n lines, n*(n-1)/2 facesE.g. Hyperbox: n*n lines, n*(n-1)/2 faces– Elaboration of scatterplot matrixElaboration of scatterplot matrix– Adding interactive data navigation (hyperbox Adding interactive data navigation (hyperbox
cutting)cutting)
Group 3Group 3
Iconography: Glyphs: graphical entities Iconography: Glyphs: graphical entities which encode MDMV with shape, size, which encode MDMV with shape, size, color, and position. color, and position. – E.g. faceglyph: size and position of eyes, nose, E.g. faceglyph: size and position of eyes, nose,
mouth; curvature of mouth; angle of eyebrowsmouth; curvature of mouth; angle of eyebrows
Group 4Group 4
Hierarchical Displays: Hierarchical Displays: – map a subset of variates into different map a subset of variates into different
hierarchical displayhierarchical display– Dynamic interactive analysisDynamic interactive analysis– the Ted Mihalisin paper, more details followedthe Ted Mihalisin paper, more details followed
Group 4 (cont’d)Group 4 (cont’d)
New term: speed=the hierarchical axesNew term: speed=the hierarchical axes E..g. Three variables:x,y,and z: {0,1,2} E..g. Three variables:x,y,and z: {0,1,2} X the fastest axis, Z the slowest axisX the fastest axis, Z the slowest axis
Group 4 (Cont’d)Group 4 (Cont’d)
Visualizing 3 Visualizing 3 variables:variables:– 2 interdependent 2 interdependent
variables: x, y: variables: x, y: • x= -2, -1, 0, 1, 2; x= -2, -1, 0, 1, 2;
• y= -2, -1, 0, 1, 2y= -2, -1, 0, 1, 2
– 1 dependent variable: z 1 dependent variable: z = x**2 + y**2= x**2 + y**2
– so, a 2D1V problemso, a 2D1V problem
– x fastest, y slowestx fastest, y slowest
Group 4 (Cont’d)Group 4 (Cont’d)
3d1v: W = (x**2) * (e**-y) + z3d1v: W = (x**2) * (e**-y) + z
• Top panel speed order : x, y, z
• Bottom panel speed order: z, y, x
Group 4 (cont’d)Group 4 (cont’d)
What if the number of the data points What if the number of the data points greatly exceeds the number of horizontal greatly exceeds the number of horizontal pixels assigned to the panel?pixels assigned to the panel?
Example: 7 independent variables + each Example: 7 independent variables + each has 10 values = 10,000,000 pointshas 10 values = 10,000,000 points
Need:Need:– hierarchical subspace zooming to reduce hierarchical subspace zooming to reduce
dimension dimension
Group 4 (cont’d)Group 4 (cont’d) example: experiment example: experiment
data visualization:data visualization:– Dependent: specific Dependent: specific
heatheat– Independent: Independent:
• Fastest: temperature Fastest: temperature (white) :gaussian peak(white) :gaussian peak
• Then alloy Then alloy concentration (blue): concentration (blue): linear increaselinear increase
• Then magnetic field Then magnetic field (red) :nonlinear (red) :nonlinear decreasedecrease
Group 5Group 5
Parallel CoordinatesParallel Coordinates– So many class presentations have already been So many class presentations have already been
done!done!– Everybody is already expert using itEverybody is already expert using it– What are some basic ideas behind it?What are some basic ideas behind it?– Cartesian v.s. Parallel Coords Cartesian v.s. Parallel Coords
Group 5 (cont’d)Group 5 (cont’d)
A Cartesian line:A Cartesian line:– L: xL: x22 = mx = mx11+b+b
– A set of points sampled A set of points sampled on this lineon this line
• On Parallel Coords:– Each point becomes a line– The set of points becomes a
set of intersecting lines
Group 5 (cont’d)Group 5 (cont’d)
The intersect point:The intersect point:
The location of the The location of the intersect point is intersect point is important!important!– Between two axes: Between two axes:
inversely proportional inversely proportional (x1 (x1 α 1/x2)α 1/x2)
– Outside two axes: Outside two axes: directly proportional directly proportional (x1 (x1 α x2)α x2)
Group 5 (cont’d)Group 5 (cont’d)
Application exampleApplication example– Aircraft collision Aircraft collision
checkingchecking
– Converting the Converting the problem into detecting problem into detecting a four dimension a four dimension geometric intersectiongeometric intersection
– Collision at (2,2,2,1)Collision at (2,2,2,1)
Group 5 (cont’d)Group 5 (cont’d) Application example:Application example:
– Economic model of a Economic model of a real countryreal country
– 8 variables:8 variables:• AgricultureAgriculture• FishingFishing• MiningMining• ManufacturingManufacturing• ConstructionConstruction• GovernmentGovernment• MiscellaneousMiscellaneous• GNPGNP
Group 5 (cont’d)Group 5 (cont’d)
A Least Squares A Least Squares function defines the function defines the boundary region in 8 boundary region in 8 dimension spacedimension space
Any point (polygon) Any point (polygon) inside the boundary inside the boundary represents a feasible represents a feasible economic policy for economic policy for the country the country
Group 5 (cont’d)Group 5 (cont’d)
Discoveries:Discoveries:– No policy would favor No policy would favor
Agriculture without Agriculture without also favoring Fishing: also favoring Fishing: (x1 (x1 α x2)α x2)
– Inverse relationship Inverse relationship between Fishing and between Fishing and Mining: resource Mining: resource competition: competition:
(x1 (x1 α 1/x2)α 1/x2)
Notes on the ReferencesNotes on the References
The Inselberg’s paper:The Inselberg’s paper:– 11 citations found on 11 citations found on
researchIndexresearchIndex
– Application in Application in knowledge discovery, knowledge discovery, user interface, aircraft user interface, aircraft design, etc.design, etc.
Ted Mihalisin paper:Ted Mihalisin paper:– Only one citation Only one citation
foundfound
ContributionContribution Inselberg’s paper:Inselberg’s paper:
– Transform MDMV hyperspace relations into a Transform MDMV hyperspace relations into a 2-D geometric pattern problem2-D geometric pattern problem
– empirical studies demonstrated the ability empirical studies demonstrated the ability extending the strength with trade-off analysis, extending the strength with trade-off analysis, discover sensitivities, and optimizationdiscover sensitivities, and optimization
Mihalisin’s paper:Mihalisin’s paper:– Hierarchical technique visualizing data points Hierarchical technique visualizing data points
greatly exceeding number of pixels greatly exceeding number of pixels
CritiqueCritique
Inselberg’s paper:– No comparison with other MDMV techniques– No examples supporting the claim that
displayed objects can be recognized under projective transformations
Mihalisin’s paper:– Limited number of values for each variable
visualized in one display– No discussion of potential information loss
with coarse-grained grid
Favorite SentenceFavorite Sentence “You can’t be unlucky all the time!”
– Multiple techniques exist for MDMV visualization problem
– Each has strength and weakness– Whichever you start with, you can’t be unlucky
all the time!– Integration and collaboration of existed tools
remain to be active research topics.