Dimension Reduction - Polo Club of Data...

44
Dimension Reduction CS 6242 Ramakrishnan Kannan Thanks : Prof. Jaegul Choo and Prof. Le Song

Transcript of Dimension Reduction - Polo Club of Data...

Page 1: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

� 

Dimension Reduction CS 6242 Ramakrishnan Kannan

Thanks : Prof. Jaegul Choo and Prof. Le Song

Page 2: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

What is Dimension Reduction?

Data  item  index  (n)  

Dimension  index  (d)  

Columns  as  data  items    

low-­‐dim  data  

Dimension Reduction

How  big  is  this?  

Why?

Attribute=Feature= Variable=Dimension

Page 3: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

3

Serialized/rasterized pixel values

Image Data

3

80

24

58

63

45

3

80

24

58

78

45

5

34

78

Raw  images   Pixel  values  

5

34

63 Serialized  pixels  

In a 4K (4096x2160) image there are totally 8.8 million pixels

Page 4: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

3

80

24

58

63

45

5

34

78

49

54

78

14

67

36

22

86

15

4

Serialized/rasterized pixel values

�  Huge dimensions

�  4096x2160 image size  →    8847360  dimensions �  30 fps. �  Means for 2 mins video, you generate a matrix of size

8847360  x3600

Video Data

3

80

24

58

63

45

Raw  images  

Pixel  values  

5

34

63

Serialized  pixels  

49

54

78

14

15

67

22

86

36

…  

Page 5: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

� Bag-of-words vector � Document 1 = “Life of Pi won Oscar” � Document 2 = “Life of Pi is also a book.”

Text Documents

Life  

Pi  

movies  

also  

oscar  

book  

won  

Vocabulary   Doc  1   Doc  2  

1  

1  

0  

1  

0  

1  

0  

…  

1  

1  

0  

0  

1  

0  

1  

Page 6: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

� Data items � How many data items?

� Dimensions � How many dimensions representing each item?

Two Axes of Data Set

Data  item  index  (n)  

Dimension  index  (d)  

Columns  as  data  items     vs.  Rows  as  data  items  

We  will  use  this  during  lecture  

Page 7: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Dimension Reduction

7

Dimension Reduction

High-dim data (n) low-dim

data (n)

No. of dimensions

(k)

Additional info about data

Other parameters

Dim-reducing transformation for new data

: user-specified

Reduced dimension

(k)

Dimension index (d)

Page 8: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Benefits of Dimension Reduction Obviously,

Compression Visualization Faster computation

Computing distances: 100,000-dim vs. 10-dim vectors More importantly,

Noise removal (improving data quality) Separates the data into General Pattern + Sparse + Noise Is Noise the important signal? Works as pre-processing for better performance e.g., microarray data analysis, information retrieval, face recognition, protein disorder prediction, network intrusion detection, document categorization, speech recognition

Page 9: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Two Main Techniques 1. Feature selection

Selects a subset of the original variables as reduced dimensions relevant for a particular task e.g., the number of genes responsible for a particular disease may be small

2. Feature extraction

Each reduced dimension combines multiple original dimensions The original dataset will be transformed to some other numbers

9

Feature = Variable = Dimension

Page 10: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature Selection

What are the optimal subset of m features to maximize a given criterion?

Widely-used criteria Information gain, correlation, …

Typically combinatorial optimization problems Therefore, greedy methods are popular

Forward selection: Empty set → Add one variable at a time Backward elimination: Entire set → Remove one variable at a time

10

Page 11: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature Extraction

Page 12: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Aspects of Dimension Reduction

Linear vs. Nonlinear Unsupervised vs. Supervised Global vs. Local Feature vectors vs. Similarity (as an input)

12

Page 13: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Linear vs. Nonlinear Linear

Represents each reduced dimension as a linear combination of original dimensions

Of the form aX+b where a, x and b are vectors/matrices e.g., Y1 = 3*X1 – 4*X2 + 0.3*X3 – 1.5*X4 Y2 = 2*X1 + 3.2*X2 – X3 + 2*X4

Naturally capable of mapping new data to the same space

13

Dimension Reduction

D1 D2

X1 1 1

X2 1 0

X3 0 2

X4 1 1

D1 D2

Y1 1.75 -0.27

Y2 -0.21 0.58

Page 14: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Linear vs. Nonlinear

Linear Represents each reduced dimension as a linear combination of original dimensions

e.g., Y1 = 3*X1 – 4*X2 + 0.3*X3 – 1.5*X4, Y2 = 2*X1 + 3.2*X2 – X3 + 2*X4

Naturally capable of mapping new data to the same space Nonlinear

More complicated, but generally more powerful Recently popular topics

14

Page 15: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Unsupervised vs. Supervised Unsupervised

Uses only the input data

15

Dimension Reduction

High-dim data

low-dim data

No. of dimensions

Other parameters

Dim-reducing Transformer for

a new data

Additional info about data

Page 16: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Unsupervised vs. Supervised Supervised

Uses the input data + additional info

16

Dimension Reduction

High-dim data

low-dim data

No. of dimensions

Other parameters

Dim-reducing Transformer for

a new data

Additional info about data

Page 17: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Unsupervised vs. Supervised Supervised

Uses the input data + additional info e.g., grouping label

17

Dimension Reduction

High-dim data

low-dim data

No. of dimensions

Additional info about data

Other parameters

Dim-reducing Transformer for

a new data

Page 18: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Global vs. Local

Dimension reduction typically tries to preserve all the relationships/distances in data

Information loss is unavoidable! Then, what should we emphasize more? Global

Treats all pairwise distances equally important Focuses on preserving large distances

Local Focuses on small distances, neighborhood relationships Active research area, e.g., manifold learning

18

Page 19: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature vectors vs. Similarity (as an input)

Dimension Reduction

High-dim data (n)

low-dim data

No. of dimensions

(k)

Other parameters

Dim-reducing Transformer for

a new data

Additional info about data

Typical setup (feature vectors as an input)

Reduced dimension

(k) Dimension index (d)

Page 20: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature vectors vs. Similarity (as an input)

Dimension Reduction

Similarity matrix

low-dim data

No. of dimensions

Other parameters

Dim-reducing Transformer for

a new data

Additional info about data

Typical setup (feature vectors as an input) Alternatively, takes similarity matrix instead

(i,j)-th component indicates similarity between i-th and j-th data Assuming distance is a metric, similarity matrix is symmetric

Page 21: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature vectors vs. Similarity (as an input)

Dimension Reduction

low-dim data(kxn)

No. of dimensions

Other parameters

Dim-reducing Transformer for

a new data

Additional info about data

Typical setup (feature vectors as an input) Alternatively, takes similarity matrix instead Internally, converts feature vectors to similarity matrix before performing dimension reduction

Similarity matrix(nxn)

High-dim data (dxn)

Dimension Reduction

low-dim data(kxn)

Graph Embedding

Page 22: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Feature vectors vs. Similarity (as an input) Why called graph embedding?

Similarity matrix can be viewed as a graph where similarity represents edge weight

Similarity matrix

High-dim data(dxn)

Dimension Reduction

low-dim data

Graph Embedding

Page 23: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Methods Traditional

Principal component analysis (PCA) Multidimensional scaling (MDS) Linear discriminant analysis (LDA)

Advanced (nonlinear, kernelized, manifold learning)

Isometric feature mapping (Isomap)

23

* Matlab codes are available at http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html

Page 24: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Principal Component Analysis Finds the axis showing the largest variation, and project all points into this axis Reduced dimensions are orthogonal Algorithm: Eigen-decomposition Pros: Fast Cons: Limited performances

24

Image source: http://en.wikipedia.org/wiki/Principal_component_analysis

PC1 PC2 Linear Unsupervised Global Feature vectors

Page 25: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

PCA – Some Questions

Algorithm Subtract mean from the dataset (X-µ) Find the covariance matrix (X-µ)’ (X-µ) Perform SVD on this covariance matrix to find the leading eigen vectors Project the data point X on these leading eigen vectors. That is., multiply.

Key Questions Why covariance matrix? Can’t we perform SVD on the original matrix?

25

Page 26: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Multidimensional Scaling (MDS) Main idea

Tries to preserve given pairwise distances in low-dimensional space

Metric MDS Preserves given distance values

Nonmetric MDS When you only know/care about ordering of distances Preserves only the orderings of distance values

Algorithm: gradient-decent type c.f. classical MDS is the same as PCA

26

Nonlinear Unsupervised Global Similarity input

ideal distance Low-dim distance

Page 27: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Multidimensional Scaling

Pros: widely-used (works well in general) Cons: slow (n-body problem)

Nonmetric MDS is even much slower than metric MDS Fast algorithm are available.

Barnes-Hut algorithm GPU-based implementations

27

Page 28: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Linear Discriminant Analysis What if clustering information is available?

LDA tries to separate clusters by Putting different cluster as far as possible Putting each cluster as compact as possible

(a) (b)

Page 29: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Aspects of Dimension Reduction Unsupervised vs. Supervised

Supervised Uses the input data + additional info

e.g., grouping label

Dimension Reduction

High-dim data

low-dim data

No. of dimensions

Additional info about data

Other parameters

Dim-reducing Transformer for

a new data

Page 30: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Linear Discriminant Analysis (LDA) vs. Principal Component Analysis

30

2D visualization of 7 Gaussian mixture of 1000 dimensions

Linear discriminant analysis (Supervised)

Principal component analysis (Unsupervised)

30

Page 31: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

LDA Compute mean of the two classes, global mean (µ1, µ2,µ) Compute class specific covariance matrix Sw

Compute between class covariance matrix using means. Call it Sb

For every class compute inv(Sw)*Sb

Questions Why is this the solution? inv(Sw)*Sb

31 CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 8

LDA, C classes • Fisher’s LDA generalizes gracefully for C-class problems

– Instead of one projection 𝑦, we will now seek (𝐶 − 1) projections [𝑦1, 𝑦2, …𝑦𝐶−1] by means of (𝐶 − 1) projection vectors 𝑤𝑖arranged by columns into a projection matrix 𝑊 = [𝑤1|𝑤2|… |𝑤𝐶−1]:

𝑦𝑖 = 𝑤𝑖𝑇𝑥 ⇒ 𝑦 = 𝑊𝑇𝑥

• Derivation – The within-class scatter generalizes as

𝑆𝑊 = 𝑆𝑖𝐶𝑖=1

• where 𝑆𝑖 = 𝑥 − 𝜇𝑖 𝑥 − 𝜇𝑖 𝑇𝑥∈𝜔𝑖

and 𝜇𝑖 =1𝑁𝑖 𝑥𝑥∈𝜔𝑖

– And the between-class scatter becomes

𝑆𝐵 = 𝑁𝑖 𝜇𝑖 − 𝜇 𝜇𝑖 − 𝜇 𝑇𝐶𝑖=1

• where 𝜇 = 1𝑁 𝑥∀𝑥 = 1

𝑁 𝑁𝑖𝜇𝑖𝐶𝑖=1

– Matrix 𝑆𝑇 = 𝑆𝐵 + 𝑆𝑊 is called the total scatter

P 1

P 2

P 3

P

S B 1

S B 3

S B 2S W 3

S W 1

S W 2

x 1

x 2

P 1

P 2

P 3

P

S B 1

S B 3

S B 2S W 3

S W 1

S W 2

x 1

x 2

*http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf

Page 32: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Linear Discriminant Analysis

Maximally separates clusters by Putting different cluster far apart Shrinking each cluster compactly

Algorithm: generalized eigendecomposition Pros: better show cluster structure Cons: may distort original relationships of data

32

Linear Supervised Global Feature vectors

Page 33: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Methods

Traditional Principal component analysis (PCA) Multidimensional scaling (MDS) Linear discriminant analysis (LDA)

Advanced (nonlinear, kernelized, manifold learning)

Isometric feature mapping (Isomap)

33

* Matlab codes are available at http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html

Page 34: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Manifold Learning Swiss Roll Data

Swiss roll data Originally in 3D

What is the intrinsic dimensionality? (allowing flattening)

34

Page 35: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Manifold Learning Swiss Roll Data

Swiss roll data Originally in 3D

What is the intrinsic dimensionality? (allowing flattening)    

           → 2D

35

What if your data has low intrinsic dimensionality but resides in high-dimensional space?

Page 36: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Isomap (Isometric Feature Mapping)

Let’s preserve pairwise geodesic distance (along manifold)

Compute geodesic distance as the shortest path length from k-nearest neighbor (k-NN) graph *Eigen-decomposition on pairwise geodesic distance matrix to obtain embedding that best preserves given distances

36

* Eigen-decomposition is the main algorithm of PCA

Page 37: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Isomap (Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decomposition Pros: performs well in general Cons: slow (shortest path), sensitive to parameters

37

Nonlinear Unsupervised Global: all pairwise distances are considered Feature vectors

Page 38: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Practitioner’s Guide Caveats

38

Trustworthiness of dimension reduction results Inevitable distortion/information loss in 2D/3D The best result of a method may not align with what we want, e.g., PCA visualization of facial image data

(1, 2)-dimension (3, 4)-dimension

Page 39: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Practitioner’s Guide General Recommendation

Want something simple and fast to visualize data? PCA, force-directed layout

Want to first try some manifold learning methods? Isomap

It is the method that will give empirically the best result. Have cluster label to use? (pre-given or computed)

LDA (supervised) Supervised approach is sometimes the only viable option when your data do not have clearly separable clusters

No labels, but still want some clusters to be revealed? Or simply, want some state-of-the-art method for visualization?

39

Page 40: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Practitioner’s Guide Results Still Not Good?

Try various pre-processing Data centering

Subtract the global mean from each vector Normalization

Make each vector have unit Euclidean norm Otherwise, a few outlier can affect dimension reduction significantly

Application-specific pre-processing Document: TF-IDF weighting, remove too rare and/or short terms Image: histogram normalization

40

Page 41: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Practitioner’s Guide Too Slow?

Apply PCA to reduce to an intermediate dimensions before the main dimension reduction step

The results may be even better due to noise removed by PCA See if there is any approximated but faster version

Landmarked versions (only using a subset of data items) e.g., landmarked Isomap

Linearized versions (the same criterion, but only allow linear mapping)

e.g., Laplacian Eigenmaps → Locality preserving projection

41

Page 42: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Practitioner’s Guide Still need more?

Tweak dimension reduction for your own purpose Play with its algorithm, convergence criteria, etc.

See if you can impose label information Restrict the number of iterations to save computational time.

The main purpose of DR is to serve us in exploring data and solving complicated real-world problems

42

Page 43: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Take Away

43

PCA MDS LDA Isomap Supervised ✖ ✖ ✔ ✖ Linear ✔ ✖ ✔ ✖ Global ✔ ✔ ✔ ✔ Feature ✔ ✖ ✔ ✔

Page 44: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2015fall/slides/dimred... · Dimension Reduction 7 Dimension Reduction High-dim data (n) low-dim data (n)

Useful Resource Tutorial on PCA

http://arxiv.org/pdf/1404.1100.pdf Tutorial on LDA

http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf Review article

http://www.iai.uni-bonn.de/~jz/dimensionality_reduction_a_comparative_review.pdf

Matlab toolbox for dimension reduction http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html

Matlab manifold learning demo http://www.math.ucla.edu/~wittman/mani/

44