Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer...

13
Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department of Political Science University of Wisconsin - Milwaukee e: [email protected] w: http://www.quantoid.net/teachessex/dimension/ August 13, 2015 1 / 49 Dissimilarities Many data are inherently (or can be transformed into) dissimilarities or distances. Inherently - respondents can be asked explicitly to provide information about the (dis)similarity between stimuli. Or, the data are inherently distance-based (e.g., distances between European cities). We can calculate the profile dissimilarity among dierent observations based on existing data (e.g., how often legislators vote together). Multidimensional Scaling tries to estimate δ jm = f (d jm ), where δ is the observed dissimilarity of point j and point m and d represents the distance between point j and point m in the low-dimensioned representation of the data. More than other techniques, MDS is used to provide a visual map of the stimuli (so applications where the number of latent dimensions is over 3 are rare). 2 / 49 Outline Classical MDS Example: Metric MDS (Torgersen) Example: Metric MDS using SMACOF Non-metric MDS Example: Non-metric MDS Individual Dierences Scaling Example: French Parties 3 / 49 Classical MDS Torgerson’s solution for the simple metric MDS problem: 1. convert the similarities/dissimilarities to a symmetric q by q matrix of squared distances 2. double-center the matrix of squared distances to remove the squared terms 3. perform an eigenvalue/eigenvector decomposition of the double-centered matrix to recover the coordinates. 4 / 49

Transcript of Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer...

Page 1: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Dimension Reduction in R

Essex Summer School in Data Analysis

Lecture 3: Multidimensional Scaling

Dave Armstrong

Department of Political Science

University of Wisconsin - Milwaukee

e: [email protected]

w: http://www.quantoid.net/teachessex/dimension/

August 13, 2015

1 / 49

Dissimilarities

• Many data are inherently (or can be transformed into) dissimilaritiesor distances.

• Inherently - respondents can be asked explicitly to provide informationabout the (dis)similarity between stimuli. Or, the data are inherentlydistance-based (e.g., distances between European cities).

• We can calculate the profile dissimilarity among di↵erent observationsbased on existing data (e.g., how often legislators vote together).

• Multidimensional Scaling tries to estimate �jm = f(djm), where � isthe observed dissimilarity of point j and point m and d representsthe distance between point j and point m in the low-dimensionedrepresentation of the data.

More than other techniques, MDS is used to provide a visual map of thestimuli (so applications where the number of latent dimensions is over 3are rare).

2 / 49

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

3 / 49

Classical MDS

Torgerson’s solution for the simple metric MDS problem:

1. convert the similarities/dissimilarities to a symmetric q by q matrixof squared distances

2. double-center the matrix of squared distances to remove the squaredterms

3. perform an eigenvalue/eigenvector decomposition of thedouble-centered matrix to recover the coordinates.

4 / 49

Page 2: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Technical Details I

We assume there exists in the world a matrix Z (which we do notobserve directly) such that stimuli are in the rows and dimensions are inthe columns

Z =

2

666664

z11 z12 . . . z1sz21 z22 . . . z2s. . . .. . . .. . . .zq1 zq2 . . . zqs

3

777775(1)

5 / 49

Technical Details II

The matrix Z gives rise to a set of inter-stimulus distances (which weobserve)

Dz=

2

666666666664

sPk=1

(z1k�z1k)2

sPk=1

(z1k�z2k)2 . . .

sPk=1

(z1k�zqk)2

sPk=1

(z2k�z1k)2

sPk=1

(z2k�z2k)2 . . .

sPk=1

(z2k�zqk)2

. . . .

. . . .

. . . .sP

k=1(zqk�z1k)

2sP

k=1(zqk�z2k)

2 . . .sP

k=1(zqk�zqk)

2

3

777777777775

6 / 49

Double-centering D

• the mean of the jth column of Dz be

d2.j=

qPm=1

d2mj

q(2)

• the mean of the mth row of Dz be

d2m.=

qPj=1

d2mj

q(3)

• the mean of the matrix Dz be

d2..=

qPm=1

qPj=1

d2mj

q2(4)

7 / 49

Double-centering II

The double centered matrix is defined as:

ymj =d2mj � d2.j � d2m. + d2..

�2

=

sX

k=1

(zmk � z̄k)(zjk � z̄)

We then have:

Y = Z⇤Z⇤0

Z⇤=

2

666664

z11 � z̄1 z12 � z̄2 . . . z1s � z̄sz21 � z̄1 z22 � z̄2 . . . z2s � z̄s

. . . .

. . . .

. . . .zq1 � z̄1 zq2 � z̄2 . . . zqs � z̄s

3

777775

8 / 49

Page 3: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Solving for Z

Assuming that z̄1 = z̄2 = . . . = z̄s = 0, then Z = Z⇤.

• An eigen decomposition of Y , the double-centered distance matrix,gives

Y = U⇤U 0

• The estimate of Z is:Z = U⇤

12

9 / 49

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

10 / 49

Example: National Similarities

Wish (1971) asked students to rate the similarity of nations from 1 (verydi↵erent) to 9 (very similar).> u <- url("http://www.quantoid.net/files/essex/nations.rda")> load(u)> close(u)> nations[1:5, 1:5]

Brazil Congo Cuba Egypt France

Brazil 9.00 4.83 5.28 3.44 4.72

Congo 4.83 9.00 4.56 5.00 4.00

Cuba 5.28 4.56 9.00 5.17 4.11

Egypt 3.44 5.00 5.17 9.00 4.78

France 4.72 4.00 4.11 4.78 9.00

Since 9 represents maximum similarity, if we want a dissimilarities matrix,then we need to subtract the similarity values from some number � 9.> d <- (9-nations)^2

11 / 49

Creating the Double-centered Matrix

> doubleCenter <- function(x){+ p <- dim(x)[1]+ n <- dim(x)[2]+ -(x-matrix(apply(x,1,mean),nrow=p,ncol=n) -+ t(matrix(apply(x,2,mean),nrow=n,ncol=p)) + mean(x))/2+ }> D <- doubleCenter(d)> ev <- eigen(D)

We can make the coordinates with the following function:> makeCoords <- function(obj){+ scaleFac <- sqrt(max((abs(obj$vec[,1]))^2 + (abs(obj$vec[,2]))^2))+ x <- obj$vec[,1]*(1/T)*sqrt(obj$val[1])+ y <- obj$vec[,2]*(1/T)*sqrt(obj$val[2])+ coords <- cbind(x,y)+ return(coords)+ }> coords <- makeCoords(ev)> rownames(coords) <- rownames(D)

12 / 49

Page 4: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Plotting the solution

> coords[,2] <- -1*coords[,2]> plot(coords[,1], coords[,2], pch=16, col="gray", font=2,+ main="Torgerson Solution", xlab="", ylab="",+ xlim=c(-6,6), ylim=c(-6,6), asp=1)> text(coords[,1], coords[,2], rownames(nations),+ pos=c(4,4,2,4,4,4,4,4,4,4,1,4), offset=0.35, col="black", font=2)> abline(a=0, b=1, lwd=2, lty=2)> abline(a=0, b=-1, lwd=2, lty=2)

●●

● ●

●●

−6 −4 −2 0 2 4 6

−6−4

−20

24

6Torgerson Solution

BrazilCongo

Cuba EgyptFrance

India

IsraelJapanChina

USSR

USA

Yugoslavia

13 / 49

Di↵erent Optimization

What we showed above was:

• A completely di↵erent theoretical model from factor analysis andprincipal components.

• A solution that relied on the same technology - eigen decomposition.

We can move in a slightly di↵erent direction now and talk aboutmajorization as a means for optimization

• majorization takes a complicated function and instead of minimizingthat, minimizes a simpler set of auxiliary functions.

• In the case of MDS, it minimizes stress

�(X) =

X

i<j

wjm(djm(X)� �jm)

2 (5)

where wjm is an q ⇥ q matrix of weights. Standard practice is to setwjm equal to 1 if �jm is observed and 0 if �jm is missing

14 / 49

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

15 / 49

SMACOF in R

The smacof package in R performs this type of majorization. Inparticular, we use the smacofSym() function1

> library(smacof)> smacof_metric_result <- smacofSym(delta=d, ndim=2, weightmat=NULL,+ type="ratio", itmax = 1000, eps=0.000001)> conf <- smacof_metric_result$conf

> plot(conf[,1],conf[,2],+ main="SMACOF (Metric) Solution", xlab="", ylab="",+ xlim=c(-1,1), ylim=c(-1,1), asp=1, type="n")> points(conf[,1], conf[,2], pch=16, col="gray")> text(-0.9, 0.9, paste("Stress = ",+ round(smacof_metric_result$stress,3)), font=2)> text(conf[,1], conf[,2], rownames(nations),+ pos=rep(4,nrow(nations)), offset=00.25, col="black", font=2)> abline(a=0, b=-3/5, lwd=2, lty=2)> abline(a=0, b=5/3, lwd=2, lty=2)

1

Note, the code in the book no longer works due to changes in the arguments to

smacofSym()". The code in the slides does work.

16 / 49

Page 5: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Configuration

−1.0 −0.5 0.0 0.5 1.0

−1.0

−0.5

0.0

0.5

1.0

SMACOF (Metric) Solution

●●

● ●

Stress = 0.223

BrazilCongo

Cuba

Egypt

France

India

Israel

JapanChina USSR

USA

Yugoslavia

17 / 49

Comparison of Torgerson and SMACOF Results

The comparison is really about how closely related the inter-pointdistances are:

> cor(dist(conf), dist(coords))

[1] 0.9247277

> plot(dist(conf), dist(coords))

●●

●●

0.5 1.0 1.5

12

34

56

dist(conf)

dist(coords)

18 / 49

Scree Plot of Stress> stressPlot <- function(obj, maxdim=5){+ s <- NULL+ for(i in 1:maxdim){+ s <- c(s, update(obj, ndim=i)$stress)+ }+ plot(1:maxdim, s, type="o", pch=16,+ xlab = "# Dimensions", ylab = "Stress")+ }> stressPlot(smacof_metric_result)

● ●

1 2 3 4 5

0.15

0.20

0.25

0.30

0.35

0.40

# Dimensions

Stre

ss

19 / 49

SMACOF with Missing Data

We can induce some missing data in the nations matrix as follows:> d <- (9-nations)^2> d[1,2] <- d[2,1] <- NA> weightmat <- !is.na(d)> d[is.na(d)] <- mean(d, na.rm=TRUE)

• Above, we make the weight matrix equal to 1 if d is observed and 0otherwise

• We replace missing values in d with observed values, but thesecontribute nothing to the solution, it just allows smacofSym() towork.

20 / 49

Page 6: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Missing Results

> missing_result <- smacofSym(delta=d, ndim=2, weightmat=weightmat,+ type="ratio", itmax = 1000, eps=0.000001)> miss.conf <- missing_result$conf> plot(miss.conf[,1], miss.conf[,2],+ main="SMACOF (Metric) Solution with Missing Value+ for Brazil-Congo Dissimilarity", xlab="", ylab="",+ xlim=c(-1,1), ylim=c(-1,1), asp=1, type="n")> points(miss.conf[,1], miss.conf[,2], pch=16, col="gray")> text(-0.9,0.9,paste("Stress = ",+ round(missing_result$stress,3)),font=2)> text(miss.conf[,1], miss.conf[,2], rownames(nations),+ pos=rep(4,nrow(nations)), offset=00.25, col="black", font=2)> abline(a=0, b=-3/5, lwd=2, lty=2)> abline(a=0, b=5/3, lwd=2, lty=2)

21 / 49

Configuration

−1.0 −0.5 0.0 0.5 1.0

−1.0

−0.5

0.0

0.5

1.0

SMACOF (Metric) Solution with Missing Value for Brazil−Congo Dissimilarity

●●

● ●

Stress = 0.221

BrazilCongo

Cuba

Egypt

France

India

Israel

Japan

China USSR

USA

Yugoslavia

22 / 49

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

23 / 49

Non-metric MDS

In metric MDS, we assumed that the inter-point distances containinterval-level information such that

• �ij = f(dij)

In non-metric MDS, we assume rather that

• if �jm < �kl then djm dkl• This really only requires ordinal rather than interval-level input data.

24 / 49

Page 7: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Non-metric MDS: The Good and the Bad

Good:

• More flexibility in estimating configurations due to fewer constraintsand as a result,

• Lower-stress solutions (and don’t we all like lower stress).

Bad:

• Fewer constraints means more likely to produce locally optimal(though globally sub-optimal) or otherwise degenerate solutions.

• These degenerate solutions happen when # Stimuli

# Dimensions

is low(Rabinowitz suggests it should be � 4).

25 / 49

Non-metric MDS

> d <- (9-nations)^2> smacof_nonmetric_result <- smacofSym(delta=d, ndim=2, type="ordinal")> nm.conf <- smacof_nonmetric_result$conf

> plot(nm.conf[,1], nm.conf[,2], type="n", asp=1, cex.lab=1.4, cex.axis=1.3, cex.main=1.6,+ main="Non-Metric (SMACOF) Solution",+ xlab="",+ ylab="",+ xlim=c(-1.0,1.0),ylim=c(-1.0,1.0),font=2)> points(nm.conf[,1],nm.conf[,2],pch=16,col="gray", cex=1.1)> text(-0.8,0.9,paste("Stress = ",round(smacof_nonmetric_result$stress,3)),font=2, cex=1.2)> text(nm.conf[,1],nm.conf[,2],rownames(nations),+ pos=c(4,4,4,4,4,4,4,4,4,4,4,2),offset=00.25,col="black",font=2, cex=1.1)> abline(a=0,b=-3/5,lwd=2,lty=2)> abline(a=0,b=5/3,lwd=2,lty=2)

26 / 49

Comparing Configurations

−1.0 −0.5 0.0 0.5 1.0

−1.0

−0.5

0.0

0.5

1.0

SMACOF (Metric) Solution

●●

● ●

Stress = 0.223

BrazilCongo

Cuba

Egypt

France

India

Israel

JapanChina USSR

USA

Yugoslavia

(a) Metric

−1.0 −0.5 0.0 0.5 1.0

−1.0

−0.5

0.0

0.5

1.0

Non−Metric (SMACOF) Solution

●●

●●

●●

Stress = 0.185

BrazilCongo

Cuba

EgyptFranceIndia

Israel

JapanChina USSR

USA

Yugoslavia

(b) Non-metric

Figure : Metric and Non-Metric (SMACOF) Multidimensional Scaling ofNations Similarities Data

27 / 49

Rotation

The two configurations from above seem to be quite highly correlated.> diag(cor(conf, nm.conf))

D1 D2

0.9905024 0.9397823

> cor(dist(conf), dist(nm.conf))

[1] 0.9088776

To get the two configurations into maximal similarity, we could rotateone to be closer to the other.

• To do this, we need a rotation matrix. If we wanted to rotate thesolution 30

� clockwise, we could use:

A =

cos(�30

�) � sin(�30

�)

sin(�30

�) cos(�30

�)

28 / 49

Page 8: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Rotating the SMACOF solution

> cos.deg <- function(x)cos((x*pi)/180)> sin.deg <- function(x)sin((x*pi)/180)> A <- matrix(c(cos.deg(-30), sin.deg(-30),+ -sin.deg(-30), cos.deg(-30)), nrow=2, ncol=2)> rot.mds <- smacof_metric_result$conf %*% A

> library(lattice)> plot.dat <- as.data.frame(rbind(conf, rot.mds))> names(plot.dat) <- c("Dim1", "Dim2")> plot.dat$config <- factor(rep(c(1,2), each=nrow(conf)),+ labels = c("Raw", "Rotated"))> trellis.par.set(superpose.symbol=list(col=c("gray33", "gray66"), pch=c(15,16)))> xyplot(Dim2 ~ Dim1, groups=config, data=plot.dat,+ auto.key=list(space="top"))

29 / 49

Figure

Dim1

Dim2

−0.5

0.0

0.5

−0.5 0.0 0.5

RawRotated ●

30 / 49

Procrustes Rotation

• In factor analysis, we used numerical criteria based on theconfiguration matrix itself to find an appropriate rotation.

• With MDS, when comparing solutions, we want to rotate onesolution to bring it into maximal similarity with another.

• A procrustes rotation does this by specifying a matrix to be rotatedand a target toward which the input is to be fit.

31 / 49

Technical Details of Procrustes Rotation

1. Center the columns of X (the target configuration) and X⇤ (thematrix to be rotated) so they sum to zero.2

2. Calculate the product matrix, X0X⇤ and its singular valuedecomposition: X0X⇤

= UDV0

The optimal rotation matrix is T = VU0

The optimal dilation factor is s = (tr X0X⇤T)/(tr X⇤0X⇤)

3

The optimal translation vector is t = n�1(X� sX⇤T)

01.4

A Procrustes rotation function exists in the MCMCpack package in R.

2

This normalization is done as a matter of course by most MDS software like

smacof".

3

Dilation is stretching or shrinking the dimension (i.e., changing the variance).

4

Translation is changing the centroid (point of means) of the space.

32 / 49

Page 9: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Rotating the Non-metric Solution

> library(MCMCpack)> rot.nm <- procrustes(nm.conf, conf, translation=F, dilation=T)> # corrlation between raw n-m and m configurations> diag(cor(nm.conf, conf))

D1 D2

0.9905024 0.9397823

> # corrlation between rotated n-m and m configurations> diag(cor(rot.nm$X.new, conf))

[1] 0.9905579 0.9398426

The correlations are pretty similar, which suggests that there is not muchrotation> acos.deg <- function(x)acos(x)*180/pi> acos.deg(rot.nm$R[1,1])

[1] 0.4311835

The rotation is around 0.43�.

33 / 49

MDS with Agreement Matrix

We can do MDS with an agreement matrix as well (e.g., how often dolegislators vote together).

• Sometimes these matrices exist already, but we can make them fromscratch.

• We’ll use data from the 112th US Senate. 5

> library(foreign)> dat <- read.dta("http://www.quantoid.net/files/essex/sen112kh.dta")> votes <- as.matrix(dat[,-c(1:9)])> votes[which(votes %in% c(1:3), arr.ind=T)] <- 1> votes[which(votes %in% c(0, 7:9), arr.ind=T)] <- 9> votes[which(votes %in% c(4:6), arr.ind=T)] <- 0> votes[which(votes == 9, arr.ind=T)] <- NA> D <- dist(votes)^2> w <- D> w[which(!is.na(w))] <- 1> w[which(is.na(w))] <- 0> D[which(is.na(D))] <- .5> mds.agree <- smacofSym(D, ndim=2, weightmat = w)

5

See http://voteview.com/senate112.htm for a discussion of the data.

34 / 49

Visualizing the Solution

> library(car)> pch.party <- recode(dat$party, "100='D'; 200='R'; 328='I'",+ as.factor.result=F)> agree.conf <- mds.agree$conf> agree.conf[,1] <- -agree.conf[,1]> plot(agree.conf, pch=pch.party)

DRR

R

D

R

R

DR

D

D

D

DD

D

D

D

R

DR

R

DD

D

RRD R R

R

R

D

R

R

R

R

R

DR

R

D

D

R

D

DD

DD

R R

D

R

DD

R

DR

R

D

R

DD

D

DDDD

RDD

R

DR

R

RDD

D

R

DD

RR

RD

R

R

R

R

R

RI

DD

D

DD

D

D

R

DRR

−1.0 −0.5 0.0 0.5 1.0

−0.4

−0.2

0.0

0.2

0.4

0.6

D1

D2

35 / 49

Exercise

Use the smacofSym() function to perform metric and non-metric MDS on the

SupremeCourt.2011 agreement score matrix.

> u <- url("http://www.quantoid.net/files/essex/supremecourt.2011.rda")> load(u)> close(u)

1. Plot the metric and non-metric configurations of the Justices side-by-side, clearly

labeling the Justices.

1.1 Next, perform a Procrustes rotation of the non-metric configurationusing the metric configuration as the target. Provide the code used.

1.2 Plot the metric and rotated non-metric configurations side-by-side.1.3 Why does non-metric MDS place Scalia and Thomas at virtually

identical locations while metric MDS does not?

2. Report the stress values from metric and non-metric scaling in one and two dimensions.

3. Based upon your interpretation of the substantive meaning of the point configurations

and the stress values, what does the first dimension represent and is there a meaningful

second dimension to this data?

36 / 49

Page 10: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Bootstrapping MDS Solutions

1. Draw, a dataset of size n⇥ k with replacement, from the non-scaleddimension (in this case, votes), R times.

2. For each draw above, calculate the dissimilarity and then run theMDS routine.

3. Use procrustes rotation to make sure that each bootstrapped pointconfiguration is in maximal geometric agreement with the originalconfiguration.

4. Use the bootstrapped, rotated configurations to generate avariance-covariance matrix that can be used to plot confidenceellipses.

37 / 49

Bootstrapped MDS in R

> library(bsmds)> votes <- votes[-c(24,57), ]> lopsided <- apply(votes, 2, function(x)mean(x == 1, na.rm=T))> outs <- which(lopsided > .95 | lopsided < .05)> votes <- votes[,-outs]> library(bsmds)> out <- bsmds(t(votes), dist.fun = "dist", R=50)

38 / 49

Plotting the Bootstrapped Configuration

> plot(out, id="none", key.side="none")

MDS Axis 1

MD

S Ax

is 2

−0.5

0.0

0.5

−1.0 −0.5 0.0 0.5 1.0 1.5

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●● ●

●●●

39 / 49

Exercise: Bootstrapping MDS

Using the dataset on Supreme Court votes, use Bootstrapped MDS toget stability estimates of the MDS stimuli configurations.> u <- url("http://www.quantoid.net/files/essex/supreme_court_2010-2013.rda")> load(u)> close(u)

The file contains data on 316 cases voted on by the current 9 supremecourt justices. They are scored 0 for conservative and 1 for liberaldecisions.

• Perform the BS MDS

• Plot the configuration with confidence ellipses..

40 / 49

Page 11: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

41 / 49

Individual Di↵erences Scaling

Individual Di↵erences Scaling operates as follows:

• Use individual data on dissimilarities distances to generate alow-dimensioned map for each individual.

• Generate weights that map each individual’s configuration onto aglobal configuration.

• These weights increase variance on important dimensions anddecrease variation on less important dimensions.

• This is a more flexible way of aggregating over individuals as it doesnot assume each individual uses each evaluative criterion in exactlythe same way.

The input to the individual di↵erences scaling algorithm is then a list ofdissimilarity matrices.

42 / 49

INDSCAL

In the INDSCAL problem, we are given n (i = 1, ..., n) matrices of theform

D⇤zi = Dzi+E = (diagZWiZ

0)J 0

q�2ZWiZ0+Jq(diagZWiZ

0)

0+Ei (6)

• q (j = 1, ..., q) is the number of stimuli

• we are asked to find the q by s (k = 1, ..., s) (where s is the numberof dimensions) matrix Z and the n diagonal matrices Wi such that:

Y ⇤i = ZWiZ

0+ E⇤

i (7)

The algorithm distinguishes Z from Z 0 in the following way:

Y ⇤i = Z(L)WiZ

0(R)+ E⇤

i (8)

43 / 49

INDSCAL Algorithm

1. Double center the n q by q symmetric matrices of squared distancesto obtain the n Y ⇤

i matrices.

2. Obtain starting estimates of ˆZ(L) and ˆZ(R) from eigendecomposition of ¯Y ⇤.

3. Use ˆZ(L) and ˆZ(R) to construct G (cross-products in Z)

4. Run OLS to obtain estimates of the ˆWi.

5. The ˆWi andˆZ(R) are used to construct i.

6. The ˆWi andˆZ(L) are used to construct i.

7. Go to Step 3.

8. Repeat Steps 3-5 until convergence.

44 / 49

Page 12: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Outline

Classical MDSExample: Metric MDS (Torgersen)Example: Metric MDS using SMACOF

Non-metric MDSExample: Non-metric MDS

Individual Di↵erences ScalingExample: French Parties

45 / 49

French Parties

> u <- url("http://www.quantoid.net/files/essex/french.parties.individuals.rda")> load(u)> close(u)> fi <- na.omit(french.parties.individuals)> parties <- lapply(1:50, function(x)dist(t(fi[x,]))^2 + .001)> indscal.result.2dim <- smacofIndDiff(delta=parties, ndim=2,+ constraint="indscal")

46 / 49

Plot of the Configuration

> plot.indscal(indscal.result.2dim)

First Dimension

Seco

nd D

imen

sion

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0

extremeleftleftpartycommunist

socialistgreensudfbayrou

umpsarkozynationalfront

47 / 49

Plot of Weights

> w <- t(sapply(indscal.result.2dim$cweights, diag))> plot(w)

● ●●

● ●●●

● ●● ●●

●●●

●● ●

● ●●●

●●

●● ●

0.0 0.2 0.4 0.6 0.8 1.0 1.2

02

46

8

D1

D2

48 / 49

Page 13: Dissimilarities Dimension Reduction in R Essex Summer ...Dimension Reduction in R Essex Summer School in Data Analysis Lecture 3: Multidimensional Scaling Dave Armstrong Department

Exercise

The 2010 Chapel Hill Expert Survey (CHES) asked expert informants toplace European parties on three 11-point scales: general left-right(leftright), economic left-right (econlr), and social/cultural left-right(galtan). CHES2010.France is a list that includes the raw placementsof six French parties by seven experts(CHES2010.France$lr.placements) and dissimilarity matrices for eachexpert constructed from the sum of the absolute distances between theparties across the three scales(CHES2010.France$dissimilarity.matrices).

1. Use the smacofIndDiff() function to run Individual Di↵erencesScaling (INDSCAL) onCHES2010.France$dissimilarity.matrices in two dimensionswith the indscal constraint.

2. Which respondent has the most stress? Which respondent has theleast stress?

3. Which party has the most stress? Which party has the least stress?4. Plot the group configuration, clearly labeling the party names.

49 / 49