Dimension Reduction Methods

download Dimension Reduction Methods

If you can't read please download the document

description

Dimension Reduction Methods. statistical methods that provide information about point scatters in multivariate space “factor analytic methods” simplify complex relationships between cases and/or variables makes it easier to recognize patterns. How?. - PowerPoint PPT Presentation

Transcript of Dimension Reduction Methods

  • Dimension Reduction Methods

  • statistical methods that provide information about point scatters in multivariate spacefactor analytic methodssimplify complex relationships between cases and/or variablesmakes it easier to recognize patterns

  • How?identify and describe dimensions that underlie the input datamay be more fundamental than those directly measured, and yet hidden from viewreduce the dimensionality of the research problembenefit = simplification; reduce number of variables you have to worry aboutidentifying sets of variables with similar behaviour

  • Basic Ideasimagine a point scatter in multivariate space:the specific values of the numbers used to describe the variables dont matterwe can do anything we want to the numbers, provided they dont distort the spatial relationships that exist among casessome kinds of manipulations help us think about the shape of the scatter in more productive ways

  • imagine a two dimensional scatter of points that show a high degree of correlation orthogonal regression

  • Why bother?more efficient description1st var. captures max. variance 2nd var. captures the max. amount of residual variance, at right angles (orthogonal) to the firstthe 1st var. may capture so much of the information content in the original data set that we can ignore the remaining axis

  • other advantagesyou can score original cases (and variables) in new space, and plot themspatial arrangements may reveal relationships that were hidden in higher dimension spacemay reveal subsets of variables based on correlations with new axes

  • sizeshape

  • PUBLICPRIVATEDOMESTICRITUAL

  • Principal Components Analysis (PCA)why:clarify relationships among variablesclarify relationships among caseswhen:significant correlations exist among variableshow:define new axes (components)examine correlation between axes and variablesfind scores of cases on new axes

  • r = 0r = -1r = 1eigenvalue: sum of all squared loadings on one component

  • eigenvaluesthe sum of all eigenvalues = 100% of variance in original dataproportion accounted for by each eigenvalue = ev/n (n = # of vars.)correlation matrix; variance in each variable = 1if an eigenvalue < 1, it explains less variance than one of the original variablesbut .7 may be a better thresholdscree plots show trade-off between loss of information, and simplification

  • Mandara Region knife morphology

  • J. Yellen San ethnoarchaeology (1977)CAMP:the camp identification number (1-16.)LENGTH:the total number of days the camp was occupied.INDIVID:the number of individuals in the principal period of occupation of the camp. Note that not all individuals were at the camp for the entire LENGTH of occupation. FAMILY:the number of families occupying the site.ALS:the absolute limit of scatter; the total area (m) over which debris was scattered.BONE:the number of animal bone fragments recovered from the site.PERS_DAY:the actual number of person-days of occupation (not the product of INDIVID*LENGTHnot all individuals were at the camp for the entire time.)

  • Correspondence Analysis (CA)like a special case of PCA transforms a table of numerical data into a graphic summaryhopefully a simplified, more interpretable display deeper understanding of the fundamental relationships/structure inherent in the dataa map of basic relationships, with much of the noise eliminatedusually reduces the dimensionality of the data

  • CA basic ideasderived from methods of contingency table analysis most suited for analysis of categorical data: counts, presence-absence datapossibly better to use PCA for continuous (i.e., ratio) databut, CA makes no assumptions about the distribution of the input variables

  • simultaneously R and Q mode analysisderives two sets of eigenvalues and eigenvectors ( CA axes; analogous to PCA components)input data is scaled so that both sets of eigenvectors occupy very comparable spacescan reasonably compare both variables and cases in the same plots

  • CA outputCA (factor) scoresfor both cases and variablespercentage of total inertia per axislike variance in PCA; relates to dispersal of points around an average valueinertia accounted for by each axis distortion in a graphic displayloadingscorrelations between rows/columns and axeswhich of the original entities are best accounted for by what axis?

  • massas in PCA new axes maximize the spread of observations in rows / columnsspread is measured in inertia, not variancebased on a chi-squared distance, and is assessed separately for cases and variables (rows and columns)contributions to the definition of CA axes is weighted on the basis of row/column totalsex: pottery counts from different assemblages; larger collections will have more influence than smaller ones

  • Israeli political economic concernsresidential codes:As/Af (Asia or Africa)Eu/Am (Europe or America)Is/AA (Israel, dad lives in Asia or Africa)Is/EA (Israel, dad lives in Europe or America)Is/Is (Israel, dad lives in Israel)

  • Israeli political economic concernsworry codesENREnlisted relativeSABSabotageMILMilitary situationPOLPolitical situationECOEconomic situationOTHOtherMTOMore than one worryPERPersonal economics

  • Ksar Akil Up. Pal., Lebanon

  • Data> Frequency> COUNTStatistics> Data Reduction> CA

  • Multidimensional Scaling (MDS)aim: define low-dimension space that preserves the distance between cases in original high-dimension spaceclosely related to CA/PCA, but with an iterative location-shifting proceduremay produce a lower-dimension solution than CA/PCAnot simultaneously Q and R mode

  • non-metric MDSmetric MDS

  • Shepard Diagram

  • Discriminant Analysis (DFA)aims: calculate a function that maximizes the ability to discriminate among 2 or more groups, based on a set of descriptive variablesassess variables in terms of their relative importance and relevance to discriminationclassify new cases not included in the original analysis

  • DFADFs = groups-1each subsequent function is orthogonal to the lastassociated with eigenvalues that reflect how much work each function does in discriminating between groupsstepwise vs. complete DFA

  • Figure 6.5Figure 6.4Figure 6.6