Download - Sem With Amos i PDF (1)

8/11/2019 Sem With Amos i PDF (1)

1/68

By Hui BianOffice For Faculty Excellence

Spring 2012

1


2/68

What is structural equation modeling (SEM)

Used to test the hypotheses about potentialinterrelationships among the constructsas well astheir relationships to the indicators or measuresassessing them.

2

Theory of plannedbehavior (TPB)


3/68

Goals of SEM

To determine whether the theoretical model issupported by sample data or the model fits the datawell.

It helps us understand the complex relationshipsamong constructs.

3


4/68

Factor1

Factor2

Indica1

Indica2

Indica3

Indica4

Indica5

Indica6

error1

error2

error3

error6

error4

error5

4

Example of SEM


5/68

5

Measurementmodel

Measurementmodel Structural

modelExample of SEM


6/68

Basic components of SEM Latent variables (constructs/factors) Are the hypothetical constructs of interest in a study, such as: self-

control, self-efficacy, intention, etc.

They cannot be measured directly.

Observed variables (indicators) Are the variables that are actually measured in the process of data

collection by the researchers using developed instrument/test.

They are used to define or infer the latent variable or construct.

Each of observed variables represents one definition of the latentvariable.

6


7/68

Basic components of SEM

Endogenous variables (dependent variables):variables have at least one arrow leading into it fromanother variable.

Exogenous variables (independent variables): anyvariable that does not have an arrow leading to it.

7


8/68


Measurement error terms

Represents amount of variation in the indicator that isdue to measurement error.

Structural error terms or disturbance terms

Unexplained variance in the latent endogenous variablesdue to all unmeasured causes.

8


9/68


Covariance: is a measure of how much two variableschange together.

We use two-way arrow to show covariance.

9


10/68

Graphs in AMOS

Rectangle represents observed variable

Circle or eclipse represents unobservedvariable

Two-way arrow: covariance or correlation

One-way arrow: unidirectional relationship

10


11/68

11

Latentvariable Latentvariable

Observed

variable

Measurement

Error termsCovariance

Path StructuralError term


12/68

Model parameters

Are those characteristics of model unknown to theresearchers.

They have to be estimated from the samplecovariance or correlation matrix.

12


13/68

Model parameters

Regression weights/Factor loadings

Structural Coefficient

Variance

Covariance

Each potential parameter in a model must be specifiedto befixed, free, or constrained parameters

13


14/68

Model parameters

Free parameters: unknown and need to be estimated.

Fixed parameters: they are not free, but are fixed to aspecified value, either 0 or 1.

Constrained parameters: unknown, but are

constrained to equal one or more other parameters.

14


15/68

15

Fixed

Free

If opp_v1 = opp_v2,they are constrainedparameters


16/68

Build SEM models

Model specification: is the exercise of formally stating

a model. Prior to data collection, develop a theoreticalmodel based on theory or empirical study, etc.

Which variables are included in the model.

How these variables are related.

Misspecified model: due to errors of omission and/orinclusion of any variable or parameter.

16


17/68

Model identification: the model can in theory and in practicebe estimated with observed data.

Under-identified model: if one or more parametersmay not be uniquely determined from observed data.A model for which it is not possible to estimate all ofthe model's parameters.

17


18/68

Model identification Just-identified model(saturated model): if all of the

parameters are uniquely determined. For each freeparameter, a value can be obtained through only onemanipulationof observed data.

The degree of freedom is equal to zero (number of freeparameters exactly equals the number of known values).

Model fits the data perfectly.

Over-identified model: A model for which all the parameters areidentified and for which there are more knowns than freeparameters.

18


19/68

Just or over identified model is identified model

If a model is under-identified, additional

constraints may make model identified. The number of free parameters to be estimated

must be less than or equal to the number of

distinct values in the matrix S. The number of distinct values in matrix S is equal

to p (p+1)/2, p is the number of observedvariables.

19


20/68

How to avoid identification problems To achieve identification, one of the factor loadings must

be fixed to one. The variable with a fixed loading of one iscalled a marker variable or reference item.

This method can solve the scale indeterminacy problem. There are "enough indicators of each latent variable. A

simple rule that works most of the time is that there needto be at least two indicators per latent variable and thoseindicators' errors are uncorrelated.

Use recursive model Design a parsimonious model

20


21/68

Rules for building SEM model

All variances of independent variables are model

parameters.

All covariances between independent variables aremodel parameters.

All factor loadings connecting the latent variables andtheir indicators are parameters.

All regression weights between observed or latentvariables are parameters.

21


22/68

Rules for building SEM model

The variance and covariances between dependent

variables and covariances between dependent andindependent variables are NOT parameters.

*For each latent variable included in the model, the metricof its latent scale needs to be set.

For any independent latent variable: a path leaving thelatent variable is set to 1.

*Paths leading from the error terms to their correspondingobserved variables are assumed to be equal to 1.

22


23/68

23


24/68

Build SEM models: Model estimation

How SEM programs estimate the parameters?

The proposed model makes certain assumptions

about the relationships between the variables in

the model.

The proposed model has specific implicationsfor the variances and covariances of the

observed variables.

24


25/68


We want to estimate the parameters specified in the

model that produce the implied covariance matrix.

We want matrix is as close as possible to matrix S,sample covariance matrix of the observed variables.

If elements in the matrix Sminus the elements in thematrix isequal to zero, then chi-square is equal tozero, and we have a perfect fit.

25


26/68

How SEM programs estimate the parameters? In SEM, the parameters of a proposed model are estimated

by minimizing the discrepancy between the empiricalcovariance matrix, S, and a covariance matrix implied bythe model,. How should this discrepancy be measured?This is the role of the discrepancy function.

S is the sample covariance matrix calculated from the

observed data. is covariance matrix implied by the proposed model or

the reproduced (or model-implied) covariance matrix isdetermined by the proposed model.

26


27/68


In SEM, if the difference between Sand (distance

between matrices) is small, then one can concludethat the proposed model is consistent with theobserved data.

If the difference between Sandis large, one canconclude that the proposed model doesnt fit the data.

The proposed model is deficient.

The data is not good.

27


28/68

Build SEM models

Model estimation

Estimation of parameters.

Estimation process uses a particularfit function

to minimize the difference between Sand.

If the difference = 0, one has a perfect model fitto the data.

28


29/68

Model estimation methods

The two most commonly used estimation techniques

are Maximum likelihood (ML) and normal theorygeneralized least square (GLS).

ML and GLS: large sample size, continuous data, andassumption of multivariate normality

Unweighted least squares (ULS): scale dependent.

Asymptotically distribution free (ADF) (Weighted leastsquares, WLS): serious departure from normality.

29


30/68

30

Assumenormality

No normalityassumed


31/68

Model testing

We want to know how well the model fits the data.

If Sand are similar, we may say the proposed modelfits the data.

Model fit indices.

For individual parameter, we want to know whether afree parameter is significantly different from zero.

Whether the estimate of a free parameter makes sense.

31


32/68

Chi-square test

Value ranges from zero for a saturated model with all

paths included to a maximum for the independencemodel (the null model or model with no parametersestimated).

32


33/68

Build SEM models

Model modification

If the model doesnt fit the data, then we need to modifythe model .

Perform specification search: change the original modelin the search for a better fitting model .

33


34/68

Goodness-of-fit tests based on predicted vs.observed covariances (absolute fit indexes)

Chi-square (CMIN): a non-significant2 valueindicatesSand are similar.2 should NOT be significant ifthere is a good model fit.

Goodness-of-fit (GFI) and adjusted goodness-of-fit

(AGFI). GFI measures the amount of variance andcovariance in Sthat is predicted by . AGFI is adjustedfor the degree of freedom of a model relative to thenumber of variables.

34


35/68

Goodness-of-fit tests based on predicted vs.

observed covariances (absolute fit indexes)

Root-mean-square residual index (RMR). The closer RMR isto 0, the better the model fit.

Hoelter's critical N, also called the Hoelter index, is used tojudge if sample size is adequate. By convention, sample

size is adequate if Hoelter's N > 200. A Hoelter's N under 75is considered unacceptably low to accept a model by chi-square. Two N's are output, one at the .05 and one at the.01 levels of significance.

35


36/68

Information theory goodness of fit: absolute fit

indexes.

Measures in this set are appropriate when comparing modelsusing maximum likelihood estimation.

AIC,BIC,CAIC,and BCC.

For model comparison, the lower AIC reflects the better-fittingmodel. AIC also penalizes for lack of parsimony.

BIC: BIC is the Bayesian Information Criterion. It penalizes forsample size as well as model complexity. It is recommended whensample size is large or the number of parameters in the model issmall.

36


37/68

Information theory goodness of fit: absolutefit indexes.

CAIC: an alternative to AICC, also penalizes for samplesize as well as model complexity (lack of parsimony).The penalty is greater than AIC or BCC but less thanBIC. The lower the CAIC measure, the better the fit.

BCC: It should be close to .9 to consider fit good. BCCpenalizes for model complexity (lack of parsimony)more than AIC.

37


38/68

Goodness-of-fit tests comparing the given modelwith a null or an alternative model. CFI, NFI, NFI

Goodness-of-fit tests penalizing for lack ofparsimony.

parsimony ratio (PRATIO), PNFI, PCFI

38


39/68

Scaling and normality assumption

Maximum likelihood and normal theory generalized least

squares assume that the measured variables arecontinuous and have a multivariate normal distribution.

In social sciences, we use a lot of variables that aredichotomous or ordered categories rather than truly

continuous. In social sciences, it is normal that the distribution of

observed variables departs substantially from multivariatenormality.

39


40/68

Scaling and normality assumption

Nominal or ordinal variables should have at least five

categories and not be strongly skewed or kurtotic.

Values of skewness and kurtosis are within -1 and + 1.

40


41/68

Problems of non-normality(practicalimplications)

Inflated2 goodness-of-fit statistics.

Make inappropriate modifications in theoreticallyadequate models.

Findings can be expected to fail to be replicated andcontributing to confusion in research areas.

41


42/68


43/68

Solutions to nonnormality

The asymptotically distribution free (ADF) estimation:

ADF produces asymptotically unbiased estimates ofthe2 goodness-of-fit test, parameter estimates, andstandard errors.

Limitation: require large sample size.

43


44/68

Solutions to nonnormality

Unweighted least square (ULS): No assumption of

normality and no significance tests available. Scaledependent.

Bootstrapping: it doesnt rely on normal distribution.

Bayesian estimation: if ordered-categorical data aremodeled.

44


45/68

Sample size (Rules of thumb)

10 subjects per variable or 20 subjects per variable

250-500 subjects (Schumacker & Lomax, 2004)

45


46/68

Computer programs for SEM

AMOS

EQS

LISERAL

MPLUS

SAS

46


47/68

AMOS is short for Analysis of MOmentStructures.

A software used for data analysis known asstructural equation modeling (SEM).

It is a program for visual SEM.

47


48/68

Path diagrams

They are the ways to communicate a SEM model.

They are drawing pictures to show the relationshipsamong latent/observed variables.

In AMOS: rectangles represent observed variables andeclipses represent latent variables.

48


49/68

Examples of using AMOS tool bar to draw adiagram.

Example

Two latent variables: intention and self-efficacy

Four observed variables: intention01, intention02,self_efficacy01, and self_efficacy02

Five error terms

49


50/68

The model should be like this

50


51/68

Go toAll programs from Start > IBM SPSS Statistics >IBMSPSS AMOS19 >AMOS Graphics

51


52/68

Latentvariables

Observedvariables

52

Tool bar


53/68

Draw observed variables use Rectangle

Draw latent variables use ellipse

Draw error terms use

53


54/68


55/68

55


56/68

Open data: File Data Files

56

ClickYour file


57/68

Put observed variable names to the graphs

Go to View >Variables in Dataset

Then drag each variable to each rectangle

57


58/68

Put latent variables in the graph

Put the mouse over one latent variable and right click

Get this menu

Click Object Properties

Type Self-efficacy here

58


59/68

For error terms, double click the ellipse and get ObjectProperty window.

Constrain parameters: double click a path from Self-efficacyto Self-efficy01, type 1for regression weight, then click Close.

59

Click


60/68

The data is from AMOS examples (IBM SPSS)

Attig repeated the study with the same 40

subjects after a training exercise intended toimprove memory performance. There were thus

three performance measures before training andthree performance measures after training.

60


61/68

Draw diagram

61


62/68

Conduct analysis: Analyze > Calculate Estimates

Text output

62

1. Number of distinct samplemoments: sample means,variances, and covariances(AMOS ignores means). Wealso use 4(4+1)/2 = 10.2. Number of distinct

parameters to be estimated: 4variances and 6 covariances.3. Degrees of freedom:number of distinct samplemoments minus number ofdistinct parameters


63/68

Text output

63

There is no null hypothesisbeing tested for this example.The Chi-square result is notvery interesting.


64/68

For hypothesis test, the chi-square value is ameasure of the extent to which the data were

incompatible with the hypothesis. For hypothesis test, the result will be positive

degrees of freedom.

A chi-square value of 0 indicates no departurefrom the null hypothesis.

64


65/68

Text output

65

Minimum was achieved: this

line indicates that Amossuccessfully estimated thevariances and covariances.When Amos fails, it is

because you have posed aproblem that has no solution,or no unique solution (modelidentification problem).


66/68

Text output

66

1. Estimate means covariance: forexample the covariance betweenrecall1 and recall2 is 2.556.

2. S.E. means an estimate of thestandard error of the covariance,1.16.3. C.R. is the critical ratio obtainedby dividing the covarianceestimate by its standard error.

4. For a significance level of 0.05,critical ratio that exceeds 1.96would be called significant. Thisratio is relevant to the nullhypothesis that, the covariancebetween recall1 and recall2 is 0.


67/68

Text output

67

5. In this example, 2.203 is greaterthan 1.96, then the covariancebetween recall1 and recall2 issignificantly different from 0 at the0.05 level.6. P value of 0.028 (two-tailed) is for

testing the null hypothesis that theparameter value is 0 in thepopulation.


68/68