INTRODUCTORY STRUCTURAL EQUATION MODELING WITH...

19-12-2012 1

Bidin Yatim, PhD Assoc. Prof. School of Quantitative Sciences

College of Arts & Sciences

[email protected]

019-3394959

PhD Applied Statistics (Exeter, UK) MSc Industrial Mathematics (Aston, UK) BSc Mathematics and Statistics (Nottingham, UK)

INTRODUCTORY STRUCTURAL

EQUATION MODELING WITH

AMOS WORKSHOP

mailto:[email protected]

TOPICS TO DISCUSS

TOPIC 1: Introduction to SEM

TOPIC 2: Modeling and Computing

5 Basic SEM steps

Model Specification

Model Identification

Model Estimation

Model Testing

Model Modification

TOPIC 3: Implementing SEM Using AMOS

How to draw model

Running AMOS & evaluating outputs

How to modify and re-specify a non fitting model

Part One

Introduction

To SEM

An overview SEM

Chapter 1 and 2 of Structural

Equation Modeling with AMOS.

Basic Concepts, Applications

and Programming Barbara M.

Bryne

Welcome to SEM The Musical Lyrics by Alan Reifman

(May be sung to the tune of "Matchmaker," Bock/Harnick,

from Fiddler on the Roof)

SEM, SEM, it can be sung, You‟ll be amazed, at what we‟ve sprung, We hope you‟ll learn more ‟bout this stats technique, Through songs of which you‟re among, SEM, SEM, we like to run, It takes awhile, but we get it done, We hope you‟ll learn of the steps that we take, And take home from this, some fun…

A Simple SEM

SEM is an attempt to model

causal relations (model) between

variables by including all

variables that are known to have

some involvement in the process

of interest…

SEM is ……

a family of statistical techniques which incorporates and integrates Path analysis

Linear regression

Factor analysis

SEM Conveys 2 important aspects of the

procedures.. The causal processes under study are

represented by a series of structural relations, and

these structural relations can be modeled pictorially to enable a clearer conceptualization of the theory under study.

The model can be tested simultaneously to determine the extent to which it is consistent with the data: if the goodness of fit adequate, the model is not rejected, otherwise the hypothesized relations rejected.

The Basic Idea Behind SEM

Statisticians have developed procedures for testing whether a set of variances and covariances in a covariance matrix fits a specified structure. The way SEM works is as follows:

1. You state the way that you/the theory believe the variables are inter-related, often with the use of a path diagram.

2. You (AMOS) work out, via some complex internal rules, what the implications of this are for the variances and covariances of the variables.

3. You test whether the variances and covariances fit the model.

4. Results of the statistical testing, and also parameter estimates and standard errors for the numerical coefficients in the linear equations are reported.

5. On the basis of this information, you decide whether the model seems like a good fit to your data.

SEM

Is a statistical methodology that takes a confirmatory (hypothesis testing) approach to the analysis of a structural theory

Most other multivariate procedures are descriptive/ exploratory in nature.

The theory represent “causal” processes that generate observations on multiple variables.

SEM is a very general, powerful and popular

multivariate analysis technique. provides a comprehensive method for the

quantification and testing of theories. Been applied in econometric, psychology,

sociology, political science, education, market and medical research etc.

Also known as covariance structure analysis, covariance structure modeling, Latent vaviable modelling, Confirmatory factor analysis, Linear structural relationship and analysis of covariance structures.

SEM serves purposes similar to multiple

regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated IVs, measurement error, correlated error terms, multiple latent IVs each measured by multiple indicators, and one or more latent DVs also each with multiple indicators.

may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. These procedures are special cases of SEM.

Advantages of SEM compared to

multiple regression more flexible assumptions (particularly allowing interpretation

even in the face of multicollinearity),

use of confirmatory factor analysis to reduce measurement error by having multiple indicators per latent variable,

the attraction of SEM's graphical modeling interface, the desirability of testing models overall rather than coefficients individually,

Enables researches to test a set of regression equation simultaneously.

the ability to

test models with multiple dependents,

model mediating variables,

model error terms,

test coefficients across multiple between-subjects groups, and

handle difficult data (time series with autocorrelated error, non-normal data, incomplete data).

Major applications of

structural equation modeling 1. causal modeling, or path analysis - hypothesizes causal

relationships among variables and tests the causal models with a linear equation system. Causal models can involve either manifest variables, latent variables, or both;

2. confirmatory factor analysis - extension of factor analysis in which specific hypotheses about the structure of the factor loadings and intercorrelations are tested;

3. regression models, in which regression weights may be constrained to be equal to each other, or to specified numerical values;

4. covariance structure models, which hypothesize that a covariance matrix has a particular form. For example, you can test the hypothesis that a set of variables all have equal variances with this procedure;

5. correlation structure models, which hypothesize that a correlation matrix has a particular form.

How Useful is a Statistical

Model

All models is wrong, but some are useful

G.E.P Box SEM models can never be accepted,

they can only fail to be rejected. This leads researchers to a given

model While models that fit the data well

can only be provisionally accepted, models that do not fit the data well can be absolutely rejected

Statistical Modeling

Statistical model DOES NOT necessarily have theoretical basis- may be interpreted as either „make sense‟ or „nonsense‟.

Smoking

Heart

Disease

Weight

Income

Number of

Newspaper

Readers

Number of

Road

Accidents

SEM: Terminologies

Independent variables (IV), assumed to be measured without error, called exogenous variables

Dependent variable (DV) or mediating variables called endogeneous variables.

Manifest/ observed variables (indicators) are directly measured by researchers, while

Latent/ unobserved variables are not directly measured but inferred by the relationship/ correlation among measured variables in the analysis. Example: self-concept, motivation, powerlessness, anomie, verbal ability, capitalism, social class.

SEM Terminologies SEM illustrates relationship between

observed & unobserved using path

diagrams.

Oval or circles:- latent variables

Rectangles/ squares:- measured

variables (indicators)

Residuals always unobserved,

represented by oval or circles.

Model identification

Unique solution – parameters are estimateable

Cannot be identified – parameters subject to

arbitraries, implying that different parameters

value define the same model, hence no

consistent estimates for all parameters, thus

model cannot be tested. For example, to

determine unique value of X and Y when the

only information is X+Y=15.

Just identified - model will always cannot be

rejected; over-identified – model can sometimes

be rejected; (positive degrees of freedom) or

under-identified – insufficient information

SEM, Oh, SEM Lyrics by Alan Reifman, dedicated to Peter Westfall (article of his)

(May be sung to the tune of "Galveston," Jimmy Webb, popularized by Glen Campbell) Ultimately, SEM, Your LV‟s cannot be measured, Which gives the critics some displeasure, There‟s nothing physical to grab on, When you run SEM, SEM, Oh, SEM, You make many an assumption, Is it recklessness or gumption? Assume the e‟s uncorrelated... When you run SEM, I can see the critics‟ point of view, now, They‟re saying the models aren‟t unique, That, we must willingly acknowledge, In response to the critique, if we want to keep on using... SEM, Oh, SEM...

http://www2.tltc.ttu.edu/Westfall/images/6348/AreLatentMeasuresBetterthanFormativeMeasures.pdf

I Am an Indicator Lyrics by Alan Reifman

(May be sung to the tune of "The Entertainer," Billy Joel)

I am an indicator, a latent construct I represent, I'm measurable, sometimes pleasurable, A manifestation of what is meant, I am an indicator, I usually come in a multiple set, With other signs of the same construct, you may instruct, I'm correlated with my co-indicators, you can bet, I am an indicator, from my presence the construct is inferred, I'm tap-able, the construct is not palpable, The distinction should not be blurred

At Least Three Lyrics by Alan Reifman

(May be sung to the tune of "Think of Me," Lloyd

Webber/Hart/Stilgoe, from Phantom of the Opera)

At least three, indicators are urged, For each latent construct shown, At least three, indicators should help, Avoid output where you groan, With less than three, your construct sure will be, locally unidentified, Though the model might still run, you could have a rough ride

SEM

Consists of two components

Structural model

Illustrates relationship among latent

construct or endogeneous variables

Measurement models

Representing how the construct related

to their indicators or manifest variables.

Example

Exogeneous Endogeneous Endogeneous

Latent Latent Latent

Construct Construct Construct

Ability Aspiration Achievement

Structural Model

The structural model allows for certain relationships among latent variables, depicted by lines or arrows (in path diagram).

In the path diagram earlier, we specified that Ability and Achievement were related in a specific way i.e. intelligence had some influence on later achievement.

One result of structural model is an indication of the extent to which these priori hypothesized relationships are supported by our sample data

Structural Model

The structural equation addresses

following question

Are Ability and Achievement related?

Exactly how strong is the influence of Ability on

Achievement?

Could there be other latent variables that we

need to consider to get a better understanding

of the influence on Achievement?

Measurement Model

Specifying relationship between latent variables and observed variables

Answers the questions: To what extent are observed variables actually

measuring the hypothesized latent variables?

Which observed variable is the best measure of a particular latent variable?

To what extent are the observed variables actually measuring something other than the hypothesized latent variable?

Using EFA or CFA to determine the significant observed variables related to each of the latent variables.

Using Factor Analysis Factor Loading Academic

Skill

Interpersonal

Skill

Communication

Skill

Ability

Using Factor Analysis Factor Loading Family

Status

Father‟s

Occupation

Peer‟s

Influence

Aspiration Father‟s

Occupation

Father‟s

Occupation

Using Factor Analysis Factor Loading

Personal

Actualization

Professional

Status

Social

Status

Achievement

Measurement Model

The relationships between observed variables and latent variables are described by factor loadings

Factor loadings provide information about the extent to which a given observed variable is able to measure the latent variable. They serve as validity coefficients.

Measurement error is defined as that portion of an observed variable that is measuring something other than what the latent variable is hypothesized to measure. It serves as a measure of unreliability

Full Latent Variable Model

1x

Ability Aspiration Achievement

Interpersonal

Skill, x2

Peer‟s

Influence

y3

Family

Status, y1

Father‟s

Occupation,

y2

Professional

Status, x5

Social

Status,

x6

Academic Skill, x1

Communication Skill, x3

Personal Actualization,

x4

Total, Direct and Indirect Effects

There is a direct effect between two latent

variables when a single directed line or arrow

connects them

There is an indirect effect between two

variables when the second latent variable is

connected to the first latent variable through

one or more other latent variables

The total effect between two latent variables is

the sum of any direct effect and all indirect

effects that connect them.

Types of Measurement Scale

There 4 types of measurement scale in a scale instrument Nominal Scale Ordinal Interval Scales Ratio

Some other common scales like Likert scales, Semantic Differential Scales, Dichotomous Scales etc can be categorized into the 4 above

This is important as assumptions on SEM rely on what we know on this page

SEM Assumptions(1)

SEM in general assumes observed variables are measured on a linear continuous scale

Dichotomous and ordinal variables cause problem because correlation/ covariance tend to be truncated. These score are not normally distributed and responses to individual items may not be very reliable.

SEM Assumption (2)

Many SEM estimation procedures assume

multivariate normality

Lack of univariate normality occurs when

the skew index is >3.0 and kurtosis index >10

Multivariate normality can be detected by

indices of multivariate skew or kurtosis

Non-normality can be corrected

(sometimes) by transforming variables.

Violations of Assumptions

The best known distribution with no

kurtosis is the multinormal

Leptokurtic (more peaked) distribution

results in too many rejection of H0 based

on the Chi-Square statistic

Platykurtic distributions will lead to too

low estimates of Chi-Square.

Correlation Matrix

We say that

If the input matrix used is the

Covariance Matrix: the estimated

coefficients in the parameters

measured are unstandardized estimates

If the input matrix used is the Correlation

Matrix: the estimated coefficients in the

parameters measured are the

standardized estimates

So what?

Correlation Matrix

When we want to test a theory, we use variance-covariance matrix

(to validate the causal relationships among constructs)

When we just want to explain the pattern of the relationships then we use correlation matrix

(Theory testing is not required)

Covariance

The covariance between two variables equals the correlation times the product of the variables' standard deviations. The covariance of a variable with itself is the variable's variance

Covariance Lyrics by Alan Reifman (May be sung to the tune of "Aquarius,"

Rado/Ragni/MacDermot, from Hair, also popularized by the Fifth Dimension)

You draw paths to show relationships, You hope align with the known r‟s, Your model will guide the tracings, From constructs near to constructs far, You will compare this with the data‟s covariance, The data‟s covariance... Covariance! Covariance! Similar to correlation, With the variables unstandardized, Does each known covariance match up with, The one the model tracings will derive? Covariance! Covariance!

Factors Effecting Correlation/ Covariance

Coefficient Type of scale and range of values

Pearson correlation is basis for analysis in regression, path, factor analysis and SEM. Hence data must be in metric form.

There must be enough variation in scores to allow correlation relationship to manifest.

Linearity Pearson correlation coefficient measures degree of linear

relationship between two variables, hence need to test linearity.

Sample size SEM requires big sample size. Rule of thumb: 10-20 times the

number of variables. Ding, Velicer and Harlow (1995): 100-150; Boomsma (1982,1983): 400; Hu, Bentler and Kano (1992): in some cases 5000 is still insufficient; Schumaker, Lomax (1999) many articles 250-500. Bentler and Chou (1987): for normal data 5 subjects per variable is sufficient.

What if the sample is small?

Consequence of small sample size

Convergence failures (software cannot

obtain a satisfactory solution)

Improper solutions (including negative

error variance estimates for measured

variables)

Lowered accuracy of parameter

estimates an in particular, standard

error (SE)

SEM Assumption (3)

SEM program standard error are

computed under the assumption

of large sample size.

Violation of Assumptions

High degrees of skewness lead to

excessively large Chi-Square

estimates.

In small samples (N<100), the Chi-

Square statistic tend to be too large.

SEM Assumptions (4) Multicollinearity Occur when intercorrelations among some variables are so

high that certain mathematical operation are impossible or results are unstable because denominators close to 0.

Bivariate correlations>0.85

Multiple correlation>0.90

May cause a non-positive definite/ singular covariance matrix

May be due to inclusion of individual and composite variables

Detection: Tolerance=1-R², 0.10;

Variance Inflation Factor (VIF)=1/(1-R²)>10

Can be corrected by eliminating or combining redundant variables.

SEM Assumptions (5)

Outliers

Univariate outliers - >3 Std. deviation from the mean

Detection - Inspection of frequency distributions and univariate skewness and kurtosis

Multivariate outliers may have extreme score on two or more variables or their figuration of scores may be unusual

Detection by inspection of multivariate skewness and kurtosis

Can be remedied by correcting errors or by dropping these cases or by transforming the variabless.

Applications involving single-

group analyses - Section II first-order confirmatory factor analytic (CFA) models demonstrate testing for the

validity of the multidimensionality of theoretical structure of a construct (Chapter 3). Specifcally, this application tests the hypothesis that self-concept (SC), for early adolescents (grade 7), is a multidimensional construct composed of four factors—general SC (GSC), academic SC (ASC), English SC (ESC), and mathematics SC (MSC).

second-order CFA model bears on the factorial structure of a measuring instrument

(Chapter 5) that examines a CFA model that comprises a second order factor.

The final single-group application tests for the validity of an empirically derived causal structure (Chapter 6). we take our frst look at a full structural equation model (SEM). The hypothesis to be tested relates to the pattern of causal structure linking several stressor variables that bear on the construct of burnout. The original study from which

this application is taken (Byrne, 1994a) tested and cross-validated the impact of organizational and personality variables on three dimensions of burnout for elementary, intermediate, and secondary teachers. For purposes of illustration here, however, the application is limited to the calibration sample of elementary teachersonly (N = 599).

In Section III, I present three applications relat

6 JULY 2011

Testing for the factorial validity of a theoretical

construct (First-order CFA model)

6 JULY 2011

Testing for the factorial validity of scores from a

measuring instrument (Second-order CFA model)

Testing for the validity

of a full causal structure

6 JULY 2011

Applications involving multiple

group analyses - Section III

to test for measurement and structural equivalence across groups with respect to a measuring instrument i.e a first order CFA model (Chapter 7) and to a full causal structure (Chapter 9).

we focus on applications involving more than one sample where the central concern is whether or not components of the measurement model and/or the structural model are equivalent (i.e., invariant) across particular groups of interest.

6 JULY 2011

17-12-2012 53

THE END

thank you

6 JULY 2011 54

INTRODUCTORY STRUCTURAL EQUATION MODELING WITH...

Documents

Transcript of INTRODUCTORY STRUCTURAL EQUATION MODELING WITH...