Value Added Analysis and background predictor effects. A case study from Spain.

Value Added Analysis and background predictor effects. A case study from Spain.

Paper presented at the second meeting of the EARLI SIG18 Educational Effectiveness, Leuven, 25‐27 August 2010.

Enrique Navarro

Complutense university of Madrid

SPAIN

Co‐author(s):

María Castro, Yeow Meng Thum

Keywords:

Value‐Added Analysis

Background effect

Multilevel analysis

Extended abstract 1. Objectives or purposes The aim of this paper is built a statistical model for measure the schools effects in the learning growth, using a longitudinal data block. That model belongs to the perspective of value added analysis as statistical tool for estimate performance and growth in different points of time. There are many approaches to using longitudinal achievement data to measure teacher or school effectiveness. Most of these approaches differ at methodological issues that represent different conceptions of Value Added Models. Using one or another modeling approach could lead to different results, but which one will lead to a more defensible analysis? We present a case study with Spanish data. It has been used multilevel models to estimate a status point and growth parameters for describe the school achievement over the time and to illustrate how different background effects affect to school description. We develop three level models (individual time‐line, students and schools). Main goal of this work is model comparisons in order to look for over‐adjustments due to use some predictors and differences at school rankings since some critical decision for them could be made based on these analysis. 2. Perspective(s) or theoretical framework There is great diversity of approaches which attempt to analyze the effectiveness of schools. The value added (next VA) analysis is one of them. When talking about VA models it refers to various statistical analyses used to estimate the contributions of schools or teachers to growth in student performance. Operationally, the VA estimated for a particular school is simply the difference between the growth observed over a period of time and the expected growth. In this study we are going to focus in the effect of predictor’s inclusion in the analysis. The introduction of background predictors in VA analysis is a contentious issue that can affect the final scores (Ballou, Sanders, & Wright, 2004; Ferrão, 2009; Haegeland & Kirkeboen, 2008; Hibpshman, 2004; Tekwe, y otros, 2004; Choi, Goldschmidt, & Yamashiro, 2006; Lockwood, McCaffrey, Hamilton, Stecher, Le, & Martínez, 2007; Keeves, Hungi, & Afrassa, 2005). In one hand, Sanders, Saxton, & Horn (1997) notes that one advantage of a system based on analysis of student gains is that do not need to incorporate student covariates in the models because each student, so to speak , is his own control. In other hand, Raudenbusch and Bryk suggest that the introduction of adjustments using variables of student’s context is important for two reasons (2002, pág. 111): ‐ Because persons are not usually assigned at random to organizations, failure to control for background may bias the estimates of organization effects. ‐ If these student covariates are strongly related to the outcome of interest, controlling for them will

increase the precision of any estimates of organizational effects and the power of hypothesis tests by reducing unexplained in the student level. Therefore, the introduction or not of covariates is a key aspect in building models of VA. The use of socioeconomic characteristics in contextualized VA models can have a negative impact on equity and efficiency of decision making, however much of this depends on how it use the information provided by the VA (OCDE, 2008, pág. 33). In an empirical study Choi, Goldschmidt, & Yamashiro (2006) compare the rankings produced by different contextualized models finding high correlations between the model that includes the initial status as a predictor and which incorporating socioeconomic status (measure by eligibility for free or reduced‐price lunch) as the main covariate with values about 0,97. 3. Methods, techniques, or modes of inquiry The statistical model used in the study with the Spain data is a nested model. It treats the performance data as nested in different levels without possibility of change between these levels. Mainly, several performance scores nested in each student and students clustered in classes or schools. Using a multilevel analysis we implement a statistical model. The outcome variable was math and we utilized performance in reading comprehension in the first measurement occasion as predictor. The background predictors used in this study were student level features. Socio‐economical status (SES) as representative of student environment was built using educational level of parents. And the prior performance in other subject, reading comprehension in this case, for avoids collinearity if the same subject was used. Both SES and read prior are centered to the global mean. We built four models using the combinations of SES and prior performance in reading comprehension. Then we have estimated the initial point and growth residual for every school: 1. Null model: without predictors 2. Model 1: with socio‐economical status (SES) 3. Model 2: with prior performance in read 4. Model 3: with socio‐economical status (SES) plus prior performance in read In the null model performance is a lineal function of time. The growth is defined as a performance curve measure over time. The model without predictors is as follows. The growth residual ( ) associated with schools is considered the school value added score, is the part that has not been explained by the model, therefore is an unexplained coefficient: 4. Data sources or evidence The data block used in this study is part of an R+D project wit title Value added in education and the education production function: a longitudinal study made in Madrid in 2007. The cohort 2 it means first and second of secondary compulsory education. Sample was composed by 2128 students nested at 65 schools, measures 4 times during two school years. The scores obtained in different test which growth in difficulty must be on a common scale to be compared. The construction of achievement test and the development of scales to measure longitudinal growth are critical factors in the estimation process of VA ratings. The linking is a broad term used to describe the different methods available to establish a relationship between the scores of two or more score tests which differ in difficulty and content. There are different kinds of linking scores and vertical scaling is one of those used to put on a common scale data from longitudinal assessments in VA models for example TVASS (Ballou, Sanders, & Wright, 2004, pág. 38). 5. Results and/or conclusions/point of view The estimation results in the four models are table 1 The results show how the unexplained variance between schools and between students has been

reduced with the predictor’s inclusion. The third model has a better fit with a low deviance in the maxim likelihood estimation. Using the correlation between growth residuals is possible see estimation changes Correlations table 2 The model with SES is a little bit more different (0,97) respect the null model that the model with read prior performance (0,99). The model with the two predictors has a high correlation with the model with SES only (0,99) and 0,97 correlation with a model with read prior performance. That may say that the SES effect is more powerful in this relationship, opposite to the Choi, Goldschmidt, & Yamashiro (2006) conclusions. Comparing residual school rankings between null model and the model with SES and prior performance explain changes in the results plots 1 & 2 The school number one and nine show remarkable changes. The school one with the predictor’s inclusion become different from the global mean and the school nine change his position above from the mean. Even though in this study the predictor’s effects are a little change in the school results is possible that this variation may be greater but in this design the effect is lower. 6. Educational or scientific importance of the study In the educational evaluation the access to broad performance data series is growing with the concern of many states for carry out the assessment process in a reliable an fair way. If some high stake decision will be made based on VA analysis, we should think carefully about which of these models represent better real school circumstances. Works cited

Attachments: Value Added Analysis and background predictor effects. A case study from Spain.

Abstract

There are many approaches to using longitudinal achievement data to measure teacher or school

effectiveness. Most of these approaches differ at methodological issues that represent different

conceptions of Value Added Analysis. Thus some attempts at classifying Value Added Analysis focused

on the statistical analysis have been implemented. Using one or another modeling approach could lead

to different results, but which one will lead to a more defensible analysis? We present a case study with

Spanish data. It has been used multilevel models to estimate a status point and growth parameters for

describe the school achievement over the time and to illustrate how different background effects affect

to school description. We develop three level models (individual time‐line, students and schools). Main

goal of this work is model comparisons in order to look for over‐adjustments due to use some

predictors and differences at school rankings since some critical decision for them could be made based

on these analysis. The predictors used were socio‐economical status and prior performance. The results

show how the predictors may produce variations in the residual estimation and school rankings. The

correlation study also show the importance of socio‐economical status effect in the analysis and his

power as a performance predictor.

1. Objectives or purposes

The aim of this paper is built a statistical model for measure the schools effects in the learning growth,

using a longitudinal data block. That model belongs to the perspective of value added analysis as

statistical tool for estimate performance and growth in different points of time. There are many

approaches to using longitudinal achievement data to measure teacher or school effectiveness. Most of

these approaches differ at methodological issues that represent different conceptions of Value Added

Models. Using one or another modeling approach could lead to different results, but which one will

lead to a more defensible analysis? We present a case study with Spanish data. It has been used

multilevel models to estimate a status point and growth parameters for describe the school

achievement over the time and to illustrate how different background effects affect to school

description. We develop three level models (individual time‐line, students and schools). Main goal of

this work is model comparisons in order to look for over‐adjustments due to use some predictors and

differences at school rankings since some critical decision for them could be made based on these

analysis.

2. Perspective(s) or theoretical framework

There is great diversity of approaches which attempt to analyze the effectiveness of schools. The value

added (next VA) analysis is one of them. When talking about VA models it refers to various statistical

analyses used to estimate the contributions of schools or teachers to growth in student performance.

Operationally, the VA estimated for a particular school is simply the difference between the growth

observed over a period of time and the expected growth. In this study we are going to focus in the

effect of predictor’s inclusion in the analysis. The introduction of background predictors in VA analysis is

a contentious issue that can affect the final scores (Ballou, Sanders, & Wright, 2004; Ferrão, 2009;

Haegeland & Kirkeboen, 2008; Hibpshman, 2004; Tekwe, y otros, 2004; Choi, Goldschmidt, &

Yamashiro, 2006; Lockwood, McCaffrey, Hamilton, Stecher, Le, & Martínez, 2007; Keeves, Hungi, &

Afrassa, 2005). In one hand, Sanders, Saxton, & Horn (1997) notes that one advantage of a system

based on analysis of student gains is that do not need to incorporate student covariates in the models

because each student, so to speak , is his own control. In other hand, Raudenbusch and Bryk suggest

that the introduction of adjustments using variables of student’s context is important for two reasons

(2002, pág. 111):

‐ Because persons are not usually assigned at random to organizations, failure to control for

background may bias the estimates of organization effects.

‐ If these student covariates are strongly related to the outcome of interest, controlling for them

will increase the precision of any estimates of organizational effects and the power of

hypothesis tests by reducing unexplained in the student level.

Therefore, the introduction or not of covariates is a key aspect in building models of VA. The use of

socioeconomic characteristics in contextualized VA models can have a negative impact on equity and

efficiency of decision making, however much of this depends on how it use the information provided by

the VA (OCDE, 2008, pág. 33). In an empirical study Choi, Goldschmidt, & Yamashiro (2006) compare

the rankings produced by different contextualized models finding high correlations between the model

that includes the initial status as a predictor and which incorporating socioeconomic status (measure by

eligibility for free or reduced‐price lunch) as the main covariate with values about 0,97.

3. Methods, techniques, or modes of inquiry

The statistical model used in the study with the Spain data is a nested model. It treats the performance

data as nested in different levels without possibility of change between these levels. Mainly, several

performance scores nested in each student and students clustered in classes or schools. Using a

multilevel analysis we implement a statistical model.

The outcome variable was math and we utilized performance in reading comprehension in the first

measurement occasion as predictor.

The background predictors used in this study were student level features. Socio‐economical status (SES)

as representative of student environment was built using educational level of parents. And the prior

performance in other subject, reading comprehension in this case, for avoids collinearity if the same

subject was used. Both SES and read prior are centered to the global mean.

We built four models using the combinations of SES and prior performance in reading comprehension.

Then we have estimated the initial point and growth residual for every school:

1. Null model: without predictors

2. Model 1: with socio‐economical status (SES)

3. Model 2: with prior performance in read

4. Model 3: with socio‐economical status (SES) plus prior performance in read

In the null model performance is a lineal function of time. The growth is defined as a performance curve

measure over time. The model without predictors is as follows. The growth residual (υ ) associated

with schools is considered the school value added score, is the part that has not been explained by the

model, therefore is an unexplained coefficient:

1 µ µ 1 υ υ 1 ε

4. Data sources or evidence

The data block used in this study is part of an R+D project wit title Value added in education and the

education production function: a longitudinal study made in Madrid in 2007. The cohort 2 it means first

and second of secondary compulsory education. Sample was composed by 2128 students nested at 65

schools, measures 4 times during two school years.

The scores obtained in different test which growth in difficulty must be on a common scale to be

compared. The construction of achievement test and the development of scales to measure

longitudinal growth are critical factors in the estimation process of VA ratings.

The linking is a broad term used to describe the different methods available to establish a relationship

between the scores of two or more score tests which differ in difficulty and content. There are different

kinds of linking scores and vertical scaling is one of those used to put on a common scale data from

longitudinal assessments in VA models for example TVASS (Ballou, Sanders, & Wright, 2004, pág. 38).

5. Results and/or conclusions/point of view

The estimation results in the four models are

The results show how the unexplained variance between schools and between students has been

reduced with the predictor’s inclusion. The third model has a better fit with a low deviance in the

maxim likelihood estimation. Using the correlation between growth residuals is possible see estimation

changes

Correlations

v1_null V1_1 V1_2 V1_3

v1_null 1 0.97 0.99 0.94

V1_1 1 0.96 0.99

V1_2 1 0.97

v1_3 1

The model with SES is a little bit more different (0,97) respect the null model that the model with read

prior performance (0,99). The model with the two predictors has a high correlation with the model with

SES only (0,99) and 0,97 correlation with a model with read prior performance. That may say that the

SES effect is more powerful in this relationship, opposite to the Choi, Goldschmidt, & Yamashiro (2006)

conclusions.

Comparing residual school rankings between null model and the model with SES and prior performance

explain changes in the results

Null 1 2 3

b0 261.90 (2.39) 263.46 (2.27) 262.99 (1.99) 264.06 (1.92)

b1 10.05 (0.43) 9.77 (0.44) 9.99 (0.44) 9.70 (0.45)

SES 3.22 (0.45) Read prior 0.36 (0.01) SES 2.37 (0.41)

Read prior 0.36 (0.01)

Random Part

School

v0 v1 v0 v1 v0 v1 v0 v1

v0 295.90 v0 251.56 v0 195.41 v0 172.76

v1 ‐30.54 7.33 v1 ‐30.66 7.27 v1 ‐26.25 7.54 v1 ‐27.39 7.8

Student

u1 u0 u1 u0 u1 u0 u1 u0

u1 1182.02 u1 1172.97 u1 850.82 u1 835.11

u0 ‐156.08 26.53 u0 ‐153.97 26.51 u0 ‐125.72 28.22 u0 ‐122.9 27.85

Residual

e 408.25 e 403.44 e 409.29 e 404.75

Deviance 81276.71 69881.98 74186.19 64350.57

N (school) 61 61 61 61

The school number one and nine show remarkable changes. The school one with the predictor’s

inclusion become different from the global mean and the school nine change his position above from

the mean. Even though in this study the predictor’s effects are a little change in the school results is

possible that this variation may be greater but in this design the effect is lower.

6. Educational or scientific importance of the study

In the educational evaluation the access to broad performance data series is growing with the concern

of many states for carry out the assessment process in a reliable an fair way. If some high stake decision

will be made based on VA analysis, we should think carefully about which of these models represent

better real school circumstances.

As space is at a premium, be as economical with references. List the most relevant sources. Make

sure that the references conform to APA editorial style.

‐10‐8‐6‐4‐202468

10

56

48

11

41

42

43

45

40

31

465 21

24

599 39

55

37

10

25

26

20

123 51

33

47

584 54

60

53

18

30

50

17

15

16

367 27

616 221 35

44

52

14

57

348 2 49

29

32

19

38

13

23

28

School VA & CI (95%)_null model

‐8

‐6

‐4

‐2

0

2

4

6

8

10

56

48

11

41

43

45

42

21

4 31

25

40

3 44

24

12

59

20

18

46

30

55

37

5 58

26

61

51

33

39

47

50

10

60

54

27

16

53

17

6 9 36

15

35

52

7 22

14

8 1 49

34

13

29

57

38

2 32

19

23

28

School VA & CI (95%)_model 3

Works cited

Ballou, D., Sanders, W., & Wright, P. (2004). Controlling for student backgroun in Value‐Added

assessment of teachers. Journal of Educational and Behavioral Statistics , 29 (1), 37‐66.

Bryk, A. S., & Raudenbush, S. W. (2002). Hierarchical Linear Models. Aplications and data analysis

methods. California: Sage Publications.

Choi, K., Goldschmidt, P., & Yamashiro, K. (2006). Exploring models of school performance form theory

to practice. University of California, CRESST, CSE. Los Angeles: University of California.

Downes, D., & Oanh, V. (2007). Value‐added measures for school improvement. Victoria: Department of

Education and Early Childhood Development.

Ferrão, M. E. (2009). Sensibilidad de las especificaciones del modelo de valor añadido: midiendo el

estatus socioeconómico. Revista de Educación (348), 137‐152.

Haegeland, T., & Kirkeboen, L. (2008). School performance and Value‐Added indicators ‐ What is the

importance of controlling for socioeconomic background? A simple empirical illustration using

Norwegian data. Report 2008/8. Statistics Norway.

Hibpshman, T. L. (2004). A review of Value‐Added Models. Kentacky: Kentacky education Professional

Standards Board.

Keeves, J. P., Hungi, N., & Afrassa, T. (2005). Measuring Value Added effects across schools: Should

schools be compared in performance? Studies in Educational Evaluation , 31, 247‐266.

Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B., Le, V.‐N., & Martínez, J. F. (2007). The

Sensitivity of Value‐Added Teacher Effect Estimates to Different Mathematics Achievement Measures.

Journal of Educational Measurement , 44 (1), 47–67.

McCaffrey, D. F., Koretz, D., Louis, T. A., & Hamilton, L. (2004). Models for Value‐Added Modeling of

Teacher Effects. Journal of Educational and Behavioral Statistics , 29 (1), 67‐101.

OCDE. (2008). Measuring Improvements in Learning Outcomes. Paris: OCDE.

Sanders, W., Saxton, A., & Horn, S. (1997). The Tennessee Value‐Added Accountability System: A

Quantitative, Outcomes‐Based Approach to Educational Assessment. En J. (. Millman, rading Teachers,

Grading Schools: Is Student Achievement a Valid Evaluation Measure? (págs. 137‐162). Thousands Oaks:

Corwin Press.

Tekwe, C. D., Carter, R. L., Ma, C. X., Lucas, M. E., Roth, J., Ariet, M., y otros. (2004). An Empirical

Comparison of Statistical Models for Value‐Added Assessment of School Performance. Jounal of

Educational and Behavioral Statistics , 29 (1), 11‐36.

Value Added Analysis and background predictor effects. A case study from Spain.

Documents

Transcript of Value Added Analysis and background predictor effects. A case study from Spain.