Performance Funding in Higher Education- The Effects of Student Motivation on the Use of Outcomes...

7/27/2019 Performance Funding in Higher Education- The Effects of Student Motivation on the Use of Outcomes Tests to Me…

http://slidepdf.com/reader/full/performance-funding-in-higher-education-the-effects-of-student-motivation 1/15

Research in Higher Education, Vol. 42, No. 1, 2001

PERFORMANCE FUNDING INHIGHER EDUCATION:The Effects of Student Motivation on theUse of Outcomes Tests to MeasureInstitutional Effectiveness

Jeff E. Hoyt

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

This study obtained data from 1,633 students who took the Collegiate Assessmentof Academic Proficiency (CAAP) to evaluate whether the English, math, and criticalthinking exams should be used as performance funding measures in the state. Analy-sis of Covariance (ANCOVA) was utilized to demonstrate that the use of the examswas problematic because students failed to give the assessments their best effort.Other variables in the analysis included level of English and math completed, gradepoint averages for English and math, cumulative grade point averages, and studentcharacteristics. Policy implications for performance funding are also addressed in thestudy.

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::KEY WORDS: performance funding; outcome measures; testing.

INTRODUCTION

Performance funding and outcomes assessment have captured the attention of

public institutions of higher education, legislators, and accrediting associations.

All the regional accrediting associations now require outcomes assessment for

accreditation (Banta and Pike, 1989). A 1997 survey of state higher education

finance officers found that 20 percent of states had implemented performance

funding, and 52 percent “were likely to continue or adopt it in the next five

years” (Burke, 1998a, p. 6).

Due to the increasing use of performance funding, it is important that policy-

makers understand the impact and validity of the outcomes measures they are

selecting to fund higher education. If outcomes are poorly measured or indica-

Jeff E. Hoyt, Director of Institutional Research, Utah Valley State College.

Address correspondence to: Jeff E. Hoyt, Utah Valley State College, 800 West University Park-

way, Orem, UT 84058-5999; [email protected].

71

0361-0365/01/0200-0071$19.50/0 2001 Human Sciences Press, Inc.



72 HOYT

tors are not valid, the goals of performance funding to improve the higher educa-

tion sector may not be realized. Performance funding may also create unin-

tended consequences.

The present study analyzes the validity of using the Collegiate Assessment of

Academic Proficiency (CAAP) as an outcomes measure. This outcomes measure

was being considered in a pilot test for the Utah System of Higher Education

(USHE) with possibilities of being tied to performance funding. Banta and Pike

(1989) previously evaluated the use of the College Outcome Measures Project

(COMP) developed by the American College Testing Service and the Academic

Profile designed by the Educational Testing Service as outcomes measures. This

research supports the findings of Banta and Pike that the use of outcomes testing

to measure student learning is problematic due to a lack of student motivation

to perform on the assessments.

In the following sections, a literature review of performance funding in higher education is presented first, followed by the study methodology and findings.

Implications for practice are then discussed.

LITERATURE REVIEW

The Emergence of Performance Funding

Legislatures have traditionally funded higher education institutions on enroll-

ment growth. Enrollment-driven funding has increased access to higher educa-

tion, but “many state leaders believe that the expansion of access has come at

the cost of adequate standards of performance” (Folger, 1984, p. 1). In other

words, “obtaining students gain[ed] more importance than giving them a good

education.” (Bogue and Brown, 1982, p. 124). There have also been complaintsabout the productivity of higher education institutions. In response, states have

adopted performance funding to give colleges and universities incentives to im-

prove higher education.

Some authors believe that the increased emphasis on accountability was

driven in part by limited resources (Ewell, 1994). Performance funding shifts

the focus from the need for more resources to justifying existing support and

accountability. It changes the “budget question from what states should do for

their campuses to what campuses did for their states” (Burke and Serban, 1998,

p. 1).

Selection of Performance Measures

Institutions have multiple outputs that make the selection of performancemeasures difficult. In addition to educational training, colleges and universities

produce basic research, facilitate technology transfer to business and industry,



MOTIVATION, OUTCOMES TESTS, AND INSTITUTIONAL EFFECTIVENESS 73

promote business development, offer medical services and hospital care, athletic

events, theater and performing arts programs, recreational activities, radio and

television programs, and other public services. In the instructional area, they

provide “career training, occupational retraining, developmental course work,

continuing education programs, contract training for business and industry, and

a variety of other educational offerings” (Mayes, 1995, p. 13).

When limiting outcomes assessment to educational outcomes, there are many

areas of learning that could be measured. These might include:

Scientific/mathematics problem solving, [writing skills], interpersonal skills/groupdynamics, humanistic outcomes, basic communication skills, life skills, knowledgeand intellectual development, critical thinking, cognitive complexity . . . , liberalizationof political and social values, increased tolerance for others, community and civicresponsibility . . . , self assurance and relationships with others, job satisfaction, rela-tionship between college major and current work . . . , task or problem solving skills,analytical thinking skills, self directed learning skills, humanistic or artistic apprecia-tion . . . , [and] consumer awareness. (Graham and Cockriel, 1990, pp. 280–282; Pace,1984; Kuh and Wallman, 1986)

Due to this complexity, multiple measures are used to assess quality in higher

education. States may use several outcomes measures because they do not want

to neglect important outputs of higher education. Different measures may be

relevant because the missions of community colleges, four-year institutions, and

research universities are not the same. Policymakers may advocate periodically

changing outcome measures to improve other important areas within the college

(Bogue and Brown, 1982). It may be necessary to rely on a combination of

measures because any one measure may not accurately reflect an outcome.

Some view it as a way to spread around the resources so that everyone gets a

share: “A university scoring low on one or two standards should score higher on others due to the variety among standards. No institution [would] be left out

of earning some of the performance funds available” (Ashworth, 1994, p. 13).

For example, the Tennessee legislature used the number of academic pro-

grams accredited, standardized tests of general education and major fields of

study, satisfaction surveys, and peer evaluation of programs for their outcomes

measures (Bogue and Brown, 1982; Banta, Rudolph, Van Dyke, and Fisher,

1996). Indicators promoted in Texas included transfer student retention, effec-

tiveness of remedial education programs, the number of tenure-track faculty

teaching undergraduates, degrees awarded, the number of course completers and

graduates in critical skill areas, scores on national graduate exams, federal re-

search dollars, income from intellectual property, and the number of faculty

involved in public service (Ashworth, 1994). Other measures were student eval-

uations of faculty, faculty credentials, teaching loads, space utilization, creditsat graduation, job placement rates, employer surveys, student performance on

licensure and certification exams, implementation of strategic plans or post-



74 HOYT

tenure reviews, faculty compensation, and administrative costs (Burke and Ser-

ban, 1998; Mayes, 1995; Van Dyke, Rudolph, and Bowyer, 1993; South Caro-

lina Commission on Higher Education, 1997).

Burke and Serban (1998) provide an excellent summary of performance indi-

cators used in eleven states. The number of indicators in states ranged from

three to thirty-seven (Burke, 1998b). The authors described several categories

for outcomes measures such as input, process, output, or outcomes indicators

(Burke, p. 49). Although there is substantial variation in the indicators selected

by states, performance funding creates a shift from inputs to process and output

or outcomes measures. The authors also found that the choice of indicators often

varied depending on whether they were prescribed by the legislature or set

through a participative process within the higher education system. The weights

given to indicators for funding purposes were also different among states.

There are several possible indicators that could be selected, which can createconfusion among policymakers. Ewell and Jones (1994) discuss the importance

of having a clear understanding of the purpose of the indicators and articulating

the policies that the system wants to promote. Layzell (1999) advocates a “small

number of well-defined and well-conceived indicators” (p. 238).

Linking Funding to Performance Indicators

Funding for outcomes measures may be provided using different mechanisms.

Funds may be awarded for improvement in performance compared with prior

years. Another option is to appropriate funding to institutions that perform better

than the national norm (Mayes, 1995). A fixed rate may be set for each outcome

without making any comparisons (Ashworth, 1994). Funds may be appropriated

using a formula or a competitive grant (Layzell and Caruthers, 1995). Value-added assessment also has support (Astin, 1990; Banta and Fisher, 1984; Folger,

1984; Eyler, 1984). This may require testing when students enter college and

after graduation to measure what they learned from their college experience.

The linkage between funding and the indicators may be direct or indirect

(Layzell, 1999).

Because gathering information is costly, the “availability of data, simplicity,

and flexibility” are important considerations (Ashworth, 1994, p. 11). During

the first four years of performance funding in Tennessee, the legislature pro-

vided state appropriations for the development of assessment procedures (Eyler,

1984). In other states, higher education officials wanted increased autonomy “in

return for increased accountability of outcomes” (Jemmott and Morante, 1993,

p. 309).

The amount of state funds set aside for performance funding also varies. Itranges from less than one percent to four percent of state appropriations (Serban,

1998). The idea is to provide enough incentive for change, but not so much that




it disrupts the operations of colleges and universities. One exception is South

Carolina where the state was moving toward placing their entire higher educa-

tion budget into performance based funding (Serban).

The Effects of Performance Funding

There have been mixed reactions to performance funding. Several authors

reported that it created a new environment on campus where faculty and staff

became more concerned with educational outcomes (Van Dyke et al., 1993;

Jemmott and Morante, 1993). Due to the low scores of students on problem

solving, one campus initiated a series of workshops and trained faculty in teach-

ing critical thinking skills (Van Dyke et al.). They also distributed student re-

sumes to potential employers to increase their job placement rates, and required

all administrators to provide student advising to improve retention (Van Dykeet al.). To achieve gains in student test scores, other authors recommended

teaching “critical thinking, problem solving, quantitative reasoning, and writing”

across the curriculum (Jemmott and Morante, p. 310). Spiral techniques of in-

struction were promoted on some campuses to give students a frequent review

of the curriculum (Van Dyke et al.).

Others were concerned about the impact and utility of performance funding.

They believed that “the process becomes increasingly bland since the temptation

is to include all interests and then reduce them to what is most easily measur-

able” (Ecclestone, 1999, p. 36). In some cases, it “has not had much impact on

the curriculum” and was more useful for grants and reporting purposes (Amiran,

Schilling, and Schilling, 1993, p. 78). There were concerns that institutions with

underprepared students operated “under a severe handicap” when competency

testing was used; and, those with better prepared students had “an unfair advan-tage”(Astin, 1990, p. 38). High-risk students who require remedial education

may be less likely to graduate. In a survey of staff in Tennessee “none of the

coordinators believed student success (as measured by persistence to graduation)

improved learning” (Mayes, 1995, p. 19).

Outcomes measures based on competency tests and surveys were also ques-

tioned. In a survey conducted in Tennessee, “only 26 percent of the campuses

ascribed any positive influence on student learning to the massive efforts made

to test most graduates using a standardized test of general education” (Banta et

al., 1996, p. 30). Other educators found that competency testing narrows instruc-

tional goals because faculty teach to the test to increase their funding levels

(Banta and Pike, 1989; Jemmott and Morante, 1993; Astin, 1990). Standardized

tests may also promote the acquisition of facts rather than more complex think-

ing skills (Astin). There may be “too much emphasis on tests” without givingconsideration to other ways of measuring effective outcomes (Banta and Fisher,

1984, p. 40).



76 HOYT

There was concern among educators with the validity of test instruments.

Yorke (1997) reported that teaching quality assessments may not be valid. Pike

(1999) found that the responses of students who graduate and report their per-

ceptions of institutional effectiveness on surveys may be biased by a “halo ef-

fect” (p. 81). Several other authors noted that general education tests often did

not measure what was being taught in the classroom (Amiran et al., 1993; Banta

and Pike, 1989; Yarbrough, 1992).

Outcomes assessment and performance funding have not always been suc-

cessful. Arkansas, Kentucky, and Texas are three states that initiated and then

later dropped the use of performance funding (Burke 1998a; Layzell and Caruth-

ers 1995). Colorado “suspended it in preparation for a new program in 1999”

(Burke, 1998b, p. 49). In a 1996 survey of 1,813 state policymakers and campus

administrators, state policymakers were more likely to believe that performance

funding will achieve its intended goals than administrators on campus (Serban,1998).

Student Motivation and Outcomes Assessment

Several authors have demonstrated that student motivation on assessment

tests is a concern. Banta and Pike (1989) evaluated two commonly used tests,

the College Outcome Measures Project (COMP) and the Academic Profile. A

critical problem with the testing was student lack of effort on the exams. Stu-

dents had little incentive to try on the tests, which took more than two hours to

complete. Only “26 percent of those taking the COMP said they had ‘tried as

hard as they could have’ on the test; while, only 20 percent of those taking the

Profile gave this response” (Banta and Pike, p. 461). Yarbrough (1992) provides

an excellent review of research that examined the ACT COMP exam. Studentmotivation was lacking in several studies, and the authors concluded that “stu-

dent involvement and motivation must be ensured” for an effective assessment

program (Yarbrough p. 231). Other critics believed that efforts to measure teach-

ing quality in terms of outputs was a “zeal for quantification carried to its inher-

ent and logical absurdity” (Enarson, 1983, p. 8).

Educators at the secondary level have also studied the effect of motivation

on assessment. Bracey (1996) found that students in secondary school districts

were not taking the National Assessment of Educational Programs (NAEP) test

seriously. School districts were able to increase the test performance of these

students through praise, setting up competition among classrooms, and giving

financial incentives to students. Rothman (1995) cited several ways to increase

student motivation on assessments, such as basing grade promotion on test re-

sults, giving students a special diploma signifying greater achievement, and cashawards. In another study, gender differences in math performance were ex-

plained by motivation (Terwillinger and Titus, 1995). Males earned higher




scores than females because they were more motivated to perform on the exams.

The present study contributes to the literature in several ways. The results

demonstrate that low student motivation on outcomes assessment can adversely

impact the revenues institutions receive through performance funding. In addi-

tion, it illustrates the complexity and uncertainty surrounding the effect of per-

formance measures and the need to give careful thought to the implementation

of performance funding programs.

METHODS

The population for the present study included 1,633 students who received

an associate’s degree from Utah Valley State College (UVSC) and took the

CAAP test developed by the American College Testing Program (ACT) during

the 1997–1998 academic year. UVSC is a public institution that enrolled 18,174students during the fall of 1998. Students on average were 22 years of age.

About 54 percent were males, and 46 percent were females. The student popula-

tion was predominantly Caucasian (93 percent).

Students who did not complete any English and math courses at UVSC were

excluded from the study for two primary reasons. First, there was difficulty in

determining whether the English courses transferred to UVSC were college-

level writing courses. Second, the focus of the study was on the curriculum at

UVSC versus the curriculum at other institutions.

The CAAP test scores, student ratings of their effort on the test, and their

self-categorization as a full-time student were merged with other demographic

and transcript data captured in the student information system using social secu-

rity numbers. The total transfer credit students earned, their age at the time they

took the test, and cumulative grade point averages were calculated for the analy-sis. The three tests analyzed in this study were the mathematics, writing skills,

and critical thinking assessments.

This analysis found a serious problem with student effort on the tests. De-

scriptive statistics were used to report this concern. Analysis of Covariance

(ANCOVA) was also used to demonstrate that there was a statistically signifi-

cant difference in the test scores of students depending on their level of effort.

The assumptions underlying ANCOVA were assessed in the study. The as-

sumption of homogeneous group regression coefficients was met for all three

CAAP tests. The interaction terms were obtained and entered into a linear re-

gression with the covariates and the independent variable to determine whether

there was a significant relationship between the interaction terms and the depen-

dent variable. In all cases, the interaction terms were not significant predictors

of student performance on the exams (p < .05). There were weak correlationsamong the covariates and the independent variable. The covariates were inde-

pendent.



78 HOYT

The assumptions of linear regression were also met. The residual scores for

each test had means of zero. The distributions of the residuals for the writing

skills and critical thinking tests were slightly positively skewed, but reasonably

close to normal. The distribution of the residuals for math was normal. There

were no curvilinear relationships among the variables.

In addition to these procedures, Levene’s test for equality of error variances

was used to test the null hypothesis that the error variance of the dependent

variable was equal across groups.

The test was not significant for the writing skills and critical thinking exams.

The Levene test was significant for math, but given the very large sample sizes

and small value of Fmax (.55), it would have a minimal impact on the results.

The omnibus or overall F-tests were calculated for each exam followed by

pairwise comparisons based on estimated marginal means. The adjustment for

multiple comparisons was made using the Bonferroni test. The type III sum of squares was used in the present study because it was more appropriate for an

unbalanced model with no missing cells (SPSS, 1997, p. 34).

STUDENT EFFORT AND ITS IMPACT ON OUTCOMES TESTING

The main finding of the present study was that UVSC students did not give

the CAAP test their best effort, which resulted in an inaccurate measure of

learning outcomes. The testing center monitored students taking the exams, and

those who attempted to leave early without giving the test any real effort were

counseled to remain in the center and continue working on the assessments.

Staff informed these students that the graduation office would be contacted and

their degree placed on hold. These actions had a limited impact as most students

did not give the exams their best effort.The majority of students did not give the mathematics and critical thinking

tests their best effort (Table 1). Nearly half the students taking the writing test

did not give their best effort. Only 32.5 percent of the students taking the mathe-

matics test gave the exam their best effort. A substantial number of students

gave the tests little or no effort.

TABLE 1. Student Effort and Outcomes Testing

Group 2

Group 1 Moderate Group 3 Group 4

Test Best Effort Effort Little Effort No Effort

Mathematics 460 32.5% 637 45.0% 273 19.3% 45 3.2%English 823 55.9% 525 35.7% 96 6.5% 28 1.9%

Critical Thinking 660 42.0% 671 42.6% 192 12.2% 50 3.2%




ACT provides reference group scores or user norms for the CAAP test (Amer-

ican College Testing Program, 1998a). These reference group scores reflect na-

tional data for comparative purposes. The CAAP test is a norm referenced exam,

and there are no interpretations for absolute levels of achievement. UVSC stu-

dent performance on the exams was similar or slightly higher than most other

students at two-year colleges taking the exams. For example, the average writing

score for UVSC students was 63.0 compared with the reference group writing

score of 62.7. The average math score for UVSC students was 57.1 compared

with 56.1 for students nationally. The average critical thinking score for UVSC

students was 62.2 compared with 61.2 for students nationally.

Data provided by ACT also showed that students at two-year colleges nation-

ally gave the test somewhat more effort than students at UVSC; however, a

substantial number did not give the exams their best effort (American College

Testing Program, 1998b). Excluding the non-responses, 68 percent of the stu-dents nationally gave the English assessment their best effort, and 52 percent

gave the critical thinking exam their best effort. However, only 43 percent of

the students nationally gave the mathematics assessment their best effort. The

non-response rates on the exams were as follows: writing skills 8 percent, math-

ematics 10 percent, critical thinking 24 percent. In the national data provided

by ACT, UVSC students made up about 19 percent of the students taking the

writing exam, 20 percent of those taking the mathematics exam, and 24 percent

of students taking the critical thinking exam. In other words, students who did

not give the exams their best effort were included in overall average scores.

UVSC was comparing the average mathematics performance of their students

with students nationally who, in general, did not report giving the mathematics

assessment their best effort. A substantial percentage of students on the other

assessments did not give their best effort, particularly on the critical thinkingexam.

Another concern with the comparison is the inconsistency in how colleges na-

tionally use the test. Colleges and universities may require all students to take the

exam during their sophomore year before graduation or obtain exam scores for a

sample of students who may be volunteers or enrolled in selected programs.

As expected, the average test scores of students increased when they gave

more effort on the exams (Table 2). The same was true for the national sample.

If funding were based on overall averages, institutions with students who take

the testing more seriously may receive more funding than institutions with stu-

dents failing to try their best on exams.

Analysis of Covariance ResultsRather than use only descriptive statistics to examine student effort, more

advanced statistical procedures were adopted to control for the impact of other



80 HOYT

TABLE 2. Mean Test Scores and Student Effort

Group 2

Group 1 Moderate Group 3 Group 4

Best Effort Effort Little Effort No Effort

Test Mean Score Mean Score Mean Score Mean Score

Mathematics 58.0 3.9 57.2 3.2 55.8 2.9 53.1 3.3

English 64.1 5.2 62.6 4.6 59.0 4.2 54.6 3.3

Critical Thinking 63.9 5.3 62.3 5.3 58.4 4.8 53.2 3.4

variables. The interest was in assessing whether effort on the test was still a

significant factor when controlling for other relevant variables. ANCOVA wasused to control for any differences among the groups and removed the effect of

the covariates. Students were placed into three groups based on ratings of their

effort on the tests: (1) Tried my best, (2) Gave moderate effort, and (3) Gave

little or no effort. The students who gave little or no effort were combined to

keep the sample sizes for each of the groups more equivalent.

Several variables were used as covariates in the ANCOVA procedure. Some

covariates were entered as dummy variables. These included sex (1 = female, 0

= male), minority status (1 = Caucasian, 0 = minority), full-time status (1 = full-

time, 0 = part-time, transfer student (1 = yes, 0 = no), took college algebra (1 =

yes, 0 = no), and completed college writing (1 = yes, 0 = no).

Other quantitative variables included in the analysis were student age at the

time of the test, math and English grade point averages, and cumulative grade

point averages. A student’s math and English grade point averages were be-lieved to be more accurate measures of student ability in math and English; a

student’s cumulative grade point average was used to control for academic abil-

ity when analyzing the critical thinking exam.

With the exception of full-time status and age, the covariates and independent

variable were always significant in the ANCOVA results (Table 3). Eta-squared

was used to estimate the effect size for each parameter. Student demographic

variables had a small effect on student performance. There were moderate ef-

fects for academic variables. However, student effort on the exam was just as

important as the academic variables. For example, student effort on the exams

explained about nine and sixteen percent of the variation in test scores on the

English and critical thinking assessments respectively, and it was the variable

that explained the largest amount of variance for these exams. A student’s math

grade point average explained about seven percent of the variation in mathemat-ics test scores, followed by student effort on the test, which explained six per-

cent of the variance. In other words, student effort had an important impact on




TABLE 3. Analysis of Covariance and Outcomes Tests

Mathematics English Critical Thinking

(N = 1,399) (N = 1,462) (N = 1,569)

Eta Eta Eta

Source F Squared F Squared F Squared

Corrected Model 45.005 . 226* 38.555 . 193* 76.667 .282*

Intercept 7945.210 .851* 2053.256 .586* 1612.007 .508*

Sex 16.419 .012* 13.270 .009* 12.943 .008*

Minority Status 5.457 .004** 15.232 .010* 7.543 .005*

Full-time Status 8.530 .006* .606 .000 .218 .000

Transfer Student 14.524 . 010* 26.082 . 018* 16.946 .011*

Age 27.410 .019* 16.583 .011* 1.928 .001

College Algebra 59.684 .041* — — — —

College Writing — — 5.183 .004** — —

Average Math GPA 100.175 .067* — — — —

Average English GPA — — 49.543 .033* — —

Cumulative GPA — — — — 212.411 .120*

Effort 43.643 .059* 75.446 .094* 145.916 .158**

*p < .01; **p < .05.

a student’s test scores, when controlling for other variables. As students in-

creased their effort on the exams, their test scores improved.

The overall F test was significant indicating that there were significant differ-

ences in test scores among the groups. All the pair-wise comparisons were statis-

tically significant. There was only a small difference between the estimated

marginal means and the unadjusted means for the groups. The models onlyexplained 19 to 28 percent of the variation in test scores. However, these results

were not surprising given that the majority of students failed to give the assess-

ments their best effort.

Because the explained variance was small, the fit between math and English

courses taken by students and the exams was examined in more detail. College

algebra and traditional college writing composition courses were not required

for an associate’s degree at UVSC for several programs. About 94 percent of

the graduates completed college writing versus alternative writing courses. Most

of the students, who did not take college writing, completed business English

courses to earn their degrees. About 78 percent of the graduates completed

college algebra or a higher level of math at UVSC; however, a substantial num-

ber of graduates took business math or specialized math courses for drafting,

fire science, auto mechanics, or electronics. In other words, these students werebeing tested on college algebra and calculus when it was not required for their

degrees.



82 HOYT

IMPLICATIONS FOR RESEARCH AND PRACTICE

The present study has important implications for policymakers who are at-

tempting to use outcomes testing and performance funding. It calls into question

the use of outcomes testing without effective motivational factors that encourage

students to give assessments their best effort. Administrators have a limited

ability to provide incentives, and it is unrealistic to assume that they can force

students to give their best effort. The goal of rewarding institutions for actual

student learning is not achievable when students fail to try on assessments. It is

displaced with rewards for institutions that have more motivated or compliant

students.

It is conceivable that students could be required to retake general education

courses if they fail to pass general education exams, but this would be unpopular

with students and difficult to carry out. Exams required for licensure or for entry

into graduate school may be more appropriate because of student incentives toperform.

An important research question is whether student self-reported effort on ex-

ams is accurate. Students may indicate little effort when they know that they

have not performed well on the exam. Is actual performance on the exam an

accurate indicator, or is self-reported effort valid? The present study found that

both academic variables and self-reported effort were significant predictors of

student performance on the exams. However, possible confounding effects dem-

onstrate that quantification of student learning is problematic. A sound approach

is to ensure that students have sufficient incentives to give proficiency exams a

sincere effort.

General education tests covering traditional areas of study may not be appro-

priate for students in vocational and trades programs. Students in these programs

do not take the same courses as those in traditional four-year degree programs.

This requires different measures to assess the effectiveness of instruction and

student learning. If institutions offer substantially different programs within the

system, comparisons with other institutions may be inappropriate.

The study findings also have broader implications for performance funding

in general. If state policymakers rush into performance funding without under-

standing the impact of their choice of indicators, they may not achieve their

goals. Indicators may be subject to manipulation, and performance funding may

result in unintended consequences. This study provides one example of a perfor-

mance indicator that was intended to reward institutions for student learning—

but student learning was not accurately measured.

States should consider implementing pilot programs to assess the appropriate-

ness of using specific performance measures. This practice can minimize thenegative effects of implementing performance funding programs that may not

work as planned. Some colleges and universities, however, may live with ques-




tionable measures because they serve political purposes and appease legislators

who may fail to fully understand the realities of outcomes testing.

A closer study of specific indicators is needed to evaluate whether they truly

provide the benefits envisioned by legislators or waste resources. Conley and

McLaughlin (1999) provide a paradigm for evaluating performance measures.

The authors emphasize the importance of assessing content and construct valid-

ity. In other words, do indicators measure what they are supposed to measure?

Are appropriate data available and are the “elements stable, objective, and con-

sistent” (Conley and Mclaughlin, p. 11)? Is internal validity compromised when

institutions “slant or spin numbers” (Conley and Mclaughlin, p. 12)? Is there

external validity?

What are the intended and unintended consequences? If legislatures fund in-

stitutions for graduation and persistence rates, student satisfaction, or time to

graduation, will the integrity of the system be compromised? These indicatorscould provide incentives for institutions to award degrees to students who have

not adequately learned the material. Will students be rushed through programs

without learning all the basic knowledge that is needed for the profession? Grad-

uates might become less prepared for their future careers creating dissatisfaction

among employers. What impact will these incentives have on providing educa-

tion to adult learners who study part-time? Will institutions become more selec-

tive to increase their graduation and persistence rates? An overemphasis on

satisfying students might create incentives for grade inflation and lower stan-

dards. How are the numbers calculated by institutions, and are they accurate?

Additional research is needed to address these and many other concerns.

Legislators may be trying to achieve more control than is possible. Human

relations or behavior complicates data gathering efforts, and the use of rough

overall indicators often does not provide information that is appropriate for funding purposes. The motivation problem will complicate the performance

funding effort. Administrators and faculty at colleges and universities will not

necessarily respond to performance funding in expected ways. Individuals may

be unmotivated or uncertain on how to improve upon dubious indicators that

are largely beyond their control. Student motivation will always be a concern

when implementing state mandated assessment tests. All of these effects will

substantially limit any measurable improvement tied to performance funding.

REFERENCES

American College Testing Program. (1998a). CAAP User Norms. Iowa City: ACT.American College Testing Program (1998b). CAAP Institutional Summary Report: Self-

Reported Performance Effort Breakdown by Test Chance Scores. Iowa City: ACT.

Amiran, M., Schilling, K. M., and Schilling, K. L. (1993). Assessing outcomes of generaleducation. In T. W. Banta (ed.), Making a Difference: Outcomes of a Decade of As-sessment in Higher Education, pp. 71–86. San Francisco: Jossey-Bass.



84 HOYT

Ashworth, K. H. (1994). Performance-based funding in higher education. Change 26(6):

8–15.Astin, A. (1990). Can state-mandated assessment work? Educational Record 71(4):

34–42.Banta, T. W., and Fisher, H. S. (1984). In J. Folger (ed.), New Directions for Higher

Education: No. 48. Financial Incentives for Academic Quality. San Francisco: Jossey-Bass.

Banta, T. W., and Pike, Gary R. (1989). Methods for comparing outcomes assessmentinstruments. Research in Higher Education 30(5): 455–469.

Banta, T. W., Rudolph, L. B., Van Dyke, J. V., and Fisher, H. S. (1996). Performancefunding comes of age in Tennessee. Journal of Higher Education 67(1): 23–45.

Bogue, E. G., and Brown, W. (1982, November-December). Performance incentives for state colleges. Harvard Business Review, pp. 123–128.

Bracey, G. W. (1996). Altering motivation in testing. Phi Delta Kappan 78(3): 251–252.Burke, J. C. (1998a). Performance funding: Present status and future prospects. In J.C.

Burke and A. M. Serban (eds.), New Directions for Institutional Research: No. 97.

Performance funding for Public Higher Education: Fad or Trend? San Francisco:Jossey-Bass.

Burke, J. C. (1998b). Performance funding indicators: Concerns, values, and models for state colleges and universities. In J.C. Burke and Serban (eds.), New Directions for

Institutional Research: No. 97. Performance funding for Public Higher Education:

Fad or Trend? San Francisco: Jossey-Bass.Burke, J. C., and Serban, A. M. (1998). Editor’s notes. In J.C. Burke and A. M. Serban

(eds.), New Directions for Institutional Research: No. 97. Performance funding for

Public Higher Education: Fad or Trend? San Francisco: Jossey-Bass.Conley, V. M., and McLaughlin, G. W. (1999). Performance measures: How accurate

are they? Paper presented at the Association for Institutional Research AIR Forum,Seattle, WA.

Ecclestone, Kathryn (1999). Empowering or ensnaring?: The implications of outcome-based assessment in higher education. Higher Education Quarterly 53(1): 29–47.

Enarson, H. (1983). Quality indefineable but not unattainable. Educational Record 64(1):7–9.

Ewell, P. T. (1994). A matter of integrity, accountability and the future of self-regulation.Change 26(6): 25–29.

Ewell, P. T., and Jones, D. (1994). Pointing the way: Indicators as policy tools in higher education. In Sandra Rupert (ed.), Charting Higher Education Accountability: A

Sourcebook on State-Level Performance Indicators. Denver, CO: Education Commis-sion of the States.

Eyler, Janet. (1984). The politics of quality in higher education. In J. Folger (ed.), New

Directions for Higher Education: No. 48. Financial Incentives for Academic Quality.San Francisco: Jossey-Bass.

Folger, J. (1984). Editor’s notes. In J. Folger (ed.), New Directions for Higher Education:

No. 48. Financial Incentives for Academic Quality. San Francisco: Jossey-Bass.Graham, S. W., and Cockriel, I. (1990). College outcome assessment factors: An empiri-

cal approach. College Student Journal 23(3): 280–287.Jemmott, N. D., and Morante, E. A. (1993). The college outcomes evaluation program.

In T. W. Banta (ed.), Making a Difference: Outcomes of a Decade of Assessment in

Higher Education, pp. 306– 321. San Francisco: Jossey-Bass.Kuh, G. D., and Wallman, G. H. (1986). Outcomes oriented marketing. In D. Hossler




(ed.), New Directions for Higher Education: No. 53 Managing College Enrollments.

San Francisco: Jossey-Bass.Layzell, D. T. (1999). Linking performance to funding outcomes at the state level for

public institutions of higher education: Past, present and future. Research in Higher Education 40(2): 233–246.

Layzell, D. T., and Caruthers, J. K. (1995). Performance Funding at the State Level:Trends and Prospects. Paper presented at the 1995 Association for the Study of Higher Education Annual Meeting, Orlando, FL. (ERIC Document Reproduction Service No.391406)

Mayes, L. D. (1995). Measuring community college effectiveness: The Tennessee model.Community College Review 23(1): 13–21.

Pace, C. R. (1984). Historical perspectives on student outcomes: Assessment with impli-cations for the future. NASPA Journal 22(2): 10–18.

Pike, G. R. (1999). The constant error of the halo in educational outcomes research. Research in Higher Education, 40(1): 61–86.

Rothman, R. (1995). Measuring Up: Standards Assessment and School Reform. San

Francisco: Jossey-Bass.Serban, A. M. (1998). Opinions and attitudes of state and campus policymakers. In J. C.

Burke and A. M. Serban (eds.), New Directions for Institutional Research: No. 97.

Performance Funding for Public Higher Education: Fad or Trend? San Francisco:Jossey-Bass.

South Carolina Commission on Higher Education (1997). Performance Funding: A Re- port of the Commission on Higher Education to the General Assembly. Special Report

No. 4. Columbia, SC: South Carolina Commission on Higher Education. (ERIC Docu-ment Reproduction Service No. 405786)

SPSS. (1997). SPSS Advanced Statistics 7.5. Chicago: SPSS.Terwillinger, J. S., and Titus, J. C. (1995). Gender differences in attitude and attitude

changes among mathematically talented youth. Gifted Child Quarterly 39(1): 29–35.Van Dyke, J. V., Rudolph, L. B., and Bowyer, K. A. (1993). Performance funding. In

T. W. Banta (ed.), Making a Difference: Outcomes of a Decade of Assessment in

Higher Education, pp. 283– 293. San Francisco: Jossey-Bass.Yarbrough, D. B. (1992). Some lessons to be learned from a decade of general education

outcomes assessment. Innovative Higher Education 16(3): 223–234.Yorke, M. (1997). Can Performance Indicators Be Trusted ? Paper presented at the 1997

Association for Institutional Research Annual Forum, Orlando, FL. (ERIC DocumentReproduction Service No. 418660)

Performance Funding in Higher Education- The Effects of Student Motivation on the Use of Outcomes...

Documents

Transcript of Performance Funding in Higher Education- The Effects of Student Motivation on the Use of Outcomes...