High Stakes Testing: Help or...

High Stakes Testing 1

Running head: HIGH STAKES TESTING

UNIVERSITY OF LA VERNE

LA VERNE, CALIFORNIA

HIGH STAKES TESTING: THE INSTITUTION OF NCLB AND ITS IMPACT UPON

DISADVANTAGED POPULATIONS

A Paper Prepared for EDUC 596

In Fulfillment of

The Requirements for the Degree

Master of Education

Roberta M. Taylor

August 2008


Abstract

The Purpose of this Literature Survey was to conduct an in-depth examination of the impact of

high stakes testing in education upon some particularly disadvantaged subgroups, such as low

socio-economic status and race, special needs and at-risk populations, and English language

learners. The institution of NCLB in 2002 has prompted nationwide escalation toward more

standardized testing in education. The principle governing NCLB is that all diverse children can

equally learn to read, write and calculate math problems at pre-determined levels, thus, leaving

no child behind. However, research reveals that the negative effects of NCLB and high stakes

testing are inadvertently leaving all manner of children behind and undermining the teaching

profession as we know it today.


High Stakes Testing: The Institution of NCLB and Its Impact upon Disadvantaged Populations

Admittedly, the major education initiative of the Bush Administration is the No Child

Left Behind Act (NCLB), which was intended to raise educational achievement and close the

racial/ethnic gap. Its policies were calculated to focus schools attention on raising test scores,

mandating better qualified teachers, and providing the public with educational choice.

Disappointingly, the complexities of the law have been unsuccessful in achieving these goals and

have instead provoked a number of negative consequences which inadvertently harm the students

it is intended to help (Darling-Hammond, 2007).

Since the advent of the No Child Left Behind Act signed into law by President George

W. Bush in 2002, there has been an escalation in education toward increasingly more

standardized testing. The compelling force behind this movement is the expectation that all

children can learn to read, write and calculate math problems at pre-established levels, regardless

of any diversity and in spite of any differences. All classifications of students, be they mentally

handicapped, English as a second language, minority race or low socio-economic status,

behaviorally disordered, learning disabled or otherwise disadvantaged, are required to be

included in on state mandated tests. Furthermore, NCLB is requiring that all students score

proficient by the year 2014. In addition to this position, it is asserted that teachers and schools

are ultimately accountable for students performance (Campbell, 2008).

More than six years after its inception, the controversy of high stakes testing perpetuated

by the federal mandates of NCLB continues. The heightened obsession with raising test scores to

meet proficiency standards has, in actuality, had a negative effect within some variable

subgroups. Even beyond this, the paradigm of education has shifted from the benefit of the

student to satisfying political objectives. This has fashioned some unanticipated changes in the


attitudes and approaches of those involved in the teaching profession. The purpose of this

literature review is to disclose the results of the investigation into the role of high stakes testing

and to uncover the injustices encountered by some particularly disadvantaged groups.

Discriminations exist in the biases experienced by low socio-economic status and race, in the

discrepancies that besiege special education or exceptional students and at-risk populations, and

in the disparities for English language learners. Funding problems also hinder the educational

progress of these subgroups. Although the focus of this investigation is the impact of high stakes

testing upon students, it cannot go unnoticed that the impact extends far into the field of

education, and that the thrust of NCLB accountability has impacted professionals in education

from teachers all the way up to the state level.

The History Behind NCLB

State testing programs began in the wake of a 1983 report called Nation at Risk

released during the Reagan Administration. At the request of Capitol Hill, Daniel Koretz at the

congressional Budget Office tracked student performance from the immediate postwar period

forward. The news was good, but Koretz emphasized that state assessments would have to

provide annual tests of comparable students each year in order to yield valid and reliable data on

student achievement over time. The rise of systematic reform in the states was crafted to generate

clearer learning standards and stronger student testing results that would remedy institutional

limitations. This restructured system was to focus on achievement outcomes measured by state

education departments. By and large, states adopted the National Assessment of Educational

Progress (NAEP) style of reporting achievement levels that communicated student levels of

mastery as criterion referenced, rather than percentile rankings, known as norm referenced, that

are comparisons between states. In the late 1990s analysts began to compare the number of


students considered proficient against state versus NAEP definitions. However, the NAEP exams

are not necessarily aligned to any states curriculum, and caution is advised in making

comparisons between the two. Furthermore, the definition of proficiency and cutpoints varies

among and between states. The legislative creators of the current act of NCLB assumed that

proficiency defined a parallel level of student performance across states and tests (Fuller, Wright,

Gesicki, & Kang, 2007).

The federal policy of NCLB that governs our nations schools today evolved from a

model of an accountability system first established in Texas, President George W. Bushs home

state, during the years just prior to his presidency in 2000. Texas was one of at least 27 states to

use the results of standardized tests to make such high stakes decisions as holding back students

a grade, withholding their diplomas, and penalizing teachers, principals and schools that

performed poorly. Scholars, educators, and policymakers agree that tests are useful for tracking

students progress and identifying weaknesses in teaching. Yet, educators even then began to

describe testings dark side. They alleged that standardized tests are too limited, imprecise, and

too easily misunderstood to use as the basis for crucial decisions concerning students. They also

maintained that the consequences of high stakes testing interferes with good teaching and

discriminates against disadvantaged and minority students who need the most help (Miller,

2001). Since its institution, NCLB has created intense pressure nationwide to show increases in

test scores within a very limited time period. This given time period has proved to be far less

time than realistically needed for improving instruction, updating curricula, or enhancing the

skills of teachers to perform miracles within all subgroups and grade levels across the board.

In Texas, it was eventually discovered that although test scores rose in the early grades,

middle and high school grades proved less successful. In time, performance at the high school


level exposed the weakness in the system. At the end of the day, Texas finally responded by

producing a loophole that permitted principals to exclude from their school-level scores those

students considered to be liabilities (Bracey, 2008). The outcome of those students not testing up

to standard was to find ways to exclude them. This is a practice that appears to be gaining steam

nationally. Ironically, the idea of leaving no child behind and exclusion appear to be

functioning synonymously today. This is a contradiction of the ideals that govern the original

policy of NCLB. Still, both the state and the national system claim that they are improving

education and that they leave no children behind. They continue down the same beaten path

while research shows that there is imminent need of paving a new road.

The Role of Politics in Determining Standards

Standards have grown out of a number of key issues in education. These include a

century-old deliberation involving tracking, a 50-year-old breakthrough concerning the impact of

teacher expectations, a 40-year struggle over educational equity, and the enduring desire for

highly skilled workers to drive the nations economy. Not surprisingly, they converge to create

support from politicians, educators, and business leaders, alike. They serve two primary

purposes. The first is to address an economic concern that America is losing its competitiveness

in the global market. Fear that America is falling behind, prompted through international studies

of achievement, has propagated the need to press students to learn more and learn faster. The

argument is that our students cant compete because of poor preparation, and that our economy

will suffer. The second purpose for standards is to address the disparity between low achieving

and high achieving students. There is a growing gap in income made worse by a growing gap in

education (Gratz, 2000).


In their most common function, standards are fundamentally the descriptors of what

should be taught, otherwise known as the content. They are intended to provide waypoints for

helping teachers and parents gain understanding of where children are cognitively. Achievement

standards are complex and greatly subjective. They involve the two components of performance

and proficiency. Performance standards are reflective of how students show their determination,

while proficiency standards unearth their level of achievement. These are the mixtures that

compel states toward the accountability and assessment that they design to present what students

know to the public. In this scenario, comparison then shifts focus from the needs of learners to

the needs of the school, which can be detrimental to students (Schultz, 2008).

Unfortunately, many education reforms are influenced by political ideology rather than

what really works in schools. The statements of the problems are usually compelling, and more

complete and accurate than the proposed solutions. They characteristically overpromise and

underdeliver. Even the most promising initiatives predictably fail when poorly implemented on a

broad scale. Some are adopted too quickly, some too rigidly. Some are adopted in name, but not

genuinely implemented. Little attention is paid to differences between schools, and little regard

is given toward unintended consequences. It should come as no surprise that inadequate

implementation and unanticipated consequences are fueling a growing rebellion against high

standards and tough tests (Gratz, 2000). The unsettling reality in todays political climate is that

many political leaders believe that the best way to transform schools is through an end of a gun

barrel approach, rather than by building consensus. Clearly, accountability to NCLB and its

corollary regulations at state levels support this approach (Casborro, 2005).

Still, proponents believe that raising standards for all students, teachers, and schools will

improve the education of poor and minority populations, particularly in urban schools where


student scores generally fall way below current standards. These students are placed at a great

disadvantage when so little is expected of them. They should be motivated to competition

through rigorous and world class standards. Yet, there is significant agreement that standards

should address the essentials, be grounded in core academic disciplines, and should cover

content students should know and be able to perform. American schools and students need an

incentive to strive for higher levels of performance, and standards and accountability have

emerged as the means to contend with these problems. Even so, standards are ultimately about

the ends and not the means. They should not be a prescription for teaching methods, a device for

classroom strategies, or a substitute for lesson plans. Furthermore, their objectives should be

crystal clear to students, parents, and teachers and they should enjoy broad support among

teachers and the general public. The component of ownership is crucial when standards imply

critical assessment and accountability (Gratz, 2000).

The Significance of High Stakes Tests

High stakes tests are tests that have the potential to significantly change the lives of

children and adults through direct consequences for individuals, groups, or organizations, in

which the stakes are high. They include such tests as state-wide assessments used to measure

adequate yearly progress (AYP) required by NCLB, college entrance exams, certification and

licensure exams like those used in education, business, medicine, and legal professions, and

some aptitude exams. Thus, it is the use of high stakes testing that determines whether the

assessment is high stakes, and not the design of the test. The hope is that increased pressure

upon states and schools to help all groups perform better on the tests will diminish the gaps in

performance between diverse population groups (Polnick & Reed, 2006; Nichols & Berlinger,

2008).


Requiring states to assess and report student outcomes by diverse groups places emphasis

on all students to achieve standards regardless of their socio-economic status, race, ethnicity, or

gender. Incentives for schools to perform stimulates support from faculty. Motivators for

individual students come in the way of praise and recognition. Promoting mastery of national

standards through high stakes assessments may focus curriculum toward meaningful content and

research-based instructional practices. However, there are some concerns as to the consequences

of high stakes tests for schools and students. Assessments associated with accountability systems

can result in rewards and/or sanctions for students and schools, thereby having the potential to

also deliver a negative impact. Sanctions for schools whose students fail to pass these tests in

adequate numbers can result in a change in faculty, a change in leadership or a total

reorganization of the school. This can lead to an increased gap between low and high performing

schools, reduced opportunities for minorities and students with disabilities, and the threat of

decreased support for national curriculum efforts. The negative effect for low performing schools

and students is potentially greater than the negative effect on high performing schools and

students. This results in an increased gap between the schools. Another critical area of concern is

high school. High stakes assessments that translate into high school exit exams that students

must pass in order to graduate can result in students being marginalized from mainstream

society. This can have significant consequences for the many disadvantaged students of low

socio-economic status, minority races, and students with disabilities (Polnick & Reed, 2006;

Nichols & Berlinger, 2008).

Some characteristics of high stakes tests given to elementary children would astound

parents and the general public. An application of the Flesch Reading ease Scale and the Flesch-

Kincaid Grade Level Scale to the high stakes test used in California in 2004-05 rated 50% of the


passages as extremely normal grade level reading norms by one year or more. Almost 20% were

above grade level by more than two years, while some passages exceeded grade level by three

years or more. Compounding this demanding situation is the depth and breadth of the questions.

Yet, when children cannot engage the material in the exam, their scores provide no meaningful

data about their academic progress (Meek, 2006). What is more, the policy of high stakes testing

is not scientifically based. Improvement on test scores is not a true indicator of what students

really know. The reliability coefficient of standardized tests for elementary and secondary

students is around .9, which is not especially dependable. Tests are imprecise because of design

error and the small sampling upon which broad conclusions are drawn. Education consultant, W.

James Popham, retired from the University of California at Los Angeles, states that policymakers

erroneously assume that standardized achievement tests measure what a school has taught, when

in fact it does not. Traditionally, designers eliminate questions that too many test-takers may

answer correctly in order to spread students out along the familiar bell curve of statistics. This

leads them to use test items that are unsuitable for measuring the quality of instruction. He

estimates that 20 percent of mathematics questions, 40-50 percent of reading items, and 70-85

percent of language arts items on two common achievement tests for elementary students were

more suitable for measuring I. Q. or socioeconomic advantages. Another profound criticism of

standardized testing is that teachers learn to teach to the test. This posture encourages the

substitution of the shallow content of test preparation for more challenging curriculum. In effect,

teachers become better at preparing their classes for these tests, which also defeats the purpose of

giving them (as cited in Miller, 2001).

The Impact of the California High School Exit Exam


The California High School Exit Exam (CAHSEE) is the test created to ensure

accountability for Californias high school graduates. It is a measure of these graduates

proficiency in English language arts and mathematics. Scheduled for implementation in 2006,

the potential for this tests impact upon students is great, and has been the subject of much

debate. Testing as a means of demonstrating accountability demands that the CAHSEE must be

scrutinized for fairness, particularly within the context of validity, absence of bias, access,

administration, and social consequence (Callet, 2005).

In 2003, the Californians for Justice, a statewide grassroots advocacy group for poor and

minority communities declared that the California High School Exit Exam (CAHSEE) is harmful

to minority and low-income students. Considering the quality of education that they earn from

unqualified teachers, and the run-down, hazardous, and unhealthy schools they attend, they

claimed that the CAHSEE falls short of its stated goals and these students are at a disadvantage

(Calif. Group, 2003). After an internal evaluation of the test done by the state education, it was

found that some fundamental components of the test, such as algebra, had not been taught to

many of the students. The evaluation also determined that more than 20 percent of students,

predominantly students limited in English skills or students with disabilities, would have failed

the test and been denied their diplomas. As a result, the California state school board voted

unanimously to delay the mandatory exit exam from 2004 to 2006. State officials anticipated that

the two years would be sufficient time for younger students to learn the skills required for the

exam. The exam is initially administered in 10th

grade, and students who do not pass on the first

try are given a number of opportunities to pass before graduation (Sack, 2003).

As early as 2001, the Oakland, California based Disability Rights association moved to

action in a San Francisco District Court. In a class action lawsuit known as Chapman v.


California Department of Education, the Superintendent of Public Instruction, and the State

Board of education, plaintiffs declared the high school exit exam to be invalid and discriminatory

against students with disabilities. The case was put on hold until 2004, when Alameda County

Superior Court Judge Ronald M. Sabraw commissioned a regional education laboratory known

as WestEd to carry out a study investigating the impact of the exam on students with disabilities.

The study, completed in 2005, recommended that the requirement of the exit exam for

graduation be delayed for at least two years. Results of the state data of these students revealed

that 54 percent of them had passed the English section of the test, while 51 percent had passed

the math section. The total percentage of students that had passed each section statewide was

about 88 percent (Jacobson, 2005).

By early 2005, new legislation known as SB517, appeared close to a final agreement. The

terms of SB 517 would permit students with disabilities who had not passed the exit exam to

graduate with their senior class under certain conditions. These students could graduate provided

they had satisfied all their other graduation requirements and met the terms of the original

Chapman settlement. This included taking the exit exam at least once during their senior year,

and taking advantage of remediation. Proposals involving delays or changes in the graduation

requirement, including legislation to permit alternatives, such as portfolios or locally developed

assessments, would require an amendment to state law. These were vetoed by California

Governor Schwarzenegger previous to 2005 at the urging of Superintendent of Public Instruction

California OConnell. While the California education department had reviewed a number of

alternatives to measure students demonstration of mastery of the standards, Mr. OConnell

concluded that none of the alternative methods was an appropriate substitute. He asserted that

students must have the mathematics and English skills measured by the exam in order to thrive in


an increasingly competitive global economy. This position drew immediate fire from civil rights

advocates and other interest groups. Still, he maintained that there were plenty of options that

remained for students to continue their education if they had failed the test and said that he

would propose legislation to help expand their options. Among the list of proposals to be offered

was the provision of more money, lifting enrollment caps for remedial education, summer

school, adult education, and independent-study programs that could help students pass the exam.

He also called for students to be eligible for Cal Grants to attend community college if they had

not passed the test, but had met all other graduation requirements and grade-point-average

requirements.

In early 2006, the Human Resources Research Organization (HUMRRO) estimated that

only 35 percent of special education students and 51 percent of English language learners in the

class of 2006 had passed both portions of the exam (Olson, 2006). A new law reflecting the

terms of SB 517 was signed by Governor Schwarzenegger in early 2006, which signaled a

settlement in the case of Chapman v. California Department of Education. Under the measure,

students with an individualized education plan would be able to graduate without passing the exit

exam, as long as they had met all other state requirements for a diploma and received available

remediation for the test (Jacobson, 2006).

Also in 2006, after about 41,000 seniors in the class of 2006 failed to receive diplomas

because they did not pass the exit exam, a group of high school seniors sued the state. In

Valenzuela v. OConnell, the Superior Court of the State of California ruled that schools in

California could no longer require an exit exam for high school graduation. The judge in that

case concluded that the exam discriminates against students who attend schools in low socio-

economic areas and students who are learning English (Kravetz & Sawchuck, 2006). The


California Department of Education immediately retaliated by asking the state Supreme Court to

overturn the decision. The states exam was reinstated as a graduation requirement for the class

of 2006, but the Supreme Court sent the case to an appeals court to make the ultimate decision of

whether or not the exam would continue to be required of the nearly 47,000 seniors who had yet

to pass it in 2006 (Sawchuck & Kravetz, 2006). Rejecting the claim that the test discriminates

against English language learners and poor students, the California Appeals Court upheld the

CAHSEE (Kravetz, 2006a). Finally, in October, 2006, the governor ultimately signed new

legislation bringing to an end the lawsuit against the state, Valenzuela v. OConnell. The

measure allows students to receive up to two years of extra help beyond the 12th

grade.

California districts are now eligible to receive a share of more than $70 million for supplemental

instruction and counseling services that specifically target students who have reached the end of

their senior year, but have not passed the states high school exit exam (Jacobson, 2007).

There are currently 23 states that have exit exam mandates, and more are moving toward

that direction. The struggles in California are mirrored by other states as decisions are made to

uphold exit exam mandates or to delay the requirements. The state of Maryland is still deciding

whether to implement or push back the requirement for their class of 2009. Washington State

requires students to pass the reading and writing portions of their state test, but the 10th

grade

math section of the Washington Assessment of Student Learning exam is delayed until 2013.

The performance of economically disadvantaged students, minority students, and students with

disabilities are what is at issue in most states. In California, of the students of the graduating

class of 2007 who have not yet passed the state exam, about 18,000 are English language

learners and 24,000 are of low socio-economic status (Jacobson, 2007).


The federal government does provide states with grants for adult education and the

General Equivalency Diploma (GED) test. However, Don Soifer, the executive vice president of

the Lexington Institute, a think tank in Arlington, Virginia, remarked that the ability to offer and

take the GED in Spanish provides an incentive for schools to put the limited-English-proficient

kid on a GED track rather than a regular high school education track, which undermines the

goals of inclusiveness (as cited in Zehr, 2006). The California Department of Education finally

released a list of dropout rates for the 2006-07 school year that puts the states average at a 24.2

percent for California Public high schools. Data reveals the dropout rates at 41.3 percent of black

students, 31.3 percent of Native Americans, 30.3 percent of Hispanics, 27.9 percent of Pacific

Islanders, 15.2 percent of whites, and 10.2 percent of Asians. Critics have charged that in the

past, state education officials have used unaudited data from school districts that reported a

graduation rate of 85 percent in order to comply with NCLB requirements. The new numbers

paint a more realistic picture. Yet, these figures pale in comparison to the 54.5 percent dropout

rate of the Victor Valley Union High School District, which bears the worst rates in the county of

San Bernardino. Duneen DeBruhl, VVUHSD assistant superintendent of educational services

remarked that district officials believe students are frustrated in meeting class requirements,

among other things. They are inclined to dropout if they feel they dont have enough credits to

graduate with the required state exam (as cited in VVUHSD Worst Dropout, 2008). VVUHSD

reported 1,552 students of the countys total of 7,082. Statewide, the number of dropouts was

reported to be 87,456 (List of Dropout, 2008).

Disadvantages of Low Socio-economic Status and Race

Studies done by Onwuegbuzie & Daley, 2001, Roth et al., and Holman, 1995, reveal that

all ethnic minority populations score significantly lower on traditional standardized tests than do


white, with the exception of Asian Americans. Scores from the Standard Assessment of

Intelligence (SAI) show that on the average, Asian Americans score three points higher than

white Americans, while white Americans achieve 15 points higher than African Americans, and

11 to 22 points more than Hispanic students. The College Board reported similar differences in

1998 Scholastic Aptitude Test (SAT) results. White Americans scored an average combined

score of 1054, Hispanic students averaged 927, Mexican American students averaged 913, and

African American students averaged a combined 900 (as cited in Altshuler & Schmautz, 2006).

In 2000, a study by Wolfensberger stated that peoples welfare is significantly contingent

upon the social roles they occupy. Social roles are a combination of functions, behaviors,

relationships, duties, privileges, and responsibilities that are socially defined, commonly

understood, and largely recognized within a society. Generally, those who occupy rolls that

others value as positive are treated well. People who fill devalued rolls are typically treated

poorly. In this model, being poor or of low socio-economic status is a devalued role, and

devalued people are apt to face a range of detrimental experiences (as cited in Lustig & Strauser,

2007). The poor are consigned to low social status and are systematically rejected by society.

They are more likely to be subjected to loss of control over their lives as others eventually end up

making some of lifes decisions for them. These individuals are usually associated with the

negative images of being dependent, lazy and often a welfare recipient. The poor often live in

neighborhoods of high crime, poor schools, and limited social networks. Children of low-income

neighborhoods are particularly vulnerable to the impact of poverty, crime, violence, divorce, and

frequent changes in family structure. They are less likely to have qualified teachers and more

likely to sit in classrooms that are noisy compared with middle-income students. They are also

more likely to report more physical assaults and incidences of weapons in schools. A 1997 study


by Gephart has shown that there is a strong relationship between poverty, low academic

achievement, and rates of school dropout (as cited in Lustig & Strauser, 2007). The

socioeconomic level of ones peers also has a significant influence as peer groups and role

models are the strongest reference for conformity. It is believed that poverty is characterized by

inconsistent and unreliable experiences. This lack of coherence and instability translates into

speculation that children of poverty or low socio-economic status are less likely to persevere in

pursuing educational and vocational objectives (Lustig & Strauser, 2007).

Researchers have long recognized that standardized tests have built-in biases that affect

socio-economic status and race. Unfortunately, NCLB disregards the realities of racial

discrepancies when the expectation is the same for all classifications of students. Nonetheless,

high stakes testing provides administration with vital educational standards that help to track,

sort and label students. Moreover, it is when the stakes are high that people tend to seek

assistance from professional resources. The more affluent schools, districts and families are at an

advantage to be able to make use of these types of benefits. Predictably, the lower-performing

schools and districts and the economically disadvantaged students are less able to afford this type

of support (Smyth, 2008). As a result, Black and Hispanic dropout and retention rates are

typically higher than Whites and Asians. Poor attendance, yet another contributing factor to

low performance in testing, is characteristically a greater problem among those students of low

socio-economic status as well (Bracey, 2008).

In spite of the guarantee brought by Brown v. Board of Education, segregation still exists

in todays schools. Not only are African Americans subjected to persistent educational

inequality, they are overrepresented in special education, suspended and expelled at greater rates,

and have higher dropout rates (Green, Mcintosh, Cook-Morales, & Robinson-Zanartu, 2005).


Psychological testing done on black and white army draftees and on the large masses of

immigrants that were arriving in the U. S. in the pre-World War I period generally showed that

black draftees and refugees scored lower than the white middle class. No one questioned the

appropriateness of these tests for people whose backgrounds, lifestyles, and language were

different from the mainstream population. The cultural bias inherent in such tests served the

useful purpose of labeling then, as they do now. Many still oppose the use of standardized tests

with ethnic and minority groups, such as blacks. They believe that the cultural frame of the

individuals environment is an influence from which one cannot be separated. While procedures

of test development are somewhat less biased today, there still remains the problem of the

appropriate use of tests and test results, as well as the development of test content that should be

representative of a pluralistic society (Spencer, 1975).

Several results of investigation have discovered that negative attitudes toward testing

may be affecting the performances of African Americans on tests. African Americans are

commonly contemptuous of the testing endeavor because of the social stigma attached to being

viewed as intellectually inferior. Arvey, Strickland, Drauden and Martin developed and validated

the Test Attitude survey (TAS). This was an attitude instrument designed to measure the

opinions of employees toward the tests they had completed. An analysis of their responses

revealed nine factors of influence. These included motivation, belief in tests, level of anxiety,

lack of concentration, test ease, external acknowledgment, achievement requisites, future

consequences, and preparation. A second study by Arvey and colleagues specifically examined

the differences in motivation between black and white test takers, and whether differences in

motivation might explain the racial gap in test performance. They found that race was

significantly related to the TAS motivation factor. Results revealed that white applicants had


more test-taking motivation than blacks. Interestingly, it was found that where the TAS factors

were held constant between both races, test score differences were significantly reduced between

black and white applicants. This indicated that the lack of test-taking motivation among African-

American applicants undermines their performance on employment tests (as cited in McKay &

Doverspike, 2001).

Steele and Aronson argued that negative stereotyping of African Americans also causes

anxiety that compromises their performance. Using a sample of 117 Stanford undergraduates,

they learned that when described as a test of mental ability, whites significantly outperformed

African American participants on cognitive ability tests. When the same test was structured as a

measure of problem-solving ability, with no indication of intellectual ability, not only did the test

performance of African American increase dramatically, it varied only modestly from their white

counterparts. Apparently, the social stigma concerning the inferior intellect of blacks has a

negative effect on their test performance. Chan, Schmitt, Deshon, Clause, and Delbridge, did

similar test studies, which found that differences discovered in test performance are affected to

some extent by motivational differences in test-taking between races (as cited in Mckay &

Doverspike, 2001). This information suggests that we might be able to increase African

American performance on traditional tests by modifying these particular attitudes. Various

training programs exist that emphasize the adoption of positive attitudes toward the testing

process, and the importance of hard work and accomplishment. Another feature that could be

included in pretest programs is the use of techniques to help deal with anxiety. Still another

solution would be an alternative format to the paper-and-pencil test that is valid and criterion

referenced. The quest for video-based tests has resulted in challenges to validity, and computer


based tests present practical and logistical problems for pubic use of large-scale testing (McKay

& Doverspike, 2001).

A large body of literature has illustrated the unbalance in opportunities to learn because

of the inequalities between schools and communities. The fact remains that students of low

socio-economic and minority backgrounds are also far more likely to attend schools with

uncertified teachers, large class sizes, more violence and substandard educational resources.

Raising and enforcing educational opportunities through high stakes standardized tests without

first addressing these and other important issues is not only inherently biased, but intellectually

and morally dishonest. For instance, the problems encountered by the Native Alaskan student are

acute. The nature of their problems is associated with the sociocultural and geographic isolation

of their communities, as described by Estrin and Nelson-Barber in 1995, and Sowell in 1994.

The imposition of testing methods that are often inconsistent with native cultural practices only

compounds the problems. An example presented by Estrin and Nelson-Barber is that elders of

Native American societies often assume the responsibility for educating their young and

establishing the standards for learning (Sloan, 2007).

Currently, Hispanics are the largest ethnic minority group in the United States, and there

are estimates that Hispanics will outnumber white people by 2046. While they represent more

than 13 percent of the nations total population, they comprise more than 25 percent of those

below the age of 18, many as students in our schools (Altshuler & Schmautz, 2006). Valenzuela

did a study in 1999 that explains why high stakes testing exacerbates the negative consequences

of the U. S. educational system for many of our Mexican and Mexican-American students who

are Hispanic. In it, she effectively expresses how academic achievement is a social process that

materializes through the experiences of these students lives as they navigate the many social,


cultural, historical, and linguistic relationships that fashion their lives, both in and out of school.

She became increasingly convinced that the organization of public schools systematically

removes, or subtracts from Mexican cultural resources in the context of the classroom. She

illustrates how the school systems are structured around cultural assimilation and resocialization,

commonly referred to as Americanization. This tool, in the form of a standardized graduation

exam presents a vast hurdle for many Mexican-origin students who can demonstrate academic

competence in the classroom, but cannot achieve passing scores on the state tests. She argued

that the states high stakes accountability system is a mechanism to vigorously enforce

monolingual policies, which has serious repercussions for the Mexican minority. High stakes

accountability led to a shift away from dual-language and late-exit bilingual education programs

structured to help these populations, to early-exit and English immersion programs, as described

by Pennington, 2004, and Sloan, 2004 (as cited in Sloan, 2007).

The fall of 1999, the state of Texas was sued by a Mexican-American civil rights group.

They claimed that the Texas Assessment of Academic Skills (TAAS) was exceedingly harmful

to the educational prospects of minority students. In this case it was Latinos. Falling short of the

state-established passing score by only one point could mean attending summer school, or even

flunking. Scholars testified on behalf of the plaintiffs. They asserted not only that classroom

instruction was being harmed by high stakes testing, but also that it encouraged minority students

to drop out. Walter M. Haney, a professor of education at Boston College, declared that 10th

grade success on the TAAS is an illusion. He estimated that minority students were three times

as likely to repeat ninth grade than whites before the TAAS was employed, and that their dropout

rate was higher in that grade than any other. What he found was that a large part of the evident

increase in test scores was fundamentally due to the rising exclusion of students, particularly


black and Hispanic students. He explained that there are three ways to exclude poor test takers.

You can flunk them in ninth grade, classify them as learning disabled, or persuade them to

pursue a general-equivalency diploma (GED) instead of a high school diploma. Unfortunately,

the TAAS makes no accommodation for limited proficiency in English, and their performance is

also hampered by other inequities, such as school funds and teacher training (as cited in Miller,

2001). Consider that learning enough English to earn a diploma is all but impossible for most

minority students from other countries who come to the United States in their late teens. Many of

these students from Mexico and elsewhere in Latin America are discovering the option of the

GED test, instead. The GED can be earned by taking the GED test in Spanish or French, as well

as English, and is the equivalent of a high school diploma. It is a test taken mostly by adults who

either dropped out of school or never were able to meet graduation requirements (Zehr, 2006a).

Removing Bias within Low Socio-economic Status and Race

These glaring observations have not gone without attempts to mitigate the problems of

bias in high stakes assessments. There have been attempts to modify traditional standardized

instruments, such as the well-known Wechsler Intelligence Scale for Children (WISC). Yet, as

Hood noted in 1998, each modification has resulted in the discovery of more biases, resulting in

perpetual changes. What's more, modifications have been attempted in order to connect

culturally responsive instructional strategies to testing results. According to Newman &

Associates, 1996, these are found to be difficult to accomplish within the constraints of public

education. Another approach that has been considered is the development of nondiscriminatory

assessment tools. However, the practical capacity for improving discriminatory results through

changes in test design remains uncertain. Hood, 1998, Linn, 1993, and Shavelson, Baxter, &

Pine, 1992, all have documented that performance based assessments (PBAs) are plagued by


lack of empirical support and low reliability for their nondiscriminatory claims. Furthermore, this

approach is rendered ineffective when colleges and universities do not utilize these assessments.

One other approach focuses on language proficiency. While this is an empirically validated

source of test effect and remediation, there are subtle and underexposed environmental norms

and contexts in the use of language. It perpetuates a discriminatory bias against students who

lack proficiency in white, middle-class English, which is the socioeconomic norm of the majority

of U. S. society. In the long run, this type of discrimination can also have damaging

consequences for Hispanics and minority students of other ethnic heritage.

Toward this end, underachievement on high stakes tests can then result in lowered

academic self-concept. Academic self-concept is a reflection of both the descriptive and

evaluative aspects of the self. It involves an individuals description of their own strengths and

weaknesses, as well as an appraisal of their academic competencies. Most will agree that positive

self-concept cultivates achievement, and successful achievement reinforces self-concept.

Conversely, it is also recognized that negative academic self-concept tends to limit academic

achievement. By and large, the ability of Hispanic students to adopt dominant behavioral norms,

which may be contradictory to their Hispanic cultural norm, appears to be a major determining

factor of their academic achievement. Conflicts in values and procedures, which are inherent in

high stakes test design, serve to systematically discriminate against Hispanic students

(Altshuler& Schmautz, 2006). To date, solutions, such as multicultural education, remain

superficial and limited by shortsightedness. Sleeter observed in 1996 that multicultural practices,

such as the study of food, music, and dance, are superficial. The attempt should be to reposition

perspectives that can foster understanding and equitable academic success (as cited in

Altshuler& Schmautz, 2006).


Disadvantages of Students with Special Needs and At-risk Populations

Like their peers in general education, states are requiring students in special education to

take state-mandated tests. Students with disabilities often require special accommodation, such

as extra time, a word processor, Braille, or a scribe. Schools are responsible to determine

appropriate accommodations for students with disabilities, but a proctor administering a test may

have no idea that there is a student who requires any accommodations (Samuels, 2006). Students

with disabilities are eligible to receive accommodations as defined in their IEP. Possessing a

learning disability in the area being assessed, such as reading, written language or math is not

grounds to exempt a student from meeting district or state standards (Swain, 2006).

As state assessment systems continued to evolve to satisfy federal requirements, an

examination of states participation and accommodations policies revealed that policies for both

participation and accommodations were becoming more explicit compared to those in place at

the initiation of NCLB. Participation options customarily included the standard three choices.

These were participation without accommodation, participation with accommodations, and

alternate assessments. Over time, states continued to increase the number of these

accommodations. However, in some cases they even began allowing accommodations for

students who were not receiving special education services. The most controversial of these

accommodations being widely offered were read aloud, calculator, and scribe (Thurlow, Lazarus,

Thompson, & Morse, 2005).

Consequently, the implementation of NCLB has affected both general and special

education by neglecting to justify the purpose of an Individualized Education Program (IEP),

neglecting to adequately define a highly qualified teacher, as well as introducing unforeseen

pressures upon both general and special educators. In effect, NCLB has been widening the gap in


student achievement rather than leveling the playing field. Schools with special needs

populations have found it increasingly difficult to stay in compliance with federal legislation.

Students with disabilities are now being held accountable for general curriculum that they were

not even exposed to previous to NCLB. Furthermore, the individualized aspect of the IEP is

being lost as IEP teams attempt to align with the general curriculum now mandated by the state

standards. IEPs are being overshadowed by the new standards based reform and focus is shifting

from the students overall development to the general curriculum. Accordingly, the original plan

of Individuals with Disabilities Educational Act (IDEA) to individualize instruction for the

student is being ignored. This is because the goal of state academic standards involves equal

standardized instruction performance for all students (Jara, 2006).

While NCLB also demands highly qualified teachers, it does not support this demand

with adequate funds. Highly qualified teachers are challenging to train and special education

programs are suffering because skilled training for such programs lacks suitable compensation.

Magnet and charter schools are especially finding it unfeasible to meet Adequate Yearly

Progress. Many of these schools have come to specialize in educating these exceptional students

and at-risk populations. Although they are exempt from reporting two percent of their population

from state assessments, their high numbers of exceptional students make this compensation

irrelevant. In many of these cases AYP is nearly impossible to achieve. Sooner or later these

schools will be labeled as failing; leaving at-risk and special needs students behind (Smyth,

2008).

The quality of special education is also suffering today because it is extremely

challenging for interested individuals to become special education teachers. Not only is a degree

in special education required, but individuals must receive certification in the subject they plan to


teach. As a result, the field of special education is suffering because of the growing shortage of

professionals in special education and the increasing difficulty to become a special educator.

This has led to an intense movement for inclusion of these students in the general classroom, yet

another challenge for general educators. What is more, general educators and special educators

alike do not feel adequately prepared with effective pedagogy for this environment (Jara, 2006).

In schools of inclusion the majority of students receive services in the general setting. However,

the structure of the school day and the demands of the inclusion setting provide precious little

time for explicit instruction for individual students. Students with disabilities need the strategies

that scaffold learning provides. The strategies of activating students background knowledge,

discussing objectives, teacher modeling, mastery, collaborative practice, and independent

practice are critical when students are expected to transfer this skill to new contexts, such as in

high stakes testing (Swain, 2006).

In a 1999 report called Assistance to States, federal regulations defined the term multiple

disabilities as concomitant impairswhich cause such severe educational needs that they

cannot be accommodated in special education programs solely for one of the impairments (as

cited in Zebehazy, Hartmann, & Durando, 2006, p. 1). Thus, it is crucial to analyze the problems

involved in the assessment of accurate information on the performance of students with

disabilities and other impairments. Visual impairments are an example. Students with

disabilities often have vision impairments that adversely affect their educational performance,

and many of these students have an additional disability. In 2005, the U.S. Department of

Education released information in its 25th

Annual Report to Congress. In it, it was stated that

137,768 students from ages 3-21 have multiple disabilities (as cited in Zebehazy, Hartmann, &

Durando, 2006). Alternative assessments should be considered for students with the most severe


disabilities, which include students with visual impairments. These alternatives must reflect on

the complex nature of a visual impairment that is combined with additional impairments

(Zebehazy, Hartmann, & Durando, 2006). It was discovered that less than ten percent of special

education students who spent the majority of their time in special education classes managed to

pass the English language portion of the California High School Exit Exam in 2006, and still

fewer passed the mathematics requirement. This developing pattern of low scores among

students with disabilities has caused investigation into the CAHSEE to recommend that schools

provide more support and guidance to IEP teams in order to help them with key placement

decisions (Kravetz, 2006b).

Statistics collected from a number of states over time unveils information showing that

less than one-third of learning-disabled students can be expected to pass high school competency

exams. There are some among this group who make good-to-adequate progress, yet they are still

not at grade level. These have mild learning disabilities. Yet, many have strengths in other areas,

such as art, drama, music, or athletics, upon which success can be capitalized. There is also the

group of students, likely the lowest third, whose disabilities are in the moderate range. While

they are not so profoundly developmentally delayed that they qualify to take alternative

assessments, their disabilities are more severe. In all probably, they will never come close to

meeting the stringent standards on which NCLB exams are based. These exams are too densely

written, too lengthy, and too difficult in level of reading and comprehension to warrant their

blanket administration, even with such accommodations as extra time and extra breaks. Even

after the best instruction, coaching, and proper accommodations, mild to moderately disabled

students frequently give up and resort to marking questions at random. The expectation that all

students with disabilities can meet the same academic targets set for their nondisabled


counterparts implies that they are the same. While they are equal, they are not the same. It is an

involuntary cruelty to demand academic proficiency of the entire population of disabled students

(Meek, 2006).

Children of autism spectrum disorders are just another example. They pose unique

challenges to professionals working in the field of developmental pediatrics and early childhood

intervention because of their restricted patterns of relating, communicating, and behaving. They

exhibit difficulties following adult directives because of linguistic limitations and problems with

focusing their attention. They often will not respond to unfamiliar adults or an unfamiliar

environment. They present challenges to modifying typical routines in order to participate in

assessments. They have restricted patterns of expressive communication. These and other such

problems have rendered formal attempts to evaluate these children difficult and unsuccessful by

professionals who consider them to be untestable. In order for professionals to interact and

communicate with them, a more sophisticated alternative of assessment is needed. However, this

is a serious concern because there is a rise in the number of children with autism today (Vacca,

2007).

Remedies for Students with Special Needs and At-risk Populations

Possible options have been sought to remedy the challenges of special needs and at-risk

populations. One alternative that has emerged to meet the needs of this population is Response to

Intervention (RTI). This method utilizes students responses to the high-quality instruction to

which they are exposed for directing educational decisions involving the effectiveness of

instruction and intervention. Eligibility for special programs and design of individualized

education programs (IEPs) are determined from there. This alternative is meant to provide early

intervention for at-risk students while the student waits for evaluation of necessary services and


support (Casey, Bicard, Bicard, & Nichols, 2008). However, innovations for remediation such as

this require funding for special training, something for which NCLB has not provided. Then

again, even if significant growth does occur, it is still unlikely that the student will be able to

achieve grade-level proficiency in the year the student becomes covered by an IEP (Salend,

2008). Also, a great need for special populations and at-risk students is the adequate funding for

teachers to have special training. Unfortunately, NCLB does not provide this funding.

Furthermore, as a result of the recent turbulence in economics, states are increasingly facing the

problem of budget cuts in education. Something else that still remains essential for this group is

the accommodation of more time to show progress. Special needs populations typically are

slower learners than general education populations. Students at-risk are typically behind already.

Yet, the expectations for achievement in high stakes testing have increased while the time span

has been narrowed to proficiency by the year 2014.

Disadvantages of English Language Learners

The validity of reporting proficiency of English language learners is debatable. Students

of limited English proficient (LEP) are inconsistently labeled from state to state. States with

higher LEP populations such as California, Texas, Florida and New Mexico all face greater

challenges in educating their LEP students and meeting AYP than all others. The linguistic

complexity of state exams demands high levels of English language ability. As a result, many

schools in these states lose state and federal funding because they cannot report AYP (Smyth,

2008). Efforts at carrying out requirements for testing English language learners are receiving

intensified scrutiny as hundreds of schools across the country fail to meet AYP as charged by

NCLB. Ten school districts in California filed a lawsuit in the state superior court of San

Francisco. The lawsuit charged that the state is not complying with the federal mandate to test


English-language learners in a valid and reliable manner. The law says that states must provide

accommodations for such students in a language and form most likely to yield accurate data on

what students know and can do. District officials claimed that their schools were being

penalized for not meeting AYP goals, and implied that that they were made to look bad by tests

that they considered as invalid for these students. They argued that some English language

learners would score better if standardized tests were administered in their native languages.

Hilary McLean, Californias spokeswoman for State Superintendent of Public Instruction

OConnell responded by saying that the test is designed to disclose whether students have

learned English language arts. She said it would also be a complex and costly process to devise

tests in all of the native languages.

In Reading, Pennsylvania, a similar suit was filed by the 17,000-student school district

requesting relief from sanctions placed on those schools. The district ultimately filed an appeal

with the Pennsylvania Supreme Court when relief was rejected. The bottom line, according to the

U. S. Department of Education, is that states must provide testing accommodations for English

language learners. However, the native language choice is optional. Most states have elected to

provide for their LEP populations written translations of the regular academic tests they give to

their students. Most of these translations have been in Spanish. In March, 2005, Education Week

stated that of the 36 states that reported complete information for English-language learners on

AYP to the federal Education Department for December, only Alabama and Michigan met their

goals in reading and math for the 2003-04 school year. This is strong indication that hundreds of

schools are not making AYP for English language learners (Zehr, 2005).

Standardized tests are also culturally biased. High stakes tests are not only a measure of a

childs intellect and language, but also their culture. Many non-English speakers are failing these


tests and being held back a grade due to the lack of understanding of the complex English

language. However, their culture and environment also impact their performance of tests. Gifford

did a study in 1989, which showed that differences in family background, school experiences,

previous test exposure and training, experiences with racial discrimination, and individual

motivation can have significant impact on test scores. Another influence upon students test

performance is their cultures attitude towards tests and performance in schools. Some cultures

are more focused on family and personal values than holding testing and academic performance

in higher esteem. In addition, tracking poses a problem when the typically low scores of these

minority students label them. In many incidents, labeling will only allow them to take certain

levels of classes. These are classes usually taught at the lowest level where students are not being

challenged. They are basically being taught how to pass the test and nothing more. This

syndrome continues to cause them to fall behind and to increase the gap in performance of

minority students (Phillips, 2006).

Solutions for English Language Learners

The Department of Education issued what they termed as final regulations for testing

English learners, effective October, 13, 2006. This required that all English language learners be

included in mathematics testing at the first administration of the test after their arrival in the

United States. However, math scores are to remain exempt from being calculated into AYP until

after a student has been in U. S. schools for at least 12 months. After they arrive in U. S. schools,

English language learners remain exempt from taking the states reading test for the first

administration of the test. After that, however, they must take the test and their scores are

required to be used for accountability purposes. Additionally, states are permitted to put the test

scores of former English language learners into the pool with other English language learners to


calculate AYP for two years after they have been reclassified as fluent in English. These students

cannot, however, be included in the limited English proficiency (LEP) subgroup for reporting

purposes in state and district report cards. Parents and the public are entitled to have a clear

picture of the academic achievement of LEP students, and states and districts are required to give

an annual report of recently arrived LEP students who do not take the state reading tests in their

school (Zehr, 2006b).

In October, 2007, the Education Department released a draft of a framework for creating

English language proficiency standards and tests. To achieve this goal, the Education

Department even provided a total of $10 million to a consortium of four states to draft new tests.

By the end of 2007, all states and the District of Columbia had ushered in new English language

proficiency tests to comply with NCLB requirements for those still learning English. Whereas,

the previous tests were mostly designed to assess only speaking and listening, federal law now

requires states to assess these students in reading, writing, speaking, and listening. This new

generation of tests is designed to assess their academic language, or the language needed in order

to learn subjects in school. Under the federal education law, states are also required to set annual

measurable achievement objectives for students to progress in English and attain proficiency in

the language. While experts agree that this is a positive move, more work needs to be done to

ensure that these tests are valid and meaningful. Inconsistencies concerning implementation of

these tests raise questions about the validity of the scores. These tests are not necessarily a

predictor of how well a student will do in a classroom or on other mandatory tests (Zehr, 2007c).

Also at issue is whether alternative tests being offered by states is comparable to the

regular state reading and math tests that are used to test other students for AYP under NCLB

mandates. Although the federal government has not rejected other forms of testing alternatives,


some states have been forced to stop using their alternatives because they were unable to satisfy

federal concerns involving comparability. When Indiana stopped using portfolio components of

their reading and math tests, their 22,000 English language learners population scored worse than

the previous year in reading at every grade level tested from grades 3-10. Those who passed still

dropped as much as 3 percentage points for 3rd

graders, to 13 points for 8th

graders. North

Carolina, on the other hand, managed to submit enough evidence in its portfolio component to

satisfy the federal government. The state of Virginia was told by the federal government that it

must stop using its test of English language proficiency to calculate AYP in reading for

beginning English learners. Teddi Predaris, the director of services for English language learners

in the Fairfax county district stated that about 10,000 students have been taking this test

statewide as a substitute for the regular reading test. A request to U. S. secretary of education

Margaret Spellings solicited a one-year delay in changing that states policy. The Virginia

education officials were denied a similar request in a December 11, meeting. Four other states,

Minnesota, Nebraska, New York, and Texas, were also using English language proficiency tests

as substitutes for regular reading tests. While the others have dropped that practice, Nebraska has

yet to decide what to do. Overall, states have not been able to demonstrate comparability for such

tests and the federal government has not made it an option for states to use an English-language-

proficiency test instead of a regular reading test for purposes of NCLB accountability. The most

common practice for states to include English learners in large-scale testing is by giving them the

regular state test with the use of accommodations. These accommodations normally comprise the

use of a dictionary or extra time. About eight states, including New York and Texas, are also

providing some tests in languages other than English for certain grades or subjects (Zehr, 2007a).


Peter Zamora, lawyer and former high school teacher, advocates for the needs of English

language learners as co-chairman of a diverse coalition of advocacy groups. These groups are

now advancing the concerns of English learners as Congress considers reauthorization of NCLB.

As spokesman for his group, he promotes better assessments for English learners through more

technical assistance in testing issues for the states, and endorses the importance of fully including

English learners in NCLBs accountability system. While some question the validity of these

tests for English language learners, he maintains that too much flexibility will mean that districts

will fail to be serious about meeting the needs of those students. He attributes his advocacy

policies to his own experiences of teaching high school in Hayward, California, in the 1990s. His

college-preparatory classes were mostly made up of Anglos, while his basic-skills classes were

generally composed of Latinos and African-Americans. He attributes this to their economic

status and the decisions that were made early in their school career. He recognized that the basic-

skills kids were getting a slowed down curriculum that created a gulf between them and the

college-prep students at the end of their careers (Zehr, 2007b). Ironically, half the Americans

who have studied at the graduate level cannot comprehend complex material with specialized

language and make inferences from it either. Yet, these are the people who are considered to be

the success stories of education (Smith, 2008).

Impact on Education and the Teaching Profession

In January, 2000, Carol Minnick Santa, president of the International Reading

Association, wrote an open letter to then, U. S. Secretary of Education Richard Riley. Within this

letter, she acknowledges that the gap between children of poverty and their wealthier

counterparts is unacceptable, and that he helped to initiate the standards and assessment

movement for improving reading. In it, however, she charges that standards have become


synonymous with high stakes assessment. She states that the trends of high stakes testing will

politicize education, narrow curriculum, and reduce professionalism. Furthermore, they are

unfair to children, and especially to those of minority families. She describes the need for a more

definitive testing instrument that could actually capture the many facets of a great student and

reading, rather than multiple choice tests that assess mostly memorization of factual knowledge

and discreet skills. She alleges that research does not prove these types of tests to be valid

indicators of success in school, college, work, or life in general. Yet, they are already being used

to make life-altering decisions. Not only do early results show that students are not doing well

and failing, but teachers, especially those in high-poverty areas, are spending more time in test

preparation at the expense of teaching a broader, more interesting curriculum that encourages

children to come to school. These teachers are spending more time trying to prepare students for

multiple choice questions than discussing literature or doing science experiments. She goes on to

relate that qualitative studies show that next to death, flunking is a childs greatest fear, and that

conversations with children have been troublesome for her. She specifically mentions that the

poor and English learners are especially targeted, and that these tests will create even more

disenfranchisement from school. She writes about the astounding amount of money being spent

on high stakes testing, and offers more productive ways to spend this money. These alternatives

are plans such as hiring reading specialists for schools for reading intervention programs, and

planning staff development programs for all teachers of special education, Title 1, and teachers

in the general classroom; programs planned so that all involved can have a better understanding

of multiple ways to teach reading to a child. She expresses her dreams of a future where teachers

have time to plan, become more professional, and are paid as professionals who are charged with

preparing our countrys most important resource, our children, and where curriculum engages


students in broad and rich thematic studies. This is the environment in which students and their

teachers have opportunities to learn, think critically, and create new content (2000).

On the whole, as pressure to improve students test scores increased, teacher job

satisfaction has decreased In a survey of school-based teacher educators, or mentor teachers,

mentor teachers surveyed revealed that they felt increased pressures to improve their students

standardized test scores. The majority of these teachers worked in grades 1-5. A few represented

kindergarten and 6th

grade, and some were specialists, such as reading specialists. The pressures

were created from school boards, principals, and the media. On average, these elementary

teachers are required to administer seven different local, state and national standardized tests in

an academic year. Another variable relating to teacher job satisfaction that was considered

involved changes in job climate. Statistics show significant relationships exist between job

satisfaction and such things as control over classroom curriculum, perception in ability to meet

individual needs, teachers effectiveness as an educator, and changes in teachers control of the

daily schedule. Curriculum and instruction are a primary aspect of assessment. Whether

intentional or not, curriculum is largely influenced by NCLB implementation and accountability.

More than 82 percent of these teachers focused intensely on covering test objectives in their

instruction as well as adjusting instructional plans according to students test scores. In 68

percent of the classrooms, significant time was regularly spent on test preparation activities.

Furthermore, students practiced the various test question formats and were given worksheets that

reviewed anticipated test content. Attention to science had decreased for 66 percent of the

teachers, while attention to social studies had decreased with 60 percent. Emphasis on preparing

students to test well occurred in 93 percent of the classrooms, while student choice of curricular

decisions decreased in more than 50 percent of the classrooms. The hard drive for test results had


caused all other activities, such as art, to all but disappear. Mentor teachers particularly noted a

pressing need for an emphasis on assessment literacy in initial teacher preparation. Assessment is

the process of gathering evidence about a students understanding of content and making

constructive inferences from that evidence. Evaluation is determining the worth of something

upon careful examination and judgment, and this is the merit of assessment. There is a mounting

need to help new teachers manage the stresses of tests and accountability, while at the same time

doing what is best for students education. Public education is in need of teachers who are able to

assess the strengths of each child, while still providing students with an engaging learning

environment of rich content (Snow-Gerono & Franklin, 2006).

A qualitative study done by Skrla, 2001, suggests that teacher expectation levels relating

to the abilities of minority youth in their classrooms have been influenced in a positive way.

Skrla concluded that the accountability factor had changed teachers expectations for the

achievement of their students for the better, and that these expectations provoked improvements

in both the quality and equitability of classroom instruction. Other researchers, however, have

illustrated how accountability policies work against teachers. Anderson and Ginsberg, 1998,

noted that such policies monitor, scrutinize, and finally end up controlling teachers, their

teaching, and their interaction with their students. The assumption that teachers do not produce

unless pushed, forced or coerced has not escaped the attention of teachers, and especially those

of minority populations. In fact, studies by McNeil in 2000, Pennington in 2004, and Sloan in

2004 and 2006, have concluded that high stakes testing has not only placed teachers of minority

youth under emotional strain, but that many of them contemplate leaving the profession.

Pennington and Sloan describe the move toward high stakes accountability as a reconfiguration

of teachers professional knowledge of curriculum, pedagogy, and of students in the classroom.


Pennington explains in detail how the increased use of standardized testing to enforce

accountability actually undermines the teachers knowledge of more complex and culturally

responsive positions of literacy. Teachers end up abandoning these views of literacy to meet the

demands of high stakes testing. Conceptions of literacy that are multifaceted, socially

constructed, culturally responsive, and highly contextual are reduced to mere test reading and

test writing. As a result, the quality of classroom literacy declines, as does teacher expectation

for minority populations.

McNeil described the declining process in 2000 as the de-skilling and deprofessionalizing

of teachers. She depicted the harmful effects on curriculum and pedagogy as the collateral

damage that are inherent flaws of the new school reform. She concluded that the results of the

policies led to a narrow, proficiency-based curriculum, which led to systematic defensive

teaching. With this style of teaching, teachers asked less of their students aside from that which

led to satisfying district mandates. In her 2004 research, Pennington describes those teachers as

devoting increasing amounts of instructional time to test taking skills, and test preparation with

commercially developed materials. These materials resembled the form and content of state tests,

and paralleled classroom instruction to match objectives on the state test. Yet, these techniques to

bolster test scores and earn schools high ratings are ultimately polluted. In the end, the test scores

serve to undermine the value and validity of the inferences made about students literacy skills.

She does not, however, investigate the fact that some schools still received an unacceptable

rating from the state, or explore the overall quality of teachers classroom practices. The lack of

counterexamples leaves the net effect underexplored.

Teachers varied experiences and responses to accountability policies are a pivotal

measure through which to gain perception and discernment of the true effects. Even so,


Valenzuela described in 1999 and 2000 how the absence of more authentic expressions of care

from teachers was alienating them from many of their students of Mexican origin. She explained

that standardization and high stakes testing undermined and even dismissed students culture and

their conception of what should be a meaningful education. She remarks that reducing these

youth to test scores reduces them to becoming objects for the mere purpose of shaping them into

the dominant, monocultural mold (as cited in Sloan, 2007).

Today, the demands of NCLB and objectives of IDEA for a least restrictive environment

(LRE) have led to a strong movement for inclusion in the classroom. Both general educators and

special educators are feeling inadequately prepared. Inclusion is promoting larger class sizes

with students being crammed into classrooms, and the profession of teaching has reached a high

stress level. More teachers are focusing on simply reaching the requirements that the standards

mandate, and this trend is also occurring in IEPs (Jara, 2006). The apparently conflicting

mandates of IDEA 2004 and NCLB 2002 also continue to promote a rising dilemma for special

education professionals as well. Designing IEP programs for students with significant disabilities

is challenging for teachers and IEP teams. These programs must assure access to the general

curriculum, yet provide highly individualized instruction that is responsive to varied needs.

Moreover, appropriate and valid assessment strategies for student progress and evaluation of

goals must continue to be maintained within statewide accountability systems under NCLB. One

possible solution may be that the development of specific standards-based objectives for these

populations could provide a means for reconciling all of these requirements. At any rate, these

students must be adequately supported if state accountability measures demand inclusion for

them (Lynch & Adams, 2008).


Furthermore, of the hundreds of failing state schools that are under restructuring in

California, few have been able to exit restructuring by pulling up satisfactory test scores. The

Center on Education Policy based in Washington determined in a report that only 33 of the 700

schools in restructuring in the 2006-07 year made enough progress to exit what is known as

program improvement. That number accounts for 5 percent of the schools that were in

restructuring in California. The large number of schools in that phase makes California a perfect

model for research. This number has risen to 1,013 for the year 2008. Interestingly, California

actually began implementing its standards-based accountability system much earlier than most

states. What is more, it even identified schools for improvement before NCLB became law. A

report, titled Managing More Than a Thousand Remolding Projects, said that federal

restructuring strategies rarely have helped schools to raise student achievement enough to attain

AYP to exit restructuring. It stated that the findings in California indicate the need to rethink

restructuring nationwide (as cited in Jacobson, 2008).

California Superintendent of Public Instruction Jack OConnell is preparing a plan to

intervene in 98 school districts that are facing sanctions because their students have not met their

targets for at least five years. This can range from merely revising local-educational agency plans

for districts that came close to their goals, to the extreme of completely abolishing failing

districts. Schools that are in program improvement have several options presently available to

them. One involves contracting with an outside organization to run the school. Another is to

become a charter school. They can also choose to replace staff, turn school operation over to the

state, or any other restructuring effort that would show promise of significant changes. The

Center on Education Policy (CEP) study shows that the last option was overwhelmingly

Californias choice in 2006-07. The option of choice might include adopting a new curriculum,


improving technology, or even having teachers reapply for their jobs. Mr. OConnell referred to

his new plan as a system of triage that would initially assign technical assistance teams to

facilitate the lowest-performing of these districts. Then, the state would capitalize on some pilot

intervention projects already in progress elsewhere in the state. This leaves many wondering who

will pay for this plan when the state is already facing a $14 billion deficit for fiscal year 2009.

The CEP study showed that most districts and school administrators were optimistic about their

schools making AYP. Others reported that they would be using strategies proven to be successful

for other schools. Perhaps it was most aptly put by Merrill Vargo, the executive director of

Springboard Schools, a school improvement organization in San Francisco. She wondered if

anyone had a plan instead, that might help these schools to better serve the kids who get up in the

morning to go to these schools (as cited in Jacobson, 2008).

Overall Achievement toward Proficiency

Several studies done in 2002 suggested early on that high stakes testing might actually

hamper academic achievement among students. Audrey L. Amrein, a researcher at Arizona State

University in Tempe, formed a report with ASU education professor, David C. Berliner, which

spanned the results over two decades of the initial federal thrust for accountability in education.

The studies described how efforts of states to link major penalties to student test scores were

generating little academic improvement. They speculated that, in some cases, high stakes testing

might be goading struggling students to seek alternatives from the traditional path to a high

school diploma. Research showed that after adopting new high stakes testing policies, 19 of 28

states witnessed their 4th

grade mathematics scores decrease on the federally supported NAEP,

compared to the national average. Eighteen states saw increases compared to the norm on the 8th

grade NAEP math tests. While states test scores are used to calculate AYP, NAEP dat

High Stakes Testing: Help or...

Documents

Transcript of High Stakes Testing: Help or...