High Stakes Testing: Help or...

download High Stakes Testing: Help or Hindranceacademic.laverne.edu/.../RobertaTaylor_HighStakesTesting.pdf · Roberta M. Taylor August 2008 . ... High Stakes Testing: ... NCLB is requiring

If you can't read please download the document

Transcript of High Stakes Testing: Help or...

  • High Stakes Testing 1

    Running head: HIGH STAKES TESTING

    UNIVERSITY OF LA VERNE

    LA VERNE, CALIFORNIA

    HIGH STAKES TESTING: THE INSTITUTION OF NCLB AND ITS IMPACT UPON

    DISADVANTAGED POPULATIONS

    A Paper Prepared for EDUC 596

    In Fulfillment of

    The Requirements for the Degree

    Master of Education

    Roberta M. Taylor

    August 2008

  • High Stakes Testing 2

    Abstract

    The Purpose of this Literature Survey was to conduct an in-depth examination of the impact of

    high stakes testing in education upon some particularly disadvantaged subgroups, such as low

    socio-economic status and race, special needs and at-risk populations, and English language

    learners. The institution of NCLB in 2002 has prompted nationwide escalation toward more

    standardized testing in education. The principle governing NCLB is that all diverse children can

    equally learn to read, write and calculate math problems at pre-determined levels, thus, leaving

    no child behind. However, research reveals that the negative effects of NCLB and high stakes

    testing are inadvertently leaving all manner of children behind and undermining the teaching

    profession as we know it today.

  • High Stakes Testing 3

    High Stakes Testing: The Institution of NCLB and Its Impact upon Disadvantaged Populations

    Admittedly, the major education initiative of the Bush Administration is the No Child

    Left Behind Act (NCLB), which was intended to raise educational achievement and close the

    racial/ethnic gap. Its policies were calculated to focus schools attention on raising test scores,

    mandating better qualified teachers, and providing the public with educational choice.

    Disappointingly, the complexities of the law have been unsuccessful in achieving these goals and

    have instead provoked a number of negative consequences which inadvertently harm the students

    it is intended to help (Darling-Hammond, 2007).

    Since the advent of the No Child Left Behind Act signed into law by President George

    W. Bush in 2002, there has been an escalation in education toward increasingly more

    standardized testing. The compelling force behind this movement is the expectation that all

    children can learn to read, write and calculate math problems at pre-established levels, regardless

    of any diversity and in spite of any differences. All classifications of students, be they mentally

    handicapped, English as a second language, minority race or low socio-economic status,

    behaviorally disordered, learning disabled or otherwise disadvantaged, are required to be

    included in on state mandated tests. Furthermore, NCLB is requiring that all students score

    proficient by the year 2014. In addition to this position, it is asserted that teachers and schools

    are ultimately accountable for students performance (Campbell, 2008).

    More than six years after its inception, the controversy of high stakes testing perpetuated

    by the federal mandates of NCLB continues. The heightened obsession with raising test scores to

    meet proficiency standards has, in actuality, had a negative effect within some variable

    subgroups. Even beyond this, the paradigm of education has shifted from the benefit of the

    student to satisfying political objectives. This has fashioned some unanticipated changes in the

  • High Stakes Testing 4

    attitudes and approaches of those involved in the teaching profession. The purpose of this

    literature review is to disclose the results of the investigation into the role of high stakes testing

    and to uncover the injustices encountered by some particularly disadvantaged groups.

    Discriminations exist in the biases experienced by low socio-economic status and race, in the

    discrepancies that besiege special education or exceptional students and at-risk populations, and

    in the disparities for English language learners. Funding problems also hinder the educational

    progress of these subgroups. Although the focus of this investigation is the impact of high stakes

    testing upon students, it cannot go unnoticed that the impact extends far into the field of

    education, and that the thrust of NCLB accountability has impacted professionals in education

    from teachers all the way up to the state level.

    The History Behind NCLB

    State testing programs began in the wake of a 1983 report called Nation at Risk

    released during the Reagan Administration. At the request of Capitol Hill, Daniel Koretz at the

    congressional Budget Office tracked student performance from the immediate postwar period

    forward. The news was good, but Koretz emphasized that state assessments would have to

    provide annual tests of comparable students each year in order to yield valid and reliable data on

    student achievement over time. The rise of systematic reform in the states was crafted to generate

    clearer learning standards and stronger student testing results that would remedy institutional

    limitations. This restructured system was to focus on achievement outcomes measured by state

    education departments. By and large, states adopted the National Assessment of Educational

    Progress (NAEP) style of reporting achievement levels that communicated student levels of

    mastery as criterion referenced, rather than percentile rankings, known as norm referenced, that

    are comparisons between states. In the late 1990s analysts began to compare the number of

  • High Stakes Testing 5

    students considered proficient against state versus NAEP definitions. However, the NAEP exams

    are not necessarily aligned to any states curriculum, and caution is advised in making

    comparisons between the two. Furthermore, the definition of proficiency and cutpoints varies

    among and between states. The legislative creators of the current act of NCLB assumed that

    proficiency defined a parallel level of student performance across states and tests (Fuller, Wright,

    Gesicki, & Kang, 2007).

    The federal policy of NCLB that governs our nations schools today evolved from a

    model of an accountability system first established in Texas, President George W. Bushs home

    state, during the years just prior to his presidency in 2000. Texas was one of at least 27 states to

    use the results of standardized tests to make such high stakes decisions as holding back students

    a grade, withholding their diplomas, and penalizing teachers, principals and schools that

    performed poorly. Scholars, educators, and policymakers agree that tests are useful for tracking

    students progress and identifying weaknesses in teaching. Yet, educators even then began to

    describe testings dark side. They alleged that standardized tests are too limited, imprecise, and

    too easily misunderstood to use as the basis for crucial decisions concerning students. They also

    maintained that the consequences of high stakes testing interferes with good teaching and

    discriminates against disadvantaged and minority students who need the most help (Miller,

    2001). Since its institution, NCLB has created intense pressure nationwide to show increases in

    test scores within a very limited time period. This given time period has proved to be far less

    time than realistically needed for improving instruction, updating curricula, or enhancing the

    skills of teachers to perform miracles within all subgroups and grade levels across the board.

    In Texas, it was eventually discovered that although test scores rose in the early grades,

    middle and high school grades proved less successful. In time, performance at the high school

  • High Stakes Testing 6

    level exposed the weakness in the system. At the end of the day, Texas finally responded by

    producing a loophole that permitted principals to exclude from their school-level scores those

    students considered to be liabilities (Bracey, 2008). The outcome of those students not testing up

    to standard was to find ways to exclude them. This is a practice that appears to be gaining steam

    nationally. Ironically, the idea of leaving no child behind and exclusion appear to be

    functioning synonymously today. This is a contradiction of the ideals that govern the original

    policy of NCLB. Still, both the state and the national system claim that they are improving

    education and that they leave no children behind. They continue down the same beaten path

    while research shows that there is imminent need of paving a new road.

    The Role of Politics in Determining Standards

    Standards have grown out of a number of key issues in education. These include a

    century-old deliberation involving tracking, a 50-year-old breakthrough concerning the impact of

    teacher expectations, a 40-year struggle over educational equity, and the enduring desire for

    highly skilled workers to drive the nations economy. Not surprisingly, they converge to create

    support from politicians, educators, and business leaders, alike. They serve two primary

    purposes. The first is to address an economic concern that America is losing its competitiveness

    in the global market. Fear that America is falling behind, prompted through international studies

    of achievement, has propagated the need to press students to learn more and learn faster. The

    argument is that our students cant compete because of poor preparation, and that our economy

    will suffer. The second purpose for standards is to address the disparity between low achieving

    and high achieving students. There is a growing gap in income made worse by a growing gap in

    education (Gratz, 2000).

  • High Stakes Testing 7

    In their most common function, standards are fundamentally the descriptors of what

    should be taught, otherwise known as the content. They are intended to provide waypoints for

    helping teachers and parents gain understanding of where children are cognitively. Achievement

    standards are complex and greatly subjective. They involve the two components of performance

    and proficiency. Performance standards are reflective of how students show their determination,

    while proficiency standards unearth their level of achievement. These are the mixtures that

    compel states toward the accountability and assessment that they design to present what students

    know to the public. In this scenario, comparison then shifts focus from the needs of learners to

    the needs of the school, which can be detrimental to students (Schultz, 2008).

    Unfortunately, many education reforms are influenced by political ideology rather than

    what really works in schools. The statements of the problems are usually compelling, and more

    complete and accurate than the proposed solutions. They characteristically overpromise and

    underdeliver. Even the most promising initiatives predictably fail when poorly implemented on a

    broad scale. Some are adopted too quickly, some too rigidly. Some are adopted in name, but not

    genuinely implemented. Little attention is paid to differences between schools, and little regard

    is given toward unintended consequences. It should come as no surprise that inadequate

    implementation and unanticipated consequences are fueling a growing rebellion against high

    standards and tough tests (Gratz, 2000). The unsettling reality in todays political climate is that

    many political leaders believe that the best way to transform schools is through an end of a gun

    barrel approach, rather than by building consensus. Clearly, accountability to NCLB and its

    corollary regulations at state levels support this approach (Casborro, 2005).

    Still, proponents believe that raising standards for all students, teachers, and schools will

    improve the education of poor and minority populations, particularly in urban schools where

  • High Stakes Testing 8

    student scores generally fall way below current standards. These students are placed at a great

    disadvantage when so little is expected of them. They should be motivated to competition

    through rigorous and world class standards. Yet, there is significant agreement that standards

    should address the essentials, be grounded in core academic disciplines, and should cover

    content students should know and be able to perform. American schools and students need an

    incentive to strive for higher levels of performance, and standards and accountability have

    emerged as the means to contend with these problems. Even so, standards are ultimately about

    the ends and not the means. They should not be a prescription for teaching methods, a device for

    classroom strategies, or a substitute for lesson plans. Furthermore, their objectives should be

    crystal clear to students, parents, and teachers and they should enjoy broad support among

    teachers and the general public. The component of ownership is crucial when standards imply

    critical assessment and accountability (Gratz, 2000).

    The Significance of High Stakes Tests

    High stakes tests are tests that have the potential to significantly change the lives of

    children and adults through direct consequences for individuals, groups, or organizations, in

    which the stakes are high. They include such tests as state-wide assessments used to measure

    adequate yearly progress (AYP) required by NCLB, college entrance exams, certification and

    licensure exams like those used in education, business, medicine, and legal professions, and

    some aptitude exams. Thus, it is the use of high stakes testing that determines whether the

    assessment is high stakes, and not the design of the test. The hope is that increased pressure

    upon states and schools to help all groups perform better on the tests will diminish the gaps in

    performance between diverse population groups (Polnick & Reed, 2006; Nichols & Berlinger,

    2008).

  • High Stakes Testing 9

    Requiring states to assess and report student outcomes by diverse groups places emphasis

    on all students to achieve standards regardless of their socio-economic status, race, ethnicity, or

    gender. Incentives for schools to perform stimulates support from faculty. Motivators for

    individual students come in the way of praise and recognition. Promoting mastery of national

    standards through high stakes assessments may focus curriculum toward meaningful content and

    research-based instructional practices. However, there are some concerns as to the consequences

    of high stakes tests for schools and students. Assessments associated with accountability systems

    can result in rewards and/or sanctions for students and schools, thereby having the potential to

    also deliver a negative impact. Sanctions for schools whose students fail to pass these tests in

    adequate numbers can result in a change in faculty, a change in leadership or a total

    reorganization of the school. This can lead to an increased gap between low and high performing

    schools, reduced opportunities for minorities and students with disabilities, and the threat of

    decreased support for national curriculum efforts. The negative effect for low performing schools

    and students is potentially greater than the negative effect on high performing schools and

    students. This results in an increased gap between the schools. Another critical area of concern is

    high school. High stakes assessments that translate into high school exit exams that students

    must pass in order to graduate can result in students being marginalized from mainstream

    society. This can have significant consequences for the many disadvantaged students of low

    socio-economic status, minority races, and students with disabilities (Polnick & Reed, 2006;

    Nichols & Berlinger, 2008).

    Some characteristics of high stakes tests given to elementary children would astound

    parents and the general public. An application of the Flesch Reading ease Scale and the Flesch-

    Kincaid Grade Level Scale to the high stakes test used in California in 2004-05 rated 50% of the

  • High Stakes Testing 10

    passages as extremely normal grade level reading norms by one year or more. Almost 20% were

    above grade level by more than two years, while some passages exceeded grade level by three

    years or more. Compounding this demanding situation is the depth and breadth of the questions.

    Yet, when children cannot engage the material in the exam, their scores provide no meaningful

    data about their academic progress (Meek, 2006). What is more, the policy of high stakes testing

    is not scientifically based. Improvement on test scores is not a true indicator of what students

    really know. The reliability coefficient of standardized tests for elementary and secondary

    students is around .9, which is not especially dependable. Tests are imprecise because of design

    error and the small sampling upon which broad conclusions are drawn. Education consultant, W.

    James Popham, retired from the University of California at Los Angeles, states that policymakers

    erroneously assume that standardized achievement tests measure what a school has taught, when

    in fact it does not. Traditionally, designers eliminate questions that too many test-takers may

    answer correctly in order to spread students out along the familiar bell curve of statistics. This

    leads them to use test items that are unsuitable for measuring the quality of instruction. He

    estimates that 20 percent of mathematics questions, 40-50 percent of reading items, and 70-85

    percent of language arts items on two common achievement tests for elementary students were

    more suitable for measuring I. Q. or socioeconomic advantages. Another profound criticism of

    standardized testing is that teachers learn to teach to the test. This posture encourages the

    substitution of the shallow content of test preparation for more challenging curriculum. In effect,

    teachers become better at preparing their classes for these tests, which also defeats the purpose of

    giving them (as cited in Miller, 2001).

    The Impact of the California High School Exit Exam

  • High Stakes Testing 11

    The California High School Exit Exam (CAHSEE) is the test created to ensure

    accountability for Californias high school graduates. It is a measure of these graduates

    proficiency in English language arts and mathematics. Scheduled for implementation in 2006,

    the potential for this tests impact upon students is great, and has been the subject of much

    debate. Testing as a means of demonstrating accountability demands that the CAHSEE must be

    scrutinized for fairness, particularly within the context of validity, absence of bias, access,

    administration, and social consequence (Callet, 2005).

    In 2003, the Californians for Justice, a statewide grassroots advocacy group for poor and

    minority communities declared that the California High School Exit Exam (CAHSEE) is harmful

    to minority and low-income students. Considering the quality of education that they earn from

    unqualified teachers, and the run-down, hazardous, and unhealthy schools they attend, they

    claimed that the CAHSEE falls short of its stated goals and these students are at a disadvantage

    (Calif. Group, 2003). After an internal evaluation of the test done by the state education, it was

    found that some fundamental components of the test, such as algebra, had not been taught to

    many of the students. The evaluation also determined that more than 20 percent of students,

    predominantly students limited in English skills or students with disabilities, would have failed

    the test and been denied their diplomas. As a result, the California state school board voted

    unanimously to delay the mandatory exit exam from 2004 to 2006. State officials anticipated that

    the two years would be sufficient time for younger students to learn the skills required for the

    exam. The exam is initially administered in 10th

    grade, and students who do not pass on the first

    try are given a number of opportunities to pass before graduation (Sack, 2003).

    As early as 2001, the Oakland, California based Disability Rights association moved to

    action in a San Francisco District Court. In a class action lawsuit known as Chapman v.

  • High Stakes Testing 12

    California Department of Education, the Superintendent of Public Instruction, and the State

    Board of education, plaintiffs declared the high school exit exam to be invalid and discriminatory

    against students with disabilities. The case was put on hold until 2004, when Alameda County

    Superior Court Judge Ronald M. Sabraw commissioned a regional education laboratory known

    as WestEd to carry out a study investigating the impact of the exam on students with disabilities.

    The study, completed in 2005, recommended that the requirement of the exit exam for

    graduation be delayed for at least two years. Results of the state data of these students revealed

    that 54 percent of them had passed the English section of the test, while 51 percent had passed

    the math section. The total percentage of students that had passed each section statewide was

    about 88 percent (Jacobson, 2005).

    By early 2005, new legislation known as SB517, appeared close to a final agreement. The

    terms of SB 517 would permit students with disabilities who had not passed the exit exam to

    graduate with their senior class under certain conditions. These students could graduate provided

    they had satisfied all their other graduation requirements and met the terms of the original

    Chapman settlement. This included taking the exit exam at least once during their senior year,

    and taking advantage of remediation. Proposals involving delays or changes in the graduation

    requirement, including legislation to permit alternatives, such as portfolios or locally developed

    assessments, would require an amendment to state law. These were vetoed by California

    Governor Schwarzenegger previous to 2005 at the urging of Superintendent of Public Instruction

    California OConnell. While the California education department had reviewed a number of

    alternatives to measure students demonstration of mastery of the standards, Mr. OConnell

    concluded that none of the alternative methods was an appropriate substitute. He asserted that

    students must have the mathematics and English skills measured by the exam in order to thrive in

  • High Stakes Testing 13

    an increasingly competitive global economy. This position drew immediate fire from civil rights

    advocates and other interest groups. Still, he maintained that there were plenty of options that

    remained for students to continue their education if they had failed the test and said that he

    would propose legislation to help expand their options. Among the list of proposals to be offered

    was the provision of more money, lifting enrollment caps for remedial education, summer

    school, adult education, and independent-study programs that could help students pass the exam.

    He also called for students to be eligible for Cal Grants to attend community college if they had

    not passed the test, but had met all other graduation requirements and grade-point-average

    requirements.

    In early 2006, the Human Resources Research Organization (HUMRRO) estimated that

    only 35 percent of special education students and 51 percent of English language learners in the

    class of 2006 had passed both portions of the exam (Olson, 2006). A new law reflecting the

    terms of SB 517 was signed by Governor Schwarzenegger in early 2006, which signaled a

    settlement in the case of Chapman v. California Department of Education. Under the measure,

    students with an individualized education plan would be able to graduate without passing the exit

    exam, as long as they had met all other state requirements for a diploma and received available

    remediation for the test (Jacobson, 2006).

    Also in 2006, after about 41,000 seniors in the class of 2006 failed to receive diplomas

    because they did not pass the exit exam, a group of high school seniors sued the state. In

    Valenzuela v. OConnell, the Superior Court of the State of California ruled that schools in

    California could no longer require an exit exam for high school graduation. The judge in that

    case concluded that the exam discriminates against students who attend schools in low socio-

    economic areas and students who are learning English (Kravetz & Sawchuck, 2006). The

  • High Stakes Testing 14

    California Department of Education immediately retaliated by asking the state Supreme Court to

    overturn the decision. The states exam was reinstated as a graduation requirement for the class

    of 2006, but the Supreme Court sent the case to an appeals court to make the ultimate decision of

    whether or not the exam would continue to be required of the nearly 47,000 seniors who had yet

    to pass it in 2006 (Sawchuck & Kravetz, 2006). Rejecting the claim that the test discriminates

    against English language learners and poor students, the California Appeals Court upheld the

    CAHSEE (Kravetz, 2006a). Finally, in October, 2006, the governor ultimately signed new

    legislation bringing to an end the lawsuit against the state, Valenzuela v. OConnell. The

    measure allows students to receive up to two years of extra help beyond the 12th

    grade.

    California districts are now eligible to receive a share of more than $70 million for supplemental

    instruction and counseling services that specifically target students who have reached the end of

    their senior year, but have not passed the states high school exit exam (Jacobson, 2007).

    There are currently 23 states that have exit exam mandates, and more are moving toward

    that direction. The struggles in California are mirrored by other states as decisions are made to

    uphold exit exam mandates or to delay the requirements. The state of Maryland is still deciding

    whether to implement or push back the requirement for their class of 2009. Washington State

    requires students to pass the reading and writing portions of their state test, but the 10th

    grade

    math section of the Washington Assessment of Student Learning exam is delayed until 2013.

    The performance of economically disadvantaged students, minority students, and students with

    disabilities are what is at issue in most states. In California, of the students of the graduating

    class of 2007 who have not yet passed the state exam, about 18,000 are English language

    learners and 24,000 are of low socio-economic status (Jacobson, 2007).

  • High Stakes Testing 15

    The federal government does provide states with grants for adult education and the

    General Equivalency Diploma (GED) test. However, Don Soifer, the executive vice president of

    the Lexington Institute, a think tank in Arlington, Virginia, remarked that the ability to offer and

    take the GED in Spanish provides an incentive for schools to put the limited-English-proficient

    kid on a GED track rather than a regular high school education track, which undermines the

    goals of inclusiveness (as cited in Zehr, 2006). The California Department of Education finally

    released a list of dropout rates for the 2006-07 school year that puts the states average at a 24.2

    percent for California Public high schools. Data reveals the dropout rates at 41.3 percent of black

    students, 31.3 percent of Native Americans, 30.3 percent of Hispanics, 27.9 percent of Pacific

    Islanders, 15.2 percent of whites, and 10.2 percent of Asians. Critics have charged that in the

    past, state education officials have used unaudited data from school districts that reported a

    graduation rate of 85 percent in order to comply with NCLB requirements. The new numbers

    paint a more realistic picture. Yet, these figures pale in comparison to the 54.5 percent dropout

    rate of the Victor Valley Union High School District, which bears the worst rates in the county of

    San Bernardino. Duneen DeBruhl, VVUHSD assistant superintendent of educational services

    remarked that district officials believe students are frustrated in meeting class requirements,

    among other things. They are inclined to dropout if they feel they dont have enough credits to

    graduate with the required state exam (as cited in VVUHSD Worst Dropout, 2008). VVUHSD

    reported 1,552 students of the countys total of 7,082. Statewide, the number of dropouts was

    reported to be 87,456 (List of Dropout, 2008).

    Disadvantages of Low Socio-economic Status and Race

    Studies done by Onwuegbuzie & Daley, 2001, Roth et al., and Holman, 1995, reveal that

    all ethnic minority populations score significantly lower on traditional standardized tests than do

  • High Stakes Testing 16

    white, with the exception of Asian Americans. Scores from the Standard Assessment of

    Intelligence (SAI) show that on the average, Asian Americans score three points higher than

    white Americans, while white Americans achieve 15 points higher than African Americans, and

    11 to 22 points more than Hispanic students. The College Board reported similar differences in

    1998 Scholastic Aptitude Test (SAT) results. White Americans scored an average combined

    score of 1054, Hispanic students averaged 927, Mexican American students averaged 913, and

    African American students averaged a combined 900 (as cited in Altshuler & Schmautz, 2006).

    In 2000, a study by Wolfensberger stated that peoples welfare is significantly contingent

    upon the social roles they occupy. Social roles are a combination of functions, behaviors,

    relationships, duties, privileges, and responsibilities that are socially defined, commonly

    understood, and largely recognized within a society. Generally, those who occupy rolls that

    others value as positive are treated well. People who fill devalued rolls are typically treated

    poorly. In this model, being poor or of low socio-economic status is a devalued role, and

    devalued people are apt to face a range of detrimental experiences (as cited in Lustig & Strauser,

    2007). The poor are consigned to low social status and are systematically rejected by society.

    They are more likely to be subjected to loss of control over their lives as others eventually end up

    making some of lifes decisions for them. These individuals are usually associated with the

    negative images of being dependent, lazy and often a welfare recipient. The poor often live in

    neighborhoods of high crime, poor schools, and limited social networks. Children of low-income

    neighborhoods are particularly vulnerable to the impact of poverty, crime, violence, divorce, and

    frequent changes in family structure. They are less likely to have qualified teachers and more

    likely to sit in classrooms that are noisy compared with middle-income students. They are also

    more likely to report more physical assaults and incidences of weapons in schools. A 1997 study

  • High Stakes Testing 17

    by Gephart has shown that there is a strong relationship between poverty, low academic

    achievement, and rates of school dropout (as cited in Lustig & Strauser, 2007). The

    socioeconomic level of ones peers also has a significant influence as peer groups and role

    models are the strongest reference for conformity. It is believed that poverty is characterized by

    inconsistent and unreliable experiences. This lack of coherence and instability translates into

    speculation that children of poverty or low socio-economic status are less likely to persevere in

    pursuing educational and vocational objectives (Lustig & Strauser, 2007).

    Researchers have long recognized that standardized tests have built-in biases that affect

    socio-economic status and race. Unfortunately, NCLB disregards the realities of racial

    discrepancies when the expectation is the same for all classifications of students. Nonetheless,

    high stakes testing provides administration with vital educational standards that help to track,

    sort and label students. Moreover, it is when the stakes are high that people tend to seek

    assistance from professional resources. The more affluent schools, districts and families are at an

    advantage to be able to make use of these types of benefits. Predictably, the lower-performing

    schools and districts and the economically disadvantaged students are less able to afford this type

    of support (Smyth, 2008). As a result, Black and Hispanic dropout and retention rates are

    typically higher than Whites and Asians. Poor attendance, yet another contributing factor to

    low performance in testing, is characteristically a greater problem among those students of low

    socio-economic status as well (Bracey, 2008).

    In spite of the guarantee brought by Brown v. Board of Education, segregation still exists

    in todays schools. Not only are African Americans subjected to persistent educational

    inequality, they are overrepresented in special education, suspended and expelled at greater rates,

    and have higher dropout rates (Green, Mcintosh, Cook-Morales, & Robinson-Zanartu, 2005).

  • High Stakes Testing 18

    Psychological testing done on black and white army draftees and on the large masses of

    immigrants that were arriving in the U. S. in the pre-World War I period generally showed that

    black draftees and refugees scored lower than the white middle class. No one questioned the

    appropriateness of these tests for people whose backgrounds, lifestyles, and language were

    different from the mainstream population. The cultural bias inherent in such tests served the

    useful purpose of labeling then, as they do now. Many still oppose the use of standardized tests

    with ethnic and minority groups, such as blacks. They believe that the cultural frame of the

    individuals environment is an influence from which one cannot be separated. While procedures

    of test development are somewhat less biased today, there still remains the problem of the

    appropriate use of tests and test results, as well as the development of test content that should be

    representative of a pluralistic society (Spencer, 1975).

    Several results of investigation have discovered that negative attitudes toward testing

    may be affecting the performances of African Americans on tests. African Americans are

    commonly contemptuous of the testing endeavor because of the social stigma attached to being

    viewed as intellectually inferior. Arvey, Strickland, Drauden and Martin developed and validated

    the Test Attitude survey (TAS). This was an attitude instrument designed to measure the

    opinions of employees toward the tests they had completed. An analysis of their responses

    revealed nine factors of influence. These included motivation, belief in tests, level of anxiety,

    lack of concentration, test ease, external acknowledgment, achievement requisites, future

    consequences, and preparation. A second study by Arvey and colleagues specifically examined

    the differences in motivation between black and white test takers, and whether differences in

    motivation might explain the racial gap in test performance. They found that race was

    significantly related to the TAS motivation factor. Results revealed that white applicants had

  • High Stakes Testing 19

    more test-taking motivation than blacks. Interestingly, it was found that where the TAS factors

    were held constant between both races, test score differences were significantly reduced between

    black and white applicants. This indicated that the lack of test-taking motivation among African-

    American applicants undermines their performance on employment tests (as cited in McKay &

    Doverspike, 2001).

    Steele and Aronson argued that negative stereotyping of African Americans also causes

    anxiety that compromises their performance. Using a sample of 117 Stanford undergraduates,

    they learned that when described as a test of mental ability, whites significantly outperformed

    African American participants on cognitive ability tests. When the same test was structured as a

    measure of problem-solving ability, with no indication of intellectual ability, not only did the test

    performance of African American increase dramatically, it varied only modestly from their white

    counterparts. Apparently, the social stigma concerning the inferior intellect of blacks has a

    negative effect on their test performance. Chan, Schmitt, Deshon, Clause, and Delbridge, did

    similar test studies, which found that differences discovered in test performance are affected to

    some extent by motivational differences in test-taking between races (as cited in Mckay &

    Doverspike, 2001). This information suggests that we might be able to increase African

    American performance on traditional tests by modifying these particular attitudes. Various

    training programs exist that emphasize the adoption of positive attitudes toward the testing

    process, and the importance of hard work and accomplishment. Another feature that could be

    included in pretest programs is the use of techniques to help deal with anxiety. Still another

    solution would be an alternative format to the paper-and-pencil test that is valid and criterion

    referenced. The quest for video-based tests has resulted in challenges to validity, and computer

  • High Stakes Testing 20

    based tests present practical and logistical problems for pubic use of large-scale testing (McKay

    & Doverspike, 2001).

    A large body of literature has illustrated the unbalance in opportunities to learn because

    of the inequalities between schools and communities. The fact remains that students of low

    socio-economic and minority backgrounds are also far more likely to attend schools with

    uncertified teachers, large class sizes, more violence and substandard educational resources.

    Raising and enforcing educational opportunities through high stakes standardized tests without

    first addressing these and other important issues is not only inherently biased, but intellectually

    and morally dishonest. For instance, the problems encountered by the Native Alaskan student are

    acute. The nature of their problems is associated with the sociocultural and geographic isolation

    of their communities, as described by Estrin and Nelson-Barber in 1995, and Sowell in 1994.

    The imposition of testing methods that are often inconsistent with native cultural practices only

    compounds the problems. An example presented by Estrin and Nelson-Barber is that elders of

    Native American societies often assume the responsibility for educating their young and

    establishing the standards for learning (Sloan, 2007).

    Currently, Hispanics are the largest ethnic minority group in the United States, and there

    are estimates that Hispanics will outnumber white people by 2046. While they represent more

    than 13 percent of the nations total population, they comprise more than 25 percent of those

    below the age of 18, many as students in our schools (Altshuler & Schmautz, 2006). Valenzuela

    did a study in 1999 that explains why high stakes testing exacerbates the negative consequences

    of the U. S. educational system for many of our Mexican and Mexican-American students who

    are Hispanic. In it, she effectively expresses how academic achievement is a social process that

    materializes through the experiences of these students lives as they navigate the many social,

  • High Stakes Testing 21

    cultural, historical, and linguistic relationships that fashion their lives, both in and out of school.

    She became increasingly convinced that the organization of public schools systematically

    removes, or subtracts from Mexican cultural resources in the context of the classroom. She

    illustrates how the school systems are structured around cultural assimilation and resocialization,

    commonly referred to as Americanization. This tool, in the form of a standardized graduation

    exam presents a vast hurdle for many Mexican-origin students who can demonstrate academic

    competence in the classroom, but cannot achieve passing scores on the state tests. She argued

    that the states high stakes accountability system is a mechanism to vigorously enforce

    monolingual policies, which has serious repercussions for the Mexican minority. High stakes

    accountability led to a shift away from dual-language and late-exit bilingual education programs

    structured to help these populations, to early-exit and English immersion programs, as described

    by Pennington, 2004, and Sloan, 2004 (as cited in Sloan, 2007).

    The fall of 1999, the state of Texas was sued by a Mexican-American civil rights group.

    They claimed that the Texas Assessment of Academic Skills (TAAS) was exceedingly harmful

    to the educational prospects of minority students. In this case it was Latinos. Falling short of the

    state-established passing score by only one point could mean attending summer school, or even

    flunking. Scholars testified on behalf of the plaintiffs. They asserted not only that classroom

    instruction was being harmed by high stakes testing, but also that it encouraged minority students

    to drop out. Walter M. Haney, a professor of education at Boston College, declared that 10th

    grade success on the TAAS is an illusion. He estimated that minority students were three times

    as likely to repeat ninth grade than whites before the TAAS was employed, and that their dropout

    rate was higher in that grade than any other. What he found was that a large part of the evident

    increase in test scores was fundamentally due to the rising exclusion of students, particularly

  • High Stakes Testing 22

    black and Hispanic students. He explained that there are three ways to exclude poor test takers.

    You can flunk them in ninth grade, classify them as learning disabled, or persuade them to

    pursue a general-equivalency diploma (GED) instead of a high school diploma. Unfortunately,

    the TAAS makes no accommodation for limited proficiency in English, and their performance is

    also hampered by other inequities, such as school funds and teacher training (as cited in Miller,

    2001). Consider that learning enough English to earn a diploma is all but impossible for most

    minority students from other countries who come to the United States in their late teens. Many of

    these students from Mexico and elsewhere in Latin America are discovering the option of the

    GED test, instead. The GED can be earned by taking the GED test in Spanish or French, as well

    as English, and is the equivalent of a high school diploma. It is a test taken mostly by adults who

    either dropped out of school or never were able to meet graduation requirements (Zehr, 2006a).

    Removing Bias within Low Socio-economic Status and Race

    These glaring observations have not gone without attempts to mitigate the problems of

    bias in high stakes assessments. There have been attempts to modify traditional standardized

    instruments, such as the well-known Wechsler Intelligence Scale for Children (WISC). Yet, as

    Hood noted in 1998, each modification has resulted in the discovery of more biases, resulting in

    perpetual changes. What's more, modifications have been attempted in order to connect

    culturally responsive instructional strategies to testing results. According to Newman &

    Associates, 1996, these are found to be difficult to accomplish within the constraints of public

    education. Another approach that has been considered is the development of nondiscriminatory

    assessment tools. However, the practical capacity for improving discriminatory results through

    changes in test design remains uncertain. Hood, 1998, Linn, 1993, and Shavelson, Baxter, &

    Pine, 1992, all have documented that performance based assessments (PBAs) are plagued by

  • High Stakes Testing 23

    lack of empirical support and low reliability for their nondiscriminatory claims. Furthermore, this

    approach is rendered ineffective when colleges and universities do not utilize these assessments.

    One other approach focuses on language proficiency. While this is an empirically validated

    source of test effect and remediation, there are subtle and underexposed environmental norms

    and contexts in the use of language. It perpetuates a discriminatory bias against students who

    lack proficiency in white, middle-class English, which is the socioeconomic norm of the majority

    of U. S. society. In the long run, this type of discrimination can also have damaging

    consequences for Hispanics and minority students of other ethnic heritage.

    Toward this end, underachievement on high stakes tests can then result in lowered

    academic self-concept. Academic self-concept is a reflection of both the descriptive and

    evaluative aspects of the self. It involves an individuals description of their own strengths and

    weaknesses, as well as an appraisal of their academic competencies. Most will agree that positive

    self-concept cultivates achievement, and successful achievement reinforces self-concept.

    Conversely, it is also recognized that negative academic self-concept tends to limit academic

    achievement. By and large, the ability of Hispanic students to adopt dominant behavioral norms,

    which may be contradictory to their Hispanic cultural norm, appears to be a major determining

    factor of their academic achievement. Conflicts in values and procedures, which are inherent in

    high stakes test design, serve to systematically discriminate against Hispanic students

    (Altshuler& Schmautz, 2006). To date, solutions, such as multicultural education, remain

    superficial and limited by shortsightedness. Sleeter observed in 1996 that multicultural practices,

    such as the study of food, music, and dance, are superficial. The attempt should be to reposition

    perspectives that can foster understanding and equitable academic success (as cited in

    Altshuler& Schmautz, 2006).

  • High Stakes Testing 24

    Disadvantages of Students with Special Needs and At-risk Populations

    Like their peers in general education, states are requiring students in special education to

    take state-mandated tests. Students with disabilities often require special accommodation, such

    as extra time, a word processor, Braille, or a scribe. Schools are responsible to determine

    appropriate accommodations for students with disabilities, but a proctor administering a test may

    have no idea that there is a student who requires any accommodations (Samuels, 2006). Students

    with disabilities are eligible to receive accommodations as defined in their IEP. Possessing a

    learning disability in the area being assessed, such as reading, written language or math is not

    grounds to exempt a student from meeting district or state standards (Swain, 2006).

    As state assessment systems continued to evolve to satisfy federal requirements, an

    examination of states participation and accommodations policies revealed that policies for both

    participation and accommodations were becoming more explicit compared to those in place at

    the initiation of NCLB. Participation options customarily included the standard three choices.

    These were participation without accommodation, participation with accommodations, and

    alternate assessments. Over time, states continued to increase the number of these

    accommodations. However, in some cases they even began allowing accommodations for

    students who were not receiving special education services. The most controversial of these

    accommodations being widely offered were read aloud, calculator, and scribe (Thurlow, Lazarus,

    Thompson, & Morse, 2005).

    Consequently, the implementation of NCLB has affected both general and special

    education by neglecting to justify the purpose of an Individualized Education Program (IEP),

    neglecting to adequately define a highly qualified teacher, as well as introducing unforeseen

    pressures upon both general and special educators. In effect, NCLB has been widening the gap in

  • High Stakes Testing 25

    student achievement rather than leveling the playing field. Schools with special needs

    populations have found it increasingly difficult to stay in compliance with federal legislation.

    Students with disabilities are now being held accountable for general curriculum that they were

    not even exposed to previous to NCLB. Furthermore, the individualized aspect of the IEP is

    being lost as IEP teams attempt to align with the general curriculum now mandated by the state

    standards. IEPs are being overshadowed by the new standards based reform and focus is shifting

    from the students overall development to the general curriculum. Accordingly, the original plan

    of Individuals with Disabilities Educational Act (IDEA) to individualize instruction for the

    student is being ignored. This is because the goal of state academic standards involves equal

    standardized instruction performance for all students (Jara, 2006).

    While NCLB also demands highly qualified teachers, it does not support this demand

    with adequate funds. Highly qualified teachers are challenging to train and special education

    programs are suffering because skilled training for such programs lacks suitable compensation.

    Magnet and charter schools are especially finding it unfeasible to meet Adequate Yearly

    Progress. Many of these schools have come to specialize in educating these exceptional students

    and at-risk populations. Although they are exempt from reporting two percent of their population

    from state assessments, their high numbers of exceptional students make this compensation

    irrelevant. In many of these cases AYP is nearly impossible to achieve. Sooner or later these

    schools will be labeled as failing; leaving at-risk and special needs students behind (Smyth,

    2008).

    The quality of special education is also suffering today because it is extremely

    challenging for interested individuals to become special education teachers. Not only is a degree

    in special education required, but individuals must receive certification in the subject they plan to

  • High Stakes Testing 26

    teach. As a result, the field of special education is suffering because of the growing shortage of

    professionals in special education and the increasing difficulty to become a special educator.

    This has led to an intense movement for inclusion of these students in the general classroom, yet

    another challenge for general educators. What is more, general educators and special educators

    alike do not feel adequately prepared with effective pedagogy for this environment (Jara, 2006).

    In schools of inclusion the majority of students receive services in the general setting. However,

    the structure of the school day and the demands of the inclusion setting provide precious little

    time for explicit instruction for individual students. Students with disabilities need the strategies

    that scaffold learning provides. The strategies of activating students background knowledge,

    discussing objectives, teacher modeling, mastery, collaborative practice, and independent

    practice are critical when students are expected to transfer this skill to new contexts, such as in

    high stakes testing (Swain, 2006).

    In a 1999 report called Assistance to States, federal regulations defined the term multiple

    disabilities as concomitant impairswhich cause such severe educational needs that they

    cannot be accommodated in special education programs solely for one of the impairments (as

    cited in Zebehazy, Hartmann, & Durando, 2006, p. 1). Thus, it is crucial to analyze the problems

    involved in the assessment of accurate information on the performance of students with

    disabilities and other impairments. Visual impairments are an example. Students with

    disabilities often have vision impairments that adversely affect their educational performance,

    and many of these students have an additional disability. In 2005, the U.S. Department of

    Education released information in its 25th

    Annual Report to Congress. In it, it was stated that

    137,768 students from ages 3-21 have multiple disabilities (as cited in Zebehazy, Hartmann, &

    Durando, 2006). Alternative assessments should be considered for students with the most severe

  • High Stakes Testing 27

    disabilities, which include students with visual impairments. These alternatives must reflect on

    the complex nature of a visual impairment that is combined with additional impairments

    (Zebehazy, Hartmann, & Durando, 2006). It was discovered that less than ten percent of special

    education students who spent the majority of their time in special education classes managed to

    pass the English language portion of the California High School Exit Exam in 2006, and still

    fewer passed the mathematics requirement. This developing pattern of low scores among

    students with disabilities has caused investigation into the CAHSEE to recommend that schools

    provide more support and guidance to IEP teams in order to help them with key placement

    decisions (Kravetz, 2006b).

    Statistics collected from a number of states over time unveils information showing that

    less than one-third of learning-disabled students can be expected to pass high school competency

    exams. There are some among this group who make good-to-adequate progress, yet they are still

    not at grade level. These have mild learning disabilities. Yet, many have strengths in other areas,

    such as art, drama, music, or athletics, upon which success can be capitalized. There is also the

    group of students, likely the lowest third, whose disabilities are in the moderate range. While

    they are not so profoundly developmentally delayed that they qualify to take alternative

    assessments, their disabilities are more severe. In all probably, they will never come close to

    meeting the stringent standards on which NCLB exams are based. These exams are too densely

    written, too lengthy, and too difficult in level of reading and comprehension to warrant their

    blanket administration, even with such accommodations as extra time and extra breaks. Even

    after the best instruction, coaching, and proper accommodations, mild to moderately disabled

    students frequently give up and resort to marking questions at random. The expectation that all

    students with disabilities can meet the same academic targets set for their nondisabled

  • High Stakes Testing 28

    counterparts implies that they are the same. While they are equal, they are not the same. It is an

    involuntary cruelty to demand academic proficiency of the entire population of disabled students

    (Meek, 2006).

    Children of autism spectrum disorders are just another example. They pose unique

    challenges to professionals working in the field of developmental pediatrics and early childhood

    intervention because of their restricted patterns of relating, communicating, and behaving. They

    exhibit difficulties following adult directives because of linguistic limitations and problems with

    focusing their attention. They often will not respond to unfamiliar adults or an unfamiliar

    environment. They present challenges to modifying typical routines in order to participate in

    assessments. They have restricted patterns of expressive communication. These and other such

    problems have rendered formal attempts to evaluate these children difficult and unsuccessful by

    professionals who consider them to be untestable. In order for professionals to interact and

    communicate with them, a more sophisticated alternative of assessment is needed. However, this

    is a serious concern because there is a rise in the number of children with autism today (Vacca,

    2007).

    Remedies for Students with Special Needs and At-risk Populations

    Possible options have been sought to remedy the challenges of special needs and at-risk

    populations. One alternative that has emerged to meet the needs of this population is Response to

    Intervention (RTI). This method utilizes students responses to the high-quality instruction to

    which they are exposed for directing educational decisions involving the effectiveness of

    instruction and intervention. Eligibility for special programs and design of individualized

    education programs (IEPs) are determined from there. This alternative is meant to provide early

    intervention for at-risk students while the student waits for evaluation of necessary services and

  • High Stakes Testing 29

    support (Casey, Bicard, Bicard, & Nichols, 2008). However, innovations for remediation such as

    this require funding for special training, something for which NCLB has not provided. Then

    again, even if significant growth does occur, it is still unlikely that the student will be able to

    achieve grade-level proficiency in the year the student becomes covered by an IEP (Salend,

    2008). Also, a great need for special populations and at-risk students is the adequate funding for

    teachers to have special training. Unfortunately, NCLB does not provide this funding.

    Furthermore, as a result of the recent turbulence in economics, states are increasingly facing the

    problem of budget cuts in education. Something else that still remains essential for this group is

    the accommodation of more time to show progress. Special needs populations typically are

    slower learners than general education populations. Students at-risk are typically behind already.

    Yet, the expectations for achievement in high stakes testing have increased while the time span

    has been narrowed to proficiency by the year 2014.

    Disadvantages of English Language Learners

    The validity of reporting proficiency of English language learners is debatable. Students

    of limited English proficient (LEP) are inconsistently labeled from state to state. States with

    higher LEP populations such as California, Texas, Florida and New Mexico all face greater

    challenges in educating their LEP students and meeting AYP than all others. The linguistic

    complexity of state exams demands high levels of English language ability. As a result, many

    schools in these states lose state and federal funding because they cannot report AYP (Smyth,

    2008). Efforts at carrying out requirements for testing English language learners are receiving

    intensified scrutiny as hundreds of schools across the country fail to meet AYP as charged by

    NCLB. Ten school districts in California filed a lawsuit in the state superior court of San

    Francisco. The lawsuit charged that the state is not complying with the federal mandate to test

  • High Stakes Testing 30

    English-language learners in a valid and reliable manner. The law says that states must provide

    accommodations for such students in a language and form most likely to yield accurate data on

    what students know and can do. District officials claimed that their schools were being

    penalized for not meeting AYP goals, and implied that that they were made to look bad by tests

    that they considered as invalid for these students. They argued that some English language

    learners would score better if standardized tests were administered in their native languages.

    Hilary McLean, Californias spokeswoman for State Superintendent of Public Instruction

    OConnell responded by saying that the test is designed to disclose whether students have

    learned English language arts. She said it would also be a complex and costly process to devise

    tests in all of the native languages.

    In Reading, Pennsylvania, a similar suit was filed by the 17,000-student school district

    requesting relief from sanctions placed on those schools. The district ultimately filed an appeal

    with the Pennsylvania Supreme Court when relief was rejected. The bottom line, according to the

    U. S. Department of Education, is that states must provide testing accommodations for English

    language learners. However, the native language choice is optional. Most states have elected to

    provide for their LEP populations written translations of the regular academic tests they give to

    their students. Most of these translations have been in Spanish. In March, 2005, Education Week

    stated that of the 36 states that reported complete information for English-language learners on

    AYP to the federal Education Department for December, only Alabama and Michigan met their

    goals in reading and math for the 2003-04 school year. This is strong indication that hundreds of

    schools are not making AYP for English language learners (Zehr, 2005).

    Standardized tests are also culturally biased. High stakes tests are not only a measure of a

    childs intellect and language, but also their culture. Many non-English speakers are failing these

  • High Stakes Testing 31

    tests and being held back a grade due to the lack of understanding of the complex English

    language. However, their culture and environment also impact their performance of tests. Gifford

    did a study in 1989, which showed that differences in family background, school experiences,

    previous test exposure and training, experiences with racial discrimination, and individual

    motivation can have significant impact on test scores. Another influence upon students test

    performance is their cultures attitude towards tests and performance in schools. Some cultures

    are more focused on family and personal values than holding testing and academic performance

    in higher esteem. In addition, tracking poses a problem when the typically low scores of these

    minority students label them. In many incidents, labeling will only allow them to take certain

    levels of classes. These are classes usually taught at the lowest level where students are not being

    challenged. They are basically being taught how to pass the test and nothing more. This

    syndrome continues to cause them to fall behind and to increase the gap in performance of

    minority students (Phillips, 2006).

    Solutions for English Language Learners

    The Department of Education issued what they termed as final regulations for testing

    English learners, effective October, 13, 2006. This required that all English language learners be

    included in mathematics testing at the first administration of the test after their arrival in the

    United States. However, math scores are to remain exempt from being calculated into AYP until

    after a student has been in U. S. schools for at least 12 months. After they arrive in U. S. schools,

    English language learners remain exempt from taking the states reading test for the first

    administration of the test. After that, however, they must take the test and their scores are

    required to be used for accountability purposes. Additionally, states are permitted to put the test

    scores of former English language learners into the pool with other English language learners to

  • High Stakes Testing 32

    calculate AYP for two years after they have been reclassified as fluent in English. These students

    cannot, however, be included in the limited English proficiency (LEP) subgroup for reporting

    purposes in state and district report cards. Parents and the public are entitled to have a clear

    picture of the academic achievement of LEP students, and states and districts are required to give

    an annual report of recently arrived LEP students who do not take the state reading tests in their

    school (Zehr, 2006b).

    In October, 2007, the Education Department released a draft of a framework for creating

    English language proficiency standards and tests. To achieve this goal, the Education

    Department even provided a total of $10 million to a consortium of four states to draft new tests.

    By the end of 2007, all states and the District of Columbia had ushered in new English language

    proficiency tests to comply with NCLB requirements for those still learning English. Whereas,

    the previous tests were mostly designed to assess only speaking and listening, federal law now

    requires states to assess these students in reading, writing, speaking, and listening. This new

    generation of tests is designed to assess their academic language, or the language needed in order

    to learn subjects in school. Under the federal education law, states are also required to set annual

    measurable achievement objectives for students to progress in English and attain proficiency in

    the language. While experts agree that this is a positive move, more work needs to be done to

    ensure that these tests are valid and meaningful. Inconsistencies concerning implementation of

    these tests raise questions about the validity of the scores. These tests are not necessarily a

    predictor of how well a student will do in a classroom or on other mandatory tests (Zehr, 2007c).

    Also at issue is whether alternative tests being offered by states is comparable to the

    regular state reading and math tests that are used to test other students for AYP under NCLB

    mandates. Although the federal government has not rejected other forms of testing alternatives,

  • High Stakes Testing 33

    some states have been forced to stop using their alternatives because they were unable to satisfy

    federal concerns involving comparability. When Indiana stopped using portfolio components of

    their reading and math tests, their 22,000 English language learners population scored worse than

    the previous year in reading at every grade level tested from grades 3-10. Those who passed still

    dropped as much as 3 percentage points for 3rd

    graders, to 13 points for 8th

    graders. North

    Carolina, on the other hand, managed to submit enough evidence in its portfolio component to

    satisfy the federal government. The state of Virginia was told by the federal government that it

    must stop using its test of English language proficiency to calculate AYP in reading for

    beginning English learners. Teddi Predaris, the director of services for English language learners

    in the Fairfax county district stated that about 10,000 students have been taking this test

    statewide as a substitute for the regular reading test. A request to U. S. secretary of education

    Margaret Spellings solicited a one-year delay in changing that states policy. The Virginia

    education officials were denied a similar request in a December 11, meeting. Four other states,

    Minnesota, Nebraska, New York, and Texas, were also using English language proficiency tests

    as substitutes for regular reading tests. While the others have dropped that practice, Nebraska has

    yet to decide what to do. Overall, states have not been able to demonstrate comparability for such

    tests and the federal government has not made it an option for states to use an English-language-

    proficiency test instead of a regular reading test for purposes of NCLB accountability. The most

    common practice for states to include English learners in large-scale testing is by giving them the

    regular state test with the use of accommodations. These accommodations normally comprise the

    use of a dictionary or extra time. About eight states, including New York and Texas, are also

    providing some tests in languages other than English for certain grades or subjects (Zehr, 2007a).

  • High Stakes Testing 34

    Peter Zamora, lawyer and former high school teacher, advocates for the needs of English

    language learners as co-chairman of a diverse coalition of advocacy groups. These groups are

    now advancing the concerns of English learners as Congress considers reauthorization of NCLB.

    As spokesman for his group, he promotes better assessments for English learners through more

    technical assistance in testing issues for the states, and endorses the importance of fully including

    English learners in NCLBs accountability system. While some question the validity of these

    tests for English language learners, he maintains that too much flexibility will mean that districts

    will fail to be serious about meeting the needs of those students. He attributes his advocacy

    policies to his own experiences of teaching high school in Hayward, California, in the 1990s. His

    college-preparatory classes were mostly made up of Anglos, while his basic-skills classes were

    generally composed of Latinos and African-Americans. He attributes this to their economic

    status and the decisions that were made early in their school career. He recognized that the basic-

    skills kids were getting a slowed down curriculum that created a gulf between them and the

    college-prep students at the end of their careers (Zehr, 2007b). Ironically, half the Americans

    who have studied at the graduate level cannot comprehend complex material with specialized

    language and make inferences from it either. Yet, these are the people who are considered to be

    the success stories of education (Smith, 2008).

    Impact on Education and the Teaching Profession

    In January, 2000, Carol Minnick Santa, president of the International Reading

    Association, wrote an open letter to then, U. S. Secretary of Education Richard Riley. Within this

    letter, she acknowledges that the gap between children of poverty and their wealthier

    counterparts is unacceptable, and that he helped to initiate the standards and assessment

    movement for improving reading. In it, however, she charges that standards have become

  • High Stakes Testing 35

    synonymous with high stakes assessment. She states that the trends of high stakes testing will

    politicize education, narrow curriculum, and reduce professionalism. Furthermore, they are

    unfair to children, and especially to those of minority families. She describes the need for a more

    definitive testing instrument that could actually capture the many facets of a great student and

    reading, rather than multiple choice tests that assess mostly memorization of factual knowledge

    and discreet skills. She alleges that research does not prove these types of tests to be valid

    indicators of success in school, college, work, or life in general. Yet, they are already being used

    to make life-altering decisions. Not only do early results show that students are not doing well

    and failing, but teachers, especially those in high-poverty areas, are spending more time in test

    preparation at the expense of teaching a broader, more interesting curriculum that encourages

    children to come to school. These teachers are spending more time trying to prepare students for

    multiple choice questions than discussing literature or doing science experiments. She goes on to

    relate that qualitative studies show that next to death, flunking is a childs greatest fear, and that

    conversations with children have been troublesome for her. She specifically mentions that the

    poor and English learners are especially targeted, and that these tests will create even more

    disenfranchisement from school. She writes about the astounding amount of money being spent

    on high stakes testing, and offers more productive ways to spend this money. These alternatives

    are plans such as hiring reading specialists for schools for reading intervention programs, and

    planning staff development programs for all teachers of special education, Title 1, and teachers

    in the general classroom; programs planned so that all involved can have a better understanding

    of multiple ways to teach reading to a child. She expresses her dreams of a future where teachers

    have time to plan, become more professional, and are paid as professionals who are charged with

    preparing our countrys most important resource, our children, and where curriculum engages

  • High Stakes Testing 36

    students in broad and rich thematic studies. This is the environment in which students and their

    teachers have opportunities to learn, think critically, and create new content (2000).

    On the whole, as pressure to improve students test scores increased, teacher job

    satisfaction has decreased In a survey of school-based teacher educators, or mentor teachers,

    mentor teachers surveyed revealed that they felt increased pressures to improve their students

    standardized test scores. The majority of these teachers worked in grades 1-5. A few represented

    kindergarten and 6th

    grade, and some were specialists, such as reading specialists. The pressures

    were created from school boards, principals, and the media. On average, these elementary

    teachers are required to administer seven different local, state and national standardized tests in

    an academic year. Another variable relating to teacher job satisfaction that was considered

    involved changes in job climate. Statistics show significant relationships exist between job

    satisfaction and such things as control over classroom curriculum, perception in ability to meet

    individual needs, teachers effectiveness as an educator, and changes in teachers control of the

    daily schedule. Curriculum and instruction are a primary aspect of assessment. Whether

    intentional or not, curriculum is largely influenced by NCLB implementation and accountability.

    More than 82 percent of these teachers focused intensely on covering test objectives in their

    instruction as well as adjusting instructional plans according to students test scores. In 68

    percent of the classrooms, significant time was regularly spent on test preparation activities.

    Furthermore, students practiced the various test question formats and were given worksheets that

    reviewed anticipated test content. Attention to science had decreased for 66 percent of the

    teachers, while attention to social studies had decreased with 60 percent. Emphasis on preparing

    students to test well occurred in 93 percent of the classrooms, while student choice of curricular

    decisions decreased in more than 50 percent of the classrooms. The hard drive for test results had

  • High Stakes Testing 37

    caused all other activities, such as art, to all but disappear. Mentor teachers particularly noted a

    pressing need for an emphasis on assessment literacy in initial teacher preparation. Assessment is

    the process of gathering evidence about a students understanding of content and making

    constructive inferences from that evidence. Evaluation is determining the worth of something

    upon careful examination and judgment, and this is the merit of assessment. There is a mounting

    need to help new teachers manage the stresses of tests and accountability, while at the same time

    doing what is best for students education. Public education is in need of teachers who are able to

    assess the strengths of each child, while still providing students with an engaging learning

    environment of rich content (Snow-Gerono & Franklin, 2006).

    A qualitative study done by Skrla, 2001, suggests that teacher expectation levels relating

    to the abilities of minority youth in their classrooms have been influenced in a positive way.

    Skrla concluded that the accountability factor had changed teachers expectations for the

    achievement of their students for the better, and that these expectations provoked improvements

    in both the quality and equitability of classroom instruction. Other researchers, however, have

    illustrated how accountability policies work against teachers. Anderson and Ginsberg, 1998,

    noted that such policies monitor, scrutinize, and finally end up controlling teachers, their

    teaching, and their interaction with their students. The assumption that teachers do not produce

    unless pushed, forced or coerced has not escaped the attention of teachers, and especially those

    of minority populations. In fact, studies by McNeil in 2000, Pennington in 2004, and Sloan in

    2004 and 2006, have concluded that high stakes testing has not only placed teachers of minority

    youth under emotional strain, but that many of them contemplate leaving the profession.

    Pennington and Sloan describe the move toward high stakes accountability as a reconfiguration

    of teachers professional knowledge of curriculum, pedagogy, and of students in the classroom.

  • High Stakes Testing 38

    Pennington explains in detail how the increased use of standardized testing to enforce

    accountability actually undermines the teachers knowledge of more complex and culturally

    responsive positions of literacy. Teachers end up abandoning these views of literacy to meet the

    demands of high stakes testing. Conceptions of literacy that are multifaceted, socially

    constructed, culturally responsive, and highly contextual are reduced to mere test reading and

    test writing. As a result, the quality of classroom literacy declines, as does teacher expectation

    for minority populations.

    McNeil described the declining process in 2000 as the de-skilling and deprofessionalizing

    of teachers. She depicted the harmful effects on curriculum and pedagogy as the collateral

    damage that are inherent flaws of the new school reform. She concluded that the results of the

    policies led to a narrow, proficiency-based curriculum, which led to systematic defensive

    teaching. With this style of teaching, teachers asked less of their students aside from that which

    led to satisfying district mandates. In her 2004 research, Pennington describes those teachers as

    devoting increasing amounts of instructional time to test taking skills, and test preparation with

    commercially developed materials. These materials resembled the form and content of state tests,

    and paralleled classroom instruction to match objectives on the state test. Yet, these techniques to

    bolster test scores and earn schools high ratings are ultimately polluted. In the end, the test scores

    serve to undermine the value and validity of the inferences made about students literacy skills.

    She does not, however, investigate the fact that some schools still received an unacceptable

    rating from the state, or explore the overall quality of teachers classroom practices. The lack of

    counterexamples leaves the net effect underexplored.

    Teachers varied experiences and responses to accountability policies are a pivotal

    measure through which to gain perception and discernment of the true effects. Even so,

  • High Stakes Testing 39

    Valenzuela described in 1999 and 2000 how the absence of more authentic expressions of care

    from teachers was alienating them from many of their students of Mexican origin. She explained

    that standardization and high stakes testing undermined and even dismissed students culture and

    their conception of what should be a meaningful education. She remarks that reducing these

    youth to test scores reduces them to becoming objects for the mere purpose of shaping them into

    the dominant, monocultural mold (as cited in Sloan, 2007).

    Today, the demands of NCLB and objectives of IDEA for a least restrictive environment

    (LRE) have led to a strong movement for inclusion in the classroom. Both general educators and

    special educators are feeling inadequately prepared. Inclusion is promoting larger class sizes

    with students being crammed into classrooms, and the profession of teaching has reached a high

    stress level. More teachers are focusing on simply reaching the requirements that the standards

    mandate, and this trend is also occurring in IEPs (Jara, 2006). The apparently conflicting

    mandates of IDEA 2004 and NCLB 2002 also continue to promote a rising dilemma for special

    education professionals as well. Designing IEP programs for students with significant disabilities

    is challenging for teachers and IEP teams. These programs must assure access to the general

    curriculum, yet provide highly individualized instruction that is responsive to varied needs.

    Moreover, appropriate and valid assessment strategies for student progress and evaluation of

    goals must continue to be maintained within statewide accountability systems under NCLB. One

    possible solution may be that the development of specific standards-based objectives for these

    populations could provide a means for reconciling all of these requirements. At any rate, these

    students must be adequately supported if state accountability measures demand inclusion for

    them (Lynch & Adams, 2008).

  • High Stakes Testing 40

    Furthermore, of the hundreds of failing state schools that are under restructuring in

    California, few have been able to exit restructuring by pulling up satisfactory test scores. The

    Center on Education Policy based in Washington determined in a report that only 33 of the 700

    schools in restructuring in the 2006-07 year made enough progress to exit what is known as

    program improvement. That number accounts for 5 percent of the schools that were in

    restructuring in California. The large number of schools in that phase makes California a perfect

    model for research. This number has risen to 1,013 for the year 2008. Interestingly, California

    actually began implementing its standards-based accountability system much earlier than most

    states. What is more, it even identified schools for improvement before NCLB became law. A

    report, titled Managing More Than a Thousand Remolding Projects, said that federal

    restructuring strategies rarely have helped schools to raise student achievement enough to attain

    AYP to exit restructuring. It stated that the findings in California indicate the need to rethink

    restructuring nationwide (as cited in Jacobson, 2008).

    California Superintendent of Public Instruction Jack OConnell is preparing a plan to

    intervene in 98 school districts that are facing sanctions because their students have not met their

    targets for at least five years. This can range from merely revising local-educational agency plans

    for districts that came close to their goals, to the extreme of completely abolishing failing

    districts. Schools that are in program improvement have several options presently available to

    them. One involves contracting with an outside organization to run the school. Another is to

    become a charter school. They can also choose to replace staff, turn school operation over to the

    state, or any other restructuring effort that would show promise of significant changes. The

    Center on Education Policy (CEP) study shows that the last option was overwhelmingly

    Californias choice in 2006-07. The option of choice might include adopting a new curriculum,

  • High Stakes Testing 41

    improving technology, or even having teachers reapply for their jobs. Mr. OConnell referred to

    his new plan as a system of triage that would initially assign technical assistance teams to

    facilitate the lowest-performing of these districts. Then, the state would capitalize on some pilot

    intervention projects already in progress elsewhere in the state. This leaves many wondering who

    will pay for this plan when the state is already facing a $14 billion deficit for fiscal year 2009.

    The CEP study showed that most districts and school administrators were optimistic about their

    schools making AYP. Others reported that they would be using strategies proven to be successful

    for other schools. Perhaps it was most aptly put by Merrill Vargo, the executive director of

    Springboard Schools, a school improvement organization in San Francisco. She wondered if

    anyone had a plan instead, that might help these schools to better serve the kids who get up in the

    morning to go to these schools (as cited in Jacobson, 2008).

    Overall Achievement toward Proficiency

    Several studies done in 2002 suggested early on that high stakes testing might actually

    hamper academic achievement among students. Audrey L. Amrein, a researcher at Arizona State

    University in Tempe, formed a report with ASU education professor, David C. Berliner, which

    spanned the results over two decades of the initial federal thrust for accountability in education.

    The studies described how efforts of states to link major penalties to student test scores were

    generating little academic improvement. They speculated that, in some cases, high stakes testing

    might be goading struggling students to seek alternatives from the traditional path to a high

    school diploma. Research showed that after adopting new high stakes testing policies, 19 of 28

    states witnessed their 4th

    grade mathematics scores decrease on the federally supported NAEP,

    compared to the national average. Eighteen states saw increases compared to the norm on the 8th

    grade NAEP math tests. While states test scores are used to calculate AYP, NAEP dat