MSSAA District Determined Measures October 1, 2015 Dr. Deborah Brady [email protected].
-
Upload
isabel-hood -
Category
Documents
-
view
215 -
download
0
Transcript of MSSAA District Determined Measures October 1, 2015 Dr. Deborah Brady [email protected].
Agenda Defining DDMs Growth Versus Achievement DDM Quality Criteria
Lunch
Some Good, Bad, and Mediocre Examples Determining Validity and Reliability Next Class—Using the Results, CASW, Calibration, Protocols, Excel
On-line access to materials now at: https://wikispaces.com/join/HQ4FDRM
Code HQ4FDRM
1. What DDMs?
2. AssessmentSelection &
Development
3. Piloted a few DDMs
4. June Report Data; Waivers
done
5. Ready to Interpret scores into
L, M, H
• The state has given us too much to do
• We can’t possibly do DDMs plus teacher evaluation plus PARCC plus the Common Core!
• Which tests that we give now can we use or adapt?
• Defining alignment with the curriculum
• Defining Rigor• Defining good
assessments• Defining the
next steps for your district
• Looking at the results
• Do they reflect what happened in the classroom?
• Can we change these pilots
• Where do we keep this data?
• Name of DDM• Kind of
assessment• Grade level• Course• Indirect DDMs
for nurses, SISP, administrators
• Request Waivers
• What’s high growth?
• What’s a year’s growth?
• What’s low growth?
• What’s low growth
• Who scores this data?
• Where do we store this data?
• Can we change these assessments?Where are you on this
journey?
Ready to Interpret Results into Low, Moderate, High Growth
Deciding which Assessments to
Use
June Report; Waiver Requests
all set
Piloted a few DDMs
What are DDMs?
District Determined Measures
DEFINITION
DDMs are defined as:
“Measures of student learning, growth, and achievement related to the Curriculum Frameworks, that are comparable across grade or subject level district-wide”
TYPES OF MEASURES Portfolio assessments Approved commercial
assessments MCAS ELA 4-8; MCAS Math 4-8 District developed pre- and
post- unit and course common assessments
Capstone projects
Quality Assessments
Almost any kind of assessment can work Must be a “Substantive” assessment (DESE)
Aligned with (at least 2) standards of Frameworks or 1 Power Standard
And/or local standards
Rigorous (appropriately challenging; locally defined) Consistent with K-12 DDMs in substance, alignment, and
rigor Consistent with the District’s values, initiatives,
expectations Measures growth (to be contrasted with achievement) and
shifts the focus of teaching From achievement to growth for all students
From teaching to learning
From the teacher to the learner
Timeline for DDMs and Impact Ratings Implementation
Last Year
District-wide training, development of assessments and piloting of 5 required DDMs (Primary ELA or math; Elementary ELA or Math, MS Math, HS Writing to Text plus un-tested area)
June 2014 Report: List of DDMs from District plus any waivers granted by DESE.
2014-2015
Year 1 (1st Cohort) Non-waivered; scores are divided into H-M-and Low and stored locally
Year 2 (2nd Cohort) Areas waivered by DESE based on June report
2015-2016
Second year data is collected for 1st Cohort.
First year of DDM data is collected for the 2nd Cohort.
2016-2017
October 2016: First DDM rating of High Moderate or Low is given to the 1st Cohort. The impact rating linked to the educator’s EPIMS NUMBER is sent to DESE with the October 1 Report based on 2 years of data for two DDMs.
A second year of data is collected for 2nd Cohort. Their Impact Rating will be calculated and sent to DESE by October 2017.
June Report Data and Waivers NAME of the DDM (Math 6 Proportions; ELA 7 Argument)
SOURCE; District Developed, Commercial, Open Source
GRADE (Pk-2, 3-5, 6-8, HS, Multiple)
CONTENT AREA or ROLE Assessment Type
TYPE OF ASSESSMENT DIRECT: Pencil and Paper (on demand), Performance/Project,Writing Prompt/Essay, Portfolio, Hybrid (a combination
of any 2)
INDIRECT: (mainly for administrators and SISP—nurses, guidance, OT/PT, principals, directors
Number of Educators using this DDM: 1, 2, 3-5, 6-10, 11-20, 20-40, 40+
All waivers that were requested were granted. Varies by district.
NOTE: All of the grades, teacher names, assessments are internal to your district.
DDM Quick Review Must measure growth, not achievement
Growth equals one year’s progress
Each educator will have at least 2 DDMs
Teachers’ DDMs will be assessments of student work called a DIRECT Measure Most growth will be based on a pre-test before teaching and a post-test after teaching
MCAS SGP for grades 4-8 for ELA and math ONLY can be used (not grade 10)
Scores (100%) or Rubrics can be used to measure progress
One measure must not be MCAS; it must be a District Determined Measure which can include local assessments, and Galileo, normed assessments (DRA, MAP, SAT). However, self-contained classroom teachers may use both ELA and math SGA if the district makes that decision.
Some SISPs, administrators, nurses may have to have 1 INDIRECT Measure (a few who do not work with students may have 2 indirect) INDIRECT Measures are like SMART goals—attendance, graduation rates, MCAS
Performance & Impact Ratings
Performance Rating
Ratings are obtained through data collected from observations, walk-throughs and artifacts Exemplary Proficient Needs Improvement Unsatisfactory
Impact Rating ( October 1, 2016 for 1st Cohort)
Ratings are based on trends and patterns in student learning, growth and achievement over a period of at least 2 years Data gathered from DDM’s and State-wide testing
High
Moderate
Low
Year 1 Year 2 PROFESSIONAL JUDGMENT
There are no weights or percentages that
dictate how an evaluator must
interpret pattern and trend data to
determine a Student Impact Rating for an individual educator.
Analyzing the 4 DDM Data Points
Impact Ratings 4 data Points
Example Overall Rating for Student Impact
If more than half of the ratings point to the same rating, this is the rating
Low ModerateModerateModerate
Moderate
If there is no clear result, the default is M
L,L,M,M Moderate is the default without compelling evidence otherwise
Summativ
e Rating
Exemplary1-yr Self-Directed
Growth Plan2-yr Self-Directed Growth Plan
Proficient
Needs Improvement
Directed Growth Plan
Unsatisfactory Improvement Plan
Low Moderate High
Rating of Impact on Student Learning
Performance and Impact Ratings
Massachusetts Department of Elementary and Secondary Education 13
Performance Rating
Impact Rating
Results of Impact Ratings on the Evaluation CyclePerformance
RatingImpact Rating Result
Exemplary High or Moderate • District negotiated result may include recognition• No meeting required
Proficient High or Moderate • District negotiated may include eligibility for additional roles (mentoring)
• No meeting required
Exemplary or Proficient
Low • 1-year Self-Directed Plan• Required Meeting when the Evaluator confirms
rating• Educator and evaluator analyze the data• Educator Plan goals may be impacted• If the Evaluator judges that the rating should be
higher, this change must be approved by the superintendent
Needs Improvement
High, Moderate, Low
Directed Growth Plan
Unsatisfactory High, Moderate, Low
Improvement Plan
Five Requirements of DDMs (DESE)
1. Must measure growth
2. Employ a common administration procedure
3. Use a common scoring process
4. Translate these assessments to an Impact Rating (High-Moderate-Low)
5. Assure comparability of assessments within the school (rigor, validity).
15
Comparable Assessments
1. Comparable across schools and within grades Example: Teachers with the same job (e.g., all 5th grade teachers
or all teachers of English 8) give the same assessment Where possible, measures are identical
Easier to compare identical measures
But the district can determine whether or not these identical measures provide meaningful information about all students
Exceptions: When might assessments not be identical? Different content (different sections of Algebra I)
Differences in untested skills (reading and writing on math test for ELL students)
Other accommodations (fewer questions to students who need more time)
NOTE: MCAS modifications and accommodations based on IEPs and 504s are appropriate16
Co-Taught Classes and Teacher of Record
• In some co-taught classes or when teachers share in the results of the assessment because of additional time spent in a pull-out class, districts may need to determine the teacher or teachers responsible for content covered by statewide testing
• Co-teachers may share equally if both co-teach all students or the district may determine one teacher as the teacher of record.
Table Talk
Communication Plan
DDMs
Quality Assessments
Roster Verification When DDM Results Have Been Calculated
Educators confirm the accuracy of their rosters
Student must be on roster by October 1, and the student must remain on roster through last day of DDM testing.
Student must be present for 90% of instructional time.
Direct Measures
Direct measures of student learning, growth, or achievement provide information about student growth from student work.
Most classroom teachers will have 2 direct measures
Indirect Measures for Guidance, Some Caseload Educators, Some
Administrators Indirect measures of student learning, growth, or achievement provide information about students from means other than student work.
These measures may include student record information (e.g., grades, attendance or tardiness records, or other data related to student growth or achievement such as high school graduation or college enrollment rates).
ESE recommends that at least one of the measures used to determine each educator’s student impact rating be a direct measure and MCAS SGP if available and appropriate.
Indirect Measure Examples Consider the teams that you are a part of, for example, what many schools call the “child study team” in which
many specialists participate, or all of your job-alike colleagues.
Discuss the goals that you have as a district group or as a school-based group. For example, you may be working on increasing the attendance of a group of students who are frequently out or tardy. Or, you may be working to return students to class quickly. Or, you may be focusing on working with a few families or students to support students with, for example, school anxiety.
Note that the measures can focus on a small group and do not have to include all students. For example, students with emotional problems that prevent them from participating fully can be your subgroup. For those educators with a small caseload, supporting your caseload to return to class and to participate fully in the class may be your goal.
Select a goal on something that is already an initiative or is recognized as something worthwhile for students. This goal needs to connect indirectly to student growth. Making sure students are in class and are able to participate fully is an appropriate goal.
Social-emotional growth is an appropriate direct measure.
Develop a method for measuring this goal. For example, nurses in one district realized that the software that they used could also track their goal of quick returns of students to classes. In another district, a school-based team decided that their “Student Support Team” had been reviewing numbers of classroom incidents and absences, but had not put aside time to study the most at risk students fully. Their goals was to change their weekly process so that time was set aside to discuss a student more thoroughly and to develop an action plan for that student. Their record was their weekly agenda, and they counted the increased number of action plans and tracked the progress of the most at risk students.
Indirect Measures More Specifics:
School-based student support teams: Increase attendance, decrease tardiness, increase the number of families involved with their students’ success plans, increase the in-depth studies of students and track their progress, decrease the number of behavior incidents.
Librarians: Increase the number of classes that work on research and projects in the library. Increase the number of teachers with whom you work to support specific units with materials.
Behavior specialists, speech pathologists, social workers, school psychologists: Increase the number of students who participate in class more fully. You can look at a sub-set of your students or caseload. For example, a behavior specialist was going to count the number of responses to a “non-classroom adult” that students made as they walked to gym or to lunch. This is a DIRECT MEASURE.
Example: A group of school psychologists were concerned that many students were referred to Special Education testing before interventions had taken place. They wanted to make sure the “Student Study Teams” processes were uniform. They provided an instrument for referral that was consistent within the district. A goal was to increase the number of times the instrument was used to provide appropriate interventions before a student was considered for assessment for special education
A group of speech pathologists used a developmental scale (like a holistic rubric) to measure the improvement of students’ speaking skills. They assessed these skills in their classroom and in general education classrooms. This is a DIRECT MEASURE.
If IEP goals measure student academic or social growth, attainment of success in working toward these goals can be used as DIRECT MEASURES. Thus, achieving IEP growth goals for my caseload can be a goal. Please note that DESE is still looking at subgroup sizes under 7 as being too small for growth measurement.
Guidance counselors set increasing the number of applications for college as a goal. Their comprehensive initiative included visiting classrooms and encouraging more students to take the PSAT, to work with the administration to provide SAT workshops, and to offer workshops through the school in writing college application letters.
Table Talk
Direct and Indirect Measures
Teacher of Record
Roster Verification
Growth versus Achievement
4503699
288 to 244/ 25 SGP
230 to 230/ 35 SGP
214 to 225/ 92 SGP
“Cut Scores” for MCAS SGP
0 50 100
Percent of students
Lower growth
Higher growth
Typical growth
One year’s growth
65Classroom
35Classroom
40Whole Grade
60Whole Grade
MCAS SGP Cut Scores
Low Moderate HighTeachers 35* or lower Greater than 35, but
less than 6565 or higher
Administrators 40 or lower Greater than 40, but less than 60
60 or higher
• More latitude is given to teachers because of the statistically small numbers in a classroom (at least 20 students).
• Administrative “cut scores” for Low and High are based upon the entire class’ scores
Median student growth percentileLast name SGP
Lennon 6
McCartney 12
Starr 21
Harrison 32
Jagger 34
Richards 47
Crosby 55
Stills 61
Nash 63
Young 74
Joplin 81
Hendrix 88
Jones 95
Imagine that the list of students to the left are all the students in your 6th grade class. Note that they are sorted from lowest to highest SGP.
The point where 50% of students have a higher SGP and 50% have a lower SGP is the median.Median SGP for the 6th grade
class
Using median student growth percentiles:growth by achievement for schools
0
20
40
60
80
100
0 20 40 60 80 100
ELA median SGP, 2009
EL
A %
pro
ficie
nt o
r ad
van
ced
, 200
9
Higher achievingLower growing
Lower achievingLower growing
Higher achievingHigher growing
Lower achievingHigher growing
Student growth percentiles Each student’s rate of change is compared to other students
with a similar test score history (“academic peers”)
The rate of change is expressed as a percentile.How much did John improve in mathematics from 5th
grade to 6th grade, relative to his academic peers?If John improved more than 65 percent of his
academic peers, then his student growth percentile would be 65.
NOTE: Differences of fewer than 10 SGP points are likely not educationally meaningful.
Growth and MCAS SGP
The median SGP must be used when a teacher has 20 or more students (altogether) in a content area
Median SGPs for 8-19 students have validity and may be used if the district determines this is appropriate
More than one educator (a classroom teacher, a support teacher, a curriculum supervisor, and a supervising administrator) may be considered responsible for a content-area SGP. Different students may be in their respective rosters.
Galileo (software) Computes
Growth In the Same Way as the
MCAS (subgroup
comparisons)
Figure 1 Theoretical I illustration
Low Growth
Moderate Growth
High Growth
Beginning to Consider a PlanCore Areas
MCAS SGP
Direct and Indirect Measures
ELA Math Science Social Studies
12 CA/CA CA/CA CA/CA CA/CA
11 CA/CA CA/CA CA/CA CA/CA
10 CA/CA CA/CA CA/CA CA/CA
9 CA/CA CA/CA CA/CA CA/CA
8 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA
7 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA
6 MCAS SGP/CA MCAS SGP/CA CA/CA CA/CA
5 MCAS SGP/CA MCAS SGP/CA
4 MCAS SGP/CA MCAS SGP/CA
3 CA/CA CA/CA
2 CA/CA CA/CA
1 CA/CA CA/CA
K CA CA
“Sample Plan” for Core Areas Only
Singleton, Art, Music,
Technology, PE by Grade or grade-spans
Special Education Specialists, Co-
Teachers, substantially
separate
Indirect Measures (IM) Central Office,
Psychologist
AdministratorsPrincipals, Assitant
Principals,Superintendent,
12 CA/CA Inclusion Co-TeachersCan “share” scores
with General Ed Teachers
Or
If their students’ goals are substantially
different, the assessments can be
modified or can focus on the goal of
inclusion
Specialists Measure Goals
Can develop K-12 rubric
SpeechPE
Adaptation of the SMART goal process
Measure goal that has an impact on student growth
AttendanceCollege ApplicationsTechnology Growth
MCAS Growth Either ELA or Math
PLUS IM
11 CA/CA
10 CA/CA
9 CA/CA
8 CA/CA
7 CA/CA
6 CA/CA
5 CA/CA
4 CA/CA
3 CA/CA
2 CA/CA
1 CA/CA
K CA/CA
“ Sample Plan” for Other
Table Talk
Initial PlanningCore SubjectsSpecialists, PE, Art, Music, TechnologySPEDSingletons (HS mainly)Indirect Specialists, Central OfficePrincipals, Assistant Principals,
Superintendent, Curriculum Directors, Department Heads
Local Cut Scores Are District Determined (for 100 students)
Pre-test Post test Difference Student Scores
Sorted low to
high
Teacher score is based on the MEDIAN Score of her class for each DDM
20 35 15 5Cut score
LOW GrowthLowest ___%
25 30 5 15
30 50 20 20
35 60 25 25
35 60 25 25 median teacher score
40 70 35 25 median Teacher score 40 65 25 25
50 75 25 30
50 80 30 35 Top 20%
50 85 35 35 Cut score HIGH GROWTHHighest ___?
Fountas and Pinnell ScaleP to Q; N to P; D to K
Fountas and PinnellGrowth for Each Student
Is Based on 10 Months of GrowthSecond Grade StudentLevel
AchievementLevel End of Year
Levels from beginning to
the end of the year
Pre-Post F&P LevelsGrowth
HIGH, MODERATE, OR LOW GROWTH
(10 MONTHS=YEAR)
Q Above Benchmark PQ 7 MONTHS GROWTH
LOW GROWTH
P At Benchmark NOP 10 MONTHS OF GROWTH
MODERATE GROWTH
K Below Benchmark DEFGHIJK 17 MONTHS OF GROWTH
HIGH GROWTH
Teacher A Teacher B Teacher C Teacher D Teacher E Teacher F
5 7.5 6.5 3.5 9
5.5 7.5 6.5 7 106.5 8.5 7.5 7 106.5 1 10 8.5 7 106.5 3 10 9 7 106.5 3 10 10 10 127 6 10 10 10 12.5
10 6.5 10 10 11.5 12.5
12 6.5 10 10 12.5 12
None 7 12 10 12 137 12 12 16 13
9 12 12 16 13.59 13 13.5 16 13.5
10 13 13.5 16 13.510 13 13.5 16 13.612 16 17 16.5 15.512 17 16.5 1912 17 16.5
12 1716 17
Median 6.5 Median 9 Median 12 Median 10 Median 16 Median 12
Below 6.5Between 6.5
and 16Between 6.5
and 16Between 6.5
and 16Between 6.5
and 16Between 6.5
and 16
LOW Moderate Moderate Moderate Moderate Moderate
All Classes
1
3
3
3
3.5
5.5
6
6.5
6.5
6.5
6.5
6.5
6.5
6.5
6.5
6.5
7
7
7
7
7
7
7
7
7.5
7.5
8.5
8.5
9
9
9
9
9
9.5
10
10
10
10
10
10
10.7
10.9
11.0
11.2
11.4
11.5
11.7
11.9
12.0
12.212.5
12.6
12.7
12.9
13.1
13.2
13.4
13.6
13.8
13.9
14.1
14.3
14.4
14.6
14.8
15.0
15.1
15.3
15.5
15.6
12.5
12.5
12.5
13
13
13
13
13
13.5
13.5
13.5
13.5
13.5
13.5
13.6
15.5
16
16
16
16
16
16
16
16.5
16.5
16.5
16.5
16.5
17
17
17
19
26
6.5 Cut Score lowest 15%
12.0 Median for whole Grade 3 DDM
16 cut score highest 15%
103 Third Graders
42
1 2 3 4
Details
No improvement in the level of detail.
One is true
* No new details across versions
* New details are added, but not included in future versions.
* A few new details are added that are not relevant, accurate or meaningful
Modest improvement in the level of detail
One is true
* There are a few details included across all versions
* There are many added details are included, but they are not included consistently, or none are improved or elaborated upon.
* There are many added details, but several are not relevant, accurate or meaningful
Considerable Improvement in the level of detail
All are true
* There are many examples of added details across all versions,
* At least one example of a detail that is improved or elaborated in future versions
*Details are consistently included in future versions
*The added details reflect relevant and meaningful additions
Outstanding Improvement in the level of detail
All are true
* On average there are multiple details added across every version
* There are multiple examples of details that build and elaborate on previous versions
* The added details reflect the most relevant and meaningful additions
Example taken from Austin, a first grader from Answer Charter School in Boise, Idaho. Used with permission from Expeditionary Learning. Learn more about this and other examples at http://elschools.org/student-work/butterfly-drafts
Holistic Rubric Show Progress across a Scale, Continuum, Descriptors
PreconventionalAges 3-5
EmergingAges 4-6
DevelopingAges 5-7
BeginningAges 6-8
ExpandingAges 7-9
2 Relies primarily on pictures to convey meaning.
2 Begins to label and add “words” to pictures.
2 Writes first name.
Uses pictures and print to convey meaning.
Writes words to describe or support pictures.
Copies signs, labels, names, and words (environmental print).
Writes 1-2 sentences about a topic.Writes names and familiar words.
.
2 Writes several sentences about a topic.
2 Writes about observations and experiences.
2 Writes short nonfiction pieces (simple facts about a topic) with guidance.
Writes short fiction and poetry with guidance.
Writes a variety of short nonfiction pieces (e.g., facts about a topic, letters, lists) with guidance.
Writes with a central idea.Writes using complete sentences.
BridgingAges 8-10
FluentAges 9-11
ProficientAges 10-13
ConnectingAges 11-14
Independent
Writes about feelings and opinions.Writes fiction with clear beginning,
middle, and end.Writes poetry using carefully chosen
language with guidance.Writes organized nonfiction pieces
(e.g., reports, letters, and lists) with guidance.
Begins to use paragraphs to organize ideas.
Uses strong verbs, interesting language, and dialogue with guidance.
2 Begins to write organized fiction and nonfiction (e.g., reports, letters, biographies, and autobiographies).
2 Develops stories with plots that include problems and solutions with guidance.
2 Creates characters in stories with guidance.
2 Writes poetry using carefully chosen language.
1 Begins to experiment with sentence length and complex sentence structure.
1 Varies leads and endings with guidance.
1 Uses description, details, and
similes with guidance. 1 Uses dialogue with guidance.
Writes persuasively about ideas, feelings, and opinions.
Creates plots with problems and solutions.
Begins to develop the main characters and describe detailed settings.
Begins to write organized and fluent nonfiction, including simple bibliographies.
Writes cohesive paragraphs including reasons and examples with guidance.
Uses transitional sentences to connect paragraphs.
Varies sentence structure, leads, and endings.
Begins to use descriptive language, details, and similes.
Uses voice to evoke emotional response from readers.
Begins to integrate information on a topic from a variety of sources.
2 Writes in a variety of genres and forms for different audiences and purposes independently.
2 Creates plots with a climax.2 Creates detailed, believable
settings and characters in stories.2 Writes organized, fluent, and
detailed nonfiction independently, including bibliographies with correct format.
1 Writes cohesive paragraphs including supportive reasons and examples.
1 Uses descriptive language, details, similes, and imagery to enhance ideas independently.
1 Begins to use dialogue to enhance character development.
1 Incorporates personal voice in writing with increasing frequency.
2 Writes organized, fluent, accurate, and in-depth nonfiction, including references with correct bibliographic format.
2 Writes cohesive, fluent, and effective poetry and fiction.
1 Uses a clear sequence of paragraphs with effective transitions.
1 Begins to incorporate literary devices (e.g., imagery, metaphors, personification, and foreshadowing).
1 Weaves dialogue effectively into stories.
1 Develops plots, characters, setting, and mood (literary elements) effectively.
1 Begins to develop personal voice and style of writing.
" .
? J
4(25)=100
4(22)=88
4(18)=72
4(15)=60
x
x
x
x25 + 18 + 22 + 15 = 80%
Criterion Referenced Rubric and Raw Scores or % of 100
AP Rubric of Rubrics Prose Analysis (9 levels give students room to improve Holistic)
9-8 Answers all parts of the question completely. Using specific evidence from the work and showing how that evidence is relevant to the point being made. Fashions a convincing thesis and guides reader through the intricacies of argument with sophisticated transitions. Demonstrates clear understanding of the work and recognizes complexities of attitude/tone. Demonstrates stylistic maturity by an effective command of sentence structure, diction, and organization. Need not be without flaws, but must reveal an ability to choose from and control a wide range of the elements of effective writing.
7-6 Also accurately answers all parts of the question, but does so less fully or effectively than essays in the top range. Fashions a sound thesis. Discussion will be less thorough and less specific, not so responsive to the rich suggestiveness of the passage or precise in discussing its impact. Well written in an appropriate style, but with less maturity than the top papers. Some lapses in diction or syntax may appear, but demonstrates sufficient control over the elements of composition to present the writer’s ideas clearly. Confirms the writer’s ability to read literary texts with comprehension and to write with organization and control.
5 Discusses the question, but may be simplistic or imprecise. Constructs a reasonable if reductive thesis. May attempt to discuss techniques or evidence in the passage, but may be overly general or vague. Adequately written, but may demonstrate inconsistent control over the elements of composition. Organization is attempted, but may not be fully realized or particularly effective.
4-3 Attempts to answer the question, but does so either inaccurately or without the support of specific evidence. May confuse the attitude / tone of the passage or may overlook tone shift(s) or otherwise misrepresent the passage. Discussion of illustrations / techniques / necessary parts of the prompt may be omitted or inaccurate. Writing may convey the writer’s ideas, but reveals weak control over diction, syntax, or organization. May contain many spelling or grammatical errors. Essays scored three are even less able and may not refer to illustrations / techniques at all.
2-1 Fails to respond adequately to the question. May misunderstand the question or the passage. May fail to discuss techniques / evidence used or otherwise fail to respond adequately to the question. Unacceptably brief or poorly written on several counts. Writing reveals consistent weakness in grammar or other basic elements of composition. Although may make some attempt to answer the question, response has little clarity and only slight, if any, evidence in its support. Although the writer may have made some attempt to answer the prompt, the views presented have little clarity or coherence; significant problems with reading comprehension seem evident. Essays that are especially inexact, vacuous, and /or mechanically unsound should be scored 1.
0 A blank paper or one that makes no attempt to deal with the question receives no credit.
Rubric from Sharon Kingston
Rubric “Cut Scores”
Create a “growth” rubric and describe a typical year’s growth Translate into 100% www.roobrix.com
Who Assesses the DDMs?
Teams of Teachers (e.g., all 5th grade teachers)
Team members rate each other’s students’ responses
Multiple raters score each response
Individual Teachers Random auditing (rechecking)
By principal, coordinator, or department head; this is a district decision
Table Talk
Scoring
Grade Levels
Departments
Who Scores?
Who Stores?
DDM Quality CriteriaCollaborative Process
Validity
Reliability
Rigor
Aligned to standards
Assessment Quality Requirementsand Definitions from DESE
• Alignment to Frameworks and District Curriculum content and/or district standards
• Rigor • Comparability across all classes and in all
disciplines• “Substantial” assessment of the course; core
content and skills• Modifications are allowed as with MCAS
Rigor MeasuresDistrict Determined
Bloom (next slide)Hess (next slide)DESE’s “Curriculum Embedded
Performance Assessment” Rubric in the Model Curriculum Units (slide 36)
Depth + thinking Level 1
Recall & ReproductionLevel 2
Skills & Concepts
Level 3Strategic Thinking/
Reasoning
Level 4Extended Thinking
Remember-Recall, locate basic facts, details, events
Not appropriate at this level
Understand
-Select appropriate words to use when intended meaning is clearly evidentMCAS
-Specify or explain relationships-summarize-identify central idea
-Explain, generalize, or connect ideas using supporting evidence (quote, example…)
-Explain how concepts or ideas specifically relate to other content domains or concepts
Apply
-Use language structure (pre/suffix) or word relationships (synonym/antonym) to determine meaning
-Use context to identify meaning of word-Obtain and interpret information using text features
-Use concepts to solve non-routine problems
-Devise an approach among many alternatives to research a novel problem
Analyze
-Identify whether information is contained in a graph, table, etc.
-Compare literary elements, terms, facts, events-analyze format, organization, & text structures
-Analyze or interpret author’s craft (literary devices, viewpoint, or potential bias) to critique a text
-Analyze multiple sources-Analyze complex/abstract themes
Evaluate PARCC -Cite evidence and develop a logical argument for conjectures
-Evaluate relevancy, accuracy, & completeness of information
Create
-Brainstorm ideas about a topic -Generate conjectures based on observations or prior knowledge
-Synthesize information within one source or text
-Synthesize information across multiple sources or texts
Hess: Rigor Relevance Matrix
B
L
O
O
M
Post Test Only
Use District’s History to predict scores on AP Calculus Exam, for example
Previous Grade’s Math Score
LowGrowth
ModerateGrowth
HighGrowth
A 3 4 5
B 2 3 4
C 1 2 3
D 1 2
Identifying and Selecting DDMs The DDM process must be collaborative
District establishes a DDM Working Group
Co-chaired by superintendent and president of local bargaining unit or their designees.
Surveys the district for available assessments
Recruits educators to identify assessments and make recommendations
Identifies at least two measures for each educator
Collects feedback on the quality of the DDMs (continuous improvement)
Makes recommendations to the superintendent
Assessment Quality Requirementsand Definitions from DESE
• Alignment to Frameworks and District Curriculum content and/or district standards
• Rigor • Comparability across all classes and in all
disciplines• “Substantial” assessment of the course; core
content and skills• Modifications are allowed as with MCAS
Rigor MeasuresDistrict Determined
Bloom (next slide)Hess (next slide)DESE’s “Curriculum Embedded
Performance Assessment” Rubric in the Model Curriculum Units (slide 36)
Depth + thinking Level 1
Recall & ReproductionLevel 2
Skills & Concepts
Level 3Strategic Thinking/
Reasoning
Level 4Extended Thinking
Remember-Recall, locate basic facts, details, events
Not appropriate at this level
Understand
-Select appropriate words to use when intended meaning is clearly evidentMCAS
-Specify or explain relationships-summarize-identify central idea
-Explain, generalize, or connect ideas using supporting evidence (quote, example…)
-Explain how concepts or ideas specifically relate to other content domains or concepts
Apply
-Use language structure (pre/suffix) or word relationships (synonym/antonym) to determine meaning
-Use context to identify meaning of word-Obtain and interpret information using text features
-Use concepts to solve non-routine problems
-Devise an approach among many alternatives to research a novel problem
Analyze
-Identify whether information is contained in a graph, table, etc.
-Compare literary elements, terms, facts, events-analyze format, organization, & text structures
-Analyze or interpret author’s craft (literary devices, viewpoint, or potential bias) to critique a text
-Analyze multiple sources-Analyze complex/abstract themes
Evaluate PARCC -Cite evidence and develop a logical argument for conjectures
-Evaluate relevancy, accuracy, & completeness of information
Create
-Brainstorm ideas about a topic -Generate conjectures based on observations or prior knowledge
-Synthesize information within one source or text
-Synthesize information across multiple sources or texts
Hess: Rigor Relevance Matrix
B
L
O
O
M
GENERIC Rubric for CEPAs in Mass 1 2 3 4 5 6
1 2 3 4 5 6
Topic development
Little topic/idea development, organization, and/or details
Little or no awareness of audience and/or task
Limited or weak topic/idea development, organization, and/or details
Limited awareness of audience and/or task
Rudimentary topic/idea development and/or organization
Basic supporting details
Simplistic language
Moderate topic/idea development and organization
Adequate, relevant details
Some variety in language
Full topic/idea development
Logical organization
Strong details
Appropriate use of language
Rich topic/idea development
Careful and/or subtle organization
Effective/rich use of language
Evidence and Content Accuracy
Little or no evidence is included
and/or
content is inaccurate
Use of evidence and content knowledge is limited or weak
Use of evidence and content is included but is basic and simplistic
Use of evidence and accurate content is relevant and adequate
Use of evidence and accurate content is logical and appropriate
A sophisticated selection of and inclusion of evidence and accurate content contribute to an outstanding submission
Use of Visuals/Media
Visuals and/or media are missing or do not contribute to the quality of the submission
Visuals and/or media demonstrate a limited connection to the submission
Visuals and/or media are basically connected to the submission and contribute to its quality
Visuals and/or media are connected to the submission and contribute to its quality
Visuals and/or media contribute to the quality of the submission in a logical and appropriate way
Visuals and/or media are carefully and strategically selected to enhance the content of the submission
1 2 3 4
Standards for English Conventions
Errors seriously interfere with communication andLittle control of sentence structure, grammar and usage, and mechanics
Errors interfere somewhat with communication and/or Too many errors relative to the length of the submission or complexity of sentence structure, grammar and usage, and mechanics
Errors do not interfere with communication and/or Few errors relative to length of submission or complexity of sentence structure, grammar and usage, and mechanics
Control of sentence structure, grammar and usage, and mechanics (length and complexity of submission provide opportunity for student to show control of standard English conventions)
Table Talk
Quality of Assessments
Quality Rubrics
Validity
Rigor
Inter-Rater Reliability
Specific Examples
Examples of DDMsThe GOOD
The BADand the Not-so-good
Quality Assessments, Developed Locally, Adapted, or Adopted
Dr. Deborah Brady
The GOOD Substantive
Aligned with standards of Frameworks, Vocational standards And/or local standards
Rigorous Consistent with K-12 DDMs in substance, alignment, and
rigor Consistent with the District’s values, initiatives,
expectations Measures growth (to be contrasted with achievement) and
shifts the focus of teaching From achievement to growth for all students
From teaching to learning
From the teacher to the learner
As a Result of the GOOD
In Districts, schools, departments:
Educators have collaborated thoughtfully Initiatives are one step more unified The District, school, department, or specific
course Moves forward (a baby step or a giant step) Gains collaborative understanding of the purpose of
a course, discipline, year’s work
Some GOOD examples
9-12 ELA portfolio measured by a locally developed rubric that assesses progress throughout the four years of high school
A district that required that at least one DDM was “writing to text” based on CCSS appropriate text complexity
A HS science department assessment of lab report growth for each course (focus on conclusions)
A HS science department assessment of data or of diagram or video analysis
More
A HS math department’s use of PARCC examples that require writing asking students to “justify your answer”
A social studies self-created PARCC exam using as the primary source or anchor text Wilson’s speech to congress advocating going to war for high-minded purposes and a 1980 text that describes Wilson’s true motives as financial gain with an essay that must summarize the excerpts then write an essay with a thesis that asserts the US motives for going to war
More SPED: Social-emotional development of
independence (whole collaborative—each educator is measuring)
SPED: “directed study” model—now has Study Skills explicitly recorded by the week for each student and by quarter on manila folder: Note taking skills, text comprehension, reading, writing, preparing for an exam, time management
A Vocational School’s use of Jobs USA assessments for one DDM and the local safety protocols for each shop
High school SST team example (Frequent Absentees)
Child Study Team example (Universal Process)School Psychologists (Did not follow procedure
for referral)School Psychologists (subgroup of students
studied)High school guidance example (PSAT, SAT,
College Applications)IEP goals can be used as long as they are
measuring growth (academic or social-emotional)
Possibly GOOD
Fountas and Pinnell individual assessment of reading comprehension
Galileo growth determination using the Galileo question bank aligned to standards
DIBELS Text-based, locally created assessments MCAS-like ORQ plus multiple choice assessments Mid-terms, Benchmarks, Final Exams
The possibility of goodness depends upon….
District Capacity and Time to Collaborate
Data teams PLCs Leaders/coaches to provide context and
meaning to student work Looking at student work protocols Diagnosing student needs and developing
action plans
Without Time and Capacity, it’s all just
Low Moderate and High in Human TermsA story of two teachers
Effective Teaching
All levels of learners
Curriculum
Goals/Agenda
Notebook
Group work
Routines
Math Practices Communicating Mathematical Ideas
Clearly constructs and communicates a complete response based on: a response to a given equation or system of
equations a chain of reasoning to justify or refute algebraic,
function or number system propositions or conjectures
a response based on data How can you assess these standards?
Demonstrating GrowthBilly Bob’s work is shown below. He has made a mistake In the space to the right, solve the problem on your own on the right. Then find Billy Bob’s mistake, circle it and explain how to fix it.
Billy Bob’s work
½ X -10 = -2.5
+10 = +10_____________________________________________
½ X +0 = +12.5
(2/1)(1/2)X =12.5 (2)
X=25
Your work
Explain the changes that should be made in Billy Bob’s Work
Find the mistake provides students with model.
Requires understanding.Requires writing in math
A resource for DDMs.
A small step?A giant step?The district
decides
Which of the three
conjectures are
true? Justify your
answer
Essay Prompt from Text
Read a primary source about Mohammed based on Muhammad’s Wife’s memories of her husband.
Essay: Identify and describe Mohammed’s most admirable quality based on this excerpt. Select someone from your life who has this quality. Identify who they are and describe how they demonstrate this trait.
What’s wrong with this prompt? Text-based question?PARCConline.org
Where are the CLAIMS and EVIDENCE?
Science Open Response from Text
Again, from a textbook,
Is this acceptable?
Is this recall?
Scoring Guides from Text
Lou Vee Air Car built to specs (50 points)
Propeller Spins Freely (60 points)
Distance car travels
1m 70
2m 80
3m 90
4m 100
Best distance (10,8,5)
Best car(10,8,5)
Best all time distance all classes (+5)
235 points total
A scoring guide from a textbook for
building a Lou Vee Air Car. Is it good
enough to ensure inter-rater reliability?
Technology/Media Rubric
A multi-criteria rubric for technology. What is
good, bad, problematical?
Don’t try to read it!
PE Rubric in Progress.
Grade 2 for overhand throw and catching.Look good?
Music: Teacher and Student Instructions
Are numbers good or a problem?
The UGLY
Comply with regulationsBring about no change or understanding
The Best?
Build on what is in the District, school or department
A small step or a larger step in cognitive complexity
Use the results to learn about students’ needs and how to address these needs
Use time to look at student work, to collaboratively plan to improve
Table Talk
District Values
Consistent Conditions for Assessments
Looking at the Results: A Preview
A Psychometrician’s view
How Do We DetermineCut Scores?
Growth Scores? Both are new areas for learning Growth is not achievement. Moderate=a year’s growth What if a student is below benchmark? Again, setting these specific parameters is district
determined “Common Sense”Psychometricians are still figuring out what a good/fair
assessment is
Objectivity versus SubjectivityCalibration
Human judgment and assessmentWhat is objective about a multiple choice test?
Calibrating standards in using rubricsCommon understanding of descriptors
What does “insightful,” “In-depth,” “general” look like?Use exemplars to keep people calibratedAssess collaboratively with uniform protocol
Insightful and deep understanding
General Details
ManyMisconceptions
Assessment Drift
Spot Checking; recording; assessment blind Develop EXEMPLARS (simple protocol) In F&P Comprehension “conversation”
Grade Level Team: Calibration with sample below benchmark, at benchmark, and above benchmark sample to begin. Discuss differences
Then sample recorded F&P
Protocols for Administration of Assessments
Directions to teachers need to define rules for giving support, dictionary use, etc.
What can be done? What cannot? “Are you sure you are finished?”
How much time?
Accommodations and modifications?
Feedback from teachers indicated some confusions about procedures
Update instructions (common format)
Qualitative Methods of Determining an Assessment’s VALIDITY
Looking at the “body of the work”Validating an assessment based upon the students’
work
Floor and ceiling effectIf you piled the gain scores (not achievement)
into High, M, and Low gainIs there a mix of at risk, average, and high
achievers mixed throughout each pile or can you see one group mainly represented
Low, Moderate, High Growth Validation
Did your assessment accurately pinpoint differences in growth?
1. Look at the LOW pile
If you think about their work during this unit, were they struggling?
2. Look at the MODERATE pile. Are these the average learners who learn about what you’d expect of your school’s student in your class?
3. Look at the HIGH achievement pile. Did you see them learning more than most of the others did in your class?
Based on your answers to 1, 2, and 3, Do you need to add questions (for the very high or the very low?)
Do you need to modify any questions (because everyone missed them or because everyone got them correct?)
Tracey is a student who was rated as having high growth.
James had moderate growth Linda had low growth
Investigate each student’s work Effort Teachers’ perception of growth Other evidence of growth Do the scores assure you that the assessment is
assessing what it says it is?
Look at specific students’ work
Psychometric process called
Body of the Workvalidation
Objectivity versus SubjectivityMultiple Choice Questions
Human judgment and assessmentWhat is objective about a multiple choice
test?What is subjective about a multiple choice
test?Make sure the question complexity did not
cause a student to make a mistake. Make sure the choices in M/C are all about
the same length, in similar phrases, and clearly different
Rubrics and Inter-Rater ReliabilityGetting words to mean the same to
all ratersCategory 4 3 2 1
Resources Effective use Adequate use Limited use Inadequate use
Development Highly focused Focused response Inconsistent response
Lacks focus
Organization Related ideas support the writers purpose
Has an organizational structure
Ideas may be repetitive or rambling
No evidence of purposeful organization
Language conventions
Well-developed command
Command; errors don’t interfere
Limited or inconsistent command
Weak command
Protocol for Developing Inter Rater Reliability
Before scoring a whole set of papers, develop Inter-rater Reliability
Bring High, Average, Low samples (1 or 2 each)
Use your rubric or scoring guide to assess these samples
Discuss differences until a clear definition is established
Use these first papers as your exemplars
When there’s a question, select one person as the second reader
Annotated Exemplar: How does the author create the mood in the poem?
Answer and explanation in the student’s words
Specific substantiation from the text
The speaker’s mood is greatly influenced by the weather.
The author uses dismal words such as “ghostly,” “dark,” “gloom,” and “tortured.”
“Growth Rubrics” Can be Developed Pre-conventional Writing
Ages 3-5EmergingAges 4-6
DevelopingAges 5-7
2 Relies primarily on pictures to convey meaning.
2 Begins to label and add “words” to pictures.
2 Writes first name.1 Demonstrates awareness that print conveys meaning.
? Makes marks other than drawing on paper (scribbles).
? Writes random recognizable letters to represent words.
J Tells about own pictures and writing.
2 Uses pictures and print to convey meaning.
2 Writes words to describe or support pictures.
2 Copies signs, labels, names, and words (environmental print).
1 Demonstrates understanding of letter/sound relationship.
? Prints with upper case letters.? Matches letters to sounds.? Uses beginning consonants to make
words.? Uses beginning and ending consonants
to make words.J Pretends to read own writing.J Sees self as writer.J Takes risks with writing.
2 Writes 1-2 sentences about a topic. 2 Writes names and familiar words.1 Generates own ideas for writing.
? Writes from top to bottom, left to right, and front to back.
? Intermixes upper and lower case letters.? Experiments with capitals.? Experiments with punctuation.? Begins to use spacing between words.? Uses growing awareness of sound
segments (e.g., phonemes, syllables, rhymes) to write words.
? Spells words on the basis of sounds without regard for conventional spelling patterns.
? Uses beginning, middle, and ending sounds to make words.
J Begins to read own writing.
Protocols for Administration of Assessments
Directions to teachers need to define rules for giving support, dictionary use, etc.
What can be done? What cannot? “Are you sure you are finished?”
How much time?
Accommodations and modifications?
Feedback from teachers indicated some confusions about procedures
Update instructions (common format)
Next Workshop: Protocols, defining Low, Moderate, High, Using Excel
DESE Quote
It is expected that districts are building their knowledge and experience with DDMs.
DDMs will undergo both small and large modifications from year to year. Changing or modifying scoring procedures is part of the continuous improvement of DDMs over
time.
We are all learners in this initiative.
Next Class
Bring sample assessment resultsBring assessments to discuss qualityExcelSetting up protocols for each assessmentOther? Please suggest!On-line access to materials at: https://
wikispaces.com/join/HQ4FDRM
Code HQ4FDRM