Post on 18-Nov-2014
description
ASSESSMENT OF LEARNING
-- Edutopia: Success Stories for Learning in the Digital Age
“Superficial forms of
assessment tend to lead to
superficial forms of
teaching and learning.”
Teaching to the Test
Why Assess?
• Provide diagnosis
• Set standards
• Evaluate progress
• Communicate results
• Motivate performance
Standardized Tests
• Are not prescriptive
• Give capsulated view of a student’s learning
• Used in conjunction with performance-based assessment
Authentic Assessment
• Observation
• Teacher-made tests, quizzes, exams
• Written compositions
Authentic Assessment
• Oral presentations
• Projects, experiments, performance tasks
• Portfolios
• TEST – the instrument or systematic procedure
• It answer the question: “How does individual student performs?”
• TESTING – method used to measure the level of achievement or performance of the students
• MEASUREMENT – process of obtaining a NUMERICAL DESCRIPTION. It answer the question: “How much?” The score.
• EVALUATION – judging the performance through a descriptive rating (satisfactory, VS, O or excellent).
TYPES OF MEASUREMENT
• NORM-REFERENCED TEST – comparison with other student using a score in PERCENTILE, GRADE or EQUIVALENT SCORE or STANINE
• Purpose: to rank student with respect to the achievement of others and to discriminate high and low achievers.
• CRITERION-REFERENCED TEST– To measure performance with respect to a
particular criterion or standard– Student score is expressed as
PERCENTAGE and student achievement is reported for individual skills
– Purpose: to know whether the student achieved a specific skills or concepts, and to find out how much students know before instruction begins and after it has finished
– Objective referenced, domain referenced, and universe referenced
TYPE OF EVALUATION
• PLACEMENT –prerequisite skills, degree of mastery and the best mode of learning
• DIAGNOSTIC – to determine level of competence, identify students with previous knowledge of the lesson and to know the causes of learning problems and to formulate plans for remedial action
• FORMATIVE – to provide feedback, identify learning errors needing corrections and for teacher to modify instruction and for improving learning and instruction
• SUMMATIVE – to determine if objectives have been met, for assigning grades and effectiveness of instructions.
MODES OF ASSESSMENT
• TRADITIONAL – multiple choice, fill-in the blanks, true or false, matching type
• PERFORMANCE – responses, performances and products
• PORTFOLIO – collaboration of student’s work, contains a purposeful selected subset of student work
KEY TO EFFECTIVE TEACHING
• OBJECTIVES – aims of instruction• INSTRUCTION – elements of the
curriculum designed to teach the subject includes lesson plans, study guides and assignments
• ASSESSMENT – testing components of the subject
• EVALUATION – extent of understanding of the lesson
INSTRUCTIONAL OBJECTIVES
• Guides for teaching and learning
• Intent of the instruction
• Guidelines for assessing learning
• Behavioral objectives clearly describe an anticipated learning outcomes
• Specific, measurable, attainable, realistic and time bound
Bloom’s TaxonomyBloom’s Taxonomy
A Focus on Higher-Level Thinking Skills
BackgroundBackgroundIn 1956, Benjamin Bloom, a professor at the University of Chicago, shared his famous "Taxonomy of Educational Objectives."
Bloom identified six levels of cognitive complexity that have been used over the past four decades to make sure that instruction stimulates and develops students' higher-order thinking skills.
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
Higher-Level Thinking SkillsHigher-Level Thinking Skills
KnowledgeKnowledge Recall or recognition of information.
list name identify show define recognize recallmatch
defineclassifydescribelocate outlinegive examples distinguish opinion from fact
Comprehension Comprehension The ability to understand, translate, paraphrase, interpret or extrapolate material. (Predict outcome and effects).
paraphrase differentiate demonstrate visualize restate rewritegive examples
summarize explain interpret describe compare convertdistinguishestimate
Application Application The capacity to use information and transfer knowledge from one setting to another. (Use learned material in a new situation).
apply classify modify put into practicedemonstratecomputeoperate
solve illustrate calculate interpret manipulatepredictshow
Analysis Analysis Identifying detail and having the ability to discover and differentiate the component parts of a situation or information.
contrast
compare
distinguish
categorize
outline
relate
analyze
organize
deduce
choose
diagram
discriminate
Synthesis Synthesis The ability to combine parts to create the big picture.
discuss plan compare createconstruct rearrangecomposeorganize
design hypothesize support write reportcombinecomplydevelop
EvaluationEvaluationThe ability to judge the value or use of information using appropriate criteria. (Support judgment with reason).
criticize justify debatesupport your reasonconcludeassessrate
evaluate
choose
estimate
judge
defend
appraise
KRATWOHL’S AFFECTIVE TAXONOMY
• Refers to a person’s awareness and internalization of objects and simulation
• ANDERSON and KRATWOHL – revised the Bloom’s original taxonomy by combining the cognitive process and knowledge dimensions from lowest level to highest level
• Receiving – listens to ideas, identify, select, give• Responding – answers questions about ideas:
read, select, tell, write, assist, present• Valuing – think about how to take advantage of
ideas, able to explain them well; explain, follow initiate, justify, propose
• Organizing – commits to using ideas, incorporate them to activity: prepare, follow, explain, relate, synthesize, integrate, join , generalize
• Characterization – putting and using them into practice: solve, verify, propose, modify, practice, qualify
Illustrative Behavioral Terms for stating Specific Learning Outcomes• RECEIVING
– Asks– Chooses– Describes– Follows– Gives– Holds– Identifies– Locates– Names– Point to– Selects– Replies– Uses
• RESPONDING– Answers– Assists– Complies– Conforms– Discuss– Greets– Helps– Label– Perform– Practice– Present– Read– Recites– Reports– Select
• VALUING– Completes– Describes– Differentiates– Explains– Follows– Form– Initiates– Invites– Justifies– Propose– Read– Reports– Select– Shares– Studies– work
• ORGANIZATION
– Alters– Arranges– Combines– Compares– Completes– Defends– Explains– Generalizes– Integrates– Modifies– Orders– Organizes– Prepares– Relates– Synthesizes
PSYCHOMOTOR DOMAIN
• OBSERVING – active mental attending to physical event
• IMITATING – attempted copying of a physical behavior
• PRACTICING – trying a specific activity over and over
• ADAPTING- fine tuning, making minor adjustment in the physical activity in order to perfect it.
CRITERIA WHEN CONSTRUCTING A GOOD TEST
• VALIDITY – what is intended to measure• RELIABILITY – consistency of the score obtained when
the test is repeated• ADMINISTRABILITY – easy, clarity and uniformity: time
limit and instructions• SCORABILITY – easy to score and the directions for
scoring is clear and simple, provision of answer sheets are made.
• ECONOMY – test should be given in the cheapest way and can be given from time to time
• ADEQUACY – wide sampling of items to represent of the areas measured
• AUTHENTICITY – stimulating and real life situations.
Table Of Specifications
• Determine the total item desired• Determine the number of days taught for each
lesson and its total• Divide the # of days taught for each topic by the
total # of days taught for all topics multiplied by the total item
• Distribute the # of questions to all levels of cognitive domain
• Identify the test item number placement in the test
ITEM ANALYSIS
• Analysis of students response to each item in the test being desirable and undesirable.
• Desirable item can be retained for subsequent use.
• Undesirable item can be revised or rejected
Criteria of an Item
• Difficulty of an item
• Discriminating power of an item
• Effectiveness of an item
Steps of Item Analysis
• Arrange the scores from highest to lowest• Select the 27% of the papers within the upper
group and 27% from the lower group• Set aside the 46% of the papers, they will not be
used• Tabulate the number of students in the UG and
the LG who selected each choices• Compute for the difficulty of each item• Evaluate the effectiveness of the distracters
Difficulty Index (DF)
• Proportion of the number of students in the upper and lower groups who answered an item correctly
UG + UL
DF = --------------
N
Interpretation
Index of difficulty Item Evaluation
0.86 – 1.00 Very easy
0.61 – 0.85 Moderately easy
0.36 – 0.60 Moderately difficult
0.00 – 0.35 Very difficult
Index of Discrimination DI
• The difference between the proportion of high performing students who got the item right and the proportion of low performing students who got an item right.
RU - RL
DI = ------------ N
• Positive Discrimination – more students in the upper group got the item right
• Negative Discrimination – more students in the lower group got the item right
• Zero Discrimination – equal number of students in both groups got the item right
Interpretation
DI Item Evaluation
0.40 – up Good item
0.30 – 0.39 Reasonably good but subject to improvement
0.20 – 0.29 Marginal item, needs improvement
below 0.19 Poor item need to be rejected or revised
• Maximum Discrimination (DM) – the sum of the proportion of the upper and lower groups who answered the item correctly.
DM = UG +LG
• Discrimination Efficiency (DE) – the index of discrimination divided by the maximum discrimination DI
DE = --------
DM
Distracter’s Effectiveness
• A good distracter attracts students in the lower group than in the upper group.
• Poor distracter attracts more students in the upper group
• This provides information for improving the item.
• No. of examinees = 84• 1/3 or 27% from the highest = 28• 1/3 or 27% from the lowest = 28• Item #4• Options *a b c dUG – 28 26 2 0 0UL – 28 10 17 1 0*correct choiceIndex of Difficulty = UG + UL/N
= 36/56 = 0.64 moderately easy
Index of Discrimination = RU – RL/N= 26 – 10/56= 0.29 marginal item needs improving
Option b function effectively as a distracter because it attracts more from the lower group. Options c and d are poor distracters because none from each group is attracted.
VALIDITY – what is supposed to be measured
• CONTENT VALIDITY – content and objectives
• CRITERION-RELATED VALIDITY – test scores relating other test instruments
• CONSTRUCT VALIDITY – test can measure on observable variable
• PREDICTIVE VARIABLE – test result can be used what will be the score of a person at a later time
FACTOR AFFECTING VALIDITY
• Poorly constructed test items• Unclear directions• Ambiguous items• Reading vocabulary too difficult• Complicated syntax• Inadequate time limit• Inappropriate level of difficulty• Unintendend clues• Improper arrangement of items
RELIABILITY –consistency of measurement
• FACTORS AFFECTING RELIABILITY:– Length of test– Moderate item difficulty– Objective scoring– Heterogeneity of the student group– Limited time
DESCRIPTIVE STATISTICS
• MEASURES OF CENTRAL TENDENCY – AVERAGES
• MEASURES OF VARIABILITY – SPREAD OF SCORES
• MEASURES OF RELATIONSHIP - CORRELATION
MEASURE OF CENTRAL TENDENCY - MEAN
• Easy to compute
• Each data contributes to the mean value
• Affects by the extreme values easily
• Applied to interval data
∑ x
• Mean = -------
n
MEDIAN
• The point that divides the scores in a distribution into 2 equal parts when the scores are arranged from highest to lowest
• If the # of the score is ODD, the value of the median is the MIDDLE SCORE
• When the # of scores is an EVEN #, the median value is the average of the 2 middle most scores
MODE
• Refers to the score/s that occurred most in the distribution
• Unimodal if the distribution consist of only 1mode
• Bimodal if the distribution contains 2 modes
• Multimodal if a score distribution consist of more than 2 modes
MEASURES OF VARIABILITY
• A single value that is used to describe the spread out of the scores in a distribution, that is above or below the measures of central tendency
• Range
• Quartile deviation
• Standard Deviation
RANGE
• Simplest and crudest measure
• A rough measure of variation
• The smaller the value, the closes to each other
• The larger the value, the more scattered the scores are
• The value easily fluctuate
• R = HV - LV
QUARTILE DEVIATION
• Is the half of the difference between the third quartile (Q3) and the first quartile (Q1)
• The value of the QD indicates the distance we need to go above or below the median to include approximately the middle 50%of the scores
Q3 – Q1
• QD = -------------
2
• The standard deviation for Math is 10.20 and for science is 10.10, which means that MATH scores has a greater variability than SCIENCE scores, which means the scores in MATH are more scattered than in SCIENCE
• SD value LARGE = scores will be FAR from the mean
• SD value SMALL = scores will be CLOSE from the MEAN
PERCENTILE RANK
• The %age of scores in the frequency distribution which are lower, meaning the %age of examinees in the norm group scored below the score of interest.
• Used to clarify the interpretation of scores on standardized tests
• score = 66 = 90th percentile, meaning 90% of the examinees got score lower than 66.
Z SCORE – STANDARD SCORE
• Measures HOW MANY SD’s an observations is ABOVE or BELOW the MEAN.
• +Z score measures the no. of sd a score is ABOVE the MEAN
• -Z score measures the no. of sd score is BELOW the MEAN
• To locate the student’s score at the base of the curve
Z score Formula
• Where– X is a raw score
– ∂ is the sd of the population
– µ is the mean of the population
– S it he standard deviation of the sample
T-score
• T-score = 10z + 50
• The higher the value indicates good performance in a test
CV
The LOWER the value of coefficient of variation, the MORE the overall data approximate to the MEAN or the MORE HOMOGENEOUS THE PERFORMANCE OF THE GROUP.
SKEWNESS
• Describes the degree of departures of the data from symmetry.
• The degree of skewness is measured by the coefficient of skewness, denoted as SK
• SK = 3(Mean – Median)
SD
NORMAL CURVE
AVERAGE (S)
Below Average
Border-line
Abnormal Imbecile, Moron, Idiot
Above Ave. VS
O
Abnormal Genius
0.630.63
POSITIVELY SKEWED
• The curve is skewed to the RIGHT, it has a LONG tail extending off to the right with a short tail to the left.
• When the computed value of SK is + most of the scores of students are VERY LOW, they performed poor in the exam.
NEAGTIVELY SKEWED
• When the distribution is skewed to the left, it has a long tail extending to the left but a short tail to the right.
• When computed value of SK is negative most of the students got a very high score, they performed WELL in the exam.
COEFFICIENT OF CORRELATION