Assessing the validity of the ICECAP-A capability measure ...
Validity of Measure Semi
-
Upload
wesam-al-magharbeh -
Category
Documents
-
view
224 -
download
0
Transcript of Validity of Measure Semi
-
8/8/2019 Validity of Measure Semi
1/77
Measurement in nursing and healthMeasurement in nursing and health
researchresearch
Validity of measuresValidity of measures
Presented by : Wesam AlmagharbehPresented by : Wesam AlmagharbehSupervised by : Muayyad Ahmad, PhD, RNSupervised by : Muayyad Ahmad, PhD, RN
-
8/8/2019 Validity of Measure Semi
2/77
Introduction MeasurementIntroduction Measurement
The assignment of numbers to represent theThe assignment of numbers to represent theamount of an attribute present in an object oramount of an attribute present in an object orperson, using specific rules.person, using specific rules.
L. L. Thurstone : Whatever exists, exists inL. L. Thurstone : Whatever exists, exists insome amount and can be measured.some amount and can be measured.
The rules for measuring temperature, weight andThe rules for measuring temperature, weight andother physical attributes are widely known andother physical attributes are widely known and
accepted.accepted.
Rules for measuring many variables howeverRules for measuring many variables howeverhave to be invented, e.g,rule for measuring pain,have to be invented, e.g,rule for measuring pain,satisfaction and depression.satisfaction and depression.
-
8/8/2019 Validity of Measure Semi
3/77
MeasurementMeasurement
according to what criteria the numeric values areaccording to what criteria the numeric values are
to be assigned to the characteristic of interestto be assigned to the characteristic of interest In measuring attributes, researchers strive toIn measuring attributes, researchers strive to
use good, meaningful rules.use good, meaningful rules.
With a new instrument, researchers seldomWith a new instrument, researchers seldom
know in advance if their rules are the bestknow in advance if their rules are the bestpossible.possible.
-
8/8/2019 Validity of Measure Semi
4/77
Key Criteria for EvaluatingKey Criteria for Evaluating
Quantitative MeasuresQuantitative Measures
ReliabilityReliability
ValidityValidity
-
8/8/2019 Validity of Measure Semi
5/77
ValidityValidity
Validity refers to the extent to which a measureValidity refers to the extent to which a measure
achieves the purpose for which it was intendedachieves the purpose for which it was intended..
"Validity is a unitary concept. It is the degree to"Validity is a unitary concept. It is the degree to
whichwhich evidence and theoryevidence and theory support thesupport the
interpretation entailed by proposed use of tests"interpretation entailed by proposed use of tests"
(AERA) (NCME) ((AERA) (NCME) (19851985,, 19991999))
-
8/8/2019 Validity of Measure Semi
6/77
The type of validity information to be obtained depends
upon the aims or purposes for the measureaims or purposes for the measure rather than
upon the type of measure.
-
8/8/2019 Validity of Measure Semi
7/77
Two framework of measurement
Norm referenced measuresNorm referenced measures are employed when the interestis in evaluating a subjects performance relative to theperformance of other subject in some well definedcomparison group
The focus on the variance between subjects performance
Criterion referenced measuresCriterion referenced measures are employed when theinterest is in determining a subjects performance relativeto/ or whether or not subject has acquired a predetermined
set of target behavior The focus on the variance between subject performanceand predetermined set of behavior, (process and outcomevariable)
-
8/8/2019 Validity of Measure Semi
8/77
NORMNORM--REFERENCEDREFERENCED
VALIDITY PROCEDURESVALIDITY PROCEDURES
Four aspects ;
Content validityContent validity Face logicalFace logical validityvalidity
Construct validityConstruct validity
CriterionCriterion--related validityrelated validity
-
8/8/2019 Validity of Measure Semi
9/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Content validityContent validity
Its focus on Determining whether or not the items sampled for
inclusion on the tool adequately represent the domain of
content addressed by the instrument The relevance of the content domain to the proposed
interpretation of scores obtained when the measure is
employed.
Important for all measures(especially instruments
designed to assess cognition )
-
8/8/2019 Validity of Measure Semi
10/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Content validityContent validity
Procedures : experts judgeexperts judge the specific items in terms oftheirrelevance, sufficiency, and clarityrelevance, sufficiency, and clarity in representing the
concepts underlying the measure's development.
When two judges are employed, the content validityindex (CVI) is used (proportion of items given a rating of
quite/very relevant by both raters )
When more than two experts rate the items on a
measure, the alpha coefficient is used 0 indicates lack of agreement
1.00 indicates complete agreement
-
8/8/2019 Validity of Measure Semi
11/77
-
8/8/2019 Validity of Measure Semi
12/77
-
8/8/2019 Validity of Measure Semi
13/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Content validityContent validity
Content validity depend largely on
Selection, preparation, and use of experts
Optimal number of experts
-
8/8/2019 Validity of Measure Semi
14/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Face logical validityFace logical validity
Face validity is not validity in the true sense and refers
only to the appearance of the instrument to the layman
When it is present, does not provide evidence for
validity, that the instrument actually measures what it
purports to measure
-
8/8/2019 Validity of Measure Semi
15/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
Refers to the extent to which an individual ,event , objectRefers to the extent to which an individual ,event , object
actually possesses the characteristics being measuredactually possesses the characteristics being measured
by the instrumentby the instrument
The primary concern is the extent to which relationshipsamong items included in the measure are consistent with
the theory and concepts as operationally defined.
The more abstract the concept, the more difficult it is toThe more abstract the concept, the more difficult it is to
establish the construct validity of the measure.establish the construct validity of the measure.
-
8/8/2019 Validity of Measure Semi
16/77
-
8/8/2019 Validity of Measure Semi
17/77
Some Methods of AssessingSome Methods of Assessing
Construct ValidityConstruct Validity
Contrasted groups approachContrasted groups approach
hypothesis testing approachhypothesis testing approach
MultitraitMultitrait--multimulti--method approachmethod approach
-
8/8/2019 Validity of Measure Semi
18/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
Contrasted groups approachContrasted groups approach
Instrument is administered to groups expected to differ on the criticalattribute because of some known characteristic ( be extremely highand extremely low in the characteristic being measured )
E.g ; fear of labor experiences between primipara and multipara
Ifa significant differencea significant difference between the mean scores :
evidence for construct validityevidence for construct validity
IfIfno significant differenceno significant difference three possibilities exist:three possibilities exist:
((11) the test is unreliable;) the test is unreliable; ((22) the test is reliable, but not a valid measure of the characteristic) the test is reliable, but not a valid measure of the characteristic
((33) the constructor's con-ception of the construct of interest is faulty and) the constructor's con-ception of the construct of interest is faulty and
needs reformulation. the characteristicneeds reformulation. the characteristic
-
8/8/2019 Validity of Measure Semi
19/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
hypothesis testing approachhypothesis testing approach
Hypotheses according to theory or conceptual
framework
gathers data to test the hypotheses,
rationale underlying the instrument's
construction is adequate to explain the data
collected.
-
8/8/2019 Validity of Measure Semi
20/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
hypothesis testing approachhypothesis testing approach
According to theory, construct X is positively
related to construct Y.
Instrument A is a measure of construct X;
instrument B is a measure of construct Y.
Scores on A and B are correlated positively, as
predicted by theory.
Therefore, it is inferred that A and B are validmeasures of X and Y.
-
8/8/2019 Validity of Measure Semi
21/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
MultitraitMultitrait--multimulti--method approachmethod approach Is appropriately employed whenever it is feasible
to :
1. Measure two or more different constructs
2. Use two or more different methodologies tomeasure each construct
3. Administer all instruments to every subject atthe same time
4. Assume that performance on each instrument
employed is independent that is not influencedby, biased by or a function of performance onany other instrument
-
8/8/2019 Validity of Measure Semi
22/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
MultitraitMultitrait--multimethod approachmultimethod approach
Depend largely on the correlation size and pattern ofDepend largely on the correlation size and pattern of
Trait varianceTrait variance is the variability in a set of scores resulting
from individual differences in the trait being measured.
Method varianceMethod variance is variance resulting from individual
differences in a subject's ability to respond appropriately
to the type of measure used
-
8/8/2019 Validity of Measure Semi
23/77
-
8/8/2019 Validity of Measure Semi
24/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
MultitraitMultitrait--multimethod approachmultimethod approach
The reliability estimate (reliability diagonal)
Convergent validity (validity diagonal)
The size of these heterotrait-monomethod coefficientswill be lower than the values on the validity diagonal(constructvalidity)
The values of these heterotrait-heteromethodcoeffi-cients should be lower than the values in thevalidity diagonal (discriminantvalidity)
-
8/8/2019 Validity of Measure Semi
25/77
-
8/8/2019 Validity of Measure Semi
26/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
Construct validityConstruct validity
CONFIRMATORY FACTOR
ANALYSIS
-
8/8/2019 Validity of Measure Semi
27/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
When one wishes to infer from a measure an individual's
probable standing on some other variable or criterion,
criterion-related validity is of concern
The degree to which the instrument is related to anT
he degree to which the instrument is related to anexternal criterionexternal criterion
Check the measure against a relevantCheck the measure against a relevant criterioncriterion..
-
8/8/2019 Validity of Measure Semi
28/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
two types of criterion-related validity : Predictive validityPredictive validity indicates the extent to which an
individual's future level of performance on a criterion can
be predicted from knowledge of performance on a priormeasure.
Concurrent validityConcurrent validity refers to the extent to which a
measure may be used to estimate an individual's present
standing on the criterion.
-
8/8/2019 Validity of Measure Semi
29/77
2929
NORMNORM--REFERENCEDMEASURESREFERENCEDMEASURES
CriterionCriterion--related validityrelated validity
Predictive ValidityPredictive Validity
NORMNORM--REFERENCEDMEASURESREFERENCEDMEASURES
CriterionCriterion--related validityrelated validity
Predictive ValidityPredictive Validity
Look at measures ability toLook at measures ability to predictpredictsomethingsomething
it should be able to predictit should be able to predict
TestTest CriterionCriterion
-
8/8/2019 Validity of Measure Semi
30/77
-
8/8/2019 Validity of Measure Semi
31/77
3131
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
Concurrent ValidityConcurrent Validity
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
Concurrent ValidityConcurrent Validity
a measure of empowerment should showa measure of empowerment should show
higher scores for managers and lowerhigher scores for managers and lowerscores for their workers.scores for their workers.
-
8/8/2019 Validity of Measure Semi
32/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
The difference between predictive andThe difference between predictive and
concurrent validity then, is the difference inconcurrent validity then, is the difference in
the timing of obtaining measurements on athe timing of obtaining measurements on a
criterion.criterion.
-
8/8/2019 Validity of Measure Semi
33/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
Activities to obtain evidence for criterion-related validity
correlation studies of the type and extent of therelationships between scores and exter-nal variables
studies of the extent to which scores predict future
behavior, performance, or scores on measures obtainedat a later point in time
studies of the effectiveness of selection, placement,and/or classification decisions on the basis of the scores
resulting from the measure
studies of differential group predictions or relationships
assessment of validity generalization
-
8/8/2019 Validity of Measure Semi
34/77
NORMNORM--REFERENCED MEASURESREFERENCED MEASURES
CriterionCriterion--related validityrelated validity
Factors to be considered in planning and
interpreting criterion-related studies relate
to
(1) the target population,
(2) the sample,
(3) the criterion,
(4) measurement reliability,
(5) the !need for a cross validation
-
8/8/2019 Validity of Measure Semi
35/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item analysis : procedure used to further assess thevalidity of a measure by separately evaluating each item
to determine whether or not that item discriminates in the
same manner in which the overall measure is intendedto dis-criminate
Three item-analysis procedures are :
(1) item p level
(2) discrimination index
(3) item-response chart.
-
8/8/2019 Validity of Measure Semi
36/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item p levelItem p level The p level (the difficulty level) : is the proportion of
correct responses to that item.
It is determined by counting the number of subjects
selecting the correct or desired response to a particularitem and then dividing this number by the total number of
subjects
The closer the value of p is to 1.00, the easier the item
the closer p is to zero, the more difficult the item
p levels between 0.30 and 0.70 are desirable
extremely easy or extremely difficult items have; very
little power to discriminate or differentiate among
subjects
-
8/8/2019 Validity of Measure Semi
37/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Discrimination IndexDiscrimination Index
The discrimination index (D) assesses an item's ability to
discriminate
if performance on a given item is a good predictor ofperformance on the overall measure, the item is said to
be a good discriminator
-
8/8/2019 Validity of Measure Semi
38/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Discrimination IndexDiscrimination Index
To determine the D value for a given item:
1. Rank all subjects' performance on the measure by using total scores
from high to low.
2. Identify those individuals who ranked in the upper 25%.
3. Identify those individuals who ranked in the lower 25%.
4. Place the remaining scores aside.
5. Determine the proportion of respondents in the top 25% who
answered the item correctly (P u)
6. Determine the proportion of respondents in the lower 25% who
answered the item correctly (PL)
7.Calculate D by subtracting PL from P u
8. Repeat steps 5 through 7 for each item on the measure
-
8/8/2019 Validity of Measure Semi
39/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Discrimination IndexDiscrimination Index
D values range from -1.00 to +1.00.
D values. greater than +0.20 are desirable for a norm-
referenced measure
A positive D value is desirable and indicates that theitem is discriminating in the same manner as the total
test
A negative D value suggests that the item is not
discriminating in the same way as the total test
-
8/8/2019 Validity of Measure Semi
40/77
NORMNORM--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item Response ChartItem Response Chart
Like D, the item-response chart assesses an item'sability to discriminate
The respondents ranking in the upper and lower 25% areidentified as in steps 1 through 4 for determining D
the two categories, high/low scorers andcorrect/incorrect for a given item.
Chi square ; a value as large as or larger than 1.84 for achi square with one degree of freedom is significant atthe 0.05 level
Mean a significant difference exists in the proportion ofhigh and low scorers who have correct responses. Itemsthat meet this criterion should be retained, while thosethat do not should be discarded or modified to improve
their ability ' to discriminate.
-
8/8/2019 Validity of Measure Semi
41/77
-
8/8/2019 Validity of Measure Semi
42/77
CRITERIONCRITERION--REFERENCEDREFERENCED
VALIDITY ASSESSMENTVALIDITY ASSESSMENT
The validity of a criterion-referenced measure can be
analyzed to ascertain if the measure functions in a
manner consistent with its purposes
validity in terms of criterion-referenced interpretations
relates to the extent to which scores result in the
accurate classification of objects in regard to their
domain status.
-
8/8/2019 Validity of Measure Semi
43/77
CRITERIONCRITERION--REFERENCEDREFERENCED
VALIDITY ASSESSMENTVALIDITY ASSESSMENT
Three aspects ;
Content validityContent validity
Construct validityConstruct validity CriterionCriterion--related validityrelated validity
-
8/8/2019 Validity of Measure Semi
44/77
CRITERIONCRITERION--REFERENCEDREFERENCED
VALIDITY ASSESSMENTVALIDITY ASSESSMENT
Content ValidityContent Validity
Focus on the representativeness of acluster of items in relation to the specifiedcontent domain
For a measure to provide a clear description of domainstatus, the content domain must be consistent with its
domain specifications or objective
prerequisite for all other types of validity
a posterioricontent validity approach in criterion-referenced measurement uses content specialists toassess the quality and representativeness of the items
within the test for measuring the content domain.
-
8/8/2019 Validity of Measure Semi
45/77
CRITERIONCRITERION--REFERENCEDREFERENCED
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
specialists should be conversant with the domain treatedin the measuring tool.
two or more content specialists are employed
item-objective congruence measure (item level)
if more than one objective is used for a measure, the
items that are meas-ures of each objective usually aretreated as separate tests when interpreting the results of
validity assessments
-
8/8/2019 Validity of Measure Semi
46/77
CRITERIONCRITERION--REFERENCEDREFERENCED
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
Determination of Interrater Agreement
Average Congruent/ Percentage
Validity AssessmentValidity Assessment
-
8/8/2019 Validity of Measure Semi
47/77
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
Determination of InterraterAgreementDetermination of InterraterAgreement
Content specialists are provided with the conceptualdefinition of the variable (s) to be measured with the setof items
The content specialists then independently rate therelevance of each item to the specified content domain
P0 0.80,
K 0.25.
The index of content validity (CVI)
-
8/8/2019 Validity of Measure Semi
48/77
Validity AssessmentValidity Assessment
-
8/8/2019 Validity of Measure Semi
49/77
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
Determination of InterraterAgreementDetermination of InterraterAgreement
IfP0 and K or either of these values is too low, one or acombination of two problems could be operating ;
First, items lack homogeneity , ambiguous or is not well
defined.
E.g. (20 out of 30), 0.50 (15 out of 30), and 0.60 (18 out of 30).
the majority of the item writers had at least one item that
was judged not/somewhat relevant (1 or 2) by the threecontent specialists, then this would be support for lack ofclarity in the domain definition.
Validity AssessmentValidity Assessment
-
8/8/2019 Validity of Measure Semi
50/77
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
Determination of InterraterAgreementDetermination of InterraterAgreement
Second, the problem due to the raters , interpret the
rating scale labels differently or used the rating scale
differently
E.g. 0.90 (27 out of 30), 0.93 (28 out of 30), and 0.93 (28
out of 30).
Each of the items judged to be unlike the rest had been
prepared by one item writer. In this case the flaw is not
likely to be in the domain definition as specified, but in
the interpretations of one item writer.
Validity AssessmentValidity Assessment
-
8/8/2019 Validity of Measure Semi
51/77
Validity AssessmentValidity Assessment
by Content Specialistsby Content Specialists
Determination of InterraterAgreementDetermination of InterraterAgreement
Refinement of the domain specifications is required if thefirst case.
If the latter is the problem, the raters are given moreexplicit directions and guidelines in the use of the scale
to reduce the chance of differential use. A clear and precise domain definition
domain specifications function to communicate what theresults of measurements mean to those people whomust interpret them,
what types of items and content should be included inthe measure to those people who must construct theitems.
-
8/8/2019 Validity of Measure Semi
52/77
CRITERIONCRITERION--REFERENCEDREFERENCED
Validity AssessmentValidity Assessment
Average Congruent/ PercentageAverage Congruent/ Percentage
Content specialists are judge the congruence of eachitem on a measure
The proportion of items rated congruent by each judge is
calculated and converted to a percentage.
Then the mean percentage for all judges is calculated toobtain the average congruency percentage.
E.g. if the percentages of congruent items for the judgesare 95,90,100, and 100%, the average congruencypercentage would be 96.25%.
percent 90 safely considered acceptable
-
8/8/2019 Validity of Measure Semi
53/77
CRITERIONCRITERION--REFERENCEDREFERENCED
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Evidence of the content validity is not guarantee that themeasure is useful for its intended purpose.
"we may say that a test's results are accurately
descriptive of the domain of behaviors it is supposed tomeasure, it is quite another thing to say that the functionto which you wish to put a descriptively valid test isappropriate" (Popham, 1978, p. 159).
the major focus of construct validation is to
establish support for the measure's ability to accuratelycategorize phenomena in accordance with the purposefor which the measure being used.
CRITERIONCRITERION REFERENCEDREFERENCED
-
8/8/2019 Validity of Measure Semi
54/77
CRITERIONCRITERION--REFERENCEDREFERENCED
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Experimental Methods and the Contrasted Groups
Approach
Decision Validity
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
-
8/8/2019 Validity of Measure Semi
55/77
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Experimental Methods and the Contrasted GroupsExperimental Methods and the Contrasted Groups
ApproachApproach
The basic principles and procedures for
these two approaches are the same for
criterion-referenced measures as for
norm-referenced measures.
-
8/8/2019 Validity of Measure Semi
56/77
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Decision ValidityDecision Validity
(1) a student may be allowed to progress to the next unitof instruc-tion if test results indicate that the precedingunit has been mastered.
(2) a woman in early labor may be allowed to ambulateif the nurse assesses, on pelvic examination, that thefetal head is engaged (as opposed to unengaged) in thepelvis.
(3) a diabetic patient may be allowed to go home if the
necessary skills for self-care have been mastered
-
8/8/2019 Validity of Measure Semi
57/77
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Decision ValidityDecision Validity
The measurements obtained from criterion-referencedmeasures are often used to make decisions.
"Criterion-referenced tests have emerged as instrumentsthat provide data via which mastery decisions can bemade, as opposed to providing the decision itself(Hashway, 1998, p. 112).
The decision validity of a measure is supported when theset standard (s) or criterion classifies subjects or objectswith a high level of confidence.
-
8/8/2019 Validity of Measure Semi
58/77
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Decision ValidityDecision Validity
In most instances, two criterion groups are used to testthe decision validity of a measure (low and high )
E.g.
"by summing the percentage of who exceed theperformance standard and the percentage who did not"
decision validity can range from 0 to 100%, with highpercentages reflecting high decision validity.
Criterion groups for testing the decision validity of ameasure also can be created
E.g.
-
8/8/2019 Validity of Measure Semi
59/77
CONSTRUCT VALIDITYCONSTRUCT VALIDITY
Approaches used to assess the construct validityApproaches used to assess the construct validity
Decision ValidityDecision Validity
Decision validity is influenced by
the quality of the measure
appropriateness of the criterion groups
the characteristics of the subjects
the level of performance or cut-scorerequired.
-
8/8/2019 Validity of Measure Semi
60/77
CRITERIONCRITERION--REFERENCEDREFERENCED
CriterionCriterion--Related ValidityRelated Validity
Criterion-related validity studies of
criterion-referenced measures are
conducted in the same manner as for
norm-referenced measures
-
8/8/2019 Validity of Measure Semi
61/77
CRITERIONCRITERION--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
content specialists' ratings holds the most
merit for assessing item validities for
determining which items should be
retained or discarded
empirical item-discrimination indices
should be used primarily to detect aberrantitems in need of revision or correction
-
8/8/2019 Validity of Measure Semi
62/77
Empirical Item-Analysis
Procedures
Criterion-referenced item-analysis procedures determine
the effectiveness of a specific test item to discriminate
subjects who have acquired the target behavior and
those who have not.
-
8/8/2019 Validity of Measure Semi
63/77
Two approaches are used for item analysis
procedures
(1) the criterion-groups technique, which also
may be referred to as the uninstructed-instructedgroups approach
(2) pretreatment/post-treatment measures
approach, which in appropriate instances may
be called the preinstruction/postinstructionmeasurements approach.
-
8/8/2019 Validity of Measure Semi
64/77
Advantage and disadvantage
The criterion-groups technique is highly practical
difficulty of defining criteria for identifyinggroups. Another is the requirement of
equivalence of groups
Pretreatment/post-treatment measuresapproach allowing analysis of individual as well
as group gains. impracticality , the amount of time that may be
required, potential problem with testing effect,
-
8/8/2019 Validity of Measure Semi
65/77
CRITERIONCRITERION--REFERENCEDREFERENCED
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Three item-analysis procedures
are :
(1) Item-Objective or Item-Subscale Congruence (2) Item Difficulty
(3) discrimination index
-
8/8/2019 Validity of Measure Semi
66/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
ItemItem--Objective orObjective or
ItemItem--Subscale CongruenceSubscale Congruence
provides an index of the validity of an item based on theratings of two or more content specialists
In this method content specialists are directed to assigna value of+1,0, or -1 for each item
an item definitely measure the objective or subscale, avalue of +1 is assigned.
A rating of 0 indicates that the judge is undecided about
the item.
The assignment of a -1 rating reflects a definitejudgment that the item is not a measure of the objectiveor sub-scale.
-
8/8/2019 Validity of Measure Semi
67/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
ItemItem--Objective orObjective or
ItemItem--Subscale CongruenceSubscale Congruence
The limits of the index range from -1.00 to +1.00.
An index of +1.00 will occur when perfect positive item-objective or subscale congruence exists, that is, when all
content specialists assign a +1 to the item for its relatedobjective or subscale and a 1 to the item for all otherobjectives or subscales that are measured by the tool.
An index of -1.00 represents the worst possible value ofthe index and occurs when all content specialists assigna -1 to the item for what was expected to be its relatedobjective or subscale and a +1 to the item for all otherobjectives or subscales.
-
8/8/2019 Validity of Measure Semi
68/77
-
8/8/2019 Validity of Measure Semi
69/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
ItemItem--Objective orObjective or
ItemItem--Subscale CongruenceSubscale Congruence
does not depend on the number of content specialists usedor on the number of objectives measured by the test orquestionnaire.
the tool must include more than one objective orsubscale in order for this procedure to be used.
cut-off score derived by the test developer.
done by creating the poorest set of content specialists'ratings
Below cut-off score ; nonvalid; discarded from themeasure or ana-lyzed and revised to Improve theirvalidity.
above cut-off score are considered valid.
-
8/8/2019 Validity of Measure Semi
70/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DifficultyItem Difficulty
the purpose is to examine the difficulty level of items andcompare them between criterion groups
The approaches to calculating item p levels and their
interpretation was discussed
The item p level should be higher for the group that isknown to possess more of a specified trait or attributethan for the group known to possess less
-
8/8/2019 Validity of Measure Semi
71/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
The focus on the measurement of performance changes
(e.g., pretest/posttest) or differences (e.g.,
experienced/inexperienced) between the criterion
groups.
referred to as D
is directly related to the property of decision validity,
Items with high positive discrimination indices improve
the decision validity of a test.
-
8/8/2019 Validity of Measure Semi
72/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
Criterion groups difference index (CGDI)
Pre/post treatment measurements
approach indices
-
8/8/2019 Validity of Measure Semi
73/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
criterion groups difference index (CGDI) is
the proportion of respondents in the group
known to have less of the trait or attribute
of interest who answered the itemappropriately or correctly subtracted from
the proportion of respondents in the group
known to possess more of the trait orattribute of interest who answered it
correctly.
-
8/8/2019 Validity of Measure Semi
74/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
Pretreatment/post treatment measurements approach
Three item-discrimination indices are
(1) pretest/posttest difference.
(2) individual gain.
(3) net gain.
-
8/8/2019 Validity of Measure Semi
75/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
The pretest/posttest difference index (PPDI) is theproportion of respondents who answered the itemcorrectly on the posttest minus the proportion whoresponded to the item correctly on the pretest
The individual gain index (IGI) is the pro-portion ofrespondents who answered the item incorrectly on thepretest and correctly on the posttest
The net gain index (NGI) is the proportion ofrespondents who answered the item incorrectly on both
occasions subtracted from the IGI.
-
8/8/2019 Validity of Measure Semi
76/77
-
8/8/2019 Validity of Measure Semi
77/77
ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES
Item DiscriminationItem Discrimination
NGI provides the most conservative estimate of itemdiscrimination and uses more information.
The range of values for each of the indices discussed
above is -1.00 to +1.00
except for IGI, which has a range of 0 to +1.00.
A high positive index for each of these itemdiscrimination indices is desirable.