Validity of Measure Semi

download Validity of Measure Semi

of 77

Transcript of Validity of Measure Semi

  • 8/8/2019 Validity of Measure Semi

    1/77

    Measurement in nursing and healthMeasurement in nursing and health

    researchresearch

    Validity of measuresValidity of measures

    Presented by : Wesam AlmagharbehPresented by : Wesam AlmagharbehSupervised by : Muayyad Ahmad, PhD, RNSupervised by : Muayyad Ahmad, PhD, RN

  • 8/8/2019 Validity of Measure Semi

    2/77

    Introduction MeasurementIntroduction Measurement

    The assignment of numbers to represent theThe assignment of numbers to represent theamount of an attribute present in an object oramount of an attribute present in an object orperson, using specific rules.person, using specific rules.

    L. L. Thurstone : Whatever exists, exists inL. L. Thurstone : Whatever exists, exists insome amount and can be measured.some amount and can be measured.

    The rules for measuring temperature, weight andThe rules for measuring temperature, weight andother physical attributes are widely known andother physical attributes are widely known and

    accepted.accepted.

    Rules for measuring many variables howeverRules for measuring many variables howeverhave to be invented, e.g,rule for measuring pain,have to be invented, e.g,rule for measuring pain,satisfaction and depression.satisfaction and depression.

  • 8/8/2019 Validity of Measure Semi

    3/77

    MeasurementMeasurement

    according to what criteria the numeric values areaccording to what criteria the numeric values are

    to be assigned to the characteristic of interestto be assigned to the characteristic of interest In measuring attributes, researchers strive toIn measuring attributes, researchers strive to

    use good, meaningful rules.use good, meaningful rules.

    With a new instrument, researchers seldomWith a new instrument, researchers seldom

    know in advance if their rules are the bestknow in advance if their rules are the bestpossible.possible.

  • 8/8/2019 Validity of Measure Semi

    4/77

    Key Criteria for EvaluatingKey Criteria for Evaluating

    Quantitative MeasuresQuantitative Measures

    ReliabilityReliability

    ValidityValidity

  • 8/8/2019 Validity of Measure Semi

    5/77

    ValidityValidity

    Validity refers to the extent to which a measureValidity refers to the extent to which a measure

    achieves the purpose for which it was intendedachieves the purpose for which it was intended..

    "Validity is a unitary concept. It is the degree to"Validity is a unitary concept. It is the degree to

    whichwhich evidence and theoryevidence and theory support thesupport the

    interpretation entailed by proposed use of tests"interpretation entailed by proposed use of tests"

    (AERA) (NCME) ((AERA) (NCME) (19851985,, 19991999))

  • 8/8/2019 Validity of Measure Semi

    6/77

    The type of validity information to be obtained depends

    upon the aims or purposes for the measureaims or purposes for the measure rather than

    upon the type of measure.

  • 8/8/2019 Validity of Measure Semi

    7/77

    Two framework of measurement

    Norm referenced measuresNorm referenced measures are employed when the interestis in evaluating a subjects performance relative to theperformance of other subject in some well definedcomparison group

    The focus on the variance between subjects performance

    Criterion referenced measuresCriterion referenced measures are employed when theinterest is in determining a subjects performance relativeto/ or whether or not subject has acquired a predetermined

    set of target behavior The focus on the variance between subject performanceand predetermined set of behavior, (process and outcomevariable)

  • 8/8/2019 Validity of Measure Semi

    8/77

    NORMNORM--REFERENCEDREFERENCED

    VALIDITY PROCEDURESVALIDITY PROCEDURES

    Four aspects ;

    Content validityContent validity Face logicalFace logical validityvalidity

    Construct validityConstruct validity

    CriterionCriterion--related validityrelated validity

  • 8/8/2019 Validity of Measure Semi

    9/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Content validityContent validity

    Its focus on Determining whether or not the items sampled for

    inclusion on the tool adequately represent the domain of

    content addressed by the instrument The relevance of the content domain to the proposed

    interpretation of scores obtained when the measure is

    employed.

    Important for all measures(especially instruments

    designed to assess cognition )

  • 8/8/2019 Validity of Measure Semi

    10/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Content validityContent validity

    Procedures : experts judgeexperts judge the specific items in terms oftheirrelevance, sufficiency, and clarityrelevance, sufficiency, and clarity in representing the

    concepts underlying the measure's development.

    When two judges are employed, the content validityindex (CVI) is used (proportion of items given a rating of

    quite/very relevant by both raters )

    When more than two experts rate the items on a

    measure, the alpha coefficient is used 0 indicates lack of agreement

    1.00 indicates complete agreement

  • 8/8/2019 Validity of Measure Semi

    11/77

  • 8/8/2019 Validity of Measure Semi

    12/77

  • 8/8/2019 Validity of Measure Semi

    13/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Content validityContent validity

    Content validity depend largely on

    Selection, preparation, and use of experts

    Optimal number of experts

  • 8/8/2019 Validity of Measure Semi

    14/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Face logical validityFace logical validity

    Face validity is not validity in the true sense and refers

    only to the appearance of the instrument to the layman

    When it is present, does not provide evidence for

    validity, that the instrument actually measures what it

    purports to measure

  • 8/8/2019 Validity of Measure Semi

    15/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    Refers to the extent to which an individual ,event , objectRefers to the extent to which an individual ,event , object

    actually possesses the characteristics being measuredactually possesses the characteristics being measured

    by the instrumentby the instrument

    The primary concern is the extent to which relationshipsamong items included in the measure are consistent with

    the theory and concepts as operationally defined.

    The more abstract the concept, the more difficult it is toThe more abstract the concept, the more difficult it is to

    establish the construct validity of the measure.establish the construct validity of the measure.

  • 8/8/2019 Validity of Measure Semi

    16/77

  • 8/8/2019 Validity of Measure Semi

    17/77

    Some Methods of AssessingSome Methods of Assessing

    Construct ValidityConstruct Validity

    Contrasted groups approachContrasted groups approach

    hypothesis testing approachhypothesis testing approach

    MultitraitMultitrait--multimulti--method approachmethod approach

  • 8/8/2019 Validity of Measure Semi

    18/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    Contrasted groups approachContrasted groups approach

    Instrument is administered to groups expected to differ on the criticalattribute because of some known characteristic ( be extremely highand extremely low in the characteristic being measured )

    E.g ; fear of labor experiences between primipara and multipara

    Ifa significant differencea significant difference between the mean scores :

    evidence for construct validityevidence for construct validity

    IfIfno significant differenceno significant difference three possibilities exist:three possibilities exist:

    ((11) the test is unreliable;) the test is unreliable; ((22) the test is reliable, but not a valid measure of the characteristic) the test is reliable, but not a valid measure of the characteristic

    ((33) the constructor's con-ception of the construct of interest is faulty and) the constructor's con-ception of the construct of interest is faulty and

    needs reformulation. the characteristicneeds reformulation. the characteristic

  • 8/8/2019 Validity of Measure Semi

    19/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    hypothesis testing approachhypothesis testing approach

    Hypotheses according to theory or conceptual

    framework

    gathers data to test the hypotheses,

    rationale underlying the instrument's

    construction is adequate to explain the data

    collected.

  • 8/8/2019 Validity of Measure Semi

    20/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    hypothesis testing approachhypothesis testing approach

    According to theory, construct X is positively

    related to construct Y.

    Instrument A is a measure of construct X;

    instrument B is a measure of construct Y.

    Scores on A and B are correlated positively, as

    predicted by theory.

    Therefore, it is inferred that A and B are validmeasures of X and Y.

  • 8/8/2019 Validity of Measure Semi

    21/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    MultitraitMultitrait--multimulti--method approachmethod approach Is appropriately employed whenever it is feasible

    to :

    1. Measure two or more different constructs

    2. Use two or more different methodologies tomeasure each construct

    3. Administer all instruments to every subject atthe same time

    4. Assume that performance on each instrument

    employed is independent that is not influencedby, biased by or a function of performance onany other instrument

  • 8/8/2019 Validity of Measure Semi

    22/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    MultitraitMultitrait--multimethod approachmultimethod approach

    Depend largely on the correlation size and pattern ofDepend largely on the correlation size and pattern of

    Trait varianceTrait variance is the variability in a set of scores resulting

    from individual differences in the trait being measured.

    Method varianceMethod variance is variance resulting from individual

    differences in a subject's ability to respond appropriately

    to the type of measure used

  • 8/8/2019 Validity of Measure Semi

    23/77

  • 8/8/2019 Validity of Measure Semi

    24/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    MultitraitMultitrait--multimethod approachmultimethod approach

    The reliability estimate (reliability diagonal)

    Convergent validity (validity diagonal)

    The size of these heterotrait-monomethod coefficientswill be lower than the values on the validity diagonal(constructvalidity)

    The values of these heterotrait-heteromethodcoeffi-cients should be lower than the values in thevalidity diagonal (discriminantvalidity)

  • 8/8/2019 Validity of Measure Semi

    25/77

  • 8/8/2019 Validity of Measure Semi

    26/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    Construct validityConstruct validity

    CONFIRMATORY FACTOR

    ANALYSIS

  • 8/8/2019 Validity of Measure Semi

    27/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    When one wishes to infer from a measure an individual's

    probable standing on some other variable or criterion,

    criterion-related validity is of concern

    The degree to which the instrument is related to anT

    he degree to which the instrument is related to anexternal criterionexternal criterion

    Check the measure against a relevantCheck the measure against a relevant criterioncriterion..

  • 8/8/2019 Validity of Measure Semi

    28/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    two types of criterion-related validity : Predictive validityPredictive validity indicates the extent to which an

    individual's future level of performance on a criterion can

    be predicted from knowledge of performance on a priormeasure.

    Concurrent validityConcurrent validity refers to the extent to which a

    measure may be used to estimate an individual's present

    standing on the criterion.

  • 8/8/2019 Validity of Measure Semi

    29/77

    2929

    NORMNORM--REFERENCEDMEASURESREFERENCEDMEASURES

    CriterionCriterion--related validityrelated validity

    Predictive ValidityPredictive Validity

    NORMNORM--REFERENCEDMEASURESREFERENCEDMEASURES

    CriterionCriterion--related validityrelated validity

    Predictive ValidityPredictive Validity

    Look at measures ability toLook at measures ability to predictpredictsomethingsomething

    it should be able to predictit should be able to predict

    TestTest CriterionCriterion

  • 8/8/2019 Validity of Measure Semi

    30/77

  • 8/8/2019 Validity of Measure Semi

    31/77

    3131

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    Concurrent ValidityConcurrent Validity

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    Concurrent ValidityConcurrent Validity

    a measure of empowerment should showa measure of empowerment should show

    higher scores for managers and lowerhigher scores for managers and lowerscores for their workers.scores for their workers.

  • 8/8/2019 Validity of Measure Semi

    32/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    The difference between predictive andThe difference between predictive and

    concurrent validity then, is the difference inconcurrent validity then, is the difference in

    the timing of obtaining measurements on athe timing of obtaining measurements on a

    criterion.criterion.

  • 8/8/2019 Validity of Measure Semi

    33/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    Activities to obtain evidence for criterion-related validity

    correlation studies of the type and extent of therelationships between scores and exter-nal variables

    studies of the extent to which scores predict future

    behavior, performance, or scores on measures obtainedat a later point in time

    studies of the effectiveness of selection, placement,and/or classification decisions on the basis of the scores

    resulting from the measure

    studies of differential group predictions or relationships

    assessment of validity generalization

  • 8/8/2019 Validity of Measure Semi

    34/77

    NORMNORM--REFERENCED MEASURESREFERENCED MEASURES

    CriterionCriterion--related validityrelated validity

    Factors to be considered in planning and

    interpreting criterion-related studies relate

    to

    (1) the target population,

    (2) the sample,

    (3) the criterion,

    (4) measurement reliability,

    (5) the !need for a cross validation

  • 8/8/2019 Validity of Measure Semi

    35/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item analysis : procedure used to further assess thevalidity of a measure by separately evaluating each item

    to determine whether or not that item discriminates in the

    same manner in which the overall measure is intendedto dis-criminate

    Three item-analysis procedures are :

    (1) item p level

    (2) discrimination index

    (3) item-response chart.

  • 8/8/2019 Validity of Measure Semi

    36/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item p levelItem p level The p level (the difficulty level) : is the proportion of

    correct responses to that item.

    It is determined by counting the number of subjects

    selecting the correct or desired response to a particularitem and then dividing this number by the total number of

    subjects

    The closer the value of p is to 1.00, the easier the item

    the closer p is to zero, the more difficult the item

    p levels between 0.30 and 0.70 are desirable

    extremely easy or extremely difficult items have; very

    little power to discriminate or differentiate among

    subjects

  • 8/8/2019 Validity of Measure Semi

    37/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Discrimination IndexDiscrimination Index

    The discrimination index (D) assesses an item's ability to

    discriminate

    if performance on a given item is a good predictor ofperformance on the overall measure, the item is said to

    be a good discriminator

  • 8/8/2019 Validity of Measure Semi

    38/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Discrimination IndexDiscrimination Index

    To determine the D value for a given item:

    1. Rank all subjects' performance on the measure by using total scores

    from high to low.

    2. Identify those individuals who ranked in the upper 25%.

    3. Identify those individuals who ranked in the lower 25%.

    4. Place the remaining scores aside.

    5. Determine the proportion of respondents in the top 25% who

    answered the item correctly (P u)

    6. Determine the proportion of respondents in the lower 25% who

    answered the item correctly (PL)

    7.Calculate D by subtracting PL from P u

    8. Repeat steps 5 through 7 for each item on the measure

  • 8/8/2019 Validity of Measure Semi

    39/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Discrimination IndexDiscrimination Index

    D values range from -1.00 to +1.00.

    D values. greater than +0.20 are desirable for a norm-

    referenced measure

    A positive D value is desirable and indicates that theitem is discriminating in the same manner as the total

    test

    A negative D value suggests that the item is not

    discriminating in the same way as the total test

  • 8/8/2019 Validity of Measure Semi

    40/77

    NORMNORM--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item Response ChartItem Response Chart

    Like D, the item-response chart assesses an item'sability to discriminate

    The respondents ranking in the upper and lower 25% areidentified as in steps 1 through 4 for determining D

    the two categories, high/low scorers andcorrect/incorrect for a given item.

    Chi square ; a value as large as or larger than 1.84 for achi square with one degree of freedom is significant atthe 0.05 level

    Mean a significant difference exists in the proportion ofhigh and low scorers who have correct responses. Itemsthat meet this criterion should be retained, while thosethat do not should be discarded or modified to improve

    their ability ' to discriminate.

  • 8/8/2019 Validity of Measure Semi

    41/77

  • 8/8/2019 Validity of Measure Semi

    42/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    VALIDITY ASSESSMENTVALIDITY ASSESSMENT

    The validity of a criterion-referenced measure can be

    analyzed to ascertain if the measure functions in a

    manner consistent with its purposes

    validity in terms of criterion-referenced interpretations

    relates to the extent to which scores result in the

    accurate classification of objects in regard to their

    domain status.

  • 8/8/2019 Validity of Measure Semi

    43/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    VALIDITY ASSESSMENTVALIDITY ASSESSMENT

    Three aspects ;

    Content validityContent validity

    Construct validityConstruct validity CriterionCriterion--related validityrelated validity

  • 8/8/2019 Validity of Measure Semi

    44/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    VALIDITY ASSESSMENTVALIDITY ASSESSMENT

    Content ValidityContent Validity

    Focus on the representativeness of acluster of items in relation to the specifiedcontent domain

    For a measure to provide a clear description of domainstatus, the content domain must be consistent with its

    domain specifications or objective

    prerequisite for all other types of validity

    a posterioricontent validity approach in criterion-referenced measurement uses content specialists toassess the quality and representativeness of the items

    within the test for measuring the content domain.

  • 8/8/2019 Validity of Measure Semi

    45/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    specialists should be conversant with the domain treatedin the measuring tool.

    two or more content specialists are employed

    item-objective congruence measure (item level)

    if more than one objective is used for a measure, the

    items that are meas-ures of each objective usually aretreated as separate tests when interpreting the results of

    validity assessments

  • 8/8/2019 Validity of Measure Semi

    46/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    Determination of Interrater Agreement

    Average Congruent/ Percentage

    Validity AssessmentValidity Assessment

  • 8/8/2019 Validity of Measure Semi

    47/77

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    Determination of InterraterAgreementDetermination of InterraterAgreement

    Content specialists are provided with the conceptualdefinition of the variable (s) to be measured with the setof items

    The content specialists then independently rate therelevance of each item to the specified content domain

    P0 0.80,

    K 0.25.

    The index of content validity (CVI)

  • 8/8/2019 Validity of Measure Semi

    48/77

    Validity AssessmentValidity Assessment

  • 8/8/2019 Validity of Measure Semi

    49/77

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    Determination of InterraterAgreementDetermination of InterraterAgreement

    IfP0 and K or either of these values is too low, one or acombination of two problems could be operating ;

    First, items lack homogeneity , ambiguous or is not well

    defined.

    E.g. (20 out of 30), 0.50 (15 out of 30), and 0.60 (18 out of 30).

    the majority of the item writers had at least one item that

    was judged not/somewhat relevant (1 or 2) by the threecontent specialists, then this would be support for lack ofclarity in the domain definition.

    Validity AssessmentValidity Assessment

  • 8/8/2019 Validity of Measure Semi

    50/77

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    Determination of InterraterAgreementDetermination of InterraterAgreement

    Second, the problem due to the raters , interpret the

    rating scale labels differently or used the rating scale

    differently

    E.g. 0.90 (27 out of 30), 0.93 (28 out of 30), and 0.93 (28

    out of 30).

    Each of the items judged to be unlike the rest had been

    prepared by one item writer. In this case the flaw is not

    likely to be in the domain definition as specified, but in

    the interpretations of one item writer.

    Validity AssessmentValidity Assessment

  • 8/8/2019 Validity of Measure Semi

    51/77

    Validity AssessmentValidity Assessment

    by Content Specialistsby Content Specialists

    Determination of InterraterAgreementDetermination of InterraterAgreement

    Refinement of the domain specifications is required if thefirst case.

    If the latter is the problem, the raters are given moreexplicit directions and guidelines in the use of the scale

    to reduce the chance of differential use. A clear and precise domain definition

    domain specifications function to communicate what theresults of measurements mean to those people whomust interpret them,

    what types of items and content should be included inthe measure to those people who must construct theitems.

  • 8/8/2019 Validity of Measure Semi

    52/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    Validity AssessmentValidity Assessment

    Average Congruent/ PercentageAverage Congruent/ Percentage

    Content specialists are judge the congruence of eachitem on a measure

    The proportion of items rated congruent by each judge is

    calculated and converted to a percentage.

    Then the mean percentage for all judges is calculated toobtain the average congruency percentage.

    E.g. if the percentages of congruent items for the judgesare 95,90,100, and 100%, the average congruencypercentage would be 96.25%.

    percent 90 safely considered acceptable

  • 8/8/2019 Validity of Measure Semi

    53/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Evidence of the content validity is not guarantee that themeasure is useful for its intended purpose.

    "we may say that a test's results are accurately

    descriptive of the domain of behaviors it is supposed tomeasure, it is quite another thing to say that the functionto which you wish to put a descriptively valid test isappropriate" (Popham, 1978, p. 159).

    the major focus of construct validation is to

    establish support for the measure's ability to accuratelycategorize phenomena in accordance with the purposefor which the measure being used.

    CRITERIONCRITERION REFERENCEDREFERENCED

  • 8/8/2019 Validity of Measure Semi

    54/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Experimental Methods and the Contrasted Groups

    Approach

    Decision Validity

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

  • 8/8/2019 Validity of Measure Semi

    55/77

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Experimental Methods and the Contrasted GroupsExperimental Methods and the Contrasted Groups

    ApproachApproach

    The basic principles and procedures for

    these two approaches are the same for

    criterion-referenced measures as for

    norm-referenced measures.

  • 8/8/2019 Validity of Measure Semi

    56/77

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Decision ValidityDecision Validity

    (1) a student may be allowed to progress to the next unitof instruc-tion if test results indicate that the precedingunit has been mastered.

    (2) a woman in early labor may be allowed to ambulateif the nurse assesses, on pelvic examination, that thefetal head is engaged (as opposed to unengaged) in thepelvis.

    (3) a diabetic patient may be allowed to go home if the

    necessary skills for self-care have been mastered

  • 8/8/2019 Validity of Measure Semi

    57/77

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Decision ValidityDecision Validity

    The measurements obtained from criterion-referencedmeasures are often used to make decisions.

    "Criterion-referenced tests have emerged as instrumentsthat provide data via which mastery decisions can bemade, as opposed to providing the decision itself(Hashway, 1998, p. 112).

    The decision validity of a measure is supported when theset standard (s) or criterion classifies subjects or objectswith a high level of confidence.

  • 8/8/2019 Validity of Measure Semi

    58/77

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Decision ValidityDecision Validity

    In most instances, two criterion groups are used to testthe decision validity of a measure (low and high )

    E.g.

    "by summing the percentage of who exceed theperformance standard and the percentage who did not"

    decision validity can range from 0 to 100%, with highpercentages reflecting high decision validity.

    Criterion groups for testing the decision validity of ameasure also can be created

    E.g.

  • 8/8/2019 Validity of Measure Semi

    59/77

    CONSTRUCT VALIDITYCONSTRUCT VALIDITY

    Approaches used to assess the construct validityApproaches used to assess the construct validity

    Decision ValidityDecision Validity

    Decision validity is influenced by

    the quality of the measure

    appropriateness of the criterion groups

    the characteristics of the subjects

    the level of performance or cut-scorerequired.

  • 8/8/2019 Validity of Measure Semi

    60/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    CriterionCriterion--Related ValidityRelated Validity

    Criterion-related validity studies of

    criterion-referenced measures are

    conducted in the same manner as for

    norm-referenced measures

  • 8/8/2019 Validity of Measure Semi

    61/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    content specialists' ratings holds the most

    merit for assessing item validities for

    determining which items should be

    retained or discarded

    empirical item-discrimination indices

    should be used primarily to detect aberrantitems in need of revision or correction

  • 8/8/2019 Validity of Measure Semi

    62/77

    Empirical Item-Analysis

    Procedures

    Criterion-referenced item-analysis procedures determine

    the effectiveness of a specific test item to discriminate

    subjects who have acquired the target behavior and

    those who have not.

  • 8/8/2019 Validity of Measure Semi

    63/77

    Two approaches are used for item analysis

    procedures

    (1) the criterion-groups technique, which also

    may be referred to as the uninstructed-instructedgroups approach

    (2) pretreatment/post-treatment measures

    approach, which in appropriate instances may

    be called the preinstruction/postinstructionmeasurements approach.

  • 8/8/2019 Validity of Measure Semi

    64/77

    Advantage and disadvantage

    The criterion-groups technique is highly practical

    difficulty of defining criteria for identifyinggroups. Another is the requirement of

    equivalence of groups

    Pretreatment/post-treatment measuresapproach allowing analysis of individual as well

    as group gains. impracticality , the amount of time that may be

    required, potential problem with testing effect,

  • 8/8/2019 Validity of Measure Semi

    65/77

    CRITERIONCRITERION--REFERENCEDREFERENCED

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Three item-analysis procedures

    are :

    (1) Item-Objective or Item-Subscale Congruence (2) Item Difficulty

    (3) discrimination index

  • 8/8/2019 Validity of Measure Semi

    66/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    ItemItem--Objective orObjective or

    ItemItem--Subscale CongruenceSubscale Congruence

    provides an index of the validity of an item based on theratings of two or more content specialists

    In this method content specialists are directed to assigna value of+1,0, or -1 for each item

    an item definitely measure the objective or subscale, avalue of +1 is assigned.

    A rating of 0 indicates that the judge is undecided about

    the item.

    The assignment of a -1 rating reflects a definitejudgment that the item is not a measure of the objectiveor sub-scale.

  • 8/8/2019 Validity of Measure Semi

    67/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    ItemItem--Objective orObjective or

    ItemItem--Subscale CongruenceSubscale Congruence

    The limits of the index range from -1.00 to +1.00.

    An index of +1.00 will occur when perfect positive item-objective or subscale congruence exists, that is, when all

    content specialists assign a +1 to the item for its relatedobjective or subscale and a 1 to the item for all otherobjectives or subscales that are measured by the tool.

    An index of -1.00 represents the worst possible value ofthe index and occurs when all content specialists assigna -1 to the item for what was expected to be its relatedobjective or subscale and a +1 to the item for all otherobjectives or subscales.

  • 8/8/2019 Validity of Measure Semi

    68/77

  • 8/8/2019 Validity of Measure Semi

    69/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    ItemItem--Objective orObjective or

    ItemItem--Subscale CongruenceSubscale Congruence

    does not depend on the number of content specialists usedor on the number of objectives measured by the test orquestionnaire.

    the tool must include more than one objective orsubscale in order for this procedure to be used.

    cut-off score derived by the test developer.

    done by creating the poorest set of content specialists'ratings

    Below cut-off score ; nonvalid; discarded from themeasure or ana-lyzed and revised to Improve theirvalidity.

    above cut-off score are considered valid.

  • 8/8/2019 Validity of Measure Semi

    70/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DifficultyItem Difficulty

    the purpose is to examine the difficulty level of items andcompare them between criterion groups

    The approaches to calculating item p levels and their

    interpretation was discussed

    The item p level should be higher for the group that isknown to possess more of a specified trait or attributethan for the group known to possess less

  • 8/8/2019 Validity of Measure Semi

    71/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    The focus on the measurement of performance changes

    (e.g., pretest/posttest) or differences (e.g.,

    experienced/inexperienced) between the criterion

    groups.

    referred to as D

    is directly related to the property of decision validity,

    Items with high positive discrimination indices improve

    the decision validity of a test.

  • 8/8/2019 Validity of Measure Semi

    72/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    Criterion groups difference index (CGDI)

    Pre/post treatment measurements

    approach indices

  • 8/8/2019 Validity of Measure Semi

    73/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    criterion groups difference index (CGDI) is

    the proportion of respondents in the group

    known to have less of the trait or attribute

    of interest who answered the itemappropriately or correctly subtracted from

    the proportion of respondents in the group

    known to possess more of the trait orattribute of interest who answered it

    correctly.

  • 8/8/2019 Validity of Measure Semi

    74/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    Pretreatment/post treatment measurements approach

    Three item-discrimination indices are

    (1) pretest/posttest difference.

    (2) individual gain.

    (3) net gain.

  • 8/8/2019 Validity of Measure Semi

    75/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    The pretest/posttest difference index (PPDI) is theproportion of respondents who answered the itemcorrectly on the posttest minus the proportion whoresponded to the item correctly on the pretest

    The individual gain index (IGI) is the pro-portion ofrespondents who answered the item incorrectly on thepretest and correctly on the posttest

    The net gain index (NGI) is the proportion ofrespondents who answered the item incorrectly on both

    occasions subtracted from the IGI.

  • 8/8/2019 Validity of Measure Semi

    76/77

  • 8/8/2019 Validity of Measure Semi

    77/77

    ITEMITEM--ANALYSIS PROCEDURESANALYSIS PROCEDURES

    Item DiscriminationItem Discrimination

    NGI provides the most conservative estimate of itemdiscrimination and uses more information.

    The range of values for each of the indices discussed

    above is -1.00 to +1.00

    except for IGI, which has a range of 0 to +1.00.

    A high positive index for each of these itemdiscrimination indices is desirable.