Characteristics of effective tests and hiring

57
Characteristics of Effective Tests and Hiring TeAm Unicorn Poof Jan Augustine Paterno- Yamsuan The “i” in teAm is hidden in the big “A”.

description

I/O Psych

Transcript of Characteristics of effective tests and hiring

Page 1: Characteristics of effective tests and hiring

Characteristics of Effective Tests and HiringTeAm Unicorn PoofJan Augustine Paterno-YamsuanThe “i” in teAm is hidden in the big “A”.

Page 2: Characteristics of effective tests and hiring

Test

Refers to any technique used to evaluate someone.

Employment tests include such methods as references, interviews and assessment centers.

Page 3: Characteristics of effective tests and hiring

4 characteristics of Effective selection techniques Reliable Valid Cost-Efficient Legally defensible

Page 4: Characteristics of effective tests and hiring

Reliability

Extent to which a score from a test or from an evaluation is consistent and free from error.

Determined in four ways: test-retest, alternate-forms, internal, and scorer reliability.

Page 5: Characteristics of effective tests and hiring

Test-retest reliability

Extent to which repeated administration of the same test will achieve similar results.

Scores from the first administration of the test are correlated with scores from the second to determine whether they are similar.

Temporal stability: consistency o test scores across time.

Page 6: Characteristics of effective tests and hiring

Test-retest reliability

Time interval should be long enough so that specific test answers have not been memorized, but short enough so that the individual has not changed significantly [EG: an administration of a personality inventory]

Typical time intervals range from 3 days-3 months.

Page 7: Characteristics of effective tests and hiring

Test-retest reliability

Longer interval= lower reliability coefficient Common test-retest reliability coefficients

for tests used is .86 (Hood,2001)

Page 8: Characteristics of effective tests and hiring

Alternate-Forms Reliability

Extent to which two forms of the same test are similar

Counterbalancing- method of controlling for order effects by giving half of a sample Test A first, followed by Test B, and giving the other half of the sample Test B first, followed by Test A.

Page 9: Characteristics of effective tests and hiring

Alternate-Forms Reliability

Scores on forms A and B are then correlated to determine whether they are similar. If yes, then the test has form stability

Form Stability- extent to which the scores on two forms of a test are similar.

Page 10: Characteristics of effective tests and hiring

Alternate-Forms Reliability

Why use this method? To prevent cheating.

Time interval should be as short as possible.

Average correlation between alternate-forms of tests is .89 (Hood,2001)

Page 11: Characteristics of effective tests and hiring

Internal Reliability

Internal consistency- extent to which similar items are answered in similar ways. Measures item stability

Item stability- extent to which responses to the same tests are consistent

Longer test= higher internal consistency Example: Test with 5 items VS Test with 20

items.

Page 12: Characteristics of effective tests and hiring

Internal Reliability

Item homogeneity- extent to which test items measure the same construct.

The more homogenous the items=higher internal consistency.

3 methods to determine internal consistency: split-half, coefficient alpha, and K-R 20 (Kuder-Richardson formula 20)

Page 13: Characteristics of effective tests and hiring

Split-Half method

Form of internal reliability in which the consistency of item responses is determined by comparing scores on half of the items with scores on the other half of the items.

Odd-numbered items in one group, even-numbered items in another.

Scores of the 2 groups are then correlated

Page 14: Characteristics of effective tests and hiring

Split-Half method

Spearman-Brown prophecy formula- used to correct reliability coefficients resulting from the split-half method.

Page 15: Characteristics of effective tests and hiring

Cronbach’s Coefficient Alpha

A statistic used to determine internal reliability of tests that use interval or ratio scales.

Page 16: Characteristics of effective tests and hiring

K-R 20

Statistic used to determine internal reliability of tests that use items with dichotomous answers. [yes/no, true/false]

Page 17: Characteristics of effective tests and hiring

Scorer reliability

Extent to which two people scoring a test agree on the test score, or extent to which a test is scored correctly.

When human judgement of performance is involved, scorer reliability is discussed in terms of interrater reliability.

Page 18: Characteristics of effective tests and hiring

Evaluating the reliability of a test

Consider the magnitude of the reliability coefficient and the people who will be taking the test.

Page 19: Characteristics of effective tests and hiring

Validity

Degree to which inferences from test scores are justified by the evidence.

Reliability has a necessary but not sufficient relationship with validity.

5 common strategies to investigate validity of scores on a test: content, criterion, construct, face, and known-group.

Page 20: Characteristics of effective tests and hiring

Content Validity

Extent to which tests or test items sample the content that they are supposed to measure.

Page 21: Characteristics of effective tests and hiring

Criterion Validity

Extent to which a test score is related to some measure of job performance.

Criterion- measure of job performance, such as attendance, productivity, or a supervisor rating.

Criterion validity is established using one of two research designs: concurrent or predictive.

Page 22: Characteristics of effective tests and hiring

Criterion Validity

Concurrent validity- correlates test scores with measures of job performance for employees currently working for an organization.

Predictive Validity- test scores of applicants are compared at a later date with a measure of job performance.

Page 23: Characteristics of effective tests and hiring

Criterion Validity

Concurrent design is weaker than predictive because of the homogeneity of performance scores.

Restricted range- narrow range of performance scores that make it difficult to obtain a significant validity coefficient

Validity generalization- inferences from test scores from one organization can be applied to another organization

Page 24: Characteristics of effective tests and hiring

Criterion Validity

Research has indicated that a test valid for a job in one organization is also valid for the SAME job in another organization

Synthetic validity- form of VG in which validity is inferred on the basis of a match between job components and tests previously found valid for those job components.

Page 25: Characteristics of effective tests and hiring

Criterion Validity

Key difference between VG and SV is that in VG we are trying to generalize the results of studies conducted on a particular job to the same job at another organization. SV tries to generalize the results of studies of different jobs to a job that shares a common component

Page 26: Characteristics of effective tests and hiring

Construct Validity

Extent to which a test actually measures the construct that it purports to measure.

Construct validity is concerned with inferences about test scores; content validity is concerned with inferences about test construction.

Page 27: Characteristics of effective tests and hiring

Construct Validity

Construct validity is usually determined by correlating scores on a test with scores from other tests.

Convergent validity- tests that measure the same construct

Discriminant validity- tests that do not measure the same construct

Page 28: Characteristics of effective tests and hiring

Construct Validity

Known-group validity- form of validity in which test scores from two contrasting groups “known” to differ on a construct are compared.

If known groups do not differ on test scores, test is invalid.

If known groups differ, validity is still unknown.

Page 29: Characteristics of effective tests and hiring

Face validity

Extent to which a test appears to be valid Face-valid tests result in high levels of test-

taking motivation. One down side is that it is tempting to fake

answers. Barnum statements- statements that are so

general that they can be true of almost anyone.

Page 30: Characteristics of effective tests and hiring

MMY

Mental measurements yearbook- book containing information about the reliability and validity of various psychological tests.

Page 31: Characteristics of effective tests and hiring

Cost-efficiency

Choose the cheaper and easier to administer test without compromising validity and reliability.

Computer-adaptive testing (CAT)- type of test taken on a computer in which the computer adapts the difficulty of questions asked to the test-taker’s success in answering previous questions.

Page 32: Characteristics of effective tests and hiring

Taylor-Russell Tables

Series of tables based on the selection ratio, base rate, and test validity that yield information about the percentage of future employees who will be successful if a a particular test is used.

A test will be useful to an organization if 1) test is valid, 2) organization can be selective in its hiring because it has more applicants than openings, and 3) there are plenty of current employees who are not performing well, thus there is room for improvement.

Page 33: Characteristics of effective tests and hiring

Taylor-Russell Tables

First piece of information needed is a test’s criterion validity coefficient which can be obtained in two ways.

Best way would be to conduct a criterion validity study with test scores correlated with some measure of job performance.

Use VG. The higher the validity coefficient, the greater the

possibility the test will be useful.

Page 34: Characteristics of effective tests and hiring

Taylor-Russell Tables

Second piece of information that must be obtained is the Selection ratio.

Selection ratio- percentage of applicants an organization hires.

Formula: SR= number hired/ number of applicants.

Lower selection ratio= greater potential usefulness of the test.

Page 35: Characteristics of effective tests and hiring

Taylor-Russell Tables

Final piece of information needed is the base rate of current performance

Base rate- percentage of current employees who are considered successful.

Base rate can be obtained in two ways.

Page 36: Characteristics of effective tests and hiring

Taylor-Russell Tables

First method is simple but least accurate. Split employees in two equal groups

based on their scores on some criterion. Base rate using this method is always .50

because one half of the employees are considered satisfactory.

Page 37: Characteristics of effective tests and hiring

Taylor-Russell Tables

Second method is to choose a criterion measure score above which all employees are considered successful.

After validity, selection ratio, and base rate figures have been obtained, consult the Taylor-Russell tables.

Page 38: Characteristics of effective tests and hiring
Page 39: Characteristics of effective tests and hiring

Proportion of correct decisions

Utility method that compares the percentage of times a selection decision was accurate with the percentage of successful employees.

Easier to do, but less accurate than Taylor-Russell tables.

Only information needed is employee test scores and the scores on the criterion

Page 40: Characteristics of effective tests and hiring

Proportion of correct decisions

The two scores are graphed on a chart. Lines are drawn from the point on the Y-

axis ( criterion score) that represents a successful applicant, and from the X-axis that represents the lowest score of a hired applicant.

Page 41: Characteristics of effective tests and hiring

Proportion of correct decisions

Quadrant I- employees who scored poorly on the test and were successful on the job

Quadrant II- employees who scored well on the test and were successful on the job

Quadrant III- employees who scored well on the test yet did poorly on the job

Quadrant IV- employees who scored low on the test and did poorly on the job.

Page 42: Characteristics of effective tests and hiring

Proportion of correct decisions

To estimate a test’s effectiveness, the number of points in each quadrant is totaled, and the following formula is used: points in Quadrants II and IV / total points in all quadrants.

The quotient represents the percentage of time that we expect to be accurate in making a selection decision in the future.

Page 43: Characteristics of effective tests and hiring

Proportion of correct decisions

To determine whether this is an improvement, we use the following formula: points in Quadrants I and II / total points in all quadrants.

If percentage from first formula is higher than that from the second, proposed test should increase selection accuracy. If not, stick to selection method currently used.

Page 44: Characteristics of effective tests and hiring

Lawshe Tables

Uses base rate, test validity, and applicant percentile on a test to determine the probability of future success for that applicant.

Page 45: Characteristics of effective tests and hiring

Brogden-Cronbach-Gleser Utility formula Method of ascertaining the extent to which an organization will

benefit from the use of a particular selection system. To use this formula, 5 items of information must be known. Number of employees hired per year (n) Average tenure (t)- average amount of time employees in the

position tend to stay with the company. Number is computed by using information from company records to identify the time that each employee in that position stayed with the company. Number of years of tenure for each employee is then summed and divided by the total number of employees.

Page 46: Characteristics of effective tests and hiring

Brogden-Cronbach-Gleser Utility formula Test validity (r)- this figure is the criterion

validity coefficient that was obtained through either a validity study or VG.

Standard deviation of performance in dollars (SDy) – 40% of employee’s annual salary. Total salaries of current employees in the position in question should be averaged.

Page 47: Characteristics of effective tests and hiring

Brogden-Cronbach-Gleser Utility formula Mean standardized predictor score of selected

applicants (m)- can be obtained in one of two ways. 1)obtain average score on the selection test for both the applicants who are hired and the applicants who are not hired. Average test score of the nonhired applicants is subtracted from the average test score of the hired applicants. Difference is divided by the standard deviation of all test scores.

Page 48: Characteristics of effective tests and hiring

Brogden-Cronbach-Gleser Utility formula 2) compute the proportion of applicants who are

hired and then use a conversion table to convert the proportion into a standard score. This method is used when an organization plans to use a test and knows the probable selection ratio based on previous hirings, but does not know the average test scores because the organization has never used the test.

Page 49: Characteristics of effective tests and hiring

Determining the fairness of a test

Measurement bias- group differences in test scores that are unrelated to the construct being measured

Adverse impact- employment practice that results in members of a protected class being negatively affected at a higher rate than members of the majority class. Adverse impact is usually determined by the four-fifths rule.

Page 50: Characteristics of effective tests and hiring

Determining the fairness of a test

Predictive bias- situation in which the predicted level of job success falsely favors one group over another

Single-group validity- characteristic of a test that significantly predicts a criterion for one class of people but not for another

Differential validity- characteristic of a test that significantly predicts a criterion for two groups, such as both minorities and nonminorities, but predicts significantly better for one of the two groups.

Page 51: Characteristics of effective tests and hiring

Making the hiring decision

Multiple regression- statistical procedure in which the scores from more than one criterion-valid test are weighted according to how well each test score predicts the criterion

Linear approaches to hiring usually take one of four forms: unadjusted top-down selection, rules of three, passing scores, or banding.

Page 52: Characteristics of effective tests and hiring

Unadjusted top-down selection

Selecting applicants in straight rank order of their test scores.

Advantage: organization will gain the most utility (Schimdt, 1991)

Disadvantage: can result in high levels of adverse impact and it reduces an organization’s flexibility to use nontest factors such as references or organizational fit.

Page 53: Characteristics of effective tests and hiring

Unadjusted top-down selection

Compensatory approach- method of making selection decisions in which a high score on one test can compensate for a low score on another test.

To determine whether a score on one test can compensate for a score on another, multiple regression is used in which each test score is weighted according to how well it predicts the criterion.

Page 54: Characteristics of effective tests and hiring

Rule of three

Variation on top-down selection in which the names of the top three applicants are given to a hiring authority who can then select any of the three.

Page 55: Characteristics of effective tests and hiring

Passing scores

Minimum test score that an applicant must achieve to be considered for hire. A means for reducing adverse impact and increasing flexibility.

Multiple-cutoff strategy – selection strategy in which applicants must meet or exceed te passing score on more than one selection test.

Page 56: Characteristics of effective tests and hiring

Passing scores

Multiple-hurdle approach – selection practice of administering on test at a time so that applicants must pass that test before being allowed to take the next test.

Page 57: Characteristics of effective tests and hiring

Banding

Statistical technique based on the standard error of measurement that allows similar test scores to be grouped.

Standard error (SE) – number of points that a test score could be off due to test unreliability