Why Paul, not Peter? - CPHR BC · Peters score on the test is 64. We draw a line from his test...

Why Paul, not Peter? The Science of Candidate & Employee Assessment

Presented by

Drs. Larry Stefan, Jane Gayton, & Bjorn Leiren Stefan, Fraser & Associates Inc

&

Mr. Len Garis, Fire Chief Fire Services, City of Surrey

Introduction

Presenters

Introductions

Dr. Larry Stefan

Dr. Jane Gayton

Dr. Bjorn Leiren

Chief Len Garis

Stefan, Fraser & Associates Inc. 1

Stefan Fraser & Associates

Industrial Organizational psychologists

Longstanding experience with diverse local, national and international organizations

Specializing in:

Individual management assessment

Selection systems (Chief Garis to present)

Test development

Experts in arbitration on selection


Introduction

What are common methods to assess talent for hire or promotion?

What works?

Evidence of effectiveness for assessment practices

Best professional, defensible practices in testing and assessment


Short History of Testing

Roman Army – interview, ability to read and write, height restriction

200 BC Chinese civil service exam

Late 1800’s Sir Francis Galton – differential psychology and early statistics

WWI – Army Alpha and Beta


History Cont’d

Thurstone – 1931 factor analysis

WWII – start of job analysis and assessment centers

1950’s – biodata

1960’s – US Civil Rights Act (1964)

1970’s – Schmidt & Hunter – validity generalization


History Cont’d

1980’s Canadian Charter of Rights and Freedoms

1990’s – more pivotal work by Schmidt & Hunter

2000’s – explosion of internet testing and big data

Newest trends – online simulations


Introduction- Value of Assessment

BAD HIRES ARE EXPENSIVE


Introduction- Our Hopes & Dreams

1. Familiarize yourself with best professional practice in workplace psychometric testing and assessment

2. Gain a better appreciation of the benefits and pitfalls of workplace testing

3. Equip yourself with information to ask the “right” questions as you contemplate and implement testing and assessment programs in your organization

4. Learn about a “real” organization’s experience with implementing an entry-level testing program


Assessment vs Testing

Assessment “A process of measuring a person’s knowledge, skills, abilities and personal style to evaluate characteristics and behaviour that are relevant to (predictive of) successful job performance.”

Jenneret & Silzer, 1998

“An integration of varied information about individuals, their backgrounds, the anticipated position requirements and the organizational circumstances in order to make meaningful descriptions and predictions about them.”

McPhail & Jenneret, 2012


Assessment vs Testing

Test

“Any procedure (for example, ability test, structured interview, work sample) used to measure an individual’s employment or career-related qualifications and interests.”

U.S. Department of Labor, Testing and Assessment: An Employer’s Guide to Good Practices, 2000


What Are Psychometric Tests Used For?


Who is the best candidate?

Who shows potential?

What are this person’s development needs?

How should training be targeted?

Types of Assessments

1. Screening Assessments

Can be used with individuals or groups

More often used with non-managerial populations to “select in” to next steps in a process

With managerial and non-managerial groups to ascertain training (or coaching) needs or understand how teams work together


Types of Assessments

2. Comprehensive Individual Assessments

For management positions or for key positions (e.g., technical or professional)

Used in selection, promotion, succession, and development contexts



Professional/Best Practices

What is the purpose of the

process?

How should I measure?

What should I measure?

How do I interpret test score results?

How do I integrate results?

How do I make

decisions?

Organizational considerations

Implementation of Testing/Assessment

Before anything, some considerations: Organizational objectives

Position of testing in the overall process

In-house or “out source”

Proctored or un-proctored

Web-based or paper-and-pencil

Collective agreement

Who has access to results

How/where are the data stored

Do applicants/participants get feedback

“Shelf-life” of results

Test re-write policy


“What” to Measure?

What “should” be measured?

Depends on the purpose of the testing

Qualities to be measured must:

relate as directly as possible to the personal and technical qualities that are required to be successful on the job (incl. in the organization)

connect to the organization’s vision, purpose and strategic objectives

Well-designed testing and assessment initiatives reflect and promote strategic HR management


“What” to Measure?

Great 8 Competency Model (Bartram, 2005)

1. Leading and deciding

2. Supporting and cooperating

3. Interacting and presenting

4. Analyzing and interpreting

5. Creating and conceptualizing

6. Organizing and executing

7. Adapting and coping

8. Enterprising and performing


“What” to Measure?: GMA

“Intelligence is a very general mental ability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. It is not merely book learning, a narrow academic or test-taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings- ‘catching on’, ‘making sense’ or things or ‘figuring out’ what to do.”

Gottfredson, 1997



Abilities & aptitudes General mental ability (GMA or ‘g’)

Specific aptitudes (‘s’): verbal reasoning, numerical reasoning, abstract reasoning, mechanical reasoning, critical thinking etc.

Referred to as the “can-do” factors



‘g’ is the best predictor of performance across jobs and job levels

Predictive validity coefficient increases with the complexity of the job

*Understand that GMA tests can demonstrate adverse impact (risk for discrimination against protected groups)


“What” to Measure?: Personality

“Personality refers to relatively enduring patterns of thoughts, ideas, emotions, and behaviors that are generally consistent over situations and time, and that distinguish individuals from each other. ”

Barrick & Mount, 2010



1. Measures of normal personality Multidimensional inventories of traits (e.g., HPI, OPQ)

2. Aggregated measures of normal personality (aka “Personality at Work”)

Measure higher-order constructs such as “customer service”, “conflict style”, “leadership”, “emotional intelligence”, “counter-productive work behaviour”

3. Typologies Measures of personality “types” (e.g., MBTI, Insights)

(*the “will-do” factors)



Five Factor Model (FFM) aka “The Big 5” N- Neuroticism (emotional stability)

e.g., positive self-concept, not anxious

E - Extraversion

e.g., social, assertive

O- Openness to Experience

e.g., tolerant, inquisitive, open to change and novelty

A- Agreeableness

e.g., cooperative, considerate, compliant

C- Conscientiousness

e.g., dependable, achievement-oriented, organized, careful



A few words about some of the familiar (well-marketed) “personality at work” constructs…

“Emotional Intelligence”- arguably not much more than ‘g’, + some Big 5 dimensions (e.g., emotional stability and agreeableness)

Typologies (e.g., based on Jungian theory) have not been shown to predict job performance and research recommends that they should not be used for selection

Research evidence that “integrity” testing (CWB) adds predictive information beyond measuring traits


“What” to Measure?: Interests

Indicate the “fit” to a particular career path, specific position, to the overall organization

Do not predict job performance but good indicators of job satisfaction and tenure in a job

Best used in development, coaching, succession planning contexts


How to Measure: Quick Points on Administration*

Obtain informed consent to (a) testing and (b) to the release of information

Follow the standardized administration procedures

Ensure that the test items and test materials are secure

Establish options to increase accessibility (accommodate)

Create respectful candidate experience (*See SIOP reference in handout)



Choosing Tests: Determining Test Quality

Key indicators

Reliability

0.8 < Good

0.8 > Acceptable > 0.7

0.7 > Marginal > 0.6

0.6 > Unacceptable

Validity Single measure

0.35 < Good

0.35 > Useful > 0.20

Multiple measures

0.35 < Minimal

0.35 < Typical < 0.5

0.50 < Good

Utility (ROI) Min. Wage: $1,500 per hire per year

$60,000: $4,000 per hire per year

Standardized (appropriate norms available)


Typical Test Validities

0.02 - Graphology

0.10 - Years of Education

0.15 - Training & Education

0.15 - Unstructured Interviews

0.18 - Years of Experience

0.20 - Reference Checks*

0.35 - Structured Interviews

0.35 - Personality Test

0.40 - Assessment Centres

0.45 - General Mental Ability Tests

0.60 - Multi-test Profiling Systems

The Science of Selection


The science requires that individual job performance and competence be measured.

Test scores come from questionnaires but are also generated by application forms, reference checks, interviews, etc.

NB: The legal definition of a test is “any basis on which a hiring decision is made.”



Performance and test scores can be represented in a graph as the intersection of their values.



This particular scatter plot illustrates a relationship between perform-ance and test scores of r = 0.70 – when there is no correspondence between the measures, the value of r = 0; when the correspondence is exact, the value of r = 1,



Knowing the correspondence between the test and performance scores enables us to predict a candidate’s future performance based on his or her current test score.

To do so, we identify a “regression line” based on the correlation coefficient.



Having identified the “regression line” that maximizes the accuracy of our predictions of candidates’ future job performance, we can use it to make our selection decisions.



Peter’s score on the test is 64.




We draw a line from his test score to the regression line …




We draw a line from his test score to the regression line … and, from the point where we hit the regression line, we draw a second line to the perform-ance axis.

Our prediction is that Peter’s future performance will be 4.2 on this scale.



Repeating the process for Paul, we predict that his future performance will be 3.3.



In an employment setting, there is an level of perform-ance that employees are expected to demonstrate and maintain in order to be considered a successful hire.



And, when hiring employees, there is typically a “cut score” on the test that candidates must exceed in order to “pass” and be hired.



Here we see the performance standard and the “cut score” superimposed on the points in the scatter plot.



Candidates who score above the cut-off and perform above the job standard can be described as “hits” … as would those falling below the cut-off who would have performed below the job standard had they been hired.



Similarly, candidates who score above the cut-off but perform below the job standard can be described as “misses” … as can those falling below the cut-off who would have performed above the job standard had they been hired.



Similarly, candidates who score above the cut-off but perform below the job standard can be described as “misses” … as can those falling below the cut-off who would have performed above the job standard had they been hired.

Hit rate = 80%



Back to the title of this presentation, we predict that Peter will perform above standard should he be hired and that Paul will perform below standard.

And that’s why Peter and not Paul!

Value of Improved Selection

Where does the value come from?


If candidates are hired without valid screening, their average job performance would be about 3.4 on the measurement scale used.


When a cut-score is applied, those scoring below the standard are eliminated and the average performance of those hired increases – in this case to about 4.0


Where does the value come from?



What ROI can one reasonably expect?

Assumptions: Current process validity = 0.20Wage $10.45/hour (new BC Minumum)200 hires per year

0.4 0.5 0.6 0.790% $12,200 $46,200 $80,000 $114,00075% $80,600 $154,200 $228,000 $301,80050% $177,400 $316,200 $454,800 $593,80025% $242,000 $463,000 $684,000 $905,000

Cost of Process: old = $50/candidate; new = $300/candidate

% ofApplicants

Hired

Replace Process ValidtyAnnual Utilityfor 200 EmployeesEarning Min. Wage

Choosing Test Instruments:

Has the test been used in similar circumstances (relevant norm groups)

Is it available in the preferred format

Length (time) considerations

Supervision required

Location and time period of data storage

Test fairness

Meet professional standards of confidentiality, item security and data integrity


Choosing Tests: Sample Reputable Publishers*

Wonderlic

SHL/CEB

IPAT

MHS

Pearson Assessments

PsyCor

CPP

PSI

Psychometrics Canada

Hogan Assessment Systems

Sigma Assessment Systems *not exhaustive

*not exhaustive


Choosing Tests

Expect professional test publishers to:

Provide comprehensive documentation (including administration and technical materials)

Report acceptable psychometric standards

Restrict tests to qualified users

Provide training, certification (as appropriate)

Provide support (technical and consultation)


A-level products do not require any specific qualifications.

e.g., Differential Aptitude Tests; WPT, WGCT suite

B-level products require that the user has completed graduate-level

courses in tests and measurement at a university or has received equivalent documented training.

e.g., HPI, OPQ32, 16PF, EQ/EI tests

C-level products require fulfillment of B-level qualifications, and

users must have training and/or experience in the use of tests, and must have completed an advanced degree in an appropriate profession (e.g., psychology, psychiatry) and/or a license to practice as a psychologist.

e.g., CPI, MMPI-2, WAIS-IV

Choosing Tests: Test User Qualifications


Scoring Instruments

Review data input for errors

Where possible, use online scoring or computerized scoring programs

Randomly compare results with data input (routine audits for accuracy)


Interpreting Results

Review the test manual or obtain consultation from the publisher

Consider limitations to the results (e.g., indicators of measurement error) Do not over-interpret small raw score differences between

candidates

Consider factors that may have impacted results

Use appropriate normative groups for comparison

Understand the metric in which the results are expressed (e.g., percentiles, stens, “standard scores” etc.)


Making Decisions: Combining Information

Can be challenging with data from multiple sources and, potentially, conflicting results for each individual

1. Multiple Hurdle

Individual must “pass” each step before moving to the next one

“Hurdles” organized from the least to most resource-intensive

2. Compensatory Model

Individual’s performance on all process elements is evaluated

Individuals are then compared against one another


Communicating Decisions

Good practice is to provide feedback on the assessment (test) results to candidates


Limitations & Pitfalls

Not having a clear purpose for the testing (assessment)

Not understanding the job or business objective

Not planning the program (before you start testing)

Not choosing well-designed instruments that meet professional standards

Not understanding the limitations of test scores and related concepts, expecting “too much” from a test

Relying on score(s) from a single test to make significant decisions

Over-reliance on a single test score (when there are many others in the profile) for decision-making

Operating beyond level of expertise


Legal Consideration

Legal discrimination

No protected group is discriminated against

A protected group is discriminated against BUT the basis is of discrimination is a BFOR or Business Necessity

Illegal discrimination

A protected group is discriminated against AND the basis of discrimination is NOT a BFOR or Business Necessity

A protected group is discriminated against and the basis is of discrimination is a BFOR or Business Necessity BUT a Reasonable Accommodation has not been attempted



Entry Level (Profiling Systems)

Designed to handle large numbers of applicants

Best positioned early in the applicant screening process Validity: 0.45 – 0.75 for job performance

Adverse Impact: typically little or none but it depends on the specific dimensions assessed

Development costs: moderate

On-going costs: low

Utility (ROI): high; costs are typically recouped within the first year of use

20 30 40 50 60 70 80

Final Score

Aptitude Average

Temperament Average

Interests Average

General Learning Ability

Mechanical Aptitude

Desire to Learn

Teamwork

Getting Along with Others

Stress Resistance

Responsibility

Courage

Activity Level

Cleanliness

Socialization

Medical Interest

Construction Interest

Group Average Candidate


Profiling Systems – Surrey Fire

Predictive validity = 0.55

ROI

$12,055/hire/year

20 hires per year = $241,110/year

A CLIENT’S POINT OF VIEW

Mr. Len Garis, Fire Chief

City of Surrey


QUESTIONS?


THANK YOU FOR TODAY!


References & Resources

Bertram, D. (2005). The Great Eight competencies: a criterion-centric approach to validation. J. App. Psychology.

Testing and Assessment: An Employer’s Guide to Good Practices https://www.onetcenter.org/dl_files/empTestAsse.pdf

Datner, B. (2013). How to Use Psychometric Testing in Hiring. HBR https://hbr.org/2013/09/how-to-use-psychometric-testin/

Revised Standards for Educational and Psychological Testing (2014) http://www.apa.org/science/programs/testing/standards.aspx

**What We Know about Applicant Reactions on Attitudes and Behavior: Research Summary and Best Practices http://www.siop.org/WhitePapers/White%20Paper%20Series%2020112012ApplicantReactions.pdf

BPS Code of Good Practice for Psychological Testing & other helpful practice guidelines https://ptc.bps.org.uk/ptc/guidelines-and-information

http://ptc.bps.org.uk/sites/ptc.bps.org.uk/files/Documents/Guidelines%20and%20Information/International%20Guidelines%20for%20Test%20Use.pdf

Rights of test takers http://www.apa.org/science/programs/testing/rights.aspx


https://www.onetcenter.org/dl_files/empTestAsse.pdf

https://ptc.bps.org.uk/ptc/guidelines-and-information





http://ptc.bps.org.uk/sites/ptc.bps.org.uk/files/Documents/Guidelines and Information/International Guidelines for Test Use.pdf

http://ptc.bps.org.uk/sites/ptc.bps.org.uk/files/Documents/Guidelines and Information/International Guidelines for Test Use.pdf

http://www.apa.org/science/programs/testing/rights.aspx

References & Resources

Psychometric Test Reviews

http://buros.org/test-reviews-information

https://ptc.bps.org.uk/test-registration-test-reviews

Association of Test Publishers (ATP) http://www.testpublishers.org/


http://www.hrcosting.com/hr/















http://www.testpublishers.org/



Why Paul, not Peter? - CPHR BC · Peters score on the test is 64. We draw a line from his test...

Documents

Transcript of Why Paul, not Peter? - CPHR BC · Peters score on the test is 64. We draw a line from his test...