Evaluation of usability tests

25
Evaluation of usability tests

description

Evaluation of usability tests. Why evaluate?. choose the most suitable data-collection techniques identify methodological strength and weaknesses of a user test . Evaluation Criteria for data-collection techniques. Utility how useful are the data? Costs resources needed? Objectivity - PowerPoint PPT Presentation

Transcript of Evaluation of usability tests

Page 1: Evaluation of usability tests

Evaluation ofusability tests

Page 2: Evaluation of usability tests

Why evaluate?1. choose the most suitable data-

collection techniques2. identify methodological strength

and weaknesses of a user test

Page 3: Evaluation of usability tests

Evaluation Criteria fordata-collection techniques Utility

how useful are the data? Costs

resources needed? Objectivity

how much subjective judgement is involved? Level of detail

is the amount and resolution of the data suitable? Intrusiveness

does the method interfere with the user’s performance?

Page 4: Evaluation of usability tests

Observations in real timeStrengths: Level of detail:

Allows you to experience the context in which performance takes place

Weaknesses: Level of detail:

Difficult to keep up with the pace of the user

Objective: Based on your own subjective judgement as an observer

Page 5: Evaluation of usability tests

Observations from videoStrengths: Utility: Allows you to

conduct detailed analysis of various usability attributes

Utility: Can obtain data about the user’s reasoning (”Think-aloud”)

Weaknesses: Costs: Time

consuming Utility: Lots of data

not being used Intrusiveness:

”Think-aloud” may disturb the user

Page 6: Evaluation of usability tests

Observations: Real time or Video?

Real time Video

Context Product ProductContext

Level of detail

Page 7: Evaluation of usability tests

Event logsStrengths: Objective: The data

are collected automatically

Costs: Automated data collection requires little effort from the test team

Weaknesses: Level of detail: Both

the amount of data and the resolution can be too high

Utility: It can be difficult to create useful measures

Page 9: Evaluation of usability tests

Questionnaire, self-madeStrengths: Level of detail: Can

be tailored to fit the purpose of the test

Utility: Can be used in several setting with different products

Costs: It doesn’t take long time to develop

Weaknesses: Objectivity: Based on

subjective judgement

Utility: Difficult to construct good items

Page 10: Evaluation of usability tests

Questionnaire, validatedStrengths: Utility: Can be used

in several setting with different products

Costs: the data are typically easy to transform into measures

Weaknesses: Level of detail:

Validated questionnaires may not address the features of the interface you are interested in.

Objectivity: based on subjective judgement

Page 11: Evaluation of usability tests

Summary data-collection techniques

Data-collection technique/ Criteria

Utility Costs Objectivity Level of detail

Intrusiveness

Interview - - - + - Questionnaire self-made ++ ++ - ++ + Questionnaire validated + - - + + Observation real time + + - - + Observation video ++ - + + +

Event logs - - ++ + ++

Physiological - -- ++ + --

The assessment concern MEASURES and not use/problem descriptions; ++ = very good; + = good; - = not so good; -- = poor

Page 12: Evaluation of usability tests

…Use/problem descriptions Observation and

Interviews are the most suitable data-collection techniques for use/problem descriptions

Data-collection technique/ Criteria

Utility

Interview ++ Observation real time ++ Observation video ++

Event logs +

Page 13: Evaluation of usability tests

Evaluation of measures The evaluation criteria of the

data-collection techniques Validitity Reliability

Page 14: Evaluation of usability tests

ValidityDo you measure what you believe you measure?

Page 15: Evaluation of usability tests

ReliabilityDo you obtain the same results when you measure the same thing during similar conditions at different points in time?

Page 16: Evaluation of usability tests

Relationship betweenValidity & Reliability

Evaluating the validity of a measure is primarily based on subjective judgement, while reliability is typically evaluated by means of statistics

It is possible to obtain reliable results that are invalid, but not unreliable results that are valid!

Page 17: Evaluation of usability tests

How can you avoid invalid results? Use several measures!

Triangulation Multiple operationalism

Page 18: Evaluation of usability tests

Ethical issues Be well prepared - act

professionally! Create a script

Introduction During test Debriefing

Create a consent form

Page 19: Evaluation of usability tests

Ethical issues The product is being tested, not the user! Respectful treatment: preserve integrity Informed consent

Inform the user what will happen, how the collected data will be used etc.

Make sure the user understands and agrees The user may leave whenever she/he

wants Confidentiality

Page 20: Evaluation of usability tests

Types of measures Experience-attitude Performance Cognitive

Page 21: Evaluation of usability tests

Experience-attitudeStrengths: Utility: Can address

most usability attributes

Validity: User-centered; we ask for the user’s opinions

Weaknesses: Validity/Objectivity:

based on the user’s subjective judgement

Page 22: Evaluation of usability tests

Performance: completenessStrengths: Utility: Can be used

for most tasks and in different settings

Cost-effective: Quite easy to create a list of activities

Weaknesses: Validity/reliability: The

user may choose a solution path you didn’t think of, but that nevertheless is satisfactory

Validity(senitivity): Ceiling or flooring effects: the task is too easy or too difficult

Page 23: Evaluation of usability tests

Summary of measuresMetric type/data-collection technique

Validity (are we able to measure it)

Construct validity (importance to usability)

Utility (how useful it is to make design decisions - currently)

Experience-attitude ++ ++ ++ Performance time + + + Peformance completeness ++ + + Performance failures - ++ + Situation awareness - + -

Workload - + -

++ = very good; + = good; - = not so good; -- = poor

Page 24: Evaluation of usability tests

Relation between data-collection techniques and measures

Data-collection technique/ Metrics

Experience-attitude

Performance time

Performance completeness

Performance failures

Situation awareness

Workload

Interview + - - + - -

Questionnaire ++ - + - + + Observation real time - - + + - - Observation video - ++ ++ ++ + +

Event log - ++ + - - -

Physiological - -- -- -- - +

++ = very good; + = good; - = not so good; -- = poor

Page 25: Evaluation of usability tests

Relation between data-collection techniques and measures

Measure

Data-collection technique

Practicle limitations

Purposeof test