CHI99 Panel Comparative Evaluation of Usability Tests

61
CHI99 Panel Comparative Evaluation of Usability Tests Presentation by Rolf Molich DialogDesign Denmark [email protected]

description

CHI99 Panel Comparative Evaluation of Usability Tests. Presentation by Rolf Molich DialogDesign Denmark [email protected]. CHI99 Panel Comparative Evaluation of Usability Tests. Take a web-site. Take nine professional usability teams. Let each team usability test the web-site. - PowerPoint PPT Presentation

Transcript of CHI99 Panel Comparative Evaluation of Usability Tests

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Rolf MolichDialogDesign

Denmark

[email protected]

CHI99 PanelComparative Evaluation of Usability Tests

Take a web-site.

Take nine professional usability teams.

Let each team usability test the web-site.

Are the results similar?

What Have We Done?

Nine teams have usability tested the same web-site– Seven professional teams– Two student teams

Test web-site: www.hotmail.comFree e-mail service

Panel Format

Introduction (Rolf Molich)

Five minute statements from five participating teams

The Customer’s point of view (Meeta Arcuri, Hotmail)

Conclusions (Rolf Molich)

Discussion - 30 minutes

Purposes of Comparison

Survey the state-of-the art within professional usability testing of web-sites.

Investigate the reproducibility of usability test results

NON Purposes of Comparison

To pick a winner

To make a profit

Basis for Usability Test

Web-site address: www.hotmail.com

Client scenario

Access to client through intermediary

Three weeks to carry out test

What Each Team Did

Run standard usability test

Anonymize the usability test report

Send the report to Rolf Molich

Problems Found

Total number of different usability problems found 300

Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)

Comparative Usability Evaluation 2

Barbara Karyukina, SGI (USA)

Klaus Kaasgaard & Ann D. Thomsen, KMD (Denmark)

Lars Schmidt and others, Networkers (Denmark)

Meghan Ede and others, Sun Microsystems, Inc., (USA)

Wilma van Oel, P5 (The Netherlands)

Meeta Arcuri, Hotmail, Microsoft Corp. (USA) (Customer)

Rolf Molich, DialogDesign (Denmark) (Coordinator)

Comparative Usability Evaluation 2

Joseph Seeley, NovaNET Learning Inc. (USA)

Kent Norman, University of Maryland (USA)

Torben Norgaard Rasmussen and others, Technical University of Denmark

Marji Schumann and others, Southern Polytechnic State University (USA)

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Barbara KaruykinaSGI, Wisconsin

USA

[email protected]

Challenges:

Twenty functional areas

+

User preferences questions

Possible Solutions:

Two usability tests Surveys User notes Focus groups

Results:

26 tasks + 10 interview questions

100 findings

Challenges:

Twenty functional areas

+

User preferences questions

Problems Found

Total number of different usability problems found 300

Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Klaus KaasgaardKommunedata

Denmark

[email protected]

Slides currently not available

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Lars SchmidtFramtidsfabriken Networkers

Denmark

[email protected]

Team E

Framtidsfabriken Networkers Testlab, Denmark

Key learnings CUE-2

Setting up the test– Insist on dialog with customer – Secure complete understanding of user groups and user

tasks– Narrow down test goals

Writing the report– Use screendumps– State conclusions - skip the premises – Test the usability of the usability report

Improving Test Methodology

Searching for usability and usefulness– Hook up with different methodologies (e.g. interviews)

Focus on website context– Test against e.g. YahooMail– Test against softwarebased email clients

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Meghan EdeSun Microsystems

California, USA

[email protected]

Hotmail Study Requests

18 Specific Features e.g. Registration, Login, Compose...

6 Questions e.g. "How do users currently do email?"

24 Potential Study Areas

Usability Methods

Expert Review 6 Reviewers

6 Questions

Usability Study 6 Participants (3 + 3)

5 Tasks (with sub-tasks)

Report Description

1. Executive Summary - 4 Main High-Level Themes - Brief Study Description

2. Debriefing Meeting Summary - 7 Areas (e.g. overall, navigation, power features, ...)

3. Findings - 31 Sections

- Study Requests, Extra Areas, Bugs, Task Times, Study Q & A

4. Study Description

Total: 36 Pages - 150 Findings

Lessons Learned

Importance of close contact with product team

Consider including: severity ratings

more specific recommendations

screen shots

Discussion Issues

How can we measure the usability of our reports?

How to deal with the difference between number of problems found and number included in report?

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Wilma van OelP5

The Netherlands

[email protected]

Wilma van Oel

P5 adviseurs voor produkt-& kwaliteitsbeleidquality & productmanagement consultants

Amsterdam, the Netherlands

Structure of Presentation

1. Introduction

2. Deviations in approach– Test design– Results and recommendations

3. Lessons for the future– Change in approach?– Was it worth the effort?

Introduction

• Company: P5 Consultants

• Personal background: psychologist

Test design Subjects: n=11, pilot, ‘critical users’, 1 hour session Data collection: log software, video recording

Methods: lab evaluation + informal approach

Techniques: exploration, task execution,

think aloud, interview, questionnaire

Tool: SUS

A Test Session

Results and recommendations

N e g a t iv en = m e d ia n

P o s it iv en > m e a n

R ecomm enda tion s:g e n e ra l

n o t 'h o w '

R esu lts:'g e n e ra l 's e v e r ity ?

Lessons for the future

Change in approach?– Methods: add a usability inspection method– Procedure: extensive analysis, add session time– Results: less general, severity?

Was it worth the effort?– Company: to get experience & benchmarking– Personally: to improve skills, knowledge

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Meeta ArcuriMicrosoft Corporation

California, USA

[email protected]

Meeta Arcuri

User Experience Manager

Microsoft Corp., San Jose, CA

CUE - 2 The Customer’s Perspective

New findings ~ 4% Validation of known issues ~ 67%

– Previous finding from our lab tests– Finding from on-going inspections

Remainder - beyond Hotmail Usability– Business reasons for not changing– Out of Hotmail’s control (partner sites)– Problems generic to the web

Customer Summary of Findings

Quick and Dirty results Recommendations for problem fixes Participant quotes – get tone/intensity of

feedback Exact # of P who encountered each issue Background of Participants Environment (browser, speed of connection,

etc.)

Report Content: Positive Observations

Fresh perspectives Lots of data on non-US users Recommendations from participants Trend reporting Report of outdated material on site

(some help files) Appreciate positive findings, comments

Additional Strengths of Reports

Some recommendations not sensitive to web issues (performance, security)

At least one finding irreproducible (not preserving fields in Reg. Form)

Frequency of issue reported was sometimes vague.

Some descriptions terse, vague - had to decipher

Report Content: Weaknesses

Cross-validate new findings with Hotmail Customer Service reports

Lots of good data to cite in planning meetings Some good recommendations given by labs

and participants

How Hotmail Will Use Results

Focused, iterative testing would give better results

Wide array of user data very valuable Overall - good qualitative and quantitative data

to help prioritize, schedule, and improve usability of Hotmail.

Conclusion

CHI99 PanelComparative Evaluation of Usability Tests

Presentation by

Rolf MolichDialogDesign

Denmark

[email protected]

Comparison of Tests

Based only on test reports

Liberal scoring

Focus on major differences

Two generally recognized textbooks:

– Dumas and Redish, ”A Practical Guide to Usability Testing”

– Jeff Rubin, ”Handbook of Usability Testing”

Resources

Team A B C D E F G HJ

Person hours used for test 136 123 84 (16) 130 50 107 45

218

# Usability professionals 2 1 1 1 3 1 1 3

6

Number of tests 7 6 6 50 9 5 11 46

Usability Results

Team A B C D E F G HJ

# Positive findings 0 8 4 7 24 25 14 46

# Problems 26 150 17 10 58 75 30 1820

% Exclusive 42 71 24 10 57 51 33 5660

Usability Results

Team A B C D E F G H J

# Problems 26 150 17 10 58 75 30 18 20

% Core problems (100%=26) 38 73 35 8 58 54 50 27 31

Person hours used for test 136 123 84 NA 130 50 107 45 218

Problems Found

Total number of different usability problems found 300

Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)

If Hotmail is typical, then the total number of usability problems for a typical web-site is huge,much larger than you can hope to find in one series of usability tests

Usability testing techniques can be improved

We need more awareness of the Usability of Usability work

Conclusion

http://www.dialogdesign.dk/cue2.htm

Download Test Reports and Slides