XTENDED H A T I , T OUTPUT AND TOUCHLESS INTERACTION€¦ · three attributes for three different...

EXTENDED HAND ATTRIBUTES FOR TOUCH INPUT, TOUCH OUTPUT AND TOUCHLESSINTERACTION

by

Aakar Gupta

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of Computer ScienceUniversity of Toronto

© Copyright 2018 by Aakar Gupta

Abstract

Extended Hand Attributes for Touch Input, Touch Output and Touchless Interaction

Aakar GuptaDoctor of Philosophy

Graduate Department of Computer ScienceUniversity of Toronto

2018

As newer computing devices vary in their screen size and proximity to the user, they limit old and enable new

affordances. However, our interactions still rely on a limited set of our anatomical capabilities which do not take

advantage of these new affordances and usage scenarios. This thesis investigates how to use three different hand

attributes, distinct fingers, the hand tactile sense, and hand dexterity, for novel interactions. We investigate these

three attributes for three different device scenarios respectively - touch input for small touchscreen devices, touch

output for wrist wearables, and touchless interactions for distant, large displays. With distinct fingers, we investi-

gate how it can be used to solve the problem of limited input space in small touchscreens interactions and enable

novel interfaces. With the hand tactile sense, we investigate how the persistent skin contact of wearable devices

can be utilized to move the use of haptics beyond simple vibrations and make them a more central component

of interaction. With hand dexterity, we investigate how it can solve the problems inherent in semaphoric and

manipulative in-air gestures. Overall, we describe six investigations pertaining to these areas which show how

the extended hand attributes can make existing tasks more efficient, make existing devices more expressive, and

make novel interfaces possible.

ii

Publications and Copyright Notices

Sections of this document have appeared in publications or are forthcoming (at the time of writing). In allcases, permission has been granted by the publisher for these works to appear here. Below, the publisher’s copy-right notice and the list of publications is given.

Association for Computing Machinery

Permission to make digital or hard copies of all or part of this work for personal or classroom useis granted without fee provided that copies are not made or distributed for profit or commercialadvantage and that copies bear this notice and the full citation on the first page. Copyrights for com-ponents of this work owned by others than the author(s) must be honored. Abstracting with creditis permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requiresprior specific permission and/or a fee. Request permissions from [email protected].

Aakar Gupta, Thomas Pietrzak, Cleon Yau, Nicolas Roussel, Ravin Balakrishnan. Summon andSelect: Rapid Interaction with Interface Controls in Mid-air. In Proceedings of ACM ISS 2017. 10Pages. (to appear)

Aakar Gupta, Antony Irudayaraj, Ravin Balakrishnan. HapticClench: Investigating Squeeze Sen-sations using Memory Alloys In Proceedings of ACM UIST 2017. 9 Pages. (to appear)

Aakar Gupta, Muhammed Anwar, Ravin Balakrishnan. Porous Interfaces for Small Screen Multi-tasking using Finger Identification. In Proceedings of ACM UIST 2016. 12 Pages.

Aakar Gupta, Antony Irudayaraj, Vimal Chandran, Goutham Palaniappan, Khai Truong, RavinBalakrishnan. Haptic Learning of Semaphoric Finger Gestures. In Proceedings of ACM UIST2016. 8 Pages.

Aakar Gupta, Thomas Pietrzak, Nicolas Roussel and Ravin Balakrishnan. Direct Manipulation inTactile Displays.In Proceedings of ACM CHI 2016. 11 Pages. Honorable Mention Award.

Aakar Gupta and Ravin Balakrishnan. DualKey: Miniature Screen Text Entry via Finger Identifi-cation. In Proceedings of ACM CHI 2016. 12 Pages. Honorable Mention Award.

iii

Contents

1 Introduction 11.1 The Problem Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Touch Input for Small Touchscreens using Distinct Fingers . . . . . . . . . . . . . . . . . 2

1.2.2 Touch Output from Wrist Wearables using the hand’s Tactile Sense . . . . . . . . . . . . 4

1.2.3 Touchless Interaction for distant large screens using Hand Dexterity & Tactile Sense . . . 4

1.3 Research Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background Work 72.1 Touch Input for Small Touchscreens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Enhancing Touch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.2 Alternate Inputs for Small Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.3 Screen Occlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Touch Output from Mobile & Wearable Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 Skin & Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Actuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.3 Tactile Feedback in Wearable Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Touchless Gestural Interaction in Mid-air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 Taxonomy of Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.2 Detection & Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.3 Freehand Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Using Distinct Fingers on Smartphones for Multitasking 213.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.1 Small Screen Multitasking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.2 Partially Transparent Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3 Porous Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.1 Porosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.4 Why Partial Transparency? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.5 Why Finger Identification? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.6 Fidelity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.7 An Implementation of Porous Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.7.1 Finger Identification Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

iv

3.7.2 Software System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.8 Porous Window Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.8.1 Window Environment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.8.2 Window Environment Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.8.3 Foreground-Background Indicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.8.4 Dynamic Transparency Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.8.5 The Vanishing Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.9 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.9.1 Camera+Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.9.2 Drag and Drop in Droppable Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.9.3 Simultaneous Keyboard use in two apps . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.10 User Feedback on Porous Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.10.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.11 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Using Distinct Fingers on Wrist Wearables for Text Entry 384.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Finger Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2.2 Miniature Screen Text-Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.3 Dualkey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.1 Finger Identification Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.2 Keyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.4 Performance Evaluation: Dualkey QWERTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.4.3 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.5.1 Text-Entry Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.5.2 Error Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.5.3 Comparison with Existing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.6 Further analysis of accuracy and speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.6.1 Analyzing DualKey Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.6.2 Analyzing DualKey Speed: Finger-Switching Time . . . . . . . . . . . . . . . . . . . . . 46

4.7 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.7.1 Step I: Finger Assignment Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.7.3 Step II: Nearest-qwerty Layout Optimization . . . . . . . . . . . . . . . . . . . . . . . . 50

4.8 Performance Evaluation: Dualkey SWEQTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.8.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.9.1 Performance Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.9.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

v

4.9.3 Designing Finger Identification Interactions . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Tactile Squeezing Sensations for Wrist Wearables 555.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Squeezing Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.4 Design Process: Making of a Strong SMA Squeezing Actuator . . . . . . . . . . . . . . . . . . . 57

5.5 The HapticClench System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.6 Psychophysics of HapticClench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.6.1 Absolute Detection Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.6.2 Discrimination Thresholds (JND) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.7 HapticClench: Capabilities and Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.7.1 MultiClench: Spatial patterns using multiple springs . . . . . . . . . . . . . . . . . . . . 62

5.7.2 RingClench: HapticClench on a Finger . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.8.1 Visual+Haptic Feedback: Squeezing Bracelets . . . . . . . . . . . . . . . . . . . . . . . 65

5.8.2 Design Guidelines for SMA Squeezing . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.8.3 Design Guidelines for Squeezing Perception . . . . . . . . . . . . . . . . . . . . . . . . 66

5.8.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.8.5 Applications and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6 Tactile Direct Manipulation in Wrist Wearables 696.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.3 Direct Manipulation in Tactile Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.3.1 The Tactile Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.3.2 The Tactile Indirect Pointing Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.3.3 Tactile Indirect Pointing Interface Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.3.4 Control & Progress in Tactile Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.4 Proof of Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.4.1 Tactile Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.4.2 Tactile Pointer Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.4.3 Tactile Target Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.4.4 Tactile Interface Actions Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.4.5 Transfer Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.5 Study I: Movement & Target Distinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.5.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.6 Study II: A Performance Model for DMTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.6.1 Experiment Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.6.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.7 Study III: A Tactile Menu Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

vi

6.7.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.7.2 Task Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.7.3 Results: Menu Application Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.8.1 Contexts of Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.8.2 Design Guidelines for DMTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.8.3 Limitations and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7 Learning Freehand Semaphoric Gestures using Tactile Finger Wearables 897.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.2.1 Haptics for Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.2.2 Semaphoric Gestural Commands Learning . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.3 Design of a Freehand Semaphoric Gesture Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.3.1 Visually Meaningful Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.4 Harware Implementation of Haptic Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.5 Study of Active Haptic Learning of Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.5.1 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7.5.2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7.5.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.6.1 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.6.2 Subjective Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

7.7 Discussion, Research and Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7.7.1 Active vs. Passive Haptic Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . 97

7.7.2 Active Haptic Learning Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7.7.3 Freehand Finger Tap Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

7.7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

8 Easing Freehand Manipulative Gestures by moving beyond Point & Select 998.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

8.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.2.1 Manipulative gestures for virtual objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.2.2 Reducing Fatigue in Mid-air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.2.3 Alternative to Point & Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.3 Summon & Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.4 Design Elements & Midas Touch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.4.1 1. Summon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.4.2 2. Disambiguate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.4.3 3. Manipulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

8.4.4 4. Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

8.5 Summon & Select: Advantages and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

8.5.1 Advantages & Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

vii

8.5.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058.6 Prototype Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058.7 Study I: Slider Disambiguation & Dragging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

8.7.1 Study Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1068.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

8.8 Study II: Summon & select vs Point & Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1088.8.1 Study Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1088.8.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.9 Haptic Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1128.9.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138.9.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.10 Discussion & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138.10.1 Integration with Point & select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138.10.2 Generalizability and Learnability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

8.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

9 Conclusion 1159.1 Summary and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

9.1.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1179.2 Congruent Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

9.2.1 From Peripheral to Central . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1189.2.2 Wearables as Tools for Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1189.2.3 Learning of Novel Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

9.3 The Final Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Bibliography 120

viii

List of Figures

1.1 The computer’s image of a human user. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 The diversity of screens wrt size and proximity to the user . . . . . . . . . . . . . . . . . . . . . 2

1.3 The thesis investigates three hand attributes within respective scenarios: Distinct fingers for smalltouchscreens, Hand tactile sense for wrist wearables, Hand dexterity for distant, large screens. . . 3

1.4 Chapters 3-8 flow. Chapters 3-4 pertain to distinct fingers for touch input, 5-6 pertain to the wristtactile for touch output, and 7-8 pertain to hand dexterity for touchless interaction using freehandgestures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1 The hierarchy of touch input literature we cover in our review. . . . . . . . . . . . . . . . . . . . 7

2.2 The hierarchy of touch output literature we cover in our review. . . . . . . . . . . . . . . . . . . . 12

2.3 Figure shows the amount of cortical area dedicated to a particular body part which represents thedegree of innervation (density of nerves) of that part. For instance, lips have a very high densityof tactile nerves. The image is reproduced from [Kandel et al., 2000] . . . . . . . . . . . . . . . . 13

2.4 Figure shows the distribution of receptors in the skin. The image is reproduced from Goldstein1999 [Goldstein, 1999]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.5 The organization of touchless interaction we cover in our review . . . . . . . . . . . . . . . . . . 19

3.1 Porous Interfaces enable overlaid semi-transparent apps accessible using different fingers as input 22

3.2 (Left) Messaging on photo gallery, (Right) the semantically transparent overlapped version . . . . 24

3.3 Single-step content transfer using the beat gesture: Sharing an image from gallery in back tomessaging in front using the middle-to-index beat . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.4 Single-step content transfer using the beat gesture: copy-pasting text from messaging in front tomaps in back using the index-to-middle beat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.5 Finger Identification Prototype with IR sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.6 Porous window setup. (Left) Middle tap on app that goes in the back. (Right) Index tap on appthat goes in front . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.7 Paired app switcher. (Left) Middle tap on app switch icon shows (Right) the switcher for pairs ofporous windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.8 (Left) Dynamic Transparency Control invoked using a middle swipe form top, (Right) the mes-saging app appears in background when it receives a new message notification . . . . . . . . . . . 32

3.9 Interaction Flow Diagram for Porous Interfaces which maintain fidelity with the existing smart-phone interface. Porous interactions and screens are in orange and existing smartphone interac-tions and screens are in Grey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.10 Camera+messaging: the beat gesture captures the picture, and sends it to the chat recipient instantly 33

ix

3.11 Keyboard works in sync with the porous interface. (Left) Middle tap types in the back messagingapp. (Right) Index tap types the address in the front maps app . . . . . . . . . . . . . . . . . . . 34

3.12 Questionnaire results boxplot. 7 indicates high usefulness and easiness on a Likert scale. . . . . . 35

4.1 DualKey (a) Index finger types ‘ty’ key’s left letter ‘t’ (b) Middle finger types ‘ty’ key’s right letter‘y’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 a) Sensor mounted on finger (b) Hardware Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.3 DualKey’s Mean Speed WPM by Day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.4 Speed of two participants P1 and P2 over 15 days . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.5 DualKey’s TER, UER by Day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.6 Mean Corrected Error Rate (CER), Finger Error Rate (FER), Swap-Correction Rate (SCR) over10 days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.7 Mean time duration between characters for each of the 4 finger configurations over 10 days. I -Index, M - Middle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.8 (a) DualKey QWERTY (b) DualKey SWEQTY . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.9 Mean Speeds of DualKey: SWEQTY vs. QWERTY . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.10 Mean TER% of DualKey: SWEQTY v/s QWERTY . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.1 HapticClench’s squeezing tangential & shear forces . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.2 HapticClench wire design iterations. (a) SMA wire+Velcro (b) Wire+Velcro+Restorative Spring(c) Coiled wire in series +insulation (d) Wires in Parallel (e) SMA Spring+Restorative Spring+Hook(f) Final: SMA Spring+Hook, No restorative spring . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.3 HapticClench circuit+spring assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.4 HapticClench Load vs Power supplied . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.5 ∆Load (kg) (y-axis), Base Load Values (x-axis). Table shows fraction of responses that judgedtwo stimuli as equal. As baseload increases, offset needs to be higher. . . . . . . . . . . . . . . . 61

5.6 The 75% and 95% JND values for base loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.7 (left) MultiClench, (right) RingClench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.8 Accuracy of MultiClench patterns (95% CI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.9 Comfort & annoyance boxplots for MultiClench . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.10 Mean Accuracy % for all three pulses for both durations. Continuous pulse is least accurate. [95%CI] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.11 A loose bracelet squeezing into the skin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.1 A 1D circular 360x1 tactile display around the wrist. On left, tactile pointer is over a void whosetactile response frequency is represented in green. On right, user navigates the pointer to a target,where the tactile response frequency is different (orange). . . . . . . . . . . . . . . . . . . . . . . 73

6.2 State Transitions for Pointing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.3 State Transitions for Target Execution & Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.4 State Transitions for Dragging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.5 left) Wristband, (middle) Actuator positions, (right) Study 1 sample layout . . . . . . . . . . . . . 76

6.6 Influence of target width, distance, block and position on pointing time . . . . . . . . . . . . . . . 80

6.7 Influence of width and block on accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

x

6.8 Study 3 Menu Layout: 4, 8 items. Blue regions are items. Dot partitions are activation zones forcontained items. Pointer starts at 22°. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.9 Study 3 Menu Application Results: Mean Accuracy for Execution and Drag & Drop tasks . . . . 846.10 Study 3 Menu Application Results: Mean Time for Execution and Drag & Drop tasks. (e) User

approach to finding an item. (f) Mental and Physical Demand of the tasks. . . . . . . . . . . . . . 856.11 Study 3 Menu Application Results: (left )User approach to finding an item, (right) Mental and

Physical Demand of the tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.1 (left) A participant wearing the haptic rings setup for the Visual-Haptic condition. (right) Thescreen for the Visual-Visual condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.2 Object images and names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.3 Recall Rate % for all techniques by blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957.4 Hint Rate % for all techniques by blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957.5 (left) Questionnaire Results, (right) Mnemonics used by participants by learning technique . . . . 96

8.1 Steps of summon & select for the bottom slider. (0) Idle (1) Summoning gesture for slider (2)Disambiguating by zoning to the desired slider (blue focus moves to the bottom slider) (3.1-3.3)Manipulation: (3.1) Enter Drag gesture to enter dragging mode (green box around the bar) (3.2)Dragging the slider bar (3.3) Exit Drag gesture to exit dragging mode (4) Release gesture torelease the control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.2 Example summoning gestures and manipulation gestures (in blue arrows) for different controltypes: Button (airtaps), dial knob (pinched rotation), switch (lateral thumb-tap), spinbox (lateralindex, middle airtaps), paired buttons (vertical index, middle airtaps) . . . . . . . . . . . . . . . . 102

8.3 Summon & select state machine. The numbers correspond to Figure 8.1. . . . . . . . . . . . . . . 1048.4 (left) Study I interface: Sliders are numbered 1-5 from the bottom. (right) Study II interface:

Buttons are numbered 1-3 from top to bottom on left and 4-6 on right. The original screens hadmore free space around these snapshots which have been cropped here for space. . . . . . . . . . 106

8.5 StudyI: Mean reach times for Tabbing and Zoning for the five sliders. Tabbing is faster for SLIDER

1, while zoning is faster for SLIDER 5. Error bars are 95% CI. . . . . . . . . . . . . . . . . . . . 1078.6 StudyII: Reach Time per TECHNIQUE for left & right buttons. Error bars are 95% CI. . . . . . . . 1108.7 StudyII: Mean selection time per BUTTON SIZES per TECHNIQUE. Error bars are 95% CI. . . . . 1108.8 StudyII: Mean click time per BUTTON SIZE per TECHNIQUE. Error bars are 95% CI. . . . . . . . 1118.9 StudyII: Qualitative results for Button Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

9.1 The three hand attributes being investigated: Chapters 3,4: Distinct Fingers (Blue), Chapters 5,6:Tactile Sense on the Wrist (Purple), Chapters 7,8: Hand Dexterity (Yellow) . . . . . . . . . . . . 115

9.2 Chapter distribution for interactions with wearables, using wearables for other devices, and both. . 119

xi

Chapter 1

Introduction

1.1 The Problem Space

In 2004, O’Sullivan et al. illustrated how a human would be perceived by a computer if it was based only on ourinteractions with its input and output devices, which are usually the mouse, the screen, the touchscreen, and theheadphones. And to such computers we would just look like a finger, two ears, and an eye as shown in Figure 1.1.

While we have progressed from this 2004 image to a certain extent, our current interactions with laptops andsmartphones would more or less still lead to this image. The hand, undisputedly the most important body partfor our input to the computers, is still limited in its expression as an interaction tool. While it has served ourinteractions with laptops and smartphones well, today we see a proliferation of new computer devices with adiversity if sizes and proximity (Figure 1.2). These devices differ in their affordances and demand interactiontechniques that treat these differences not as constraints but opportunities to devise novel interaction techniquesfor them. We propose that the hand with a larger set of its capabilities deployed for interaction can yield interactiontechniques for these new classes of devices that (a) make existing tasks more efficient, (b) enable tasks that are

Figure 1.1: The computer’s image of a human user.

1

CHAPTER 1. INTRODUCTION 2

Figure 1.2: The diversity of screens wrt size and proximity to the user

considered infeasible for a device(c) and enable their use in constrained usage scenarios.

We divide the hands’ use in interactions into three parts: touch input using hands and fingers, touch output onthe hand (haptics), and touchless interaction (freehand interaction). We investigate these three aspects for threedifferent device contexts: touch input for small touchscreens, touch output from wearable devices, and touchlessinteraction for distant large screens.

1.2 Research Overview

For each of the above interaction contexts, a specific capability of the hand is used: distinct fingers, the hand

tactile sense, and hand dexterity. We now look at these contexts, the associated hand capabilities and the researchquestions in more detail. Figure 1.3 illustrates the three threads.

1.2.1 Touch Input for Small Touchscreens using Distinct Fingers

As touchscreens shrink for smaller devices while the users’ fingers remain the same size, the input space isseverely constrained. An obvious solution is to make the touch input richer by using other distinct propertiesof finger touches, such as using knuckles, finger pads, pressure, or even hover. However, an area which hasbeen overlooked by most literature is enabling touchscreen interactions that can distinguish between differentfingers. And so the first thread of this thesis tackles the problem of investigating interactions that utilize fingeridentification.

While there have been works that have touched upon finger identification (which we detaile in Chapter 2), therewere no end-to-end investigations into how finger identification can be used to enable interfaces and applicationsthat perform better and solve problems inherent in small screen devices. Consequently, we focus on two specificproblems pertaining to small screens –


Figure 1.3: The thesis investigates three hand attributes within respective scenarios: Distinct fingers for smalltouchscreens, Hand tactile sense for wrist wearables, Hand dexterity for distant, large screens.

1. The small screen size of smartphones does not allow side-by-side app display which speeds up multitaskingin desktops. This results in users engaging in constant app switching to do a sequential form of multitasking.In addition to the obvious temporal cost, it requires physical and cognitive effort which increases multifoldas the back and forth switching becomes more frequent. How can we design interfaces that enable desktop-style single-step multitasking in small smartphone screens?

2. The even smaller touchscreens of wrist wearables are unable to support a high performance text entrymechanism. Fast and accurate access to keys for text entry remains a challenge for miniature screens. Howcan we design text entry for wrist wearables with a significant quantitative improvement?

To this end, we introduce two novel interaction techniques that solve the problems using finger identification–

1. Using Distinct Fingers on Smartphones for Multitasking

We propose Porous Interfaces, which use finger identification on the touchscreen combined with translucentwindows to support efficient multitasking on small screens. Translucent app windows are overlaid on top ofeach other, each of them being accessible simultaneously using a different finger as input. Porous interfacesinclude a broad range of multitasking interactions with and between windows while ensuring fidelity withthe existing smartphone interactions. Chapter 3 details the concept, implementation, and a qualitative studyof porous interfaces.

2. Using Distinct Fingers on Wrist Wearables for Text Entry

Existing works typically use a cumbersome two-step selection process, first to zero-in on a particular zoneand second to make the key selection. We propose DualKey, which uses finger identification to multiplextwo letters in one key that enables a single-tap entry. Chapter 4 details the technique, the prototype, and theresults of a 10 day longitudinal study with 10 participants that evaluated speed, accuracy, and learning. Italso details a layout optimization method that shows increased speed and reduced errors in another 10 daystudy.


1.2.2 Touch Output from Wrist Wearables using the hand’s Tactile Sense

As with the input space, the output space for wearable devices such as smartwatches and rings is limited by theirscreen size. The on-body placement of wearables provides an opportunity to potentially overcome these outputlimitations of the visual modality by using another modality - the tactile one. The sense of touch is capable ofperceiving a wide range of sensations, and few body locations have higher tactile acuity than the hand [Manciniet al., 2014]. Wearable devices are in steady, persistent contact with the body which opens the opportunity forhaptics to play a more central role in interactions. The second thread of the thesis investigates how touch outputor haptics in wrist wearables can play an enhanced role in our interactions. We draw on two particular insights –

1. The sense of touch is varied in its detection of multiple different sensations on the skin and expanding tosensations beyond the standard vibrations can lead to novel interaction possibilities. How do we enhance therole of haptics in smart wrist wearables in providing feedback beyond the standard vibrotactile feedback?

2. While haptics does have value in assisting interactions, the approach for haptics in HCI should go beyondits role as a secondary channel. How can smart wrist wearables enable haptic displays that not just enhance,but make haptics the central modality of interaction?

Consequently, we introduce two different ways of using haptics in this space, each drawing from the aboveinsights -

1. Tactile Squeezing Sensations for Wrist Wearables

Squeezing sensations are one of the most common and intimate forms of human contact. We build Hap-

ticClench, a wrist-worn device that generates squeezing sensations using shape memory alloys. Chapter 5details the formalized concept of squeezing feedback in terms of its perceptual properties. It further detailsa psychophysical evaluation of HapticClench. HapticClench is capable of generating up to four levels ofdistinguishable load and works well in distracted scenarios. Further, it details the study of multiple capa-bilities such as a high spatial acuity that can generate recognizable spatial patterns on the wrist and ambientcommunication of gradual progression of an activity.

2. Tactile Direct Manipulation in Wrist Wearables

Tactile displays have predominantly been used for information transfer using patterns or as assistive feed-back for interactions. With recent advances in hardware for conveying increasingly rich tactile informationthat mirrors visual information, and the increasing viability of wearables that remain in constant contactwith the skin, there is a compelling argument for exploring tactile interactions as rich as visual displays.Direct Manipulation underlies much of the advances in visual interactions. Chapter 6 introduces the con-cept of a Direct Manipulation-enabled Tactile display (DMT) and defines the concepts of a tactile screen,tactile pixel, tactile pointer, and tactile target which enable tactile pointing, selection and drag & drop. Thechapter details a proof of concept DMT and three studies that investigate DMT’s precision limits, targetacquisition and performance in a tactile menu application.

1.2.3 Touchless Interaction for distant large screens using Hand Dexterity & TactileSense

The hand including the forearm, palm and fingers is the most dexterous body part. In the previous subsection, wefocused on how the tactile sense can be used on the wrist to make haptics be more central to our interactions. We


now investigate how the tactile sense of our fingers can couple with hand dexterity to enhance touchless interactionfor distant large screens.

Freehand interactions are touchless interactions that involves gesturing in air with just the hands and fingerswithout any proxy control devices. Freehand interactions in air are becoming popular as a result of their potentialuse with large screens, public displays, virtual reality, augmented reality and other scenarios where the usercannot easily touch the screen or a controlling device. We investigate freehand interactions for distant largescreen displays. Freehand gestures can be broadly divided into two classes - semaphoric and manipulative. Boththese gesture classes have inherent challenges -

1. Semaphoric gestures are predefined gesture sets which are difficult to learn if the sets are large. While thisproblem has been explored in desktop and touchscreen contexts, the problem has been underexplored inmid-air interactions where the inertia to learn new gestures is high since the user has to spend a lot of timein front of the large display. How can we enable a learning method that can be used in on-the-go scenarios?

2. Manipulative gestures in mid-air are riddled with problems such as arm fatigue and the need for precisionpointing. This is down to their reliance on point & select, an interaction model that was never designed formid-air. How can we design an interaction paradigm that solves these problems of point & select?

We propose solutions to these challenges that are aided by haptics -

1. Learning Freehand Gestures using Tactile Finger Wearables

Semaphoric freehand gesture sets require learning the gestures and their associations. Most work in semaphoricgestural learning is in the space of touchscreens and desktops that rely on visual learning. Minimal work hasbeen done in this space for freehand gestures. Further, these methods require the user to be in front of thescreen and do not allow learning anywhere and everywhere. For instance, language learning while travelingis very popular. We propose haptic learning of freehand semaphoric gestures that can be performed any-where without the need for screens. Chapter 7 details a two-day study of 30 participants where we couplehaptic stimuli from finger rings with visual and audio stimuli, and compare their learning performance withwholly visual learning. The results indicate that with <30 minutes of learning, haptic learning of finger tapsemaphoric gestures is comparable to visual learning and maintains its recall on the second day.

2. Easing Freehand Manipulative Gestures by moving beyond Point & Select

Current freehand interactions with large displays rely on point & select as the dominant paradigm. However,constant hand movement in air for pointer navigation leads to hand fatigue very quickly. We introduceSummon & Select, a new model for freehand interaction where, instead of navigating to the control, the usersummons it into focus and then manipulates it. Summon & select solves the problems of constant pointernavigation, need for precise selection, and out-of-bounds gestures that plague point & select. Chapter 8describes the interaction model and conducts two studies to evaluate the feasibility & design and compareits performance against point & select in a multi-button selection study. The results show that summon &select is significantly faster and has less physical and mental demand than point & select.

1.3 Research Organization

In Chapter 2, we describe the overarching Background work pertaining to all three areas. Chapters 3,4 describethe two works on finger identification. Chapters 5,6 describe the two works on wearable haptics. Chapter 7,8


Figure 1.4: Chapters 3-8 flow. Chapters 3-4 pertain to distinct fingers for touch input, 5-6 pertain to the wrist tactilefor touch output, and 7-8 pertain to hand dexterity for touchless interaction using freehand gestures.

describe the two works in freehand interaction. Chapter 3 starts with finger identification for smartphone screens,chapter 4 investigates finger identification for even smaller touchscreens on wrist wearables, chapter 5 investigatesenhancing haptic actuation for wrist wearables , chapter 6 investigates enhancing haptic interactions using wristwearables by making haptics the central interaction modality, chapter 7 investigates making haptics central tolearning of semaphoric freehand gestures in mid-air, chapter 8 investigates a model combination of semaphoricand manipulative freehand gestures for mid-air interaction and how haptics can hep realize the model. All containtheir specific descriptions of related work, conclusions, and future work. Chapter 9 describes the general directionsof future work that our work opens up. An illustration of the chapters’ primary questions and how they followfrom one to the next is shown in Figure 1.4.

Chapter 2

Background Work

In the subsequent three sections, we study the background work of the three areas, Touch Input, Touch Output,and Touchless Interaction.

2.1 Touch Input for Small Touchscreens

For touch input, we focus on techniques that specifically address the small screen problem. We classify theseapproaches into three categories: Enhancing Touch input for small screens, alternate input techniques to interactwith small screens, and techniques that address screen occlusion. Figure 2.1 shows the detailed hierarchy of touchinput literature we are going to cover in this review.

Figure 2.1: The hierarchy of touch input literature we cover in our review.

7

CHAPTER 2. BACKGROUND WORK 8

2.1.1 Enhancing Touch

We first look at techniques that enhance touch input on small screens. While most of these techniques have beenproposed for smartphones, they inform our understanding of precise touch input which is useful for smaller screendevices such as smartwatches. Existing works have addressed the problem of precise and occlusion-free touch ontouchscreens using broadly two approaches: expanding the touch gesture vocabulary or by getting more nuancedinformation from the touch attributes themselves. The third, but relatively underexplored approach is identifyingdifferent fingers on a touchscreen which is where our work lies. We discuss each of these one by one.

Touch Gestures

In this subclass, we discuss works that focus on improving touch without using any external modalities, purelybased on the information from single or multiple touches and the extended vocabulary.

Touch Extensions: Different finger touch behaviors have been used to extend the vocabulary of touch eventson a touchscreen to address precision and occlusion – finger touch precise selection using dual finger selection[Benko et al., 2006], finger roll events [Bonnet et al., 2013], and bezel-swiping [Roth and Turner, 2009], distantdouble taps [Heo and Lee, 2013]. In this subset of extended touch vocabulary, there have been few works thatprecisely focus on wearables – Beats [Oakley et al., 2015] proposes rapidly, sequential or overlapping taps ofindex and middle finger that are easy and intuitive for a smartwatch context.

Gesturing: Aside from the ubiquitous swipe, pinch and zoom, using touch-surface stroke gestures to operatemobile interfaces has been studied extensively to reduce the need for precision and the need for visual attention.Graffiti and Unistrokes [Castellucci and MacKenzie, 2008] are gestural text-entry techniques. Gesture Avatar [Lüand Li, 2011] uses gestures for precise operation of mobile interfaces. Another popular class of gestures that arelocation dependent are crossing gestures. Crossing gestures can be modeled as well as pointing tasks [Accot andZhai, 2002]. Luo et al.[Luo and Vogel, 2014] study crossing based selection for touch input. Zhai et al. [Zhai,2012] provide a foundational review of touch stroke gestures. The primary issues concerning these gestures aremeasuring and modeling the stroke complexity, differentiating between gestures, and the cognitive and memoryrequirements.

Touch Attributes

In this subclass, we discuss works that use additional touch attributes that cannot be detected from the standardsoftware and uses additional hardware or raw capacitive data.

Finger Hover: Multiple works have investigated hover on large touch surfaces [Annett et al., 2011, Hilligeset al., 2009, Marquardt et al., 2011, Rekimoto and Jun, 2002, Takeoka et al., 2010] and commercial applications ofit can be see in smartphones as well [Samsung, 2017, Sony, 2017]. Hover has been used to emulate the three statemodel of a mouse [Buxton, 1990a], or to show a preview of the target under hover [Samsung, 2017, Sony, 2017, Guet al., 2013, Marquardt et al., 2011], or show a GUI widget [Marquardt et al., 2011, Annett et al., 2011, Hinckleyet al., 2016]. Hover has also been used for in-air gestures above the phone screen or as a combination of in-air+touch gestures to enhance interactions [Kratz and Rohs, 2009]. Further it can be used to activate the intendedtouch target to reduce latency [Hinckley et al., 2016] and provide instant feedback [Xia et al., 2014]. Touchlatency has also been shown to improve using specialized hardware and software [Leigh et al., 2014].

Touch Force: Force or pressure touches in touchscreen has seen a lot of exploration. This can either forcenormal to the touchscreen or tangential (shear). Multiple methods for normal force/pressure detection have beenproposed including using force sensors under the screen [Heo and Lee, 2012], around the screen [Essl et al., 2010],


based on device motion [Hinckley and Song, 2011], vibration changes [Goel et al., 2012], muscle sensing [Benkoet al., 2009], fingertip [Abu-Khalaf et al., 2009] or fingernail color or force change [Hsiu et al., 2016, Marshallet al., 2008]. The interactions using normal force define distinct additional states for each distinct force value[Buxton, 1990a]. This includes Blackberry’s Storm 2 [Schulman, 2017] for previws and tooltips and Apple’sforce touch for a right click menu. Hinckley et al. [Hinckley and Song, 2011] use hard touch for faster access toinformation or to simulate a click state [Han and Lee, 2015, Benko et al., 2006]. Goel et al. [Goel et al., 2012] usethree levels of pressure for one-handed zoom operations, while Heo et al. [Heo and Lee, 2012] use it to changedragging modes. Normal force has also been used for continuous manipulation of different parameters such asscroll and zoom levels, speed of video navigation [Apple, 2017] in layering tasks to change item depths, and fornavigation in 3D space [Clarkson et al., 2006].

Shear forces have been detected using sensors in the touchscreen [Harrison and Hudson, 2012, Heo andLee, 2013, Lee et al., 2012a], but such approaches fail for multi-finger usecases. Other methods use changes infingernail [Mascaro and Asada, 2001], fingertip [Delhaye et al., 2016], gel deformations under the finger [Nakaiet al., 2014], or finger movement [Heo and Lee, 2013] to detect shear angles of touch. Shear forces are twodimensional in nature and have been used for 2D pointing interactions such as varying speed scrolling throughand e-book, panning and zooming [Heo and Lee, 2013]. Further these have been used for continuous parametercontrol [Harrison and Hudson, 2012] and secure invisible password input [Lee et al., 2012b].

Touch Pose: Different touch angles and finger poses of the finger have been investigated. Xiao et al. [Xiaoet al., 2015] estimate yaw and pitch of the angle of finger touch and enabling finger rocking and rotation on thetouchscreen. Huang et al. detect different areas of the touch pad such as flat and side [Huang et al., 2014]. Rogerset al. [Rogers et al., 2011] show how the finger’s 3D orientation can be used to improve touch accuracy. Hwanget al. [Hwang et al., 2015] and Webb et al. [Webb et al., 2016] use the smartwatch to estimate touch angles andhand orientation. Further, Boring et al. [Boring et al., 2012] show how thumb touch size can be detected and usedfor interactions. Harrison et al. [Harrison et al., 2011] show how nails and knuckle can be detected and used.Similarly, Cao et al. [Cao et al., 2015] and Hinckley et al. [Hinckley et al., 2016] use different touch shapes toperform different actions. Plus, touch precision has also been shown to improve using the finger’s Bayesian touchdistance from the nearest target [Bi and Zhai, 2013], skewing touches based on crowd data [Henze et al., 2011],and personalized touch models [Baldwin and Chai, 2012].

Finger Identification

Fast and accurate access to keys remains an open question for miniature screens. This is especially problematicin interfaces that necessitate a large number of buttons, like a keyboard. The question we explore is how to makethe keys effectively bigger without expanding upon the number of steps or gestures beyond finger tapping? Onepotential solution to both is to make the touchscreen sensitive to finger identity when the tap hits the screen.Although there are no market-ready technical solutions, multiple promising threads of research [Benko et al.,2009, Marquardt et al., 2011, Sugiura and Koseki, 1998, Goguey et al., 2014b, Colley and Häkkilä, 2014, Ewerlinget al., 2012] have demonstrated the feasibility of finger identification in larger scale systems like the tabletop. Itis reasonable to expect that techniques for finger identification will eventually become viable for smaller screendevices, and hence it is worth exploring now if it is worthwhile using finger identification to improve the text inputexperience on small screens.

Prior work on finger identification either tackles the technical problem of identifying fingers or focuses onpotential interactions that result from identifying fingers. A variety of sensing methods have been used for fingeridentification on touchscreens, which include using fiduciary tags [Goguey et al., 2014a, Marquardt et al., 2011],


multitouch finger arrangements [Au and Tai, 2010, Ewerling et al., 2012, Wagner et al., 2014, Westerman andElias, 2006], and muscle sensing [Benko et al., 2009]. Further, there are user touch authentication technologiesthat make use of optical fingerprinting [Fitzpatrick et al., 1994, Holz and Baudisch, 2013], capacitive coupling[Dietz and Leigh, 2001] that uniquely grounds each user or capacitive fingerprinting [Harrison et al., 2012b] thatsenses body’s electrical properties. Most of these techniques were demonstrated for tabletops. However, someof these, notably, fingerprinting and muscle sensing, hold promise for future miniaturization. Recent efforts byApple and others [Swanner, 2015, Yousefpor and Bussat, 2014] on embedding fingerprinting into displays showcommercial interest in this space.

Another set of techniques that enable finger identification interactions use a fixed orientation of the hand onthe touchscreen to initialize a mapping of fingers [Au and Tai, 2010], or detect the shape of the hand to identifyfingers on a tabletop using vision [Ewerling et al., 2012]. However, these techniques do not actually identifyfingers and therefore only work with multitouch chording gestures where the relative position of the fingers onthe touchscreen makes it known which finger is which.

The exploration of interactions utilizing finger identification, however, is limited [Benko et al., 2009, Mar-quardt et al., 2011, Sugiura and Koseki, 1998, Goguey et al., 2014b, Colley and Häkkilä, 2014, Ewerling et al.,2012]. A popular use case is using different fingers to signal different modes or buttons – cut/copy/paste, menushortcuts etc. [Benko et al., 2009, Roy et al., 2015, Sugiura and Koseki, 1998]. Also suggested, is using fingers ascontainers for URLs [Sugiura and Koseki, 1998], copied items [Benko et al., 2009], and brush colors [Marquardtet al., 2011]. Finger-specific chording has been used for multi-step commands [Goguey et al., 2014b], mixingbrush colors [Marquardt et al., 2011], and a virtual mouse [Ewerling et al., 2012].

These works mostly focus on how specific actions within an application can be modified for the better usingfinger identification. Our works (Chapters 3,4) propose novel interactions that utilize finger identification on anapplication or an interface level and conduct in-depth investigations of the design and performance.

2.1.2 Alternate Inputs for Small Screens

Aside from touch, the most popular alternate input modality for touchscreens is the stylus. Nanostylus [Xia et al.,2015] is a fine-tip stylus specifically designed for high precision interaction on smartwatches. Further, multimodaltechniques that use pen+touch [Hinckley et al., 2010] and grip+motion sensing [Hinckley et al., 2014] have alsobeen proposed. Besides the stylus, the other techniques typically perform input by using the area around thedevice.

Multiple works use additional instrumentation to enable around-the-device interactions for small mobilescreens. Tilting operations using accelerometers have been explored for interactions including scrolling and typ-ing [Hinckley et al., 2000, Rekimoto, 1996, Wigdor and Balakrishnan, 2003]. Multiple works have explored in-airgestures around the device. PalmSpace detects mid-air gestures around the device using a depth camera to for3D tasks [Kratz et al., 2012], SideSwipe detects in-air gestures around mobile devices using actual GSM signal[Zhao et al., 2014]. SideSight senses touch around the device using IR sensors [Butler et al., 2008]. TouchZoomuses motion trackers on the finger precise target selection by zooming-in target areas upon finger approach [Yanget al., 2011]. AD-Binning [Hasan et al., 2013] allows users to off-load content in space around the device andinteract with it using finger motion tracking.

Physical properties of the device have been utilized as well. LucidTouch [Wigdor et al., 2007] and NanoTouch[Baudisch and Chu, 2009] use back of device interactions to overcome occlusion. Gripsense detects hand gripposture and pressure to augment interactions [Goel et al., 2012]. Tangible interactions with physical controllablewidgets around the device have also been explored. TUIC and Clip-On use multi-touch instruments on the device


that convey unique capacitive information to the screen [Yu et al., 2011]. MagGetz [Hwang et al., 2013] usespassive magnetic instruments around the device, and Acoustruments use acoustically-driven plastic instrumentsnear the microphone [Laput et al., 2015] to enable tangible interactions on the mobile phone.

A lot of work within this space focuses on the smartwatch specifically. Multiple touchable extensions ofthe watchface have been proposed around the watch. Watchit uses the watch strap for touch input [Perrault et al.,2013]. Oakley et al. use the perpendicular edge of the watch as a touchpad [Oakley and Lee, 2014]. Facet replacesthe watch strap with multiple displays around the wrist [Lyons et al., 2012]. iSkin [Weigel et al., 2015] uses apaper-thin and flexible on-body touch sensor to augment wearable devices. Further IR sensing has been used toenable touch interaction on the skin around the smartwatch: Skin-buttons [Laput et al., 2014] uses tiny projectorsembedded in the watch to render touchable icons on the skin, Nakatsuma et al. [Nakatsuma et al., 2011] embedIR sensors at the side of the watch to enable a touch interface at the back of the hand. In-air input techniqueshave also been explored: Abracadabra [Harrison and Hudson, 2009] uses magnets for finger input around thedevice, and Minput [Harrison and Hudson, 2010] uses IR sensors to detect device movement in the plane andcontrol interaction. zSense enables shallow depth sensing over wearable devices like rings, smartwatches, andeyeglasses to enable finger gestures [Withana et al., 2015]. Wrist rotation has also been explored as a way ofinteraction [Crossan et al., 2008]. Xiao et al. use a mechanical watch face’s pan, twist, and click for increasesinput expressivity [Xiao et al., 2014]. Orbits [Esteves et al., 2015] tracks the user’s gaze while it tracks a movingtarget around the desired target.

These alternate modalities and instrumentations have also been explored in Multimodal contexts with or with-out touch. Air+Touch [Chen et al., 2014c] interweaves in-air gestures with touch using a depth camera. SensorSynaesthesia [Hinckley and Song, 2011] combines touch+device motion gestures, Dual-Surface Input [Yang et al.,2009] augments one-handed touch with coordinated behind the screen input using a touchpad. Duet [Chen et al.,2014b] explores join smartwatch and phone interactions based on the devices’ spatial configurations, motion, andtouch input. Weigel et al. [Weigel et al., 2014] study the space of on-skin input and understand people’s prefer-ences on which skin areas on the skin are preferable. The hand and forearm were found to be most preferred.

Aside from these, speech interactions are a natural alternate for small or no screen devices but it has problemsassociated with accuracy, consistency, and social usage.

2.1.3 Screen Occlusion

Multiple works have tried to address the screen occlusion problem on touchscreens by enhancing the displayspace. They can be classified into augmented reality extensions, display hardware extensions, and interactiontechniques that overcome real-estate issues. MultiFi combines augmented reality projections with small screendevices [Grubert et al., 2015]. Naildisplay [Su et al., 2013]overcomes screen occlusion by mounting a smalldisplay on the nail.

Multiple display interaction techniques have been proposed to handle occlusion on small screens – Shiftduplicates the occluded content above the finger [Vogel and Baudisch, 2007]. Escape enables selection gesturescued by icon position and appearance [Yatani et al., 2008]. Peephole displays [Yee, 2003]propose interactions forspatially-aware displays which provide a window on a larger virtual space.Pointing Lenses [Ramos et al., 2007]use pressure activated magnification to acquire small targets. An interesting category of devices where visualoutput is significantly limited is implanted devices which do not seemingly have any display capability – Holzet al [Holz et al., 2012] use LED intensity variations to demonstrate simple output capabilities for such devices.LED point lights that are embedded in current smartphones have been studied by Harrison et al. who [Harrisonet al., 2012a] provide recommendations on how to use these efficiently. NotiRing [Roumen et al., 2015] also


studies light notifications for rings.

2.2 Touch Output from Mobile & Wearable Devices

Haptics refers to both the tactile and kinesthetic sense of the body. However, our focus will primarily be on devicesthat operate on the skin either in a tactile fashion or using localized muscle stimulation, and not on kinestheticsthat move the limbs and body. Further, we’ll focus on technologies that have been applied specifically in thecontext of wearables. We look at the literature in three parts - 1) Skin & Perception, 2) Haptic Actuation and 3)Haptics in Interaction. Figure 2.2 shows the hierarchy of touch output as covered in our review.

Figure 2.2: The hierarchy of touch output literature we cover in our review.

2.2.1 Skin & Perception

Human Body and Tactile Sensors

The tactile sensations sensed by the skin and can be categorized into light touch, vibration, temperature, pres-sure, pain and itch sensation. The skin consists of three broad types of tactile receptors in different layers whosemechanical and neurological properties [Goldstein, 1999, Rock, 1984] lead to tactile sensations. (1) Mechanor-ceptors detect light touch, vibration and pressure. (2) Thermoreceptors detect temperature and its variations. (3)Nocioreceptors detect pain and itching. The feeling of various sensations depend on the varying spatial and tem-poral properties of the receptors. The density of these actuators varies across the regions of the body. Figure 2.3shows the distribution of tactile sensors from high to low. The response time of different receptors determine therange of resolution of the sensation that can be felt. For instance, vibration receptors are fast adapting which candistinguish between a large range of high and low frequencies of vibrations, while pressure is associated with slow

adapting receptors. Based on these two, there are two psychophysical properties associated with tactile actuations- spatial acuity which is the spatial resolution of actuation and is studied heavily in localization studies and tactile

sensitivity which determines the range and resolution of the detection of stimuli.

Figure 2.4 illustrates common receptors. Merkel receptors respond to pressure at about 0–10 Hz, Meissnercorpuscles respond to touch within 3–50 Hz, Ruffini endings respond to stretching of skin and Pacinian corpusclesrespond to vibration within 100–500 Hz. Jones et al. [Jones and Sarter, 2008] and Pasquero [Pasquero, 2006]


Figure 2.3: Figure shows the amount of cortical area dedicated to a particular body part which represents the degreeof innervation (density of nerves) of that part. For instance, lips have a very high density of tactile nerves. The image

is reproduced from [Kandel et al., 2000]

present detailed research reviews of the psychophysical investigations for tactile sensing. These psychophysicalstudies involve investigating multiple spatial and temporal tactile attributes including intensity, frequency, andlocalization via two basic experiment schemes - the absolute detection threshold which is to give an estimate ofthe lowest stimulus value a user can detect and the just noticeable difference which gives an estimate of the lowestdifference in values where the a user can tell two stimuli apart.

Perception

Besides this a lot of work has been in understanding the perceptual properties of tactile feedback, particularlyvibrotactile feedback.

Single Body Site The simplest tactile feedback is single point stimulation on skin. Psychophysics researchhas broken this down into four questions. First, the absolute detection threshold of the stimulus which is anestimate of the lowest stimulus value that can be detected by a particular area of the skin. For vibrotactile,this is measured in terms of the frequency and intensity of vibration. As is common in tactile perception, theabsolute thresholds are affected by a number of factors, such as body site, contact area, stimulus duration, stimuluswaveform, contact force, skin temperature, the presence of other masking stimuli, and age. See [Jones andLederman, 2006, Johansson and Flanagan, 2009, Morioka and Griffin, 2006] for detailed analyses of these factors.

The second question is can the user distinguish between different frequency or intensity of vibrotactile cuesbeing displayed? This capability is quantified by the discrimination threshold or just noticeable difference (JND),


Figure 2.4: Figure shows the distribution of receptors in the skin. The image is reproduced from Goldstein 1999[Goldstein, 1999].

which gives an estimate of the lowest difference in values where the user can tell two stimuli apart. JNDs dependon the strength of the reference stimulus, usually in adherence to Weber’s fraction, the ratio of the differencethreshold to the reference level. According to Weber’s law [Gescheider et al., 1990], Weber fractions for the samestimuli tend to remain constant regardless of reference level. Although there are exceptions [Jones and Sarter,2008], Weber fractions are around 10%–30% for vibration intensity (but can range between 4.7% and 100%) andmostly around 15%–30% for vibration frequency (ranging between 2% and 72%) [Israr et al., 2006].

The third question question is what is the response time for tactile cues? Vibrotactile perception generally hasa high temporal acuity. For example, we can distinguish successive pulses with latency of 5ms [Gescheider et al.,1990] which is better than vision (25 ms), but worse than auditory acuity (0.01 ms) [Jones and Lederman, 2006].Further temporal variations of intensity can lead to rhythms which we can recognize with a high recognition anddiscrimination [van Erp and Spapé, 2003].

Multiple Body Sites With multiple actuation points around the body, there are two questions. First, whatfactors govern distinguishing of tactile cues on neighboring locations on the body. This is traditionally measuredusing a two-point threshold, the smallest distance between two simultaneous stimuli to be reliably perceived asdistinct. Although researchers still use aesthesiometers to gather two-point thresholds for clinical studies, theaccuracy of such measurements as an indicator of spatial acuity is questionable. However, one confoundingfactor with two-point localization is that two closely placed probes can lead to tactile illusions and thereforecurrent method for this include measuring point localization thresholds and grating orientation discriminationthresholds.In the former, only one stimulus is applied on the skin at a time [Stevens and Choo, 1996]. The latteroutputs the smallest spatial period of gratings for which orientation can be reliably distinguished [Vega-Bermudezand Johnson, 2001]. Spatial discrimination depends heavily on frequency, amplitude, and the body part [Jonesand Lederman, 2006]. For example, the localization acuity of 250-Hz vibrotactile stimuli around the waist was74% with 12 equidistant tactile actuators, 92% with eight tactors, and 97% with six tactors [Cholewiak et al.,2004]. This high discrimination ability can be useful for multiple applications.

The second question here is how does the user perceive multipoint tactile cues when they are closer? Thisquestion has led to uncovering of various tactile illusions pertaining to vibrotactile actuation. For example, in


sensory saltation or the cutaneous rabbit leads to a feeling of a continuously moving pulse when proximal linearpoints on the skin are actuated rapidly in sequence[27]. A phantom tactile illusion leads to a single phantom tactilesensation in between two simultaneously actuating motors. TactileBrush [Israr and Poupyrev, 2011] investigateshow variations of these can be generated algorithmically and used for multiple applications. A thorough reviewon tactile illusions can be found in [Hayward, 2008].

While we touch upon some important aspects above, please see [Choi and Kuchenbecker, 2013] for a throughreview of the psychophysics of vibrotactile feedback. Aside from the above fundamental issues, psychophys-ical explorations have focused on a vast expanse of issues. They have dealt with how users perform hapticexploration [Lawson, 2014], the perception of material properties such as shape, texture, weight [Martínez et al.,2014, Martínez et al., 2016, Martínez et al., 2013], haptic realism [Hoffman, 1998], and sensory substitution[Novich and Eagleman, 2014, Shull and Damian, 2015, Simeone, 2015].

2.2.2 Actuation

While vibrotactile actuation is the most popular, others such as point-pressure, skin stretch etc. have also beenproposed. Vibrotactile actuation can be further divided into linear resonant actuators, rotary electromagneticactuators, and non electromagnetic actuators which take advantage of the piezoelectric effect. A summary of thesecan be found in [Choi and Kuchenbecker, 2013]. Pressure and force feedback investigations include comparisonswith vibrotactile [Tejeiro et al., 2012], pressure onto the forearms [Mitsuda, 2013], and on the fingertips [Chunget al., 2015, Achibet et al., 2016]. More recently, Lopes et al. [Lopes and Baudisch, 2013] stimulate force-feedback using electrical stimulation instead of physical motors thus miniaturizing force-feedback. Skeletouch[Kajimoto, 2012] is a miniature transparent electrical tactile display for touchscreens. Traxion [Rekimoto, 2013]induces a virtual push or pull force using an electromagnetic coil. TeslaTouch [Bau et al., 2010] varies theelectrostatic friction between the fingers and the touchpad to induce electrovibration in the finger. Skin Dragdisplays [Ion et al., 2015] drag a physical tactor on the user’s forearm to produce tactile shapes. Tactile Brush[Israr and Poupyrev, 2011] demonstrates phantom sensations where two separate skin actuations lead to a singleillusory actuation on the skin.

2.2.3 Tactile Feedback in Wearable Interaction

We now discuss the literature pertaining to the exploration of tactile feedback in interactions, with special empha-sis on wearables. We divide this into three categories - Guidance, Communication and Information display, andAssistive Feedback.

Guidance

Tactile feedback is used for providing physical guidance to the user via tactile cues in a wide variety of areasincluding athletics [Spelmezan et al., 2009b, Pakkanen et al., 2008, Marchal-Crespo et al., 2013, Sigrist et al.,2013], performing arts [Drobny and Borchers, 2010, van der Linden et al., 2011b, Seim et al., 2015], and re-habilitation [Sienko et al., 2013, Krebs et al., 1998]. These can be feedforward or direct guidance or can becorrective which only informs the user on when and what they’re doing wrong. Most systems use vibrotactilefeedback and vary their spatial and temporal patterns and the intensity of vibrations. Existing work [Spelmezanet al., 2009a, McDaniel et al., 2011] uses saltation motion patterns to convey direction of motion. McDaniel et al.[McDaniel et al., 2011] found that most intuitive saltation patterns are applied in a feedforward “follow me" stylewhere the vibration direction is tangential to the movement direction. Spelmezan et al. [Spelmezan et al., 2009a]


explored the more traditional “push/pull” model where vibration patterns are applied along the length of the limb.Stationary vibration at the location indicating direction of error have shown to evoke a shorter response time fordynamic motion guidance than continuous patterns [Lieberman and Breazeal, 2007, Stanley and Kuchenbecker,2012]. Further, tactile feedback has been used to guide pointing in air [Lehtinen et al., 2012].

Communication and Information Display

Multiple works investigate a range of tactile patterns for communicating patterns as simple as single vibrationsfor notifications to complex rhythms to communicate messages. Before the advent of wearables, the work onhandheld devices dominated the literature. Spatiotemporal patterns around the edge or the back of the mobiledevice have been shown to high recognition accuracy [Yatani et al., 2012, Rantala et al., 2013].

The studies have investigated tactile information displays on various body parts. Existing work has found highacuity for tactor localization [Chen et al., 2008] and placement [Matscheko et al., 2010a] on the wrist. Lee et al.[Lee et al., 2015] study the information transfer efficiency of a 3x3 vibrotactile grid at the watch-back. Karuei et al.[Karuei et al., 2011] found that for detecting a single vibration, the thigh and feet are the least sensitive body parts,followed by waist, arm, and chest. Other studies have shown the wrist to be one of the most sensitive body partstowards vibrotactile stimuli [Craig, 1977]. Buzzwear [Lee and Starner, 2010] evaluates wrist-worn displays forpattern distinction and found that users were able to distinguish 24 tactile patterns with 99% accuracy. Pasqueroet al. [Pasquero et al., 2011] found that depending on the length of each vibration, participants could easily countup to 10 vibrations generated on their wrist. NotiRing [Roumen et al., 2015] studies tactile notifications for rings.Other body parts such as forearm [Sergi et al., 2008] or cheek [Park and Han, 2010] have also been investigated.

Beyond localization and recognition of patterns, association of patterns to notifications and icons have beeninvestigated. Brown et al. [Brown et al., 2006] found that iconic associations of vibrations to phone contactsformed from the variations of roughness, intensity, and rhythm were learned with a 72% recognition rate. Long-term information like progress bars [Brewster and King, 2005] or direction [Jin et al., 2014] have also beeninvestigated and shown to work well. Further, devices on other body parts have been investigated for notificationssuch as vests and belts [Jones et al., 2004, van Erp and van Veen, 2003], the back [Tan and Pentland, 2001,Yanagida et al., 2004], and the tongue [Bach-y Rita et al., 1998].

Other than vibrotactile, tactile skin stretch has been investigated for notifications in mobile devices [Luk et al.,2006] and for associations to pictorial concepts [Wang et al., 2006].

Assistive Feedback

Multiple works investigate how tactile feedback can assist with interactions by acting as a reinforcement for theinteractions. The simplest example of this is the vibration feedback upon a key press on a touchscreen. Yataniet al. [Yatani and Truong, 2009] study spatial vibrotactile patterns that were used to accompany visual feedbackin spatial coordination tasks and showed that vibrotactile feedback can reduce information workload in visualchannel.

Multiple works show benefits of tactile feedback in pointing tasks or gestures [Carter et al., 2013, Lehtinenet al., 2012, Pfeiffer et al., 2014, Sodhi et al., 2013, Schonauer et al., 2012], target acquisition [Oron-Gilad et al.,2007] and visual search tasks [Lehtinen et al., 2012]. Further, existing work investigates vibrotactile feedbackfor tasks such as navigating through one-dimensional lists or menus [Linjama and Kaaresoja, 2004, Oakley andO’Modhrain, 2005, Poupyrev et al., 2002]. Schonauer et al. [Schönauer et al., 2015] studied user perception ofvibrotactile feedback on the forearm for mid-air gestures such as circle, swipe left, swipe right and found it to be


80% accurate. “Shoogle” uses a shaking metaphor for interactive exploration of information [Williamson et al.,2007]. Luk et al. [Luk et al., 2006] and Pasquero et al. [Pasquero et al., 2007] miniaturized a piezo skinstretchtactile display and mounted it on a passive slider, allowing for thumb movement in browsing and navigation. It hasalso been used in visually impaired scenarios such as eyes-free gestures [Lee and Starner, 2010], braille-reading[Nicolau et al., 2013] or color identification [Carcedo et al., 2016]. This is a wide space and out of scope of thisthesis.

Other than the above, some works have tactile feedback more tightly integrated with the interaction. Forinstance, Pasquero et al. [Pasquero et al., 2011] use piezoelectric actuators on the wrist to convey notificationsand gesture feedback. Harmonious Haptics [Hwang et al., 2015] uses a combination of smartwatch and phonevibrotactile actuation to enable enhanced tactile feedback. Lopes et al. [Lopes et al., 2015] propose proprioceptiveinteractions where the user conveys input using wrist flexion and output is conveyed to the user via electricalmuscle stimulation.

Our work on tactile squeeze sensation (Chapter 5) falls under the Haptic Actuation category, specifically forwrist wearable devices. The work on tactile direct manipulation (Chapter 6) again deals with wrist wearables, butthe focus is on interaction as opposed to actuation. However, it does not fall under any of the three subcategories(Guidance/Information Display/Assistive Feedback). We detail how the work differs from these in Chapter 6.

2.3 Touchless Gestural Interaction in Mid-air

Freehand gestures are gestures that involve using hands and finger gestures in air for conveying direct inputwithout any proxies such as a mouse, touchscreen or a controller in between. There are multiple lenses throughwhich we can look at the freehand gesture literature - (1) Gesture detection systems in air which include glove-based systems [Wang and Popovic, 2009, Zimmerman et al., 1987, Bowman et al., 2002], vision [Krueger et al.,1985, Wilson, 2006], or novel signal sensing [Saponas et al., 2009, Cohn et al., 2011]; (2) Different applicationdomains which include virtual and augmented reality, large screen displays, desktops, and smart environments; (3)type of exploration which include proposing new gestures, conducting internal validity evaluations, conducting in-the-wild explorations etc.; (4) type of gestural framework which broadly can be divided into deictic, manipulative,semaphoric and gesticulation [Karam and Schraefel, 2005]. In this review, we briefly discuss the taxonomy ofoverall gestures styles, both touch-based and touchless. We then focus the review on freehenad gestures anddiscuss detection and application context, followed by the manipulative and semaphoric gestures. Figure 2.5shows the organization of our review.

2.3.1 Taxonomy of Gestures

Karam et al [Karam and Schraefel, 2005] define five categories of gestures based on their style of invocation:Deictic, Manipulative, Semaphoric, Gesticulation, and Langugage gestures. We look at these one by one.

Deictic Gestures

In Deictic gestures, user point to an object or location and perform subsequent actions such manipulating theobject by hand or using voice. “Put that there” [Bolt, 1980] is a deictic gesture that allows the user to point tovirtual objects and move them using speech commands. Deictic gestures are further explored to identify objects invirtual reality [Zimmerman et al., 1995], in CSCW applications [Kuzuoka et al., 1994], for desktop applications[Wellner, 1991], and for identifying appliances [Swindells et al., 2002, Nickel and Stiefelhagen, 2003].


Manipulative Gestures

Manipulative gestures are those which “control an entity [typically on the screen] by applying a tight relationshipbetween the actual movements of the gesturing hand or arm with the object being manipulated” [Quek et al.,2002]. There are different kinds of manipulative gestures as described by Karam et al. [Karam and Schraefel,2005]: gesturing in 2 degrees of freedom (dof) for 2D interactions like controlling a desktop mouse pointer,gesturing in 2.5 or 3 dof for 2D interactions like using pressure or finger gestures [Matsushita and Rekimoto, 710, Minsky, 1984], gesturing with tangible objects for 3D interactions [Hinckley et al., 1998], and gestures forreal-world physical object interactions like controlling a robotic arm [Fisher et al., 1987].

When gesturing with tangible objects, the users either perform touch and tap based gestures or grasp-basedgestures. Fitzmaurice et al. introduced graspable user interfaces [Fitzmaurice, 1996] for grasping physical objectsto control widgets in the interface. These have been explored in multiple scenarios. Kry et al. [Kry and Pai,2008] use a spherical device called Tango for whole handed interaction with 3D environments. Doring et al.[Döring et al., 2011] study grasping on the steering wheel. Other works explore doing grasping gestures in airwhich simulate physical grasping [Song et al., 2012, Steins et al., 2013]. Angelini et al. [Angelini et al., 2015]provide a framework for tangible grasping gestures. The gesture consists of three components: an optional handmovement, holding the physical object (or simulating the grasp of holding a physical object), and touching thephysical object. All gestures are formed from one or more instances of these three components. Hoven et al.[van den Hoven and Mazalek, 2011] also provide a comprehensive review of grasping gestures based on wherethey stand in the wider context of hand gestures.

Semaphoric Gestures

According to Quek et al. [Quek et al., 2002], semaphoric gestures “employ a stylized dictionary of static ordynamic hand or arm gestures that serve as a universe of symbols to be communicated to the machine." These canbe static or dynamic. For example, joining the thumb and index finger to represent ok is a static pose while wavingthe hand from left to right is a dynamic gesture. Stroke gestures is a subcategory which refers to strokes madeby the fingers, stylus etc. which map onto commands. Stroke gestures have been used for issuing commands todesktop applications [Segen and Kumar, 1998, WU and Balakrishnan, 3 11] or for screen navigation or markingmenu selections [Zhao and Balakrishnan, 4 10, Lenman et al., 2002]. A more detailed exploration can be foundin [Karam and Schraefel, 2005].

Gesticulation Gestures

Gesticulation gestures are gestures which are interpreted within the context of the user’s speech. They are used incombination with conversational speech [Quek et al., 2002, Wexelblat, 1995, Kopp et al., 2004, Bolt and Herranz,1 15, Kettebekov, 0 13].

Language Gestures

These are gestures used for sign language and are based linguistically consisting of series’ of independent gesturesthat combine to form grammatical structures. These are intended to carry conversations as opposed to issuecommands.

Out of the five gesture styles, our work focuses on manipulative and semaphoric gestures in the context offreehand gestures and discuss them in more detail after discussing the work on detection of freehand gestures.


Figure 2.5: The organization of touchless interaction we cover in our review

2.3.2 Detection & Context

A lot of earlier work went into detecting the gestures and applying them to different application domains forsimple tasks such as pointing and manipulating virtual objects. Since freehand gestures are apt for manipulatingvirtual 3D objects, multiple earlier works investigate navigation and object manipulation in virtual environmentstypically for first-person scenarios. Bowman et al. use finger pinch gloves to enable a bimanual navigationtechnique in a virtual environment [Bowman et al., 2004]. Gesture VR [Segen and Kumar, 1998] performs 3Dnavigation and object manipulation in desktop spatial simulations. Sharma et al. [Sharma et al., 1996] demonstratea VR interface for manipulating 3D molecular models. Multiple works use bimanual freehand gestures for 3Dmodeling [Krueger, 1993, Shaw and Green, 1997, Nishino et al., 1998]. Song et al. [Song et al., 2000] use fingergestures for selection and manipulation technique in virtual environments. Voodoo Dolls [Pierce and Pausch, 420] uses two-handed pinch gestures for object selection and manipulation. Sturman et al. [Sturman et al., 1989]use hand and finger gestures for their virtual environment graphical system for task-level animation. FingARtips[Buchmann et al., 2004] uses finger gestures such as grabbing, pointing and pressing for interacting with virtualobjects in AR. Gordon et al. [Gordon et al., 2002] use vision to track fingertips for pointing. Wilson demonstratedTAFFI [Wilson, 2006], a vision technique to detect simple pinching gestures for desktop tasks like cursor control,translation, rotation and scaling. Freeman et al. [Freeman and Weissman, 1995] propose television control usinghand gestures detected by vision. Hardenberg et al. [Hardenberg et al., 1 15] used finger tracking for virtualpainting and moving virtual items on a wall. Freeman et al. [Freeman et al., 1996] controlled movement andorientation of game objects such as cars by tracking hand position.

2.3.3 Freehand Gestures

We now look at the recent literature based on the type of gestural framework. Deictic gestures involve pointing toan object within the context of an application domain. Put-That-there [Bolt, 1980] is the most popular example ofgestures used in conjunction with speech. Further, they’ve been used to identify objects in virtual environments[Zimmerman et al., 1995], in CSCW applications [Kuzuoka et al., 1994], for pointing to appliances in ubicompcontexts [Swindells et al., 2002, Nickel and Stiefelhagen, 2003], and for desktop applications [Wellner, 1991].We will detail the work on freehand manipulative and semaphoric gestures in air as our work pertains directly to


them.Although most freehand gestural interactions in the current literature are manipulative gestures [Quek et al.,

2002] (which “control an entity [on the screen] by applying a tight relationship between the actual movementsof the gesturing hand or arm with the object being manipulated”), we focus our study on a second class of ges-tures, called semaphoric gestures [Quek et al., 2002] (which “employ a stylized dictionary of static or dynamichand or arm gestures that serve as a universe of symbols to be communicated to the machine”). Keyboard short-cuts for commands are semaphoric gestures which involve a fixed pattern of finger movement on the keyboard.Semaphoric gesture sets hold similar potential to be used as freehand gestures to rapidly invoke command short-cuts in a diversity of contexts through carefully designed actions that would be less prone to heavy hand and armfatigue, known as the gorilla arm effect [Hincapié-Ramos et al., 2014].

Manipulative

The manipulative gestural work has mostly focused on point & select, with the work falling under one of twocategories – a) absolute pointing: aiming the pointer at a target on a large display using raycasting and (b) relativepointing in a virtual 2D plane. Vogel et al. [Vogel and Balakrishnan, 2005] found raycasting to be slow forsmaller targets as compared to relative pointing techniques in a virtual plane, but also found relative techniquesto be highly time-consuming due to inconvenient clutching requirements. Ray-casting has been altered to betterits performance – marker cone [Ren, 2013] points a cone in the ray-direction. Multiple ways for target selectionafter pointing have been proposed including thumb–index join [Vogel and Balakrishnan, 2005], breach and triggergesture [Banerjee et al., 2012], wrist tilt and pinch [Ni et al., 2011], double crossing the pointer [Nakamura et al.,2008], and using speed and distance of pointer movement to select menu items. We further discuss the specificworks related to our exploration in Chapter 9.

Mid-air interactions are devoid of any haptic feedback, thus prompting interesting explorations in the contact-less mid-air haptics space. Two primary methods are used for this - air jets and ultrasound acoustics. AIREAL[Sodhi et al., 2013] produces pressurized air that preserves form and speed across long distances. Ultrahaptics[Carter et al., 2013] uses ultrasound to provide uninstrumented haptic feedback in air. A review of these methodis present in [Arafsha et al., 2015].

Semaphoric

Multiple works have investigated freehand semaphoric gestures [Baudel and Beaudouin-Lafon, 1993], includingproposing a library of gestures for common actions [Ismair et al., 2015], eliciting gestures from users [Kou et al.,2015, Ruiz and Vogel, 2015], and studying them in comparison to navigational gestures for menu selection [Baillyet al., 2011], their discoverability [Walter et al., 2013], and their use in various tasks [Ackad et al., 2015, Aslanet al., 2014]. Gunslinger [Liu et al., 2015] uses bimanual relaxed arms-down gestures with one hand doing thepointing and the other performing semaphoric gestures. Digits [Kim et al., 2012] uses a wrist worn sensor forfinger gesture recognition that can be used for performing both pointing and semaphoric gestures.

Our work on semaphoric gestures focuses on how to ease their learning by eliminating the requirement forvisual learning (Chapter 7). The work on manipulative gestures addresses the problem of slow speed and fatiguecaused by constant pointer navigation by proposing an alternative to point and select (Chapter 8).

Chapter 3

Using Distinct Fingers on Smartphonesfor Multitasking

3.1 Introduction

1 Salvucci et al. define multitasking as the ability to integrate, interleave, and perform multiple tasks and/orcomponent subtasks of a larger complex task [Salvucci et al., 2004]. Multitasking forms an integral part of theway we interact with a myriad of data on computers with reasonably large screens. Smartphones, with muchsmaller screens, do not lend themselves naturally to such traditional forms of multitasking. While smartphoneoperating systems support multiple apps running simultaneously, the limited screen size limits the interface’sability to support multiple concurrently visible and rapidly accessible windows which is the de facto solutionfor multitasking in larger screen interfaces. The lack of dedicated multitasking interface features has resulted insmartphone users attempting a sequential form of multitasking via frequent app switching. Bohmer et al. [Böhmeret al., 2011] describe how users switch repeatedly among already open apps within a short span and how users usegroups of apps frequently in sequence. The single window constraint makes this frequent back and forth switchingbetween apps inefficient [Leiva et al., 2012]. In addition to the obvious temporal cost, it requires physical andcognitive effort which increases multifold as the back and forth switching becomes more frequent. Given thenumerous obvious mobile form-factor benefits of maintaining a small screen for smartphones, it is clearly worthexploring alternative ways in which to support multitasking on such small screen devices.

We propose Porous Interfaces, a paradigm to support efficient multitasking on small screens. Porous interfacesenable partially transparent app windows overlaid on top of each other, each of them being accessible simulta-neously using a different finger as input. The semi-transparency allows for at least coarse characteristics of dataon multiple windows to be discerned concurrently, while finger identification enables concurrent interaction withmultiple windows without necessarily bringing the window being interacted with to the top. Further, the interac-tions enable easy, visible and more fluid data transfer between overlaid windows than is currently possible withtraditional “cut-copy-paste” interactions. We designed porous interfaces to include a set of characteristics thatenable a broad range of multitasking interactions with and between windows, while ensuring that the interfacemaintains existing commonly used and understood interactions of existing smartphone interfaces. We developedan end-to-end demonstration smartphone interface including a hardware prototype that performs finger identi-

1The contents of this chapter were published at UIST 2016 [Gupta et al., 2016a].

21

CHAPTER 3. USING DISTINCT FINGERS ON SMARTPHONES FOR MULTITASKING 22

Figure 3.1: Porous Interfaces enable overlaid semi-transparent apps accessible using different fingers as input

fication on the touchscreen and a series of applications that showcase porous interfaces. In a qualitative study,participants found porous interfaces easy, intuitive, and very likely to be used regularly if integrated into smart-phones.

3.2 Related Work

Chapter 2 discusses how different features of the finger including finger identification have been explored in theliterature. However, none of the works address how multitasking can be made more efficient on smartphonesusing enhanced finger features. The section 2.1.1 further shows that the explorations of finger identification haveonly shown how it can be used for multiplexing the same button for different operations. None of the priorwork designs or investigates end-to-end interfaces based on finger identification. There is a lack of compellinginvestigation into how these interactions can be uniquely ingrained in interfaces, and how they might lead to betterperformance. We now look at existing work on small screen multitasking and translucent windows.

3.2.1 Small Screen Multitasking

Prior research on small screen multitasking is surprisingly sparse. Nagata et al. [Nagata, 2003] found that mes-saging interruptions on a PDA significantly disrupt task performance. Choi et al. [Choi et al., 2016] propose easynotification peeking by flipping the smartphone cover. Mobile device designers have started to recognize the needfor easy multitasking on today’s mobile devices and have introduced side-by-side windowing on tablets [Budiu,2015]. However, similar solutions are not easily adapted to smartphones owing to their small screen size.

3.2.2 Partially Transparent Windows

While there have been no recent attempts at exploring partially transparent windows in the context of mobiledevices, earlier research has explored how they affect content visibility and comprehension on desktops. Harrisonet al. [Harrison et al., 1995a] introduce transparent layered windows and found that information density in thelayers governed the degree of visual distinction between them. Follow up research concluded that image on text orimage on image layering is better than text on text layering [Harrison et al., 1995b, Harrison and Vicente, 1996].


Ishak et al. propose content-aware transparency of windows which obscures the unimportant regions in a window[Ishak and Feiner, 2004]. Besides work on layered windows, ToolGlass and MagicLens explore transparenttoolboxes on top of an application [Bier et al., 1993]. However, there have been no published investigations intooverlaying full-screen partially transparent windows, and how the user would interact with such an interface.

In summary, no existing works tackle the problem of small screen multitasking in its entirety. We use anovel combination of partially transparent overlaid windows and finger identification on smartphones to designan end-to-end interface for small screen multitasking.

3.3 Porous Interfaces

As described by Spink et al [Spink et al., 2009], typical multitasking involves the desire to task switch, theactual task switch, the task execution, and switching back to previous task. We observe this behavior in dailysmartphone use in the form of app switching, where a user working on app A desires to switch to app B, invokesthe app switcher or goes back to the home screen (1), performs optional swipe(s) to go to the appropriate locationon the screen (2), performs a quick visual search to locate app B on the switcher or home screen (3), selects the app(4), waits for the app to open (5), performs tasks in the app, then invokes the app switcher or home screen again toswitch back to app A (6), performs an optional swipe again (7), performs a quick visual search again (8), selectsapp A (9) and waits for it to reopen (10). Excluding task execution in app B, these are a total of 10 temporal,physical, or cognitive steps after the user’s desire to switch apps, consuming time, physical, and cognitive effort.While most of these steps are not overly taxing, as users frequently perform app switching, their cumulative effectover time greatly affects user experience and performance. Leiva et al. [Leiva et al., 2012] found that smartphoneapp switching due to intended and unintended app interruptions may delay completion of a task by up to 4 times.

The situation gets worse when the user has to perform rapid repeated back and forth switching between theapps. Our aim is to reduce this switching time and effort and free users’ cognitive resources to work on task relatedoperations rather than window management operations. To this end, we classify three types of operations that theuser performs when requiring app switching – switching and viewing the content in app B without interaction,switching and interacting with content in app B, switching and transferring content from app B to app A. The aimconsequently becomes enabling the execution of these three operations to happen as rapidly as possible.

Importing the above operations to application windows in smartphones, we delineate three primary prerequi-sites of porous interfaces that enable efficient multitasking: concurrent visibility of windows, single-step interac-tion with windows, and instant content transfer between windows. To attain these prerequisites, we define threeprimary characteristics: transparent overlapped windows for concurrent visibility of windows, different fingers toaccess different windows in a single step, and multi-identified-finger gestures for instant content transfer betweenwindows. These characteristics are collectively termed as Porosity. Another integral property of our system isFidelity which ensures that the porous multitasking interface will not affect the prevailing single window use fortouchscreens and will rather be an augmentation to the existing interface.

3.3.1 Porosity

Transparent Overlapped Windows for Concurrent Visibility

Concurrent visibility of windows is the most obvious way to rapidly switch viewing between multiple app. Cur-rently, the way users perform app switching is wholly time multiplexed. Desktops solve the problem by spacemultiplexing the applications next to each other. This, however, is not possible for small screens. We therefore


apply depth multiplexing [Harrison et al., 1995a], where multiple app windows overlap on top of each other whilebeing partially transparent.

As explored in previous work on transparent windows on desktops, the potential interference of the contentof one window with another could be a real problem [Harrison et al., 1995b]. We limit the number of overlappedapp windows to two to minimize this interference. The two overlapping apps will comprise of a background (orback) app and a foreground (or front) app. The front app is made partially transparent so that both app windowsare visible.

Figure 3.2: (Left) Messaging on photo gallery, (Right) the semantically transparent overlapped version

Figure 3.2 (left) shows a user scenario that we implemented. A user is looking at messages from a friend inthe front app while looking at their picture gallery in the back app. The information from both apps is coarselyvisible. To improve visibility, we use another form of overlap, the semantically transparent overlap where thenon-useful parts of the front app are made completely transparent so the user can see through those parts. Figure3.2 (right) shows the semantically transparent version of Figure 3.2 (left) which has improved visibility. Whilethe semantically transparent apps have been custom-built in our demonstration to showcase the interactions, theconcept is an adaptation of Ishak et al’s content-aware transparency [Ishak and Feiner, 2004] for desktop windows.

Different Fingers for Different Windows

Concurrent visibility of windows poses an immediate problem: if two windows are visible at the same time, thenhow does the user interact with each window? Prior works address this by keeping only one window interactableat a time. The user can perform a selection command to choose which window to interact with. This mode changeapproach is suited to desktop interfaces where the windows are only partially overlaid so the user can easily selectone or the other by accessing their non-overlaid parts. For small smartphone screens, however, the partial overlaywould have to be considerably smaller to make such selections possible. Further, for tasks that require rapid backand forth interactions with both windows, the intermediate selection command upsets the interaction flow andspeed significantly, thus making the interaction frustrating. So even though the visibility of windows is depth-multiplexed, the interaction with them is still time-multiplexed. In fact, in a study by Kamba et al. [Kamba et al.,


1996] on semi-transparent overlaid widgets over news text, participants overwhelmingly found their way of longpress selection of the back layer tedious and requested immediate responsiveness of both layers.

To solve the problem for small touchscreens, we propose different fingers for interacting with different win-dows at the same time. This reduces the interaction with each concurrently visible window to an immediate singlestep. In our system, the index finger corresponds to the front app and the middle finger corresponds to the backapp. When the two app windows are overlaid, then every interaction on screen with the index finger correspondsto the front app and every interaction on screen with the middle finger corresponds to the back app. For example,if the back app uses taps, swipes, drags, long press, pressure touch etc. then these operations can be performedusing the middle finger. In Figure 3.2 (right), the user performs messaging with the index finger, and browses thegallery using the middle finger. We will refer to the taps with a specific finger as an index tap or a middle tap andsimilarly for other gestures such as swipe.

Multi-identified-finger gestures for instant content transfer between apps

The third primary reason a user switches from app A to app B besides viewing or interacting with app B is theneed to transfer content between apps. In the smartphone, we can see this manifest in the form of sharing content,getting attachment objects, and copy-pasting content. These operations again take up a lot of interaction stepsfor the user. We propose a pair of gestures that utilize finger identification to rapidly transfer content betweenconcurrently visible app windows.

To transfer content selected in the front app to the back app, the user index taps the screen (without liftingup), and then middle taps in rapid succession, thus performing a “beat gesture”. Multi-finger beat gestures havebeen shown [Oakley et al., 2015] to operate without interfering with single touch input owing to the durationbetween the two taps or beats being very small. We augment the beating gesture with finger identification tomake an index-to-middle finger beat distinct from a middle-to-index finger beat. Content transfer between theconcurrently visible windows is thus made symmetric. Transfer from the front to back app is done using theindex-to-middle beat and from the back to front app is done using the middle-to-index beat.

Figure 3.3: Single-step content transfer using the beat gesture: Sharing an image from gallery in back to messaging infront using the middle-to-index beat


Figures 3.3 and 3.4 show two user scenarios that we implemented to demonstrate rapid image sharing and textcopy-pasting enabled by the beat gesture. In Figure 3.3, a user messaging in the front app needs to send a pictureto her friend from the gallery in the back app. She selects the required picture using a middle long press, andperforms the middle-to-index beat to bring it instantly into the messaging textbox, then index taps Send to sendit. In Figure 3.4 the user is messaging with a friend about a restaurant and needs to look it up in the maps app inthe background. She copies the restaurant address and performs an index-to-middle beat to paste it into the mapsapp’s search box.

Figure 3.4: Single-step content transfer using the beat gesture: copy-pasting text from messaging in front to maps inback using the index-to-middle beat

Based on the above three characteristics, we see how the porous interface simplifies multitasking for smalltouchscreen devices. For two app windows that are concurrently visible, the number of steps, if the user wants toview the two apps in rapid succession, is reduced from ten to zero. If the user wants to interact with the two appsin rapid succession, the number of steps is reduced from ten to one. If the user wants to transfer selected content,it is reduced to a single multi-finger gesture.

3.4 Why Partial Transparency?

There are no easy ways for displaying windows simultaneously on a smartphone. While semantic transparencyoffers a solution, good visibility is conditional upon the windows’ contents and the solution will not work for allapp pairings. Apps that allow for sufficient white space would work well both with apps that allow/don’t allowwhite spaces. Messaging is immensely popular, and allows significant white space. Similarly, apps with itemlist like music, Twitter, stocks, file browsing, calendar, voice calls, all allow for white spaces. These will workwith apps with limited white space like maps, video, news, Facebook, web browser, shopping, photos, email.These may not work well with each other. However, app combinations with an image+text overlay will work well[Harrison and Vicente, 1996] (e.g. photo+email). Even with image+text overlays, color interference will causeproblems such as wehn the text and the image color are of similar shades. Semantic transparent overlaps offera solution to make visibility work more consistently. However, while we custom-built our apps to illustrate the


concept, more automated methods need to be explored. A direct way would be to allow developers to indicatesegments in their app views which contain less information which can in turn be used automatically by the systemto generate porous windows or given to the user to pick and choose the segments they want. Another way is todedicatedly explore content-aware transparency for overlapped apps on smartphones. A starting point would be torecognize instances of text-on-text overlap and color interference in image+text. A third way would be to devisemechanisms such that the content of the two apps intelligently mesh together into a single layer such that the userdoes not need to use the second finger or suffer from visibility problems. However, this would also be dependenton particular app pairings.

3.5 Why Finger Identification?

The second question is why is finger identification ap propriate in this situation and not another modality whichis already present in smartphones? In porous interfaces, the challenge is of enabling access to windows layeredbelow the topmost window without brining those windows to the forefront. One possibility is to use a modalitysuch as finger pressure which is already out there in the market. There are two reasons why this would not haveworked well. First, we envision the porous interface as an augmentation to the currently ubiquitous smartphoneinterface. Consequently we need to ensure that all existing gestures supported by an app currently will work inthe porous interface seamlessly. With pressure gestures already being used in apps as “force touch”, we could notoverload the gesture with the new porous interactions. Second, even if we assume that apps using pressure are inthe minority, we want seamless multitasking where all gestures including tap, swipe, pinch etc. work well withthe overlaid apps. It is difficult to do such gestures with added pressure consistently while switching between lesspressure and more at will. Finger Identification allows us to address both these issues.

Further, other methods such as using two-finger taps simultaneously and two-finger movements in the samedirection (opposite direction is pinch-to-zoom) for the back app would also present fidelity problems with existingapps. For instance, browser apps with embedded maps use two-finger scrolling for map navigation, 3D modelviewing apps use two-finger movements for orienting 3D objects. Further, precise interactions such as tappingkeyboard keys within the back layer will be difficult to do with two-fingers.

3.6 Fidelity

The current interaction paradigm on touchscreens is pervasive and works really well for single app use cases. Anew interface paradigm that addresses multitasking should make sure that the existing paradigm is not affected inany disruptive way. We term this property of our porous multitasking interface as the fidelity constraint. Fidelityimplies two things – that the single app interaction should keep working like before and when the apps are overlaidin the porous mode, then all the interactions associated with those apps should be supported.

We will address the first part in detail in the next section. For the second part, we have mentioned how allthe single finger interactions associated with both the overlaid apps will be supported by the corresponding finger.The use of the beating gesture was motivated by the fact that it is currently unused in the smartphone interface,besides being quick and appropriate for our use case. The system also ensures that multitouch interactions suchas pinch-and-zoom are supported in the overlaid apps too. When the user wants to zoom into a foreground mapsapp, she can simply use the standard pinch and zoom gesture using the thumb and the index finger. Similarly athumb and middle finger pinch and zoom can be used for a background app. However, we recognize that someusers prefer to perform pinch and zoom gestures using the index and middle finger. The system currently does not


support such a use case. However, a mechanism where the index finger touches slightly earlier than the middlefinger to act on the foreground app and vice versa for the background app could alleviate this problem.

Users hold their smartphones in three styles of grips [Hoober, 2013]: a) one handed – phone is held in thefingers of one hand while interacting using the thumb of the same hand, b) cradled – phone is cradled in one handand tapped with the finger in the other hand, c) two handed – phone is held in fingers of both hands with boththumbs providing the input. Since porosity requires two fingers being used in tandem, it can only be used whenthe user is holding the phone in the cradled style. However, when the second hand is free, the switch from onehanded to cradled is easy. We design our system such that existing interactions work in all three usage styles,whether the index finger or thumb is used and porous interface is invoked when the middle finger is used.

3.7 An Implementation of Porous Interfaces

We developed a demo operating system interface as an application within Android that demonstrates porousinterfaces and their fidelity with existing interactions.

3.7.1 Finger Identification Prototype

The finger tap is detected by the touchscreen. The prototype then needs to identify the finger whose tap wasregistered. We explored multiple techniques to identify index and middle fingers distinctly – tracking individualfinger movement with optical markers, color markers, and leap motion; capturing differences in finger motion withringed inertial motion unit (IMU) rings; and muscle sensing. However, none of these approaches worked profi-ciently within the constraints of our application requirements – a high precision accuracy for miniature screens,minimum instances of failure, low communication latency to the watch, and unobtrusive instrumentation. Aftermultiple rounds of experimentation, we settled on our final working design that uses two combined miniaturephoto-transistor and optical detector sensors mounted on the index finger and middle fingers.

As shown in Figure 3.5, the sensor is connected to an Arduino, which processes the sensor data and sends itto an Nexus 4 Android smartphone via a Bluetooth chip. The sensor detects its distance from the touchscreen.It is mounted and calibrated such that when the index finger touches the screen, the distance value is noticeablylower than when the index finger hovers over the screen. We have a similar sensor on the middle finger. We definethresholds for the distance values for both fingers for when they are touching the screen. When the touchscreenregisters a touch, the system checks the distance from both sensors and depending on the threshold, determineswhich finger was used. If the distance values are high for both the finger, it determines that a finger other than indexor middle was used (usually the thumb). An initial pilot with three users showed that upon individual calibration,99.5% accuracy was achieved. Each user performed a series of 75 taps and swipes with index, middle, and thumbin random order. However, the high accuracy is achieved at the cost of certain usability constraints. For instance,since the sensor is mounted just below the finger pad, the users can only tap the screen using their fingertips.

3.7.2 Software System

We built an Android app that simulated an end-to-end smartphone interface that supported porous multitasking.This includes the home screen, notifications bar, window switcher, lock screen, settings features, and nine demoapps. To maintain fidelity, the system should ensure that when the interface is not displaying porous windows,the interaction remains unaffected. The central principle that helps achieve this is gesture overloading. When thesystem is not in the porous mode, user interaction on screen with any other finger besides the middle finger results


Figure 3.5: Finger Identification Prototype with IR sensors

in the system behaving in the usual way. If the user performs an interaction with the middle finger, for instance,middle tap-ping an app icon, it results in a response associated with the porous interface. Thus, the middle fingerworks as a means for implicitly indicating the porous mode to the system.

3.8 Porous Window Management

Kandogan et al. [Kandogan and Shneiderman, 1997] define three processes that impact user performance in amulti-window interface – task window environment set up, environment switching, and task execution. Earlierwe described two-window task execution with the window environment already set up. We now look at howporous interfaces enable efficient setup and switching via gesture overloading with the middle finger.

3.8.1 Window Environment Setup

The interface enables invocation of two overlapping app windows while ensuring two things – a) the user caneasily designate which app will be the back app and which one will be the front, and b) the single finger invocationof apps should not be affected. A user scenario explains this below.

Figure 3.6: Porous window setup. (Left) Middle tap on app that goes in the back. (Right) Index tap on app that goesin front

A user starting a road-trip wants to play songs while constantly looking at their location in the maps app tomake sure they’re following the route correctly. The user starts at the home screen as shown in Figure 3.6 (left).To open Maps+Music, with Maps in the background, the user first middle taps the Maps app which lets the systemknow that the user intends to open the app in the porous mode and the Maps app is shown to be selected as in


Figure 3.6 (right). The user then index taps the Music app and both app windows are instantly opened with theMusic app overlaid over Maps. To open just a single app, the user can use the index finger (or thumb or anyother finger besides middle finger) in the usual way. The interface achieves both stated objectives and keeps theinteraction limited to two steps.

Since we designed the interface in Android, we overloaded the standard android soft keys Back and AppSwitcher as well. When two overlaid windows are open, an index tap on the back button corresponds to its actionin the front app and a middle tap corresponds to its action in the back app. If the user continually invokes Back,with an index tap for instance, there will come a point when the corresponding front app window will close. Atthis point, the system will transition out of porous mode. The Home button will always take the user to the homescreen.

3.8.2 Window Environment Switching

Window environment switching is the act of changing the screen contents to an existing environment setup [Kan-dogan and Shneiderman, 1997]. In our system, the window environment consists of a pair of apps and theiroverlay order. We overload the App Switcher icon such that when it is middle tapped (Figure 3.7 (left)), it opensup the porous window switcher (Figure 3.7 (right)). It shows the thumbnails of the pairs of apps the user is cur-rently working with. Tapping on one of these pairs with either finger opens up the overlaid window pair. Thepaired app switcher window can also include app pairs that the user frequently works with to enable faster accessthan the home screen invocation. Index Tapping the window switcher will open the usual single app switcherwindow where index tapping a thumbnail opens a single app. However, a user might need to invoke a backgroundapp while she is working on a single app. For instance, a user messaging with a friend might want to capture aphoto instantly to send it. To enable this on-demand porosity without a prior setup, all the app thumbnails in thesingle app switcher window are overloaded such that middle tapping an app thumbnail opens it in the backgroundand the front app becomes partially transparent.

Figure 3.7: Paired app switcher. (Left) Middle tap on app switch icon shows (Right) the switcher for pairs of porouswindows


3.8.3 Foreground-Background Indicator

While developing porous interfaces, we sought user feedback at various stages. One feedback was that while usersintuitively remembered to use the index finger for the front app and the middle finger for the back app, at times ittook them a moment to figure out which app window was at the front and which one at the back even when theyhad launched the apps in the first place. We included two kinds of feedback to solve this. First, at launch time, thebackground window was shown instantly on the screen, followed by a delay of 250ms, followed by the animatedappearance of the front app such that the animation looked like the front app jumping from above the screen ontothe back app. This established an immediate context in the user’s mind that the app that appeared from above isthe front app corresponding to the index finger. The animation appeared every time the overlaid windows werelaunched, either from the home screen or from the app switcher. However, if the user does not interact with thetwo apps for some time, she might forget this association. Therefore, we designed an indicator icon at the bottomof the screen that showed the overlaid app icons in accordance with the window ordering. Figure 3.8 shows themaps icon on top of the messaging icon. The app icon doubles down as a soft key such that when the icon istapped with any finger, the ordering of the overlaid windows is toggled which is in turn reflected in the icon. Thisallows the user to switch their front and back apps at any moment if they so desire.

3.8.4 Dynamic Transparency Control

The default partial transparency of the front window is set at 50% which as noted by prior work works equally aswell as a single app [Harrison et al., 1995b, Kamba et al., 1996]. However, depending on the windows’ contentand the tasks, the user might want to alter transparency themselves. We provide a dynamic transparency control(Figure 3.8 (left)) which is invoked by middle swiping from the top bezel. The usual notifications bar appearswhen using the index or any other finger.

An implicit feature of note here is the hidden window. The user can make the front app completely opaque at0% transparency. In this scenario, the user can still interact with the back app using the middle finger, but can’tsee it. This would be useful in situations where the user does not want the back app to affect the visibility of thefront app, but still wants to interact with the back app. For instance, prior work suggests that when a text windowis overlaid on another text window, the visibility of both windows suffers especially if either of the windows istext heavy [Harrison et al., 1995b]. A user reading an eBook while playing music in the back app would not likeany visual interference but still want to skip songs and replay them using middle swipe gestures that the Musicapp supports. Hidden window enables these kind of user scenarios. Similarly, the transparency can be set at 100%to make the front app the hidden window.

3.8.5 The Vanishing Notification

So far we have talked about situations where the user explicitly intends to multitask and launches and workswith overlaid apps. However, there is another kind of situation where the multitasking is not initiated by theuser [Nagata, 2003]. While performing single app tasks on a mobile device, there are high rates of externalinterruptions in form of notifications that distract the users from the main task, hamper user performance anddelay task completion [Leiva et al., 2012, Nagata, 2003]. While one part of the problem is the cognitive contextswitch, the second part is the interaction the user needs to perform, especially when the user wants to attend tothe notification immediately. For instance, when using app A, the user receives a messaging notification whichprompts her to stop the current task, swipe down the notification bar, read the preview and decide to attend to it


Figure 3.8: (Left) Dynamic Transparency Control invoked using a middle swipe form top, (Right) the messaging appappears in background when it receives a new message notification

now, select the notification to launch the messaging app, write a reply, and then perform more steps to get back toher initial app.

We alleviate this problem via the vanishing notification. If the user is interacting with a single app and shereceives an important notification, the porous mode auto-activates, with the corresponding app window beingopened in the background and the front app becoming partially transparent (Figure 3.8 (right)). The porous modeauto-exits after 3s, the background window disappears, and the front app regains its opacity. The duration ischosen based on prior work which showed that users take a median of 2s to decide if they want to engage witha notification [Banovic et al., 2014]. Since the interface is in porous mode for 3s, the user can interact with thebackground app using the middle finger. When this happens, the interface stays in the porous mode and does notauto-exit. The user can then use the two windows in porous mode and can switch back to the single window modeby index tapping the app switcher and index tapping the single app or by continually middle tapping Back. Sincethe vanishing notification could feel intrusive to the user, we make it a setting such that it is only allowed for appsthat the user explicitly marks as a priority app. Figure 3.9 summarizes the porous interface interactions and theirintegration into the existing interface in a flow diagram.

3.9 Applications

We built a series of demo applications to showcase various user scenarios of how porous interfaces work – Mes-saging, Photo Gallery, Maps, Music, Video Player, Twitter, Call, News, and Camera. We have already touchedupon some of these scenarios such as playing music and navigating on maps during a road-trip, messaging whilebrowsing a photo gallery and messaging a picture from the gallery instantly, messaging notification while read-ing, and copying text from the messaging app to maps. Other scenarios demonstrated by our apps could includewatching a live game while scrolling through live tweets about the game, messaging while watching a video,playing and changing songs without switching out from reading the news, etc. We now discuss applications thatshow how porous interfaces can lead to even more interesting and optimal extensions.


Figure 3.9: Interaction Flow Diagram for Porous Interfaces which maintain fidelity with the existing smartphoneinterface. Porous interactions and screens are in orange and existing smartphone interactions and screens are in Grey

3.9.1 Camera+Messaging

So far we have discussed how porous interfaces will ease multitasking based on existing app designs. However,if developers can design features specific to porous interfaces in their apps, it could result in unforeseen benefits.We developed a modified camera app (Figure 3.10), which when overlaid with another app such as messaging,can perform photo capture, in addition to sending it to the messaging app, all with the beating gesture alone, thusallowing the user to capture and send images across even more rapidly than a usual camera app overlaid with themessaging app.

Figure 3.10: Camera+messaging: the beat gesture captures the picture, and sends it to the chat recipient instantly


3.9.2 Drag and Drop in Droppable Zones

While the beat gesture enabled content transfer between apps, we exploit a property of the Android operatingsystem to enabled drag and drop between two porous windows. In standard Android, when the user starts draggingan object, it automatically detects droppable zones on the window when the object hovers over the zone. Withoverlaid windows this leads to the following scenario: If the user is browsing a photo gallery in the back app andmessaging in the front app, and she long presses an image and starts dragging it, then the image detects the textbox on the messaging app as a droppable zone where the user can simply drop the object without using any othergesture. This could be useful in multiple applications such as for email attachments, image sharing between apps,and dragging text items. However, such a technique will not work if the droppable zones of two apps overlap eachother at the same location. One solution to this problem that might be implemented in the future is to allow fordynamically movable drop zones that move away from each other across layers so that they never overlap.

3.9.3 Simultaneous Keyboard use in two apps

The keyboard is a system artifact that is usually the same across all apps. In certain scenarios, users perform backand forth typing in two apps in quick succession. For instance, when messaging with a friend about places to eatin the vicinity, the user might want to search maps at the same time, while going back and forth between them.We modify the behavior of the interface such that if both apps contain text boxes, index tapping the keyboard willtype in the front app and middle tapping it will type in the back app, thus reducing the effort even more (Figure3.11). The users can stop this behavior via an off key on the keyboard.

Figure 3.11: Keyboard works in sync with the porous interface. (Left) Middle tap types in the back messaging app.(Right) Index tap types the address in the front maps app

3.10 User Feedback on Porous Interfaces

We gathered feedback from 8 regular smartphone users (3 female), ages 22 to 30. Our goal was to get userreactions and comments on porous interfaces. The study took 60 minutes per participant. In the first 15 mins of


the study, the researcher briefed the participants on the system goals, followed by a demo of porosity interactions,followed by participants playing with each interaction. The next 30 mins consisted of a demo of the windowmanagement interactions, and an intro to the 9 demo apps followed by participants playing with each interactionand different app combinations. Participants were free to talk to the researcher, ask questions and comment ontheir experience. In the last 15 mins, participants were asked to rate the usefulness (how useful is a feature) and theeasiness (how easy it is to perform) for every interaction feature on a Likert scale while assuming a less obtrusivehardware.

3.10.1 Results

Participants uniformly liked porous interfaces and found it intuitive. Seven of eight participants indicated thatthey are very likely to use the system if integrated into smartphones. Figure 3.12 shows subjective rankings foreach interaction.

Figure 3.12: Questionnaire results boxplot. 7 indicates high usefulness and easiness on a Likert scale.

Participants liked that they were able to see overlaid apps, however, they had issues with the visibility forsome app combinations – “I want to watch the game and see the twitter stream but when the video gets white, it’shard to see the tweets because their font is white too”. However, they really liked semantic transparency – “The

chat bubbles are great. I can use them anywhere. It will be great if they can change their color according to my

background app.”

Participants loved concurrent access to two apps via different fingers and found finger switching easy butmentioned that they would prefer a less obtrusive hardware setup. A couple of users found a way to interact bytapping the indicator icon to switch app ordering whenever they needed to interact with the back app and keptinteracting without changing the finger – “Usually my other fingers are folded inwards so that I can see the screen

completely. I don’t want to unclench them, so I just bring the app front.”

An anticipated problem was that long nailed users use index and middle fingers to pinch and zoom instead ofthe thumb, which is not possible in our interface. One participant faced this issue and switched to using her ringfinger.

The most uniformly liked were the beat gesture and the vanishing notification. “I spend all my time attaching

files on email or copy pasting things on phone. This copy paste is the killer feature.”, “It’s so fluid. I can just

beat pictures to my friends all day.”, “I don’t want to check the notification, but can’t stop myself and it stops my

work. May be if I see it instantly I can let it vanish without doing anything.” Participants suggested functional


extensions – “The only reason I switch apps is to copy paste or for notification. If the video automatically pauses

when I start replying to the notification in background, it would be great!” One participant had privacy concernswith the vanishing notification – “So if I look at the chat notification in the background, will it still tell my friend

that I read the message? I don’t want to tell them immediately every time”

The most polarizing feature was the dynamic transparency control. Some users liked it. However, otherswanted it to be easier and involve only a single step. Participants suggested creative alternatives – “I don’t want

to touch the screen to change the screen. Can we use the (hard) volume buttons so that tapping the buttons with

middle finger changes transparency?” We believe the feature is useful, however, it needs a more user friendlyexecution. An alternative is finger sliding on the bottom soft key row.

Messaging was the most popular app – “I use chat all the time. I can use it with video, with Facebook, with

maps, with news, everything.” In fact, three participants mentioned using video+chat as one of their intended usecases even when we had not explicitly demonstrated them as a pair – “I can’t really focus on the TED talks but if

they keep playing in the background while I am chatting, it’ll be easier”. Another participant mentioned using chatwith an online shopping browser window – “It’ll be cool to chat with my friend while looking at the products.”

The TED comment demonstrates how they could do productive but boring activities while being simultaneouslyengaged in another app. This desire to do productive things if not for a lack of focus was echoed a lot and theysaw a solution in porous interfaces. “I can play mindless games while reading news. . . maybe I’ll read news more

that way.”

Participants also mentioned using video and music in frequent combination with other apps – “I’ll watch Game

of Thrones and switch on the camera in the background to record my reaction video!”, “I want to play music while

recording with my camera so that my personal video will have an automatic background score.” Some more appsthat participants mentioned were Calendar+Email for scheduling, Contact List+Calling for adding friends to anongoing call without multiple taps, and Browser+Chat/Email to open links and see them then and there. Oneparticipant mentioned an interesting use case for the phone lock screen – “If I unlock the phone with my thumb

or index finger, the phone opens in the mode without any porous interface. If I unlock with the middle finger, it

opens the phone with the porous interface.” This is a compelling way to think about the fidelity challenge. If auser intentionally unlocks the phone in the porous interface, she won’t expect it to follow the standard interface.

3.11 Discussion

Kandogan et al. [Kandogan and Shneiderman, 1997] define five requirements of multi-window systems for mul-titasking – i) allow organization of windows according to tasks, ii) allow fast task-switching and resumption, iii)free the user to work on tasks rather than window management, iv) use screen space efficiently, and v) spatiallyindicate the relationship between windows. Looking back, porous interfaces fulfill all five of these. Porous in-terfaces, by definition, satisfy iv), and enable i) and ii) using its window setup, app switching, and notificationhandling. Porous interfaces free the user to work on tasks rather than incessant switching by enabling simultane-ous visibility and interaction with frequently paired apps, and rapid content transfer between them, delivering abig improvement over the existing smartphone interface. The indicator animation and icon show the spatial appordering which is adequate for our paired app use.

However, aside from visibility issues discussed earlier, there are multiple areas for improvement. First, oursystem places high emphasis on fidelity, it isn’t able to maintain 100% fidelity in all situations as evidencedby the multitouch with long nails issue. Our system is designed to handle a variety of touch input modalitieswhich continue to function as usual, such as pressure. However, if we move to other modalities such as voice


and accelerometer motion sensing, the user needs a way to indicate which app should respond to that input. Thesimplest way to address this would be to always have the front app respond to these, with the user having thecontrol to change the behavior. Another way would be to intelligently predict this based on app behavior and pastuse. A third solution is to disregard fidelity altogether and design an interface from the ground-up that is rootedin finger identification and transparency. Porous interfaces assume that finger id isn’t used within apps. However,both could work together. Porous interfaces only need one additional finger and others can be used for within-appoperations. Or, individual apps that use finger id can choose to disable porous multitasking while they are open.

Second, almost all participants said that if the hardware can be compacted into a typical smartring, they willnot mind wearing it on their fingers in order to be able to use the interface. However, we recognize that this isnot tenable for widespread use and future work will have to focus on developing a ring-free finger identificationtechnique. The point of porous interfaces isn’t simply optimizing existing app switching, but opening the spacefor newer multitasking interactions that result from complete concurrent interactivity of both apps. Consequently,something like explicit mode switching to access different windows is not desirable. As mentioned before, recentcommercial efforts could mean that the problem solves itself in the coming years. Plus, misidentified fingerhandling can be incorporated into the interface by delaying the action to give user the time to undo it when thefinger is identified with low surety or blocking high impact operations like email sending in porous mode.

Aside from existing scenarios, advanced multitasking on smartphones where people perform complex tasksthat require multiple sources of information is infrequent. We need solutions that solve the inefficiencies inexisting app switching scenarios and that stimulate newer multitasking use-cases and applications for small screendevices.

3.12 Conclusion

We proposed porous interfaces that enable small screen multitasking using window transparency and finger iden-tification. To that end, we define their primary characteristics of task execution, the window setup and switching,and helper features. We built an end-to-end interface with fidelity to the existing smartphone interface. We furtherdemonstrated user scenarios using nine demo applications. We gathered detailed user feedback which reinforcedthe usefulness and ease of porous interfaces. Two instances where users have to switch apps most frequentlyare for content transfer and attending to notifications. The beat gesture and the vanishing notification directlyaddressed the two scenarios and received highly enthusiastic feedback from the participants.

This work dealt with how finger identification overcome the limited screen space on smartphones to enabledepth multiplexed multitasking. However, the challenge is even bigger in form factors smaller than smartphonessuch as smartwatches where both the input touch and output screen space are severely constrained, so much sothat even a qwerty keyboard cannot fit the screen in full for easy use. A second question is that while finger iden-tification can enable novel interactions, can those interactions quantitatively perform better than other alternativesor does finger switching slow down the interaction significantly? We address these questions in our next Chapterwhich quantitatively investigates a text entry method for wrist-worn small screen devices.

Chapter 4

Using Distinct Fingers on WristWearables for Text Entry

4.1 Introduction

1The miniature screens of wrist wearables such as smartwatches, fitness bands and smart bracelets are challengingfor text entry. When the keys are in a full qwerty layout, the extremely small keys coupled with the fat-fingerproblem makes efficient text entry difficult. While the previous chapter dealt with how finger identification canenable desktop-style multitasking on smartphones, in this chapter we explore how the even smaller smartwatchtouchscreens can enable high performance text entry that can compete with the performance in smartphones.

Prior work [Chen et al., 2014a, Hong et al., 2015, Oney et al., 2013] has tackled this by making the text-entryinteraction a two-step process. In the first step, the user typically zooms in or scrolls to a particular zone on thekeyboard which reduces the number of keys on the screen, followed by key selection in the second step. While thetechniques show impressive improvement over standard qwerty, the two-step interaction is cumbersome. Further,the best performing techniques [Chen et al., 2014a, Hong et al., 2015] use swipe gestures as a way of discerninguser intent on the miniature screen. Swipe or slide gestures are also used for shape writing [Zhai et al., 2009]on regular soft keyboards. It is common to see people use both character-by-character entry and shape writingin conjunction, often in the same phrase. However, the use of swipe gestures for character-by-character entryprecludes the possibility of shape writing on the proposed keyboards.

The question we explore is how to make the keys effectively bigger without expanding upon the number ofsteps or gestures beyond finger tapping? One potential solution to both is to make the touchscreen sensitive tofinger identity when the tap hits the screen. Although there are no market-ready technical solutions, multiplepromising threads of research [Benko et al., 2009, Holz and Baudisch, 2013, Marquardt et al., 2011] have demon-strated the feasibility of finger identification in larger scale systems like the tabletop. It is reasonable to expectthat techniques for finger identification will eventually become viable for smaller screen devices, and hence it isworth exploring now if it is worthwhile using finger identification to improve the text input experience on smallscreens.

We propose DualKey – a solution to efficient text-entry on miniature screen devices using finger identification.DualKey eliminates the need for a two-step selection process, plus it does not require any swipe gestures for

1The contents of this chapter were published at CHI 2016 [Gupta and Balakrishnan, 2016].

38

CHAPTER 4. USING DISTINCT FINGERS ON WRIST WEARABLES FOR TEXT ENTRY 39

Figure 4.1: DualKey (a) Index finger types ‘ty’ key’s left letter ‘t’ (b) Middle finger types ‘ty’ key’s right letter ‘y’

character-by-character entry. DualKey has a qwerty layout, but with half the keys of a full layout (Figure 4.1).Every key corresponds to two characters, each character associated with a different finger. The left characteron a key associates with the index finger, and the right associates with the middle finger. Typing ‘the’, forinstance, would require the user to tap the keys ‘ty’, ‘gh’, and ‘er’ with the index, middle, and index fingerrespectively. Prior research on single finger tapping [Colley and Häkkilä, 2014] has shown that users are morecomfortable, fast, and accurate when using their index or middle fingers as compared to the other fingers. We builta finger identification prototype for evaluation that has an accuracy of 99.5% on smartwatch scale touchscreens,implemented the DualKey technique using it, and ran two studies to evaluate its effectiveness.

4.2 Related Work

4.2.1 Finger Identification

As described in Chapter 2, prior work on finger identification either tackles the technical problem of identifyingfingers or the potential interactions that result from identifying fingers. However, there are no end-to-end appli-cations based on robust finger identification that have been evaluated for performance using a working prototype.There is a lack of compelling investigation into how these interactions can be uniquely ingrained in applications,and how their performance can show improvements over basic finger interactions, if any. DualKey fills this gapand hopes to fuel the conversation on finger identification interactions.

4.2.2 Miniature Screen Text-Entry

A lot of work has been done in text-entry for smartphones. These techniques, however, do not translate well forminiature devices like smartwatches whose size makes it impossible to have 26 unambiguously accessible keys.Text-entry for this form factor has recently been addressed in multiple works [Chen et al., 2014a, Oney et al.,2013, Leiva et al., 2015, Dunlop et al., 2014, Cho et al., 2014, Hong et al., 2015]. They study typing performancein terms of speed (words-per-minute WPM), uncorrected error rate (UER), and total error rate (TER) whichincludes corrected and uncorrected errors. GPC refers to gestures-per-character – which indicates how manygestures (or taps) it takes, on average, to type a character. GPC includes space bar and has been estimated for alltechniques [Hong et al., 2015] using the 500 phrase-set [MacKenzie and Soukoreff, 2003].

Zoomboard [Oney et al., 2013] uses a two-step tapping technique, to first zoom into the desired region, fol-lowed by selecting the key. With two taps for every character, besides space, Zoomboard had a GPC of 1.85.


Zoomboard’s speed after 15 minutes of use on a smartwatch was 9.8 WPM with a TER of 7.1% [Hong et al.,2015]. As a baseline, standard qwerty achieved 13.7 WPM with a whopping 21.2% TER. Zoomboard was thefirst text-entry technique dedicated to miniature screens, but its performance was surpassed by later techniques.

Splitboard [Hong et al., 2015] splits the keyboard into left and right sections and shows one of these sectionson the display at a time, with the user required to swipe left or right to get the desired section. With two columnsof keys common to both sections, Splitboard has a GPC of 1.28. After 15 minutes of use, Splitboard reporteda speed of 15.3 WPM and a TER of 7.35%. Splitboard did not report a significant effect of blocks on speed andwas not evaluated long-term beyond initial use. Given the simplicity of the technique, there doesn’t seem to be anargument that performance might improve significantly over long term use.

Swipeboard [Chen et al., 2014a] has a GPC of 2 and is based on zooming-in where the first swipe leads to 1of 9 designated regions and the second swipe selects the key. It relies on a training protocol to maximize learningand reports a text entry speed of 19.5 WPM after 2 hours of training, with an error rate of 17.48% for the wholeexperiment (including hard and soft errors [Chen et al., 2014a]). While the keyboard size is smaller than thesmartwatch form factor, it does not account for the errors as the technique is not based on precise key selection,but on memorization and performing gesture-pairs on the layout. The reported speed of text-entry accommodatesthis error rate by allowing for corrections. However, the high error rate is still a significant cause of concernbecause, in practice, this effectively means that 1 in 5 characters is typed wrong. This changes the dynamic of thetext-input task: instead of a user typing relatively fluently with occasional glances at the output text area, here,she has to constantly focus at the output text and intermittently keep correcting the input. Further, Swipeboard didnot use the standard experimental protocol of phrase entry. Instead, a single four letter word drawn from a limitedword-set made up of only five characters E, T, A, N, and S, and no others was entered. The authors acknowledgethat these deviations from standard protocol artificially accelerate novice-expert performance growth. However,this does not provide an accurate representation of the expert performance of the technique. The results could varyon a longitudinal text-entry study with phrases derived from common language. Considering the highly reducedword-set, the initial performance of Swipeboard at the end of 15 minutes of practice was low at 9.09 WPM. Thisis due to the emphasis on memorization of two-step gestures for every letter.

Other proposed techniques [Leiva et al., 2015, Dunlop et al., 2014, Cho et al., 2014] are either slower or lessaccurate than the aforementioned or are not evaluated for performance. Flick input is used in Japanese keyboardsto select different characters on the same key based on the flick gesture. Hong et al. [Hong et al., 2015] evaluateda form of flick input called Slideboard and found that Splitboard outperformed Slideboard. A potential directionof work in this space is related to keyboards that allow ambiguous input and rely on a language model and amiss model for predicting the word [Li et al., 2011] However, our work is focused on per-character entry, whichis not supported by such key-boards. No reports have been published on the performance of such keyboards onsmall screens. While language models are an important part of today’s text entry systems, per-character entry issupported by all popular soft keyboards. It allows entering words not in the lexicon, an ability which is routinelyused by users not just for English conversations, but also conversations in other languages that are typed in English.For instance, Hindi speakers simply type Hindi words in English which are not present in the dictionaries.

In summary, while existing works have advanced the topic of smartwatch text-entry, there are multiple issues:Firstly, all techniques follow a two-step selection process, with aGPC > 1. Secondly, besides Zoomboard whichhas a slower performance, all techniques utilize swipes which precludes shape writing on the keyboards. Thirdly,the best performing techniques either have low peak performance or have high entry barriers. Splitboard was notevaluated beyond initial use and the data suggests that its performance plateaus quickly at 15 WPM. Swipeboardhas a high error rate, low initial speed, and a very high entry barrier due to the memorization requirements.


4.3 Dualkey

DualKey has a GPC of exactly 1, relying solely on single finger taps. It requires no swipe gestures, only taps,thus allowing for the possibility of simultaneous shape writing on the layout. The current prototype is built for asmartwatch scale touchscreen. DualKey leverages the distinction between the index and middle fingers to enablea single finger tap for a character. Instinctively, simply tapping the two fingers interchangeably on a smartwatchfeels comfortable. We developed a prototype that identifies the finger touching the screen using an optical sensor,and sends it to the keyboard app on the watch via Bluetooth.

4.3.1 Finger Identification Prototype

The prototype is the same as the one used for Porous Interfaces. However, this time the sensor was mountedonly on the index finger Figure 4.2. When the touchscreen registers a touch, the system checks if the distancevalue is low enough for the index finger to be touching. If not, the system determines that the touch was madeby the middle finger. The single sensor performed well enough to discard the sensor on the middle finger. Thisminimized instrumentation. An initial pilot with three users showed that upon individual calibration, 99.5%accuracy was achieved. Each user typed in 25 phrases from 500 phrase-set [MacKenzie and Soukoreff, 2003].The sensor somewhat constrains typing as the users cannot tap laterally with their finger pads. Also, even the tinysensor adds to screen occlusion.

Figure 4.2: a) Sensor mounted on finger (b) Hardware Setup

4.3.2 Keyboard

Figure 1 shows the keyboard layout. The qwerty layout was used to retain familiarity. Every key is associatedwith two letters, with the left letter corresponding to the index finger and the right letter corresponding to themiddle finger. The ‘**’ key is a swap key which enables the user to swap the letter just typed with its same-keycounterpart. For example, on the ‘er’ key, if a user intends to type ‘r’, but mistakenly uses the index finger andtypes ‘e’, she can immediately swap it to ‘r’ by using the swap key, instead of using backspace and typing thecharacter again. Swap key was the result of an observation during pilot experimentation wherein almost half theuser errors were due to using the incorrect finger on the correct key. The enter, space, and backspace keys areat the bottom of the screen. Enter and Space keys can be pressed with any finger. A middle finger touch on the‘Back’ key changes the keyboard layout to enable access to numbers and special characters.


4.4 Performance Evaluation: Dualkey QWERTY

A 10-day study was conducted to investigate the long-term performance and learning curve of the technique.Since using finger identification is an entirely new interaction experience for the user, it is expected that initialperformance will start slow, and will improve over time based upon how good the interaction technique is andhow easy the learning curve is, in the absence of any explicit training. This makes it imperative to study thistechnique longitudinally. Further, there is no prior work that informs our understanding of input performance forfinger identification, which makes it all the more important to investigate the learning curve of such interactions.

4.4.1 Participants

10 participants (9 male, mean age = 25.3) took part in the study. Eight did the study for 10 sessions, one per day,while two did the study for 15 sessions. Only one participant was a native English speaker, while others studiedor worked in an English-speaking environment. All participants were regular smartphone users. None of themhad experience with a smartwatch. All participants were right-handed and wore the watch on their non-dominantleft hand. We did not account for left-handedness in the study.

4.4.2 Design

The study consisted of a series of sessions, one session per day. In each session, participants typed 25 randomphrases sourced from Mackenzie et al’s phrase sets [MacKenzie and Soukoreff, 2003]. No phrases were repeated,even across different sessions. The 25 phrases were divided up into five blocks with five phrases each, with anoptional break between blocks. Each session lasted 6-15 minutes, depending on the participants’ speed. Sincespeed improved over sessions, each participant went through 90 minutes of typing by the end of 10 sessions.To explore if and when the performance plateaus, we extended the run for two randomly selected participants to15 sessions. The sessions ran on consecutive days, with breaks during weekends. No breaks exceeded 2 days.Text-entry speed in WPM, and error rates UER, TER were recorded.

As is standard, participants were instructed to correct their mistakes as they went. If they did not detect themistake until several characters later, they were asked to ignore it and continue. They were also asked to ignoresystem errors caused by occasional incorrect finger detections. The experiment started after a calibration phrase.In total, ((8 participants x 10 sessions) + (2 participants x 15 sessions)) x 5 blocks x 5 phrases resulted in 2750phrases entered.

4.4.3 Apparatus

The sensor was mounted on the right hand index finger pad at a distance of 10 mm from the fingertip. Theexperiment ran on an LG G android watch with a 29.6 x 29.6 mm touchscreen. Size of letter keys was 5.6x 6.5 mm. The presented phrase and the typed phrase were shown above the keyboard area. Aside from thelongitudinal aspect, our study adhered closely to the study design and keyboard lay-out used by Hong et al in theirfive-smartwatch-keyboard- study [Hong et al., 2015], in order to aide in cross-study comparisons.


4.5 Results

4.5.1 Text-Entry Speed

Figure 4.3 shows the mean WPM by day (including error correction time). The steep curve over the first six days(sessions) shows a substantial performance improvement, followed by a slower improvement until the 10th day.

Figure 4.3: DualKey’s Mean Speed WPM by Day

Figure 4.4 shows the curve for the 2 participants who did 15 sessions. Their performance continued to improveeven after 10 days, with an average of 22.42 on the 15th day. This is notable given that DualKey is not designedas an expert technique which requires explicit training. The maximum speed reached was 24.7 WPM at thelast session for P1. When considering speed of individual blocks, the maximum speed reached was 26.6 WPM.Interestingly, P2’s curve appears to plateau during days 6-11, but then starts climbing again. This might suggestthat DualKey’s expert performance could potentially be even higher. The WPM = a ∗ (session)b curve was agood fit with a = 11.00, b = .256, R2 = .989.

Figure 4.4: Speed of two participants P1 and P2 over 15 days


4.5.2 Error Rate

Figure 4.5 shows UER and TER by day. UER was low and remained relatively constant throughout, in the 1-2%range. This includes detection errors made by the system, which are approximately 0.5%. For a perfect fingerdetection system, UER, and consequently TER would be even lower. CER (=TER-UER) is shown in Figure 4.6.

Figure 4.5: DualKey’s TER, UER by Day

TER started out at 7.54%, eventually dropping to 5.26% on the last day. Participants made more errors on thefirst day and then stabilized for the rest of the days.

4.5.3 Comparison with Existing Techniques

Aside from GPC and swipe gesture use, we look at four performance metrics: novice speed, long-term speed,novice TER%, and long-term TER%. Novice performance refers to performance for the last block in the firstsession which equates to performance after 15 minutes of practice. Long-term performance refers to the perfor-mance at the end of 90 minutes of practice. Even though we have mirrored the smartwatch keyboards [Honget al., 2015], it is not viable to compare these metrics across studies with statistical techniques due to the slightimplementation differences between studies. However, for comparisons where the difference is visibly large, wecan infer which techniques will potentially perform better. We compare DualKey with Splitboard and Swipeboardas they reported the best novice and long-term speeds respectively, and with Zoomboard since it does not requireswipe gestures.

UsesSwipe GPC Novice

WPMLong-term

WPMNoviceTER

Long-termTER

Zoomboard No 1.85 9.80 17.08 27.5% 19.64%Splitboard Yes 1.28 15.30 NA 7.35% NASwipeboard Yes 2 9.09 19.58 19.5% 17.48%

DualKey QWERTY No 1 12.31 19.61 6.29% 5.25%DualKey SWEQTY No 1 14.69 21.59 5.67% 3.27%

Table 4.1: DualKey’s comparison with existing techniques. Swipeboard followed an accelerated evaluation design.

DualKey’s novice speed is 12.3 WPM. This exceeds the equivalent speeds for Zoomboard (9.8 WPM) and


Swipeboard (9.1 WPM). It is lower than Splitboard’s 15.3 WPM. However, as mentioned earlier, Splitboard doesnot suggest a learning effect and reaches its peak performance early. No long-term study was conducted forSplitboard. Swipeboard, on the other hand, recorded a long-term speed of 19.58 WPM. Zoomboard was at 17.08under the same conditions. However, these speeds were achieved after artificially accelerated learning. DualKeyachieved a comparable speed of 19.61 WPM after 90 minutes of practice under standard evaluation design.DualKey showed even more improvement with 5 more days ( 2 hours of practice). Further, TERs of Swipeboardat both novice and long-term stages are prohibitively high at > 17%. DualKey’s TER was 6.29% at the novicestage. This is comparable to Splitboard’s 7.35%. Table 4.1 lays out these comparisons for easy perusal. We’lldiscuss the last row in the next section.

DualKey’s novice performance exceeds every other technique barring Splitboard. Since finger identificationinteractions are completely new for the user, an initial inertia is to be expected. As we see in Figure 4.3, theperformance jumps sharply from Day 1 to Day 2, showing that the first day inertia is quickly overcome. However,it is important to understand what cause the initial inertia. We, therefore analyzed the data in more detail tounderstand the variables that negatively impact accuracy and speed.

4.6 Further analysis of accuracy and speed

4.6.1 Analyzing DualKey Errors

To understand if it was too confusing for users to switch fingers, we analyzed all corrected errors to see how manywere caused as a result of using the incorrect finger on the correct key. Figure 4.6 shows this finger-switchingerror rate (FER) besides the corrected error rate (CER). FER hovers around 2.5%. Essentially, close to half of thecorrected errors were the result of the use of incorrect finger on the correct key. Again, FER is highest on the firstday.

Figure 4.6: Mean Corrected Error Rate (CER), Finger Error Rate (FER), Swap-Correction Rate (SCR) over 10 days

We further analyze how often the participants used the swap key ‘**’ to correct these finger switching errors.SCR% is the % of swap corrected errors. SCR tells us how helpful the swap key was based on how frequentlyparticipants used it to correct finger errors as opposed to backspace and correction. Figure 4.6 shows that in initialdays, the swap key was infrequently used even when finger switching errors were high. Less than a quarter of


finger switching errors were corrected using the swap key. This usage considerably improves over time. Partici-pants use the swap key for more than half of the corrected finger errors after the fourth day. We asked participantsabout this at the end. Their responses indicate that initially they found it difficult to give up their habit of usingbackspace. As they used the swap key more, they found it easier to use. However, even at the end, the swapkey is still not used for all finger switching errors. One participant commented that it depended on the flow ofcorrections – if the previous error was not a finger switching error, they would use backspace corrections and soeven on a finger-switching error they would habitually go for the backspace key.

4.6.2 Analyzing DualKey Speed: Finger-Switching Time

In addition to movement and tapping time, the time taken to type a character relies on the user’s swiftness indeciding if the finger needs to be switched for the character, and then in optionally switching the fingers if required:

T = Tdeciding + [Tswitching] + Tmovement + Ttapping

In certain situations, switching + movement time could actually be faster than when there is no switching –for instance, when typing ‘k’ after ‘a’, the middle finger will naturally be positioned close to ‘k’ after ‘a’ has beentyped, and the user simply needs to tap ‘k’ with minimal movement. Depending on the deciding time, such asituation could be faster than one where there is no switching involved – to type ‘x’ after ‘i’, the user needs todecide and then move to the other diagonal end of the layout.

Consequently, we analyzed the time duration between two characters with respect to their finger configura-tions. There are four possible finger configurations for the index (I) and middle (M) fingers – II, IM, MI, andMM. For instance, typing ‘k’ after ‘a’, requires a switch from the index finger to the middle finger, and thereforeis in IM. The analysis ignores the data points where space is involved because it can be typed with any finger. Italso ignores the first character of the phrases. Figure 4.7 shows the time duration between characters in secondsfor the four finger configurations. The finger configuration has a significant effect on the time between charac-ters (F (3, 297) = 257.725, p < .001). Pairwise comparisons found the difference in means between all fingercombinations to be significant (p < .001).

The participants were fastest in the II configuration, taking the least amount of time by a large margin. Inter-estingly, MM takes the maximum time, even without the need to switch fingers. Further, IM takes longer than MI.In summary:

TII < TMI < TIM < TMM

While TII < TMI is expected since TMI involves switching fingers, the reasons for TMI < TIM andTIM < TMM are not immediately obvious. We examine them in more detail.

TMI < TIM : The data effectively says that regardless of the prior finger, it takes longer to type with themiddle finger than the index finger. This can be explained by the fact that the middle finger is bigger, used less,and users have less precise control over its movements which causes them to spend more time to correctly positionit over the right key.

TIM < TMM : Even though switching is not involved, typing with a middle finger when it follows anothermiddle finger takes longer. One explanation is found in the frequency of occurrence of the MM finger configu-ration data points in the complete data. Among the four configurations, MM’s frequency of occurrence is half ofthe others: II: 30.5%, IM: 27.6%, MI: 27.8%, and MM: 14.1%. Participants encountered the MM configurationless frequently than others and so were less used to MM than IM which causes the MM speed to be low. This


Figure 4.7: Mean time duration between characters for each of the 4 finger configurations over 10 days. I - Index, M -Middle

hypothesis is confirmed by a closer look at Figure 4.7. Initially, when the users had no practice, both IM and MMstart off at the same time duration. As the users get more experience, they encounter MM less frequently than IM,and the curves eventually diverge after the second session.

The increasing order of time for finger configurations indicates that if we maximize the instances of lowerduration finger configurations and minimize higher ones, we can get an overall increment in speed. This willchange the keyboard layout from qwerty. The optimal layout will assign the 26 letters to the index and middlefingers such that the average time between characters is reduced to a minimum. However, such a new assignmentwill mean that we deviate from the familiarity of the qwerty layout for a novice user. We perform an optimizationof the keyboard layout that accounts for both the optimal assignment of letters to fingers and the closeness toqwerty, and results in a final finger-optimal near-qwerty layout.

4.7 Optimization

The optimization is performed in two steps:

I. Optimizing the assignment of fingers to letters such that the average time between characters is minimum.

II. Optimizing the keyboard layout to be as close to qwerty as possible for the optimized assignment of fingerswe get in Step I.

While the finger configuration analysis is instructive and promises to be useful for later optimizations, itadmittedly does not capture all aspects of user latency. As we stated earlier, positioning of fingers with respectto the next character has a bearing on movement time. We did a coarse position-based analysis where we addedanother variable in addition to the finger configuration: left and right halves of the screen, assuming that movingto the other half of the screen takes up time for the same finger. However, the analysis did not yield significantresults beyond what we already knew from the finger configuration analysis.

We also considered an optimization based on individual character pairs; bigram times. However, this wouldrequire selecting the most optimized layout out of 26!*26! layouts which would require a non-determinsiticapproach such as simulated annealing. This would also mean that the resulting layout would be a random one


and not a near-qwerty one. Since the novice performance depends on familarity to qwerty and because of theuncertainty associated with a non-dterministic approach, the finger-level analysis was preferred.

4.7.1 Step I: Finger Assignment Optimization

Procedure

Multiple participants remarked on the first day that they found finger switching to be difficult and it requiredcertain mental effort to make the decision. They subsequently reported that it became much easier after the firstday, and they were able to type more freely. During the middle sessions, participants mentioned that they nolonger had to consciously think and type for several words. One participant commented:

“You develop a rhythm with the fingers after some time and now it does not feel that different from normal

typing.”

However, to make DualKey more acceptable to users initially, we need to reduce the decision-making effort,as well as the time taken by finger switching. The analysis reflects this in that the average time for different fingercombinations differed starting from the first day to the last day (aside from MI and MM initially). We, therefore,perform optimization such that the overall time between characters is reduced based on these differences in fingerconfiguration time. Our aim is to find an optimal assignment of letters to fingers such that the more frequent letterpairs (bigrams) are associated with finger configurations having lower time duration. To this end, we define thefollowing objective cost function:

Cost =

z∑i=a

z∑j=a

fij ∗ tij

Here fij is the frequency of a bigram ij in the English language corpus [22]; tij is the average time durationof the finger configuration associated with the bigram ij. In effect, tij has only four possible values, the averagetime durations: TII , TMI , TIM , TMM . For instance, both ‘th’ and ‘ef’ bigrams have the same tij = TIM for theqwerty layout. The cost function effectively calculates the normalized time it would take for a user to type theentire English corpus if she were to type every pair of letters in the time taken by that finger configuration onaverage. To minimize the overall time, the cost function needs to be minimized. For different keyboard layouts,the same letter could be assigned to different fingers, and consequently the same bigram will be allocated todifferent configurations, thus giving different cost outputs. For example, if a new layout is same as qwerty, justwith key er replaced by re, then all tij associated with e and r will change – for instance, qwerty’s tef = TIM

will be changed to tef = TMM for the new layout. However, two layouts can have equal cost functions if fingerassignment is same for every letter.

After optimization, we need the optimal assignment of letters divided into two sets: SI and SM , such that thecost function is minimized. For qwerty layout, the assignment is:

SI A B C D E G J L M O Q T U Z

SM F H I K N P R S V W X Y

Notice that optimization is not affected by the specifics of the position of the letters on the layout, only in theassignment of a letters to one of the fingers. There will be multiple layouts that will satisfy the final assignments.

In the current DualKey layout, two keys have special characters (;<) for their middle finger association.Consequently, there are 14 slots for index finger and 12 slots for middle finger. We keep the positions of thespecial characters fixed, keeping the number of index and middle finger slots constant at 14 and 12. This willretain qwerty’s number of letters per row and simplify the analysis.


The four tij values used for optimization are the individual average values, TII , TMI , TIM , TMM over all 10sessions: .581, .765, .691, and .831 respectively. For an assignment of 26 letters into groups of 14 and 12, thereare total 26C14 possible combinations that need to be evaluated. We wrote a matlab script that gave the optimalassignment of letters to fingers with the least value of the cost function.

The simplification of time taken between a bigram to the time taken by its finger combination admittedly doesnot capture the individual key level times. However, finger configuration is a significant contributor to a bigram’stime and while the optimization’s predicted values may not exactly replicate in practical use, it should certainlyimpact the speed positively and lessen decision-making effort.

4.7.2 Results

The baseline value of the cost function for the qwerty layout is .7006s. The globally optimal assignment is asfollows:

SoI A C D E H I L M N O R S T U

SoM B F G J K P Q V W X Y Z

The corresponding optimal value of the cost function is .6148s. This is a 12.25% improvement over qwerty’s.7006s. This implies that theoretically this assignment should increase the speed of DualKey by 12.25%. As acomparison, we ran the optimization for maximizing the cost function to get the worst possible assignment sets.The resultant cost value was .7952. The difference between the best and worst times is in a narrow range andqwerty lies somewhere in the middle.

Looking closely at the assignment sets, we see that the letters assigned to the index finger are mostly highfrequency letters, E, T, A, etc. whereas the middle finger assignments are low frequency letters X, J, Q, Z etc.In fact, a closer look reveals that So

I contains the top 14 English letters by frequency [Norvig, 2013], and SoM

contains the bottom 12. This fits in perfectly with the notion that with the highest frequency letters associatedwith the index finger, users would need to switch to the middle finger rarely, thus reducing the MM, IM, and alsoMI configurations and increasing II configurations. Using the frequency table [Norvig, 2013], we see that theoptimal assignment results in 86.5% of the taps being made by the index finger.

Our optimization algorithm was designed to minimize the time taken between characters. To see how wouldan optimization based on minimizing the number of finger switching instances would perform, we ran anotheroptimization with the same cost function, but with tij replaced with bij :

Costswitches =

z∑i=a

z∑j=a

fij ∗ bij

Here, bij is a binary variable with a value 1 for IM and MI configurations and 0 for II and MM. This reducesfinger switching regardless of individual time taken. Not surprisingly, the optimization resulted in the exact sameoptimal assignment sets. The cost of switching for qwerty was .5792, which implied that on average, to type theEnglish corpus, the number of finger switches required would be .5792 per bigram or one switch for every secondbigram. The optimal cost value came out to be .2220, which is an improvement of 61.7%. The number of fingerswitches is reduced to less than one switch for every four bigrams. This is a sizeable reduction in the number offinger switches the user has to deal with, thus lessening the decision-making effort. While theoretically, the speedshould improve by 12.25%, the potential reduction in decision-making effort should also impact participants’speed positively, improving the speed even further.


The new assignment sets mean that the layout can no longer be qwerty. This could impact the initial perfor-mance of users who are habituated to soft qwerty keyboards. Our optimization, though, only results in optimalassignment sets and not a fixed keyboard layout. Consequently, we perform a layout optimization that resulted in akeyboard layout which was closest to the qwerty layout amongst all the eligible layouts for the optimal assignmentset.

4.7.3 Step II: Nearest-qwerty Layout Optimization

Procedure

With 14 letters assigned to the index finger, and 12 to the middle finger, there are 14!x12! = 4.1E + 19eligiblelayouts for the optimal assignment set. SIo and SMo were optimized individually over their 14 and 12 slots.

A keyboard is considered near to qwerty based on the distances of the positions of all letters in the new layoutfrom their positions in qwerty. The coordinates were assigned to the 26 slots in the following manner: The firsttop-left slot is assigned (x, y) = (0, 0) and the x-coordinate increases by 1 for every move to a slot on the right.The y-coordinate similarly increases by 1 for every move to a slot below. For instance, for the qwerty layout, q

has coordinates (0, 0), p has (9, 0), and m has (6, 2). The distance of a letter in layout L from its position in theqwerty layout Q is defined as:

diLQ =√

((xiL − xiQ)2 + (yiL − yiQ)2)

Here xiL refers to the x-coordinate of the ith letter in a layout L, and xiQ refers to the x-coordinate of the ithletter in the qwerty layout Q. Based on this distance, we define the distance cost function for a layout L:

Cost =

z∑i=a

diLQ ∗ fi

Here fi is the frequency of the ith letter in the English language corpus. This is a weighted distance costfunction such that more weight is given to the near-qwerty positioning of the most frequent letters.

The globally optimal layout will minimize the weighted distance cost function to give us the layout that isclosest to qwerty. Since the cost function involves the sum of weighted distances, we can write it as the sumof two independent parts: the weighted distance cost function for letters in So

I , and the weighted distance costfunction for letters in So

M . Because the letters in SoI can never occupy the slots designated for letters in So

M , andvice versa, these two cost functions can be independently optimized. Thus, we obtain the globally optimal layoutby iterating through only 14! + 12! = 8.8E + 10 layout permutations.

Results

Figure 4.8(b) shows the optimal layout that resulted from the optimization. We term it the SWEQTY layout. Theoptimal weighted distance cost value is .53. The unweighted distance of SWEQTY from QWERTY is 16.9. Asthe figure shows, the highest frequency letters in both So

I and SoM have retained their qwerty positions if their

finger assignment was same as qwerty. As a comparison, the farthest layout from qwerty had a weighted distancecost of 2.95 and an unweighted distance of 93.6.

SWEQTY is the optimal layout in terms of time taken between characters, finger switching instances, andcloseness to qwerty for any DualKey keyboard with the same slot layout. The same process can be easily appliedto any other slot distribution for any DualKey variation.


Figure 4.8: (a) DualKey QWERTY (b) DualKey SWEQTY

4.8 Performance Evaluation: Dualkey SWEQTY

With sweqty, the problem of initial unfamiliarity with the keyboard is somewhat alleviated due to the closenesswith QWERTY. Combined with the potentially reduced decision-making effort and higher speed, we hypothesizethat we will see a net improvement in the novice, as well as the long-term speed and error rates. To validate ourhypothesis we ran another 10-day study with eight participants with the exact same design as the first QWERTYstudy. All participants were different from the earlier study. In total, we had 8 participants x 10 sessions x 5blocks x 5 phrases = 2000 typed phrases.

4.8.1 Results

The sweqty speed by days is shown in Figure 4.9. The effect of days on speed was significant (F (9, 63) =

34.887, p < .001). The mean novice speed of sweqty is 14.69 WPM, and the long-term speed is 21.59, bothof which are higher than qwerty. SWEQTY’s curve is higher than QWERTY. The novice speed shows a 19%improvement over qwerty.

Sweqty speeds show a higher variance than qwerty. This is because a subset of participants texted heavily ontheir smartphones every day. Consequently, they were using qwerty soft keyboards daily, in addition to sweqty.The performance of these users was affected as a result, resulting in higher variance.

The novice and expert TERs of sweqty are 5.67% and 3.27% respectively. A mixed ANOVA on error ratesfor 10 days yielded a significant difference between TERs for qwerty v/s sweqty: F (1, 16 = 6.737, p < .05).Figure 4.10 shows the TER %s. The novice TER improved by 10.93%, and the long-term TER improved by37.71%, which is a rather large improvement. The sweqty TER% starts off close to qwerty, but falls rapidly tovalues considerably lower than qwerty. The relatively high error rates initially in both techniques are because ofthe techniques being unfamiliar to the users. The fall after first day is pronounced in sweqty because of less fingerswitching and the low frequency of middle finger taps.

Participants mentioned that they felt they needed to switch fingers very infrequently and the technique waseasy to pick up. When asked about the non-qwerty layout, one of the participants commented: “It looked weird.

But when I actually started typing, I did not have to actively search for the alphabets a lot.”

Looking back at Table 4.1, we see that all four metrics show improvements over DualKey QWERTY. Whilesweqty v/s qwerty speeds do not show significance, there is significant difference between their TERs. At novicestage, the speed improves 19% and TER improves 10.9% despite a non-qwerty layout which shows that the gainsof switching to a near-qwerty optimal layout outweigh the familiarity of a qwerty layout for DualKey even at


Figure 4.9: Mean Speeds of DualKey: SWEQTY vs. QWERTY

Figure 4.10: Mean TER% of DualKey: SWEQTY v/s QWERTY

the starting point. The huge improvement in error rates boosts the usability of sweqty further, especially in thelong-term.

SWEQTY further extends the improvement that DualKey showed over existing techniques. Although novicespeed does not exceed Splitboard’s 15.30, its value of 14.69 can be considered comparable. The novice TER,on the other hand, at 5.67% is lower than Splitboard’s 7.35%. Both qwerty and sweqty perform comparablyor better than others on all four metrics. To put it another way, DualKey at least performs comparably to thebest performing novice technique while exceeding others, and performs better than all existing techniques onlong-term performance.


4.9 Discussion

4.9.1 Performance Improvements

Even after optimization using the sweqty layout, there is further scope for improvement in multiple areas. First,our custom hardware limited the degree of freedom for the user’s fingers and thus constrained the user’s abilityto do free form typing. Second, even though it was small, the finger sensor occluded the screen by a bit, whichfurther impeded user’s performance. An unobtrusive finger identification technology will further improve theperformance. In fact, aside from fingerprinting techniques, we can imagine miniature sensor rings that the usercan wear that will reduce movement constraints or occlusion.

Third, incorporating auto-correction or prediction [MacKenzie et al., 2001] into the keyboard will allow theuser to make more imprecise taps which will increase the overall typing speed. Using such statistical disambigua-tion might preclude the need for detecting fingers altogether. However, that would imply that not only wordsoutside the dictionary cannot be typed, but also new words cannot be added to the dictionary by the user sincethere is no way to perform per-character entry. This will be even more problematic for users who type non-Englishwords in English, a frequent occurrence in countries such as India.

Finger identification leads to huge number of possibilities for simplifying and augmenting interactions. There-fore, as and when the technology gets mature, we will potentially see such applications on the rise and slowlybecome a part of the natural interaction ecosystem. In an ecosystem where finger identification interactions are anorm, the users’ novice performance will certainly be superior.

4.9.2 Limitations and Future Work

Our study does not account for left handedness. There is an added wrinkle when DualKey is implemented forthe left hand. Since the index finger is on the right of the middle finger, the characters corresponding to theindex finger should be on the right part of a key. This will change the layout from qwerty/sweqty to somethingthat is farther from qwerty. While the process will remain the same, left handed DualKey needs to be studiedindependently.

DualKey has been designed and optimized for a smartwatch form factor. However, there are even smallerscreens where DualKey’s efficacy needs to be evaluated. As an extension, a TriKey model can be studied wherea single key corresponds to three letters associated with three fingers (or 2 fingers + thumb), reducing the spacerequirements even further. The insights we have gained from DualKey can be helpful in designing the optimizedversions of such TriKey keyboards. As we have seen, the optimal finger assignment results in allocating thehighest frequency letters to the index finger. We can apply the same principle for TriKey, where the highestfrequency letters are allocated to the index finger, followed by the middle finger, and finally the ring finger or thethumb.

4.9.3 Designing Finger Identification Interactions

Our analysis provided us some insights on designing interactions that use finger identification. First, reducingfinger switching even at the cost of impacting familiarity of an interface is an inquiry worth undertaking for anynew application. Second, middle finger taps expectedly take longer and should be assigned low frequency keys.Third, such interactions will be prone to incorrect finger usage mistakes, and providing a quick Swap or Undomethod will be useful. It also reduces the cost of an incorrect-finger tap in the user’s mind, thus relaxing theinteraction without the user having to worry about making finger errors. Text-entry has a latent implication that


it can be undone easily. However, keys like Send cannot be undone. It is recommended that buttons that triggercommands that cannot be undone easily should not be associated with multiple fingers.

The space of interactions using finger identifications is starting to grow out of its infancy. What is neededis not just listing a library of interactions, but also end-to-end applications and in-depth analyses that provide acompelling argument for these interactions and how they improve and augment our current interactions. DualKeyhopes to be a progressive step in that direction.

4.10 Conclusion

We presented DualKey, a novel technique for miniature screen text entry via finger identification. We built acustom hardware that performed finger identification with a 99.5% accuracy which involved detecting subtlefinger movements. We conducted a comprehensive long-term study of the technique and do an in-depth analysisof the speed and error rates. The error analysis showed the usefulness of having the swap button. Based on thespeed analysis, we optimize the assignment of letters to fingers to reduce the finger switching time. Based onthe new assignments, we optimize the layout to get a nearest-to-qwerty layout as close to familiar as possible.We conduct another long-term study with eight participants for the new sweqty layout to validate the theoreticalpremise. Despite being a non-qwerty layout, Sweqty improves upon DualKey in all respects, including improvednovice and long-term speeds and error rates, as well as reduced initial decision-making effort.

DualKey (qwerty/sweqty) beats the existing techniques in multiple respects: (1) Using single step selection,instead of the prevalent two-step selection, it has a GPC of 1 in contrast to a minimum GPC of 1.28 among existingtechniques. (2) In contrast to the best performing techniques, DualKey does not require swipe gestures, thusallowing for parallel shape writing on the layout. (3) DualKey’s long-term performance, both in terms of speedand error rate is better than the reported long-term speeds of other techniques, even while not being a typicalexpert technique. (4) DualKey qwerty’s novice performance exceeds all others barring Splitboard. DualKeysweqty improves upon this novice performance and is comparable to Splitboard in terms of novice speed andbetters Splitboard in terms of error rate.

Overall, based on Porous Interfaces and DualKey, we show that finger identification can be useful for bothenabling novel interaction capabilities as well as enhancing existing interactions. Further, finger identification isuseful for a range of small touchscreens and has both qualitative and quantitative benefits. However, wrist-wornwearables also allow persistent contact with the skin which can serve as another channel for output, and for input.Further, not all wearables have touchscreens owing to power or aesthetic reasons and in such situations, it becomesimperative to explore wearable haptics for interactions. In the next two chapters, we see how wrist-worn hapticscan be used for enhanced feedback and as a central component to our interactions.

Chapter 5

Tactile Squeezing Sensations for WristWearables

5.1 Introduction

1 As described in section 2.2.1, our cutaneous senses comprise the submodalities of stimuli that can be perceivedby our skin – light touch, vibration, pressure, temperature, pain, and itch [McGlone and Reilly, 2010]. While vi-brations have been researched extensively, the pressure modality remains underexplored. Pressure communicatesa more intimate [Wang et al., 2012] and pleasant feedback [Suhonen et al., 2012]. For instance, a simple holdingof a hand or a finger relies on the pressure conveyed by the clench. This clenching around a body part is termedsqueezing feedback. A miniature, portable and controllable squeezing feedback mechanism that can fit withintoday’s smartrings and smartwatches would be very useful.

In this chapter, we investigate the use of shape memory alloys (SMAs) for squeezing pressure feedback onthe wrist. Most work in wrist pressure feedback uses pneumatic actuation, similar to blood pressure devices,which involves pumps and valves that are relatively bulky for a wearable device. Secondly, the size of the inflatedcuffs makes impossible to generate highly localized compression sensations. Thirdly, pneumatic feedback islimited in its ability to provide instant compression sequences because of the inflation and deflation time. Finally,pneumatic actuation provides compression feedback whose perceptual properties are different from squeezing.We investigate squeezing feedback using SMA springs that are lightweight and thin, enable a high localizationacuity, and can quickly generate strong squeezing feedback.

In the following sections, we formalize the definition of squeezing feedback and describe HapticClench –an SMA squeezing actuator, detail its design process and its electro-mechanical properties. We report on thepsychophysical analysis of HapticClench including absolute detection and JND thresholds. We then investigatehow HapticClench’s low bulk and spatial acuity lead to capabilities that are not present or investigated in earlierwork in this domain. We end with design guidelines and a discussion.

1The contents of this chapter were published at UIST 2017 [Gupta et al., 2017a].

55

CHAPTER 5. TACTILE SQUEEZING SENSATIONS FOR WRIST WEARABLES 56

Figure 5.1: HapticClench’s squeezing tangential & shear forces

5.2 Related Work

Pressure actuation on the body can be point-based [Antfolk et al., 2010, Kikuuwe and Yoshikawa, 2001], planar[Ying Zheng et al., 2013, Zheng and Morrell, 2012], or around a body part which can be compression or squeezing-based. Research on compression predominantly uses pneumatic actuation [Pohl et al., 2017]. Although squeezingis not much explored, motor-based squeezing is the predominant approach. Both approaches use wrist as thecommon ground for such exploration.

Pneumatic actuation inflates air into a cuff around the wrist. Multiple works use blood pressure cuffs forcompression to provide sensory replacement when using prosthetics [Patterson and Katz, 1992, Tejeiro et al.,2012]. Mitsuda et al. [Mitsuda, 2013] and Pohl et al. [Pohl et al., 2017] study the psychophysics of pneumaticcompression, with both establishing a connection between air pressure and users’ detection thresholds. Pohl et al.further found that users took more time to react to compression over vibrations, which is attributed to the inflationtime of the straps.

Motor-based actuation tightens a band around the wrist by pulling it towards the top using a motor [Baumannet al., 2010, Chinello et al., 2014, Hinckley and Song, 2011, Stanley and Kuchenbecker, 2011]. Song et al[Song et al., 2015] show that temporal squeezing pulses are recognized as well as vibration cues. Baumann et al.[Baumann et al., 2010] show that participants described different squeezing pulses with a larger range of adjectivesthan tapping and considered it more organic. Chinello et al. [Chinello et al., 2014] show that squeeze actuationusing three motor-driven moving plates resembled the squeezing action of the hand. However, no existing workinvestigates the squeezing load limits and interaction capabilities that a miniature motor-based device can provide.While some of the proposed devices could be small enough for wrist wearables, their strength is limited and thepresence of a small mechanical motor still adds considerable thickness, which makes it impractical for flatterwearables, fitness bands, and rings. Prior work also notes the presence of noticeable gear noises and vibrationsproportional to speed and load.

Suhonen et al’s [Suhonen et al., 2012] work is the only instance of using SMA wires for squeezing feedback.The users found the squeezes as “surprisingly weak”, which they would not be able to feel in distracted situations.However, they found the sensations to be “very pleasant” and “massage-like”.


5.3 Squeezing Feedback

Prior literature uses the terms squeezing and compression interchangeably to describe pressure that encompassesthe wrist. Pohl et al. [Pohl et al., 2017] describe compression feedback as tangential-only compression of the skinall around the body part like an inflatable strap around the wrist pushes into the skin and compresses it tangentially.We delineate the term squeezing to refer to pressure sensations around a body part that consist of tangential andshear forces on it (Figure 5.1). When a band tightens around the wrist, instead of directly pushing against it, itresults in both shear and compression forces. This is a different perceptual phenomenon since shear forces resultin skin stretch that acts upon the Ruffini endings [Mountcastle, 2005] in the cutaneous tissue, while compressionacts upon the Pacinian and Merkel endings in the cutaneous and subcutaneous skin tissue [Lederman and Klatzky,2009, Pohl et al., 2017]. When the stimulus solely relies on compression, it can lead to constriction similar toblood pressure monitors, affecting comfort [Pohl et al., 2017]. However, while squeezing & compression arebiologically different [Kikuuwe and Yoshikawa, 2001], their perception might not be exclusive from each other[Mountcastle, 2005]. An exploration into this requires designing a pneumatic device with same width and loadproperties as HapticClench and is a subject for future work.

Squeezing happens when a wire or a band tightens around the wrist. We define three properties of a squeezingfeedback device that inform us of its squeezing type and competence: span, load capacity, and load throughput.1) Span is the width of the actuating wire or band on the skin. 2) Load Capacity is the maximum load a squeezingdevice generates. 3) Load Throughput is the maximum load a squeezing device can provide in a second. Higherthe span of a device, higher is the load required to generate the same amount of pressure on the user’s skin. Theload capacity and load throughput inform us about the limits of strength and speed of a device. However, theseproperties can only give us a sense of a device’s squeezing prowess. Squeezing load can vary differently basedon power supplied and time increment. We investigate these issues for HapticClench after discussing its designprocess.

We desire three properties in HapticClench: a small span that enables higher spatial acuity thus allowing formultiple squeezing actuations, a high load capacity to offer a wide range of stimulus strength, and high loadthroughput to minimize latency upon actuation.

5.4 Design Process: Making of a Strong SMA Squeezing Actuator

Shape-Memory Alloys have the ability to deform to a preset shape when heated. HapticClench uses Flexinol®,a commercially available Nickel-Titanium SMA with a low span that contracts like muscles when electricallydriven. However, as mentioned, SMA wires were perceived as surprisingly weak. The challenge was to increasethe load capacity of the wires while keeping the span smaller and achieving a high load throughput. The basicprototype has a Flexinol® wire tied around the wrist whose contraction modulates based on the power supply. Thestrength of an SMA wire depends on two factors: (1) the absolute contraction limit of the wire from its originallength, (2) the restoration of the contracted wire to its exact original length so that subsequent contractions areconsistently strong.

Figure 5.2 catalogues the HapticClench wire design iterations. The initial prototype (Fig 5.2a) consisted of a0.5mm diameter wire attached to the wrist with Velcro straps. However, while the sensation was easily discernableinitially, it did not consistently produce the same load capacity. This was because the wires need an external pullforce while cooling to restore to their exact original length. While we assumed that the force exerted by thesqueezed skin on the contracted wire would restore the wire, it was not enough. To exert the required pull force,


Figure 5.2: HapticClench wire design iterations. (a) SMA wire+Velcro (b) Wire+Velcro+Restorative Spring (c) Coiledwire in series +insulation (d) Wires in Parallel (e) SMA Spring+Restorative Spring+Hook (f) Final: SMA

Spring+Hook, No restorative spring

we added a 4x12mm extension spring (Figure 5.2b), to the ends of the wire which was enough for restoration.However, the wire now had to overcome the restorative spring force during its contraction, which weakened thesqueezing sensation.

To solve this, we considered two solutions: (1) (Figure 5.2c) a longer wire with multiple coils around thewrist that would increase the maximum overall contraction. However, there was no way to incorporate springs inthis design. (2) (Figure 5.2d) Multiple wires were placed in parallel on the skin. While this increased the overallstrength, the resultant force spatially distributed on the skin, resulting in a spread-out, but weak sensation.

We tried other SMAs such as the generic Nitinol, but the sensations, although strong for the thickest Nitinol,were highly contingent on an initial manual calibration using an open flame, which could not be made consistent.Flexinol, on the other hand, is pre-treated. Finally, we used Flexinol® springs (Figure 5.2e). The springs hadhigher contraction force and maximum possible contraction without any spatial distribution and resulted in amuch-improved sensation. The springs also required less restorative force, which the squeezed skin itself couldsuccessfully provide. The restorative spring was therefore removed in the final prototype (Figure 5.2f). While theSMA spring’s span is slightly higher than the wire, it delivers much higher load capacity and throughput.

5.5 The HapticClench System

The final HapticClench prototype (Figure 5.2f) uses 30-coil Flexinol® springs with a 0.5mm wire diameter and a3.45mm span (outer diameter) that has a load capacity of 1.63kg. The load throughput is same as load capacityi.e. it can reach a load of 1.63kg in 1s. The system (Figure 5.3) consists of the spring connected via crimps to ahook that ties around the user’s wrist. The end of the springs connect to the driving circuit. The circuit consists ofan Arduino Pro Mini that supplies PWM pulses to drive a mosfet, which in turn drives the SMA spring. Varyingthe PWM varies the supplied power resulting in different squeezing loads.

In absolute terms, the strength of the sensation is expressed in terms of the load of the force exerted by thespring. This depends on the power and duration of the supply. We measured the load of the spring using a digitalscale horizontally. Figure 5.4 shows the load strength for different power values supplied for a duration of 2s.

A similar load curve can be obtained within 1s duration by supplying higher power. The load rises slowly,then rises almost linearly until 1.25kgs and then starts leveling as it reaches its load limit. The spring maintainedits consistency over multiple uses and across other similar springs with an error range of +-5%. However, largechanges in ambient temperature could have an effect on this consistency. The prototype was always used inan ambient temperature of 21-25°C. We derive a two degree polynomial equation for the curve (R2 > 0.95)

excluding the first three points and use it for the studies.

As evident, HapticClench combines a small and lightweight apparatus, with a small span, high load capacity,


Figure 5.3: HapticClench circuit+spring assembly

Figure 5.4: HapticClench Load vs Power supplied

and high throughput. The system however has its drawbacks. The power required to generate the minimumload at 1.63kg is 33W. This is high and given current battery sizes, it could be a limiting factor in its portability.Multiple SMAs that use lower power with a higher efficiency have been proposed [Darling et al., 2002, Faran andShilo, 2015]. However, other SMAs are not commercially available for non-bulk orders. Another limitation ofFlexinol® is that it reaches a temperature of up to 90°C at maximum contraction. This temperature peak stays foronly a fraction of a second, rapidly cooling down to the room temperature within 15s. We insulate the user’s wristusing two overlaid 33% rubber, 67% polyester bands of 0.9 mm thickness each and 19m width with 2 layers ofinsulating Kapton tape in between. To encapsulate the spring into a band, a heat venting mechanism such as holesin the band is necessary. Although, it should be noted that the temperature remains within comfortable levels forlower loads. As we will see in the next section, users’ detection thresholds are much lower than the maximumload.

5.6 Psychophysics of HapticClench

To establish the fundamental psychophysical properties of squeezing feedback, we study its absolute detection &discriminatory (JND) thresholds. While earlier work in pneumatic compression has studied these thresholds, thiswork is the first investigation into squeezing feedback.


5.6.1 Absolute Detection Threshold

What is the minimum load of the squeezing feedback that a user can feel? To determine this, we conducted astandard two-down, one-up staircase study. Every squeezing stimulus was applied for a duration of 2s. The ex-periment started with a high load stimulus that reached 0.6kg at 2s. For every stimulus that was felt consecutivelytwo times, the load was decreased by a factor of 0.6, and increased by the same if the stimulus was not felt once.After three reversals, the factor was changed to 0.75. This ensured faster convergence towards the threshold levelinitially, and a fine-grained threshold estimation later. The load values were administered by supplying the equiv-alent power based on the curve in Figure 5.4. The experiment ended after nine total reversals and the average fromthe last five was taken as the threshold estimate. We hypothesized that ADT would fall in .1 to .3 kg. The 0.6kginitial load ensured an easily detectable start. The 0.6 step factor ensured quick jumps to the .1-.3kg vicinity thusminimizing trials, after which the 0.75 factor ensured convergence to a fine-grained value.

Ten participants (2 female, age 22-27, mean=23.8), all right-handed took part. We recorded wrist circum-ference (WC) (13.9-17.2, mean=15.9mm) and wrist-top skinfold thickness (SFT) [Fletcher et al., 1962] (4.3-8.1,5.5mm) which is a reliable indicator of body composition [Wells and Fewtrell, 2006]. Participants wore the bandin their left wrist with the arm rested on a table and hidden from their view during the study. While the squeez-ing sound was minimal, the participants wore headphones playing Brown noise for complete insulation. Theycontrolled the mouse with the right hand to respond Felt/Not Felt after every stimulus. A gap of 20s betweenconsecutive stimuli ensured that the spring completely returned to the ambient temperature.

Results

The mean threshold estimate is 0.16kg (95%CI [0.12, 0.20]). No correlation for SFT or WC was observed. 0.16kgload corresponds to 5W power, which is higher than vibration actuators, but manageable in today’s wearables.

Distracted Absolute Detection Threshold

In situations where the user is distracted, a stronger stimulus might be needed. We conducted an absolute detectionthreshold study while the user performed a primary task. The primary task was playing the game CandyCrush[King, 2017] on the desktop. The participants first played a practice level to get themselves familiarized with thegame. Participants were instructed that their goal was to score as much as possible. At the start of the experiment,the participants began playing the game and the squeezing stimulus was played at a random time between 30sto 75s after the previous pulse. The interface only had one button for Felt in a second screen, and participantswere instructed to press it whenever they felt something and get back to the game. If there is no response within8s of the pulse, it is considered as Not Felt. The user can pause and resume CandyCrush any time withouttiming constraints. The 30-75s random gap allowed the participants to get their focus back into the game andnot anticipate the next pulse. The study followed the one-down, one-up, staircase design since the users were notprompted to respond and only responded when they felt the sensation. The other parameters were same as theprior study. Ten participants (3 female, age 22-26, mean=23.34), different from the prior study, all right-handed,took part.

Results

The mean threshold estimate is 0.17kg (95%CI [0.14, 0.20]). This shows that even though the squeezing modal-ity is perceived as less attention demanding, its minimum perception threshold remains similar with or withoutdistraction. However, more cognitively demanding tasks might lead to a different outcome. We also studied


response times of the participants to see how quickly they perceive and respond to the squeezing. The mean re-sponse time was 1.4s after the pulse stopped playing. Including the pulse time of 2s, a total of 3.4s demonstratesquick responsiveness to the squeezing sensation.

5.6.2 Discrimination Thresholds (JND)

To determine the different levels of squeezing load users can feel, we conducted a JND study using the methodof constant stimuli. For each trial, participants had to respond if a pair of stimuli felt same or different. Eachpair consisted of a base load and an offset load value. We used four base loads (0.125, 0.25 0.5, 1.0) and fouroffsets (∆Load)(0, 0.125, 0.25, 0.5) giving 16 different conditions. In keeping with the JND standard, the baseand offsets increased exponentially while keeping within the spring’s load limits. Each condition repeated once.The order of conditions and within the stimuli pairs were randomized. Participants could play back a stimuli pairif they wanted to be sure of their response. The same participants as the Absolute detection threshold study tookpart 4 days later.

Results

Our JND analysis follows from Pohl et al.’s pneumatic JND study [Pohl et al., 2017] which estimates a conser-vative JND to better fit its use for real world differentiation. JND is defined as the load difference where 95% ofthe users are able to differentiate two stimuli. Figure 5.5 shows the aggregate fraction of responses that found thetwo stimuli equal for each base-offset condition. For a 0.25 kg base load, 0.25 kg more is required for it to bedistinguishable 95% of the time.

Figure 5.5: ∆Load (kg) (y-axis), Base Load Values (x-axis). Table shows fraction of responses that judged two stimulias equal. As baseload increases, offset needs to be higher.

We determined the 95% JND for each base value by fitting a logarithmic function to the δLoad vs aggregate%data (R2 > 0.90 for all) and calculating the ∆Load at 95%. Figure 5.6 shows the resultant JND values in Blue.While the 95% JND’s general trend adheres loosely to Weber’s law (∆L/L = 1.79), the 75% JND follows itmore closely.

The 95% JND for Pohl et al’s pneumatic actuation [Pohl et al., 2017] was 2.77, which is much higher than1.79. Further, participants in our study played back an average of 0.15 times per trial, compared to 3.1 times forpneumatic actuation. While part of this difference is attributed to the different nature of compression vs squeezing,and the surface area of actuation, the time pneumatic actuation took to reach the target load was also higher. Our2s actuation time is much lower and earlier work has shown that feedback around the wrist is better perceivedwhen applied in quick time durations [Suhonen et al., 2012].


Figure 5.6: The 75% and 95% JND values for base loads

5.7 HapticClench: Capabilities and Use

We have established the psychophysical properties of HapticClench’s squeezing feedback and shown that thesystem can easily generate minimum detection thresholds and has a range of up to four levels of load that the usercan differentiate. This can be useful in a variety of scenarios, such as notifications, information communicationvia patterns, gradual progression of temporal activities, and even emotion communication by on the wrist or finger.We investigate three capabilities of HapticClench that bolster these user scenarios: MultiClench, SlowClench, andRingClench.

5.7.1 MultiClench: Spatial patterns using multiple springs

HapticClench’s small span ensures a narrow surface area of actuation, thus enabling multiple springs to be placedside-by-side (Figure 5.7) which can generate multiple squeezing patterns that can communicate more details aboutthe standard notifications, or a codified message. For instance, a user wearing a force sensing band could squeezeher own wrist and make a spatial pattern, and then send it to a friend who feels the analogous sensations via theHapticClench band and understands the message. The question is, are the generated squeezing sensations goodenough for the user to distinguish spatially? If yes, what are the constraints on these patterns, and to what accuracycan the user correctly perceive them?

Figure 5.7: (left) MultiClench, (right) RingClench

We investigated duration patterns lasting 3s for the three springs. Considering triplet patterns of 1s each, 27such patterns are possible. In initial explorations with all 27 patterns, we observed that while the users couldaccurately identify if the current sensation was to the left or right of the previous one, it was somewhat confusingto identify the exact sensation. For instance, if 1-2-3 was played, it was clear, but with 2-3-1, there three potential


answers: 1-2-1, 2-3-2, and 2-3-1. To get rid of the confusion, we removed repeating patterns. Further, we includedall single and double stimulus patterns to make a set of 15 patterns, with no repeating sensations: 1, 2, 3, 12, 13,21, 23, 31, 32, 123, 132, 213, 231, 312, 321. For double patterns, each played for 1.5s. For single, it played for3s. We conducted a study to see how well can the users disambiguate these patterns.

Study Design

In the study, each pattern was played twice for a total of 30 trials, which were randomized. Participants chose oneout of the 15 pattern options in the interface. They were not given feedback on the correctness of the response.All sensations were played at the load of 0.9kg. 8 participants (1 female, all right-handed, age 19-25, mean age= 23.5), all different from prior studies took part. Before the experiment, participants were introduced to thesensations from the three springs and did a practice run of 8 random patterns.

Results

The mean accuracy is 85% and varied heavily depending on the patterns. Figure 5.8 shows accuracy by pattern.

Figure 5.8: Accuracy of MultiClench patterns (95% CI)

For patterns 1, 3, 13, 32, 123, 132, 231, 312, 321, at least 14 out of 16 total trials (87.5%) are correct. Wenoticed a trend in patterns with low accuracy:

2 75% confusion from pattern 312 80% confusion from pattern 1321 80% confusion from pattern 3131 100% confusion from pattern 32

The high triplet but low doubles accuracy indicates that participants used relative positions of springs. Post-study interviews confirmed that while the users could accurately identify if a sensation was to the left or right ofthe previous one, they had trouble with its exact location. Absolute positioning might thus depend on separatingthe springs further. For singles, 1 & 3 have higher accuracy than 2. However, participants erred more on theside of misidentifying middle spring as 3. While we did not record each pattern’s perceived difficulty, participantsreported judging spring position based on how far the squeeze was from the head of ulna. This might have resultedin a bias for spring 3. Based on these results, we recommend removing the middle spring from single and doublepatterns, giving the final set of patterns as: 1, 3, 13, 31, 123, 132, 213, 231, 312, 321.

In addition to this, we asked the users to rate the sensations felt in the study. Figure 5.9 shows the comfortand annoyance. Participants felt that the sensations were fairly comfortable. One participant mentioned that they


felt a bit tighter than they liked. Since the perception of the load varies from participant to participant, an initialcalibration where participants can set their comfort levels will be useful.

Figure 5.9: Comfort & annoyance boxplots for MultiClench

SlowClench: A Gradual Progression of Squeezing

Given its intimate nature, the squeezing sensation could be perfect for ambient delivery of a gradual change ofstate or progress. For instance, while waiting for a friend to arrive, a slowly incremental squeezing sensation couldsignal the narrowing distance. A user waiting for a download could passively track it via the incremental sensation.While the evaluation of temporal progression is a wide topic, we evaluate the feasibility of HapticClench todeliver moderately slow squeezing sensations of 1min and of 30s durations. For a continuing task, its progresscan either be increasing or be paused. Further, as the duration of a tactile sensation increases, its perception bythe user decreases [Chung et al., 2015]. Therefore, the question is, can the user passively differentiate a squeezingsensation that is either increasing or paused? To evaluate this, we propose two different types of squeezing loadincrements: continuous and staggered. Drawing from visual progress bars that update continuously or update ina staggered way, the continuous squeezing sensation rises uniformly from a low 0.3kg load to 1.1kg in 30s/1min.The staggered pulse rises in four steps of 0.2, 0.4, 0.8 and 1.6 at equal intervals in 30s/1min. The third pulse fordenoting a pause is a holding pulse that starts at 0.6kg in 1s and then stays there until 30s/1min end. One problemwas that the longer actuation times generated higher heat. To combat this, we added one another polyester bandwith a layer of Kapton tape.

Study Design

Participants play an obstacle game as their primary task. The 30s/1min pulse plays at a random time 30s-1minafter the participant starts playing the game. After the pulse is over, a pop up appears after a random time of 10-20s is elapsed instructing the participant and select if they felt a Holding or Increasing pulse earlier. The users arenot asked to distinguish between Continuous and Staggered increasing pulses. After the response, participants getback to the game. The next pulse then plays at a random time 30s-1min after the participant’s previous response.The procedure ensures that the participants respond based on the general feeling and not by tracking the startingand stopping of pulses. In contrast with the distracted detection threshold study where the participant has to pausethe game on their own when they feel a sensation, here the user is prompted to respond. Therefore, we selected anaddictive obstacle game that requires constant user attention [Helicopter, 2017]. The participants were introducedto the three pulses before starting the game. Given the nature of the squeezing sensation and the fact that userperception decreases as the stimulus duration increases, we hypothesize that the staggered pulse will performbetter.

10 participants (3 female, all right-handed, age 20-27, mean age = 24.5), different from earlier studies tookpart. Each participant did 2 trials for every duration and pulse combination. All trials were randomized. In total,we have 10 participants x 3 pulses x 2 durations x 2 trials = 120 trials.


Results

Figure 5.10 shows the mean accuracy. A two-way repeated measures ANOVA shows that the pulse type sig-nificantly affected accuracy (F2,18 = 6.00, p < 0.05, η2partial = 0.4). Pairwise comparisons with Bonferronicorrection show significant difference between Continuous and Staggered (mean diff=30%, p=0.039). While noeffect of time or time*pulse was observed, higher 1min accuracy could be because participants were distractedand had more time (opportunities) to register what’s happening with the ambient squeeze. Whether this effectcontinues for >1mins needs investigation as increments spread over longer durations might not be as easily per-ceivable.

Figure 5.10: Mean Accuracy % for all three pulses for both durations. Continuous pulse is least accurate. [95% CI]

Participants successfully recognize Holding and Staggered pulses to reasonable accuracy, but expectedly, theContinuous pulse is not recognized well. Consequently, staggered pulses should be preferred for designing gradualprogression. Deeper investigations into staggered pulses with different periodic increments will help expand it tolarger time durations and optimize accuracy.

5.7.2 RingClench: HapticClench on a Finger

The small spring assembly can also be used on other body parts such as the finger. In addition to the obviouspossibilities for squeezing multiple fingers, squeezing a finger is one of the most intimate ways for communicatingemotions. No existing work has proposed finger squeezing feedback devices or studied squeezing sensations onthe finger. We study the absolute detection threshold of miniature HapticClench prototype at the bottom of thering finger (Figure 5.7). The spring is the same with a shorter length. We conducted the same absolute detectionthreshold study conducted earlier for the wrist with the same parameters. The same participants as the earlierabsolute threshold were recruited to draw a direct comparison. They participated 7 days after they completed theJND study.

The mean threshold estimate came out to be 0.25kg (95%CI [0.18, 0.32]). This is higher than the wristthreshold.

5.8 Discussion

5.8.1 Visual+Haptic Feedback: Squeezing Bracelets

So far we have talked about squeezing sensations in bands that are tight around the skin to begin with. We canalso use HaptiClench’s system to shrink loose bracelets that hang on the wrist. Figure 5.11 shows a faux-leather


bracelet with the spring and circuit inside which shrinks from a loose grip to a tighter one. In this case, the userdoes not feel the force as much as they simply feel the material sticking to the skin. The feedback is both visualand haptic. Since the restorative force of the skin is only enough to restore it to a less tight grip, the user simplypulls the bracelet to stretch it back to its loose form.

Figure 5.11: A loose bracelet squeezing into the skin

Such bracelets provide a visual+haptic stimulus which could be used for priority notifications that grab theuser’s attention. The visual component could be useful for providing socially meaningful notifications.

5.8.2 Design Guidelines for SMA Squeezing

Based on our design process, we suggest guidelines for designing squeezing sensations using SMAs – 1) Com-mercially available SMA wires do not provide enough load for satisfactory squeezing sensations and require aspring for restoration. SMA springs provide higher loads and do not require external restoration. 2) The actuationspan, load capacity, and load throughput all have an effect on perception and must be fixed before designing thesensations. 3) SMA springs do not conform to a linear power vs load curve. The power-load curve should bederived for a particular spring type to make the design process simpler. The curve might be subject to fluctua-tions based on ambient temperature, overheating or physically extending the spring beyond its limits. 4) Thickersprings provide a very high load, but also consume and dissipate more energy and have higher cool down times.Thinner springs can provide loads that are comfortably detectable, consume & dissipate lower energy, and cooldown quickly.

5.8.3 Design Guidelines for Squeezing Perception

Based on our studies, we suggest guidelines for the perceptual properties of squeezing hardware – 1) The detection(wrist: .16kg, ring: .25kg) and JND thresholds (95% JND: 1.79) from our study should hold for squeezingsensations with similar spans even if they are generated using other wires or motor-based mechanisms. Thesecan be directly used to get a sense of a wrist squeezing device’s perceptual capabilities. 5) Rings require a higherload than wristbands, implying a higher minimum power. 3) For application requiring multiple patterns, spatialpatterns with three springs that remove the ambiguities of the middle sensation work well. Given reasonable JNDs,spatio-temporal patterns can also yield promising results. 4) For gradually increasing pulses, a staggered increasein load is more effective than a continuous increase. The step increases should be exponential in accordance with


Weber’s law. 5) However, spring temperature rises with time and needs to be guarded against, when designinglonger duration pulses.

5.8.4 Limitations

As mentioned, SMA springs have increasing power and temperature constraints as the load requirement increases.The problem for continuous longer-duration actuations isn’t the heated spring, but heat accumulation over time.For this, 1) a mechanism for gradual release needs to be devised. A holed insulation with a micro-fan can speedup this process. 2) Doing the actuation in smaller bursts rather than continued actuation will also accumulate lessheat. Second, while the spring relaxes to its original length instantly, quick temporal pulses of a very high load arenot possible because of higher cooling times. An easy workaround is to use multiple springs alternately to mirrorrepetitive actuation. Third, while the springs perform consistently in a controlled environment, we need moreinvestigations to ensure they maintain their loads in the wild. Since Flexinol’s contraction has a direct relationshipwith its temperature, a closed feedback loop with constant accurate temperature measurements would be mostuseful. This won’t be trivial.

5.8.5 Applications and Future Work

We have touched upon some of the applications of squeezing sensations. However, in-depth investigations needto be carried out to see how squeezing sensation with or without using SMA fare for different use-cases. Thereare three specific areas where we intend to continue our work:

Replicating human squeezing attributes

One of the most intriguing applications is communication of human- generated squeezing sensations reliably. Ifa user uses their own hand to send a squeeze of a particular duration and intensity variation to a friend, can thatbe exactly replicated for the friend? This requires gathering accurate data from the first user and converting itinto a stimulus from the device that takes into account the perception of the receiving user. Further, the extent ofreplication needs to be investigated which means replicating not just the overall intensity, but its subtle temporalvariations and width.

Affective communication

Affective communication has been investigated for vibrations and unltrasound-based haptics. Squeezing sensa-tions could be even more relevant for communicating different emotions including probably more so than vibra-tions. It could be used for communicating different classifications of emotions such as the eight basic emotions -fear, anger, sadness, joy, disgust, surprise, trust, anticipation. Or the dual axis of arousal and valence can be indi-cated by the intensity and the suddenness of the squeezing sensation. A reliable mapping of squeezing sensationsto emotions could be very useful.

Ambient awareness

The SlowClench investigation suggests that squeezing could be useful for ambient temporal awareness. Whilevibrotactile feedback can get annoying when applied for a long time. This can be used for applications for trackingtemporal progress, such as tracking progress of a microwave or cooking equipment, of a download, of a journeyor of a friend’s imminent arrival. This can even be used to signal ambient factors such as the temperature outside.


Deeper investigations into different contexts, tasks, and longer durations could shed more light on the capabilitiesof squeezing sensations for ambient use.

5.9 Conclusion

We investigated squeezing sensations using HapticClench, a system for generating squeezing SMA springs. Tothis end, we formalized squeezing feedback and its attributes. We described the design process of HapticClenchand its load properties. We conducted psychophysical evaluations of squeezing using HapticClench that gave usthe baseline detection thresholds for active and passive use, as well as the JND values. We further investigate dif-ferent capabilities of HapticClench for gradual progression, spatial patterns, actuation on the finger, and squeezingloose bracelets to tighten onto the skin. We summarize by suggesting design guidelines for future squeezing in-vestigations with and without SMA springs. Squeezing is one of the most intimate forms of human contact, but ithas not seen much investigation in HCI. Our work intends to address that gap.

HapticClench dealt with expanding the vocabulary of touch output sensations that can be delivered by wristworn devices. However, the role of haptics in our interactions doesn’t have to be limited to feedback. In the nextchapter, we propose a paradigm that imports direct manipulation from the visual realm to the tactile realm. Weinvestigate this paradigm for a wrist-worn device.

Chapter 6

Tactile Direct Manipulation in WristWearables

6.1 Introduction

1 Engaging with visual displays is not always feasible or desired. They could be absent in certain contexts dueto their fragility, cost or high battery consumption. Visual attention could be engaged elsewhere in an activity orbe impaired due to disabilities. With the rise of wearables that remain in constant, sturdy contact with the skin,and the advances in tactile display capabilities, it is imperative for HCI to research tactile interactions that are asindependent and as advanced as visual interactions. In this chapter we explore how we can make tactile outputcentral to our interaction beyond its use for assistive feedback and notifications.

The concept of direct manipulation [Shneiderman, 1983] underlies much of these advances in visual inter-actions in the past decades. Direct manipulation is an interaction system which represents task objects on adisplay (visual, in general) and allows the user to directly interact with them through actions whose effects arealso continuously displayed. This continuous cycle of control and progress forms the core of direct manipulation.

The notion of direct manipulation in this form is absent in current tactile displays. A great deal of work indigital tactile displays is on tactile pattern recognition and feedback for interactions. In fact, sensory substitutiontechniques have successfully delivered tactile patterns akin to visual or auditory information [Maidenbaum et al.,2014, Lenay et al., 1997]. However, in almost all cases, tactile displays function as feedback or pattern notificationtools and not as displays that enable direct manipulation of tactile objects in a Tactile User Interface. The advancesin actuation technology and understanding of tactile perception encourage tactile interactions analogous to visualinteractions.

We introduce the concept of direct manipulation for tactile displays. We describe the constructs of a tactile

screen, tactile pixels, tactile pointer, and tactile targets. We describe a tactile indirect pointing interface thatenables user actions like pointing, selecting, and drag & drop on tactile targets. Based on this conceptual frame-work, we build a tactile display for the wrist that demonstrates the feasibility of the concept. We establish theprecision limits of this display via a target distinction study. We then construct a performance model for tactiletarget acquisition along the lines of Fitts’ law. Finally, we investigate the use of a tactile-only menu application onthe tactile display. The results show that the users are able to use the tactile indirect pointing interface and select,

1The contents of this chapter were published at CHI 2016 [Gupta et al., 2016c].

69

CHAPTER 6. TACTILE DIRECT MANIPULATION IN WRIST WEARABLES 70

drag and drop tactile menu items with relative ease.

6.2 Related Work

Section 2.2.3 details the use of tactile feedback in wearable interaction. To contextualize our work in this chapterbetter, we revisit that classification: guidance, communication and information display, and assistive feedback. Wefurther talk about exploratory tactile displays as something of a combination of information display and assistivefeedback. For brevity, we mention 1-2 recent works that are indicative of the group.

Guidance is when guidance information is tactically conveyed to the user. Such systems can dynamicallychange the tactile response based on the user’s movements to guide them better for purposes like motor learning[Schonauer et al., 2012]. The user, here, is not interacting with the computer, only getting guided.

Information Displays display different spatial and/or temporal tactile patterns to convey different informationto users. Typically, this involves either short-term alerts [Pasquero et al., 2011], or long-term information likeprogress bars [Brewster and King, 2005] or direction [Jin et al., 2014].

Assistive feedback is when the user is aided in an interaction by tactile feedback, such as eyes-free ges-tures [Lee and Starner, 2010], braille-reading [Nicolau et al., 2013] or color identification [Carcedo et al., 2016].

A unique class of tactile displays are Exploratory tactile displays, which can be thought of as a combinationof information displays and assistive feedback. These require the user to explore them to convey continuousinformation in a tactile fashion. A braille book, for instance, is the easiest example. Other examples includeconveying texture information or shape information upon user exploration. Such displays mostly work whenusers use their fingers [Xu et al., 2011] to explore on the display.

However, none of the above involve the user performing direct manipulation on tactile displays similar tovisual displays. In a very simplified sense, however, DMTs can be thought of as exploratory tactile displays thatcan be instrumented on body-parts other than the finger, and that separate the body-part doing the explorationfrom the body-part that feels the tactile response. This will become clearer in the next section.

6.3 Direct Manipulation in Tactile Displays

Imagine a pointer controlled by a regular computer mouse but which the user can feel moving around her wristinstead of seeing it moving on a screen. At a particular location on her wrist, she feels a change that indicates atarget. She performs a double click, which executes an associated command. This is the tactile direct manipulationexperience which enables the loop of control and progress for tactile displays attached to the skin. In this sectionwe develop the concept of direct manipulation for tactile displays by deconstructing the components that makedirect manipulation in visual displays work end-to-end and adapting each of them for tactile displays. We illustratethe concept using the example of a wrist-mounted circular tactile display which eventually ties into our proof ofconcept (POC). However, the concept applies independently to any potential tactile display on any region of theskin.

There are two basic components of a visual direct manipulation system - display and input. Display refersto a combination of a screen and an interface. Our focus in this work is on Direct Manipulation-enabled Tactile

displays (DMTs) for the body’s tactile sense. Analogous to a visual screen, we define a tactile screen as a clusterof tactile pixels that enable tactile perception of system responses. A tactile screen consists of tactile objects thatare combinations of tactile pixels.


An interface combines these pixels into objects that users can individually perceive. It then defines user actionsto interact with these objects. Before we get to the tactile interface, we need to understand tactile perception first.With visual displays, the user is always aware of the overall state of the screen via a simple glance at the screen.This is fundamentally different from tactile perception. If multiple tactile objects in close proximity on skin getsimultaneously stimulated to make the user aware of the tactile screen’s overall state, user feels a single compositesensation and is not able to discern individual tactile objects accurately [von Bekesy, 1957]. The tactile senserequires active exploration to identify objects and their forms [Gibson, 1962]. Consequently, users need a way toexplore or track objects on the tactile screen without performing actions on them.

This leads us to Indirect Pointing Interfaces where the input space is different from the display space. Atracking state where the system is aware of the tracking by the user and responds accordingly is an inherentproperty of indirect pointing interfaces as described in Buxton’s 3-state model [Buxton, 1990a]. In direct touchinterfaces there is typically no tracking state (unless an additional modality like pressure is used) and interactionwith the screen results in an action on the objects. In indirect pointing, the virtual pointer tracks objects withoutperforming actions on them. The pointer is perfect as the virtual tool using which a user can explore the tactileinterface. Thus, we choose indirect pointing interfaces as the interface paradigm for DMTs. A tactile indirectpointing interface will analogously have a tactile pointer that allows tracking of the tactile screen, and tactile

targets that are tactile objects that the user can act upon. The user actions will analogously be tactile pointing,selection & execution, and drag & drop.

We formalize the DMT by defining the tactile screen and tactile indirect pointing interface and their respectivecomponents. For the input, we illustrate the concepts using a touchpad which we use in our POC as well. However,any indirect pointing device can potentially be used for DMTs.

6.3.1 The Tactile Screen

Analogous to a visual screen, the tactile screen has three defining attributes - shape (or geometry), size, andresolution. The tactile screen is composed of a cluster of individual pixels, termed tactile pixels, each of whichhave three defining attributes - coordinates, stimulus pattern range, and stimulus strength range.

For a tactile device worn or affixed on the skin, the tactile shape and size of the screen are defined by theregion on the skin within which the device can convey tactile sensations. In contrast to visual screens, this coulddepend both on the device hardware, as well as the anatomy of the body part being instrumented. For example,for a stretchable wrist band consisting of tactile actuators, the exact shape depends on wrist anatomy, and sizedepends on circumference around the wrist. However, for a tactile device to be used across multiple users withthe same interface, these should have fixed values. An abstraction is required which fixes the effective shape andsize for a particular device. For our POC, we fix it to a circular screen comprising of 360 degrees. The absolutemeasure of each degree would be different for every wrist, but the abstraction allows a developer to make appswhich can work within a certain range of wrist sizes.

Before we come to tactile resolution, we define a tactile pixel. A tactile pixel is the smallest tactile sensationthat the device can distinctly modify on the tactile screen. This distinction might or might not be perceivable bythe user, depending on the pixel size (as is the case with visual displays). The modifiable attributes are stimuluspattern & stimulus strength. Tactile resolution is the total number of tactile pixels on the screen. In our POC, weuse phantom sensations such that four actuators around the wrist enable 360 tactile pixels, one for each degree,making its resolution 360x1. It is a one-dimensional (1D) tactile screen where the pixels are spread on the wristcircumference. Theoretically, in a very simple tactile display, an individual actuator could be the tactile pixel.

Every tactile pixel has a location coordinate depending on where it is on the screen. For the POC wristband,


coordinates span from 0° to 360°. The stimulus pattern and strength attributes are analogous to the visual pixelcolor and brightness. Both of them should be similarly distinguishable for certain discrimination thresholdswithin the stimulus’s range. This would depend on the tactile actuation. For vibrotactile actuators, frequency andamplitude have been shown to have discrimination thresholds within a certain range and therefore the POC usesthem for pattern and strength respectively.

6.3.2 The Tactile Indirect Pointing Interface

The indirect pointing interface consists of a tactile pointer and tactile targets. Interaction with tactile targetshappens via tactile pointing, selection, execution, and drag & drop.

Tactile Pointer

The tactile pointer enables the user to navigate the tactile screen. To avoid conflicting sensations, the device’stactile sensation is stimulated only at the location coordinate of the pointer. We refer to this as the tactile responseof the DMT. Initially, the pointer is at a certain coordinate on the screen. As the user controls the pointer using aninput device, the pointer starts moving from its current location to the neighboring pixels until the user stops. Theuser feels this pointer movement as a tactile sensation going from one skin location to another depending on thespeed of user’s control movement. In the ideal case, the user feels a precise point-like sensation going smoothlyfrom one location on skin to another.

The specific tactile sensation felt by the user at the pointer location depends on the tactile object that is at thecurrent pointer location. Tactile pointer is only the virtual tool for navigating the screen and does not have a tactileresponse of its own. There are only two types of tactile objects: voids and targets. Targets are objects that the usercan act upon. In places where there is no target, there is a void.

Tactile Targets & Voids

The tactile interface relies on users differentiating between i) a void from a target, and ii) a target from anothertarget. While voids are all same, targets can be different from each other. Tactile response of a void when thepointer is at its location should be perceived differently from that of a target. To achieve this, we dig into theattributes of a tactile object.

A tactile object is a combination of multiple pixels with the following properties - size, location, and pixel

attributes of all pixels in the object. Size and location are derived attributes from the pixel attributes of all pixelssince each of them have a coordinate. In a 2D tactile screen, the objects do not necessarily have to be rectangular,which means that aside from size and location, shape would also be a derived attribute. To keep things simple,that case is not addressed. Since voids and targets are next to each other, they need to be distinguished based onperceivable properties of the pixel like pattern or strength. Voids are assigned a low strength or pattern stimulusdistinguishable from higher values reserved for targets. For the POC, we use a fixed lower frequency for everypixel on the void to signify when the pointer is over it, and higher frequency for the targets.

The frequency and amplitude discrimination is relative. Since targets are explored one by one, the variety ofdifferent targets needs to be limited on the tactile screen or else the user will lose track of the relative distinction.For the POC, we use the simplest case where only the location property is used to distinguish between multipletargets. All targets have the same amplitude and frequency over all of their pixels. Figure 6.1 illustrates thevoid and target discrimination. In figure 6.1(a), the pointer is over a void where the user can feel the void’s lowfrequency tactile response. The user then moves the pointer on the screen, all the while feeling the void tactile


response following the pointer movement. In figure 6.1(b) when the pointer reaches a target, the user feels thetarget’s high frequency tactile response. Going further, the user will encounter a void and then a second target,which the user can discriminate using localization.

0°

90° 270°

180°

0°

Figure 6.1: A 1D circular 360x1 tactile display around the wrist. On left, tactile pointer is over a void whose tactileresponse frequency is represented in green. On right, user navigates the pointer to a target, where the tactile response

frequency is different (orange).

However, the exploratory movement of the pointer on the screen adds another dimension to target discrimina-tion besides its location. Unlike localization, where the user needs to localize a random tactile sensation, here theuser starts moving from a certain location on the screen to eventually reach the desired target location and there-fore has an additional temporal sense of how far the pointer has traveled. Further, users can devise mechanismssuch as maintaining a count of their favorite targets from a fixed location and simply count the targets to reachthe desired one. In fact, users can use location, time, and counting in conjunction to guide their movements. Weterm this exploratory usage of finding targets that are differentiated only by location on a DMT as exploratory

localization.

6.3.3 Tactile Indirect Pointing Interface Actions

Before describing user actions on tactile objects using tactile pointer, we describe the input used for illustration:a multitouch screen used as a touchpad with no visuals. A finger down (FD) and move (FM) is equivalent to themouse hover motion. The second finger tap while first is down and then lift up of both fingers (SFT) is the mouseleft click. The second finger moving (SFM) while the first finger is down and static is mouse move with left buttondown. A second finger double tap (SFDT) while the first finger is down and then lift up of both fingers is mousedouble click. We could have used any input device, including the mouse, but this helps maintain consistency withour POC.

Each action description is supported by a state transition diagram. Every state is a unique state of the tactilepointer: over void (OV), over target (OT), over selected target (ST), dragging over void (DV), dragging over target(DT). The color of state bubbles in the figures refer to the mode: Idle (Blue), Tracking (Light Green), Dragging(Yellow). Every state has a unique tactile response, which is shown by a green Rx label. The input action causingthe transition is shown by a black label. When an action results in a command (Select, Execute etc.), the commandis shown with a red label. Every command can either be a success or failure. This is conveyed to the user viashort instant pulses of tactile responses at pointer location. These are different from state tactile responses and aretermed as ephemeral tactile responses. Each command’s success and failure has a unique ephemeral response.


The ephemeral responses are played during a short pause after the tactile response of the current state stops andthe next state starts.

Figure 6.2: State Transitions for Pointing

Pointing When no finger is on the screen, the pointer is idle (I - idle), and the tactile response is R0 whichshould be zero in normal instances. When the user puts the finger down, she feels the tactile response Rv of void(OV) or Rt of target (OT) depending on what’s under the current pointer location. As the finger moves (FM), thepointer moves correspondingly over voids and targets and the user feels Rv/Rt depending on the state. The sizeof a target can be small or large and the pointer can keep moving over the same target for some time if the pointeris moving slowly. In our POC, we enable the distinction between Rv and Rt using frequency, but other attributescan also be used. On finger up (FU), the pointer is Idle again, but still maintains its last position.

Figure 6.3: State Transitions for Target Execution & Selection

Target Execution and Selection When over a target (OT), the user executes it by doing a double tap with thesecond finger and lifting up both fingers (SFDT). An alternate way to execute a target under the pointer while inidle state without going to the intermediate OT state is to do a double finger double tap (DFDT). Execution is acommand action which results in the ephemeral response Re. Re has two possible values, one each for successand failure. The tactile response after execution is Idle because no fingers are on screen. The ensuing responseswill depend on what happens after target execution. For instance, a target can be an icon to an application thattakes over the entire screen or it could simply be a button that performs a task (like volume up) but the screenremains the same.

Similarly, after a finger down (FD), a second finger tap and lift off of both fingers (SFT) when pointer is in OTselects the target (ST). Doing the same action in ST, deselects the target. The alternate way while in idle state isa double finger single tap (DFST) (not depicted in figure). The select/deselect command results in the ephemeralresponse Rs. Again, Rs has two possible success and failure values. It is not necessary to keep Rs different from


Re since the user knows the action she performed and perceives success or failure for that action. Tactile responsein ST is R0 because the fingers are up immediately after selection.

Figure 6.4: State Transitions for Dragging

Target Manipulation: Dragging When a target is selected (ST), the user drags the target by moving the secondfinger (SFM). The pointer can drag the target over void (DV) with response Rdv , or over another target (DT) withresponse Rdt. Dragging is a different mode of pointer movement than tracking, and therefore the responses inDV and DT are Rdv and Rdt which can be different from Rv and Rt, but don’t necessarily have to. The usereventually drops the target by lifting both fingers up (FU) at the desired location.The drop command results in theephemeral response Rd which again has success and failure values.

We see that the tactile response of the display depends on three variables: whether pointer is over a void ortarget (Rv , Rt), whether the pointer is in tracking mode or dragging mode (Rdv , Rdt), and whether a commandwas invoked (Re, Rs). The first two are continuous responses as long as the pointer is in that state, while thethird is a short-term response signal. While our focus is on the tactile responses of each state and action, the statetransitions with inputs help us understand the end-to-end interaction. These transitions can change dependingon the input mechanism used. A complete state machine of a DMT can be constructed by combining the threefigures. We now have the basic building blocks of an indirect pointing interface which can be used to create higherlevel widgets such as lists, menus, sliders, etc.

6.3.4 Control & Progress in Tactile Displays

As described, while the adaptation of direct manipulation for the tactile context requires delving into the specifictactile affordances, the theoretical construct of DM remains the same. Looking at direct manipulation througha tactile lens, Control & Progress are two salient properties that distinguish DMTs from other existing works intactile space. Control refers to the user being in control of the objects on the tactile display which governs thetactile responses that the user feels. Progress refers to the ability of the tactile display to update continuously inreal time, in line with the user’s actions, informing about her progress. In effect, the user engages in a cycle ofcontrol and progress so that later input actions depend on earlier output responses and vice versa. User controlsthe pointer whose progress is continuously being conveyed tactically to the user, which, in turn, guides the user’sfurther navigation of the pointer.


6.4 Proof of Concept

To demonstrate that DMTs can be functional, usable systems, we build and study a prototype 1D DMT displayaround the wrist. With the popularization of wrist wearables that remain in constant, sturdy skin contact, the wristpresents a fitting space for tactile interventions. Prior studies show that wrist regions support rich tactile stim-ulation [Matscheko et al., 2010a]. We use vibrotactile stimulation for its proven phantom sensation capabilities[Israr and Poupyrev, 2011]. We now describe how the tactile screen, pointer, targets, and actions are realized inour POC. In indirect pointing, movements in input space are mapped to display space movements via a transferfunction, which is described later.

6.4.1 Tactile Screen

A circular 1D tactile screen is apt for the wrist circumference and efficiently demonstrates the concept. Assumingthe wrist to be a crude circle, the screen size is set to 360 degrees. We use only a subset of values withinthese ranges. The hardware consisted of four EAI C2 actuators, called tactors, attached to an elastic sportswristband, which has an Arduino, Bluetooth, battery, and a signal amplification board (Figure 6.5). The actuatorswere placed at top (0°), left (90°), bottom (180°), and right (270°) positions on the wrist. The device allowedtactor amplitude, frequency and duration control via Bluetooth commands. We paired it with an LG G Androidwatch on the wristband whose touchscreen was the input touchpad. Since the tactile display is 1D, the touchpadregistered finger motion in only one dimension parallel to wrist circumference (Figure 6.5). The pointer movedin the direction corresponding to the direction of finger movement. Looking at Figure 6.5, if the finger movedin the thumb-to-little-finger direction on the watch touchpad, the pointer moved in the 0° to 90° direction, andsimilarly reverse for reverse finger movement. During tactile interactions, the touchpad showed no visuals. Duringexperiments, the touchscreen was used between trials for the logistics of the experiment. The tactor frequencyrange is 0-400Hz. The C2 tactor prescribes an amplitude range between 0-255. We only use the maximumamplitude and a select few frequencies for our DMT.

0°

90° 270°

180°30° 15

°

1

2

3

45

6

7

8

Figure 6.5: left) Wristband, (middle) Actuator positions, (right) Study 1 sample layout

6.4.2 Tactile Pointer Implementation

We simulate a tactile pointer whose movement can be controlled over every pixel and felt continuously on the skin.The continuity is crucial to the feeling of smooth pointer movement and something like an array of mini actuatorswould not be ideal. We utilize the funneling sensory illusion [von Bekesy, 1957]: the simultaneous stimulation oftwo tactile actuators in close proximity on skin results in a single illusory stimulation at the location between thereal actuators. These are called phantom sensations. While phantom sensations have been extensively explored


to generate localized stimulation, their use for stimulating continuous tactile movement has been limited to twoactuators [Rahal et al., 2009] which simply generate a quick pulse in a particular direction. They do not go intothe details of movement speed and precision, as well as, of giving the control of the sensation to the user.

The aim of the tactile pointer is to traverse over every pixel with a speed controlled by the user. To stimulate atactile response at the 45° tactile pixel, the actuators at 0° and 90° are played at equal intensities. A tactile pixel atθ◦ is stimulated by varying the amplitude of the two actuators between which θ◦ lies. We adapt Israr et al’s pointstimulation algorithm [Israr and Poupyrev, 2011] for our screen to get the amplitudes:

A1(θ) = ± A sin(θ) A2(θ) = ± A cos(θ)

where A1 corresponds to the actuator at the lower angle among the two. A is the effective amplitude of thestimulated pixel which is preset and follows the formula: A =

√(A1(θ)2 +A2(θ)2). We fix A at a C2 maximum

of 255 for all vibrations. Stimulating adjacent pixels in rapid succession based on user control produces thesensation of the tactile pointer moving around the wrist.

6.4.3 Tactile Target Implementation

To implement tactile targets, we designate a different actuator vibration frequency for the target and the void.The user tracks the pointer over voids (OV) using phantom vibrations playing at a frequency Fv = 75Hz forC2 tactors. When the pointer is over a target (OT), the vibrations switch to Ft = 320Hz which is perceiveddifferently from Fv .

6.4.4 Tactile Interface Actions Implementation

The difference in tactile responses depends on frequency. Rv & Rdv have the same frequency Fv and Rt & Rdt

have Ft. The ephemeral responses Re, Rs, and Rd all have the same success response, and same failure response.Success is one quick pulse and failure is two quick pulses. Success response sequence: 100ms pause, 300ms pulseat 320Hz, 100ms pause. Failure response sequence: 100ms pause, 100ms pulse at 320Hz, 100ms pause, 100mspulse at 320Hz, 100ms pause.

6.4.5 Transfer Function

Our transfer function maps the watch touchpad pixels to display degrees. The scale factor between movementvelocities in the two spaces is termed as CD gain = Vdisplay/Vinput [Casiez et al., 2008]. We use a discretelogistic function approximation as this gain:

gain =

gl if vinput < vl(mm/s)

gu if vinput > vu(mm/s)

gu + (gu − gl) × vinput−vlvu−vl

else

Distance moved by the tactile pointer is: ddisplay(degrees) = gain × dinput(mm). gl = 0.5, gu = 1.29,vl = 113px/s, vu = 10, 000px/s were fixed after a pilot. All values in the implementation were arrived at aftera series of pilots with four users.

While localization has been studied on the wrist [Matscheko et al., 2010b, Chen et al., 2008], DMTs useexploratory localization which potentially enables users to work with more targets. We study how precisely theDMT allows users to move and perceive the pointer and make a distinction between targets. We then study target


acquisition in DMTs and formulate a DMT performance model. Finally, we study a DMT menu application andsee how users perform and use the DMT.

6.5 Study I: Movement & Target Distinction

Phantom sensations and frequency discrimination worked well during our pilot studies. However, the limits ofmovement precision and accurate target discrimination that these enable in a DMT need to be studied. We con-ducted a study where participants were asked to count a random number of targets on screen. This requiredparticipants to control speed of movement depending on number of targets present. If the system allows pre-cise movement, accurate perception, and precise target discrimination, participants should be able to count largenumbers of targets accurately.

The participants did multiple trials. The initial number of targets to be counted was 4. For the next trial,the count was exponentially increased (round(4targets × 100.5) = 13 targets) or decreased based on if theanswer was correct or not. A change from increasing to decreasing targets, and vice versa, is referred to as areversal. After first 3 reversals, the exponential factor was reduced from 0.5 to 0.1. As target count increased, thecorresponding target size and distance between targets decreased, thus requiring the user to be more controlledin their movement to perceive targets differently from the void and from other targets. The large initial factorensured faster convergence of the target count to the participant’s performance threshold. The smaller value laterensured fine resolution of the threshold estimate. The experiment was terminated after 8 reversals at the smallervalue. The average number of targets from these 8 reversals was taken as an estimate of the threshold level. Thisthreshold represented the upper limit of target count which can be usefully put on the tactile screen.

Equal-sized targets were uniformly distributed on screen. The ratio between target and void size was keptconstant at 1:2. So, as the count increased, target size decreased. The pointer started at the origin at 0°. Figure 6.5-b shows the 8-target layout. The experiment ran in an app on the smartwatch. Since the task only involved pointingand not target selection, a double finger tap was used to stop a trial which led to a visual screen on the touchscreenfor the user to enter the answer for this trial. On pressing Ok on this screen, the blank touchpad appeared againand next trial began.

7 participants (6 male, mean age = 25.6 years) with wrist sizes within 150-178mm took part. Participantswere recruited using flyers in a University campus. Participants wore the wristband in left hand, which is usuallythe watch hand. They also wore headphones playing pink noise to mask tactor sounds. Each session lasted 15-25minutes. The session started with a system demo, followed by the task. The instruction was “give an accuratecount and try to be as quick as possible”. The experiment began after a dummy trial with 2 targets.

6.5.1 Results

Mean threshold count was 19.1 (SD = 7.2). This is lower than the mean of the maximum counts that theparticipants got right - 29.6. Essentially, for a normal wrist of 160mm, a target at every 8mm is recognizablein normal use which is quite precise. Therefore, phantom sensations were found to work well with frequencydiscrimination to enable precise movement and target discrimination on the DMT. No correlation was found forwrist size.


6.6 Study II: A Performance Model for DMTs

Reliable prediction models of user performance are imperative to understand device capabilities and human limits.Movement Time (MT ) models for target acquisition have been studied using Fitts’ law for a number of inputdevices where the output is a visual display. Fitts’ law relates MT to target width (W ) and distance (A): MT =

a+ blog2(2A/W ). Tactile displays are different from visual displays. They do not give an overview of the screenat a glance and are subject to variances in perception. Whether the same model and dependencies hold for tactiledisplays, is a nontrivial question that needs to be investigated.

A target acquisition task has a ballistic phase where the pointer moves to the general region of the target, anda corrective phase when the pointer makes small movements to reach the exact location. The ballistic phase isresponsible for large distances leading to higher MTs. The corrective phase is responsible for smaller widthsleading to higher MTs. We hypothesize that this general rule should apply to the tactile display - however,perception variations of different target positions could be a significant factor that affects MT and its relationshipwith width and distance. We conducted a target acquisition study that asked the following questions:

• How do target width (size) & distance affect MT in DMTs?

• Is there an effect of target position on MT in DMTs?

• What is the target acquisition performance? If and how does it improve over time for DMTs?

The last question informs the performance metrics of our specific wrist DMT, it will give us a sense of what toexpect from such systems in general. The basic relationship between MT and width, distance, and target positionthat the experiment yields should hold more generally for DMTs.

6.6.1 Experiment Method

A point and select task was designed with the following independent variables: DISTANCE (3 levels), WIDTH

(3 levels), POSITION (8 levels) and BLOCK (4 levels). 9 participants, different from previous study, (6 male, 8right-handed, mean age=29.8) took part. The experiment lasted 20-30 minutes.

We followed the no-distractor target acquisition study design [Douglas et al., 1999], where during a singletrial, only one of the positions functioned as the target. There were eight fixed targets which allowed explorationof all wrist regions and kept experiment duration reasonable. Mid-points of targets were at 0°, 45°, 90°, 135°,180°, 225°, 270° & 315°. The three width levels were selected such that the void width:target width ratios were1,3, and 5. After rounding off, the widths were 24°, 13° and 9° respectively. For instance, a target at 90° with awidth of 13° spanned 84°-96°. The log scale separation in width levels was in line with earlier pointing studies[Douglas et al., 1999]

Participants needed to go from one target to the next without reinitializing the pointer. As is standard practice[Douglas et al., 1999], distance was the degrees between mid-points of the previous target and current target.The three distance levels were 45°, 90° & 180°. Since the display was circular, the minimum distance betweentwo targets cannot exceed a maximum of 180°. The user could possibly go in either direction and figuring outthe optimal direction was part of user performance. The final evaluations used the minimum distance, no matterwhich direction the user took. In total, 8 x 3 x 3 = 72 conditions (trials) = 1 block. We study 4 blocks of 72 trialseach, with the first one intended as a training block.


6.6.2 Experiment Design

A within-subjects study was run for all variables. The sequence of 72 conditions in a block was such that startingwith the first, each condition led to the next, while ensuring that no position, width, or distance value appearedconsecutively. 36 such sequences were designed for the 9 × 4 = 36 blocks.

The experiment ran in a smartwatch app. The task was designed analogous to a visual target acquisition study.As soon as the participants made a selection for the current target, the ephemeral responses informed them of theirsuccess or failure. After a 1s delay, participants were informed of the next target by a vibration pulse at the targetposition. This utilized the participants’ ability to coarsely locate a stimulation on the skin. We hypothesized thatas blocks progressed and participants locate the positions repeatedly, they’ll start doing quick shorthand fingermovements as opposed to active exploration.

The task was preceded by a demo and a dummy trial. Participants were instructed to work as fast as possible

while still maintaining high accuracy [Douglas et al., 1999]. The first trial of every block began with the pointerat 0°. For subsequent trials the pointer position was where the participant left the pointer in the previous trial.Participants were prompted to take a break after every 24 trials, with the pointer position at 0° after the break.Participants wore headphones playing pink noise. In total, we had 9 PARTICIPANTS x 4 BLOCKS x 3 DISTANCES

x 3 WIDTHS x 8 POSITIONS = 2592 measurements.

6.6.3 Results

For 3 blocks of trials (excluding the training block), the mean MT per trial was 3.36s (SD = 0.48). The meanaccuracy was 93.7% (SD = 9.21). A four-way repeated measures Anova analysis found significant main effectsof width, distance, and block on MT : block (F (2, 16) = 7.292, p < .01, η2p = .477), distance (F (2, 16) =

17.944, p < .01, η2p = .692), width (F (2, 16) = 40.708, p < .01, η2p = .836). No interaction effects werefound. We now report on participant performance and improvement, followed by the effects of width, distanceand position, and finally propose a performance model.

Figure 6.6: Influence of target width, distance, block and position on pointing time

Performance and Improvement

A 93.7% accuracy is promising for an interface completely relying on the tactile sense. The MT , however,of 3.36s is relatively slow. A post-hoc test with Bonferroni correction for block showed a significant differencebetween blocks 2 and 4 (p < 0.05). Figure 6.6c shows that meanMT decreases as the blocks progress. The mean


accuracy in the initial training block was low at 85.03% with a high standard deviation of 24.6. We conducted amultiple logistic regression analysis of the success output of all four blocks and found a significant overall effectof block on accuracy (Wald = 51.007, p < .01). Figure 6.7(right) shows a sharp increase in accuracy after thefirst block, followed by a relative stabilizing of the accuracy. Both speed and accuracy curves show that despiteinitial friction, users will get better with practice on the DMT.

Figure 6.7: Influence of width and block on accuracy

Effects of width and distance

The post-hoc tests showed a significant difference between MTs of all 3 width pairs: 9° & 13° (p < .05), 13° &24° (p < .01), 9° & 24° (p < .01). Figure 6.6-a shows the mean MTs decreasing as the width increases. The 9°target translates to a miniscule target of 4mm for an average 160mm wrist. MT reduces by an entire second froma 9° target to 24°. Thus, limiting minimum target widths on DMTs is recommended.

We ran a multiple logistic regression to model binary success output based on the four variables. A significanteffect of width on accuracy was found (Wald = 14.547, p < .01, βwidth = .065, OR = 1.067). Figure 6.7(left)shows accuracy increasing by width. Even for a target as small as 4mm, accuracy is at a healthy 91.5%. Asidefrom higher corrective adjustments, acquiring the smallest targets was challenging because of constant clutching.Participants found it difficult to gauge if the pointer slightly slid off the target when lifting the finger. This issueis addressed in the next study.

For distance, the post-hoc tests showed a significant difference between the MT s of 45° & 180° (p < 0.01),and 90° & 180° (p < .01). Figure 6.6-b shows that the mean MT increases with distance. For 45° & 90°, a45° difference is not enough to significantly impact MT due to the fast ballistic movement. Participants tookthe optimal direction for minimum distance in 93.7% trials. (This excludes the symmetric 180° distances.) Evenwithout an overview of the display, participants knew the shortest path direction. This demonstrates their use ofexploratory localization where they select direction based on target location and then explore locally to find thetarget. The effects of width & distance confirm our hypotheses. The respective inverse and direct relation withMT hold true for tactile displays.

Effects of Position

There was no position effect in the initial analysis. However, participants noted the perception being better incertain regions. We re-ran the analysis, this time grouping the positions into categories. A significant effect of po-


sition on MT was found for the categorization: Anchor (0°,90°,180°,270°) & Intermediate (45°,135°,225°,315°)(F (2, 8) = 7.608, p < 0.05, η2p = .487). Figure 6.6-d shows the anchor positions had a lower MT than inter-mediates. This has two possible explanations. First, the actuators were placed at anchors. Second, the wrist top,bottom, sides are natural anchors on the skin which a user uses to gauge relative pointer position. This increasestactile acuity due to the anchor effect [Cholewiak and Collins, 2003]. Looking at the difference between meansand effect sizes (η2p), the effect of width & distance is stronger than position. Therefore, while position should bea design consideration for DMTs, it does not supersede width & distance. The assertion, of course, is based onthe wrist DMT and generalizations should be carefully considered.

Performance Model

We fit a variant of Fitts’ law forMT in DMTs. A multiple linear regression was run forMT based on log2(2A/W ),the combined width-distance variable from Fitts’ equation, andP , a nominal variable with values- Anchor/Intermediate.To model end performance, the test was run on the last block. It gave the following equation forMT (F (2, 646) =

1196.781, p < .001, R2(adjusted) = .787):

MT = b1log2(2A/W ) + b2P

Here b1 = .802, b2 = .120. Both variables added significantly to the prediction, p < .001. The model adheresto Fitts’ law with an additional position term. These values are obviously affected by the choice of touchpad asinput. It is a reasonable claim that DMTs will generally follow the width & distance relationships with movementtime while adhering to Fitts’ law. Position’s effect is less impactful, but is subject to the body part and actuationand should not be discarded.

Overall, while target acquisition in DMTs adheres to Fitts’ law, the fit is not as strong as it is for visual displays.A visual target acquisition is composed of a big ballistic movement followed by smaller corrective movements toreach the target. Here, the ballistic movement is present; however, the search in exploratory localization is not thesame as corrective movements. A perfect model for DMTs will include this in the model and is open to research.

6.7 Study III: A Tactile Menu Application

We have investigated the limits of tactile target distinction and established a performance model for DMTs. Theseshow that DMTs are capable and usable for direct manipulation. The next question is how well can the users

perform in an actual application? The question also pertains to how users get started with such a tactile interfaceand can this tactile interface be used independently without a visual display. To investigate this, we study aDMT menu application where the users perform tactile target execution and drag & drop. The study starts witha static visual aid (Figure 6.8) to understand the visual to tactile mapping. The aid is subsequently removed tostudy independent DMT use. Eight participants, different from previous studies, (all male, mean age=26.5) tookpart. The experiment lasted 75 mins. Participants were debriefed following the study and were asked to rate thephysical and mental demand of the tasks on a five-point Likert scale.

6.7.1 Design

The menu application is modeled after a frequent contacts list. Two menu sizes were studied: 4 items (targets)and 8 items (Figure 6.8). It was divided into 4 stages in the following order: item execution for 4 items, item


execution for 8 items, drag & drop for 4 items, and drag & drop for 8 items. This progression of stages enabledthe users to carry their learning over from previous stage to the next.

For each stage in item execution, users performed 15 blocks. With 4 items, each block consisted of 3 executiontrials, with Bob, Dan, and Helen randomly distributed in each. With 8 items, each block consisted of 4 trials, withBob, Dan, Emma, and Gary randomly distributed. The 15 blocks were divided in 3 sessions, consisting of 5blocks each: In blocks 1-5, the users could see a static visual map of the menu on a desktop (Figure 6.8). Inblocks 6-10, the map was hidden, and was shown only upon request in a trial. In blocks 11-15, the visual mapwas not shown at all. Each individual session took 1-3 minutes. As evident, users were presented with only 3 outof 4 items, and 4 of 8 items. Limiting items follows the idea that not all menu items are accessed equally andsome are rarely used. The screen still had 4 or 8 items depending on the stage, all of which the users felt duringa trial. Item size was fixed at 11°. Before each trial, users were shown the name of the item to be executed on thewatch. Pressing “Ok” on the touchscreen started the blank touchpad. The tactile pointer started at 22° for everytrial. Upon the execution action, ephemeral responses informed users of their success or failure.

The drag & drop task was dragging a source item to a destination item. For each stage in drag & drop, usersperformed 10 blocks of 2 trials each. With 4 items, the two trials had mutually exclusive random combinations ofsource & destination items, thus involving all 4 items. It was similar for 8 items, with 4 items Bob, Dan, Emma,and Gary. Since, users were familiar with the menu layouts after the execution tasks, the map was shown onlyupon request for a trial in blocks 1-5, no visual map was allowed in blocks 6-10. Each individual session lasted2-4 minutes. Before each trial, users were shown a text instruction, for example, “Drag Bob to Dan”.

6.7.2 Task Input

Based on user feedback, we modified the input mechanism for efficiency. Finger movement in earlier studies wasalong the single touchpad dimension parallel to wrist circumference. However, as mentioned earlier, constantclutching was a problem. To remove clutching, input was modified to a circular path on the touchpad, close to thebezel, like an ipod wheel, where the pointer movement corresponded to the degrees traveled on the circular path.A constant CD gain of 0.5 was fixed. A 2° movement on touchpad led to a 1° movement on the tactile screen. Theinput actions for drag & drop were changed accordingly: user selects the source item using a double finger singletap, then moves it with a single finger down & move. The item is dropped when the user lifts the finger up.

A visual menu does not have spaces between different items. However, our DMT needs the void between two

a)

0°

90° 270°

180°

11°

Helen

Dan

Bob

Frank

b)

0°

90° 270°

180°

11°

Helen

Dan

Bob

Frank

Gary

Charlie

Alice

Emma

Figure 6.8: Study 3 Menu Layout: 4, 8 items. Blue regions are items. Dot partitions are activation zones for containeditems. Pointer starts at 22°.


items for item distinction. To solve this, an activation zone was designated for each target, so that users couldperform actions (execute/select/drop) for that item within that item’s zone even when the item was felt only inits 11° span. This activation zone is the arc within the dotted lines around an item (Figure 6.8). This ensured amenu-like functionality while keeping the item distinction intact. This also addresses the problem of incorrectanswers when the user is only off by a few degrees. The experiment procedure was similar to the previous study.For 8 PARTICIPANTS, for each stage (BLOCKS x TRIALS) = 8 × (15 × 3 + 15 × 4 + 10 × 2 + 10 × 2) = 1160

trials.

6.7.3 Results: Menu Application Study

Out of the 440 trials which allowed visual aid upon request, it was asked for in only 5 trials, showing that thevisual to tactile mapping was quickly understood by the users. This is of course aided by the alphabetical designof the menu.

Figure 6.9: Study 3 Menu Application Results: Mean Accuracy for Execution and Drag & Drop tasks

Item Execution

The mean item execution accuracy % for 4 & 8 items over the three sessions, with visual aid, visual aid uponrequest, and without visual aid, is shown in Figure 6.9. For 4 items, accuracy remains relatively consistent at> 95% for all three sessions. For 8 items, the accuracy is relatively lower, starting at 86.3% in the first sessionand ending at 92.5% in the third.

The mean item execution time is shown in Figure 6.10. For 4 items, execution time started out at 5.2s in thefirst session and improved to 3.1s in the final session. For 8 items, it went from 5.8s to 4.5s. The effect of sessionon execution time was significant for both. For 4 items, after Greenhouse-Geisser correction F (1.129, 7.903) =

13.074, p < .01. For 8 items, F (2, 14) = 5.320, p < .05. With < 5 minutes of visual aid, not only wereparticipants able to perform tactile direct manipulation in a tactile menu without visual aid, they improved upontheir speeds while preserving an accuracy of > 90%.


Figure 6.10: Study 3 Menu Application Results: Mean Time for Execution and Drag & Drop tasks. (e) User approachto finding an item. (f) Mental and Physical Demand of the tasks.

Drag & Drop

The mean drag & drop accuracy % for 4 & 8 items over the two sessions, visual aid upon request, and no visualaid, is shown in Figure 6.9. The accuracy hovers in the range 85-90% for both 4 and 8 items. The mean itemexecution time is shown in Figure 6.10. For 4 items, it goes from 11.6s to 9.9s. For 8 items, it goes from 13.3s to12.3s. Even for drag & drop, these time values are high. We take a deeper dive into the participants’ approach.

Results Discussion

We queried the participants on which of the three approaches they used for finding items:. i) only localization -when they found items based only only upon its location on the wrist, ii) only counting - when they only counteditems to get to the desired one, or iii) using Both i) and ii). The terms were clearly explained to the participants.Figure 6.11 shows the distribution of their approaches for 4, 8 items. For 4 items, all 3 were used equally. For 8items, no participant used localization alone, and more than half of them used Both. One participant remarked: “I

knew that Dan was the bottom-mid item and felt it very clearly, so I reached Dan quickly and then counted from

there.”Multiple participants echoed similar strategies that involved using location and counting in tandem. In fact,

all three participants who used localization for 4 items switched to using Both for 8 items, indicating that theyfound location to be an insufficient indicator when it came to a higher number of items. This shows that the use ofexploratory localization in DMTs can overcome limitations posed by localization. However, our results are for analphabetical menu which eases counting. The same performance and strategies might not apply without a logicalordering and needs to be studied.

We queried participants on each stage’s mental and physical demand. Figure 6.11 shows the results. Unsur-prisingly, execution with 4 items was the least mentally demanding, and drag & drop with 8 items, the most. Fouritems were easier than eight items for both tasks. However, even for 8 items, target execution’s mental demandwas low-to-medium. Given that both speed and accuracy of execution are also reasonable, target execution insuch applications shows promise. Further, performance improves with practice. One participant remarked: “The

concept was difficult to grab initially, but once I understood, I was surprised by the ease.” Some participant re-marks hinted at motor learning: “I started with counting the names but after some time I was not really counting,

I just repeated what I did earlier without thinking.”


Figure 6.11: Study 3 Menu Application Results: (left )User approach to finding an item, (right) Mental and PhysicalDemand of the tasks

6.8 Discussion

6.8.1 Contexts of Use

DMTs open up a new space of tactile interactions that enable adaptation of visual interface analogues for thetactile sense. Rich actuation technologies in future wearables can be easily imagined. We already see instancessuch as smartwatches with traditional dials that only focus on tactile feedback [Brian, 2015]. We now list someroutine scenarios of DMT use.

Private Interactions: Visual passcode entry procedures are vulnerable to over-the-shoulder glances. Witha DMT, users can enter tactile passcodes that are secure in such situations. Public Subtle Interactions: DMTscan aid public performers who need to interact with devices as part of their performance but prefer to keep visualattention towards their audience. On-the-go Interactions: Users can perform eyes-free interactions while walking,running or driving. For example, when driving, the steering wheel can have an input controller for radio channelswhich are tactically located on the user’s wrist. This requires an investigation of the limits of complexity thetactile interface can reach before users start getting significantly distracted from their primary activity. Stealth

Interactions: In situations where users want to completely hide their device use, like a meeting or lecture, DMTscan be used stealthily.

The above scenarios are just to give a sense of DMT use. We’re only scratching the surface of what DMTsare capable of, and the application scenarios can potentially range from routine wearable contexts to medical &visually impaired contexts, to unpredictable usecases like underwater apps.

6.8.2 Design Guidelines for DMTs

Based on our observations, we summarize nine design guidelines for building and interacting with DMTs. 1)Phantom sensations & frequency discrimination work well together to simulate tactile pixels and targets. 2)Only using location as a target discriminator works well provided exploratory localization is easy in the DMT. Aprecision counting study will be useful in this regard. 3) Exploratory localization is aided by targets positioned atanchor locations. 4) In the absence of strong anchors on the skin, the pointer start position is the default anchor andshould therefore be reset to its start position every time the screen is switched on or a new application is opened, or


even when a single interaction sequence is complete. This requirement might be relaxed in a non-circular displaywhere the ends can act as anchors. 5) Maintaining a constant ratio of target:void widths helps the user discerntargets and voids in regions of the skin where perception acuity is not as strong. 6) A logical ordering of targetson the screen eases exploratory localization and should be implemented where possible. 7) Even in non-menuapplications, a buffer space around targets is useful to prevent marginally incorrect actions. 8) Since the Idle statehas no response, users lose track of the interface state when clutching. The input needs to have minimal or noclutching to prevent loss of interaction continuity. 9) It will be useful to derive the performance model indices forvarious input devices for a particular DMT. This will help in selecting the apt input device.

An issue with extended duration tactile stimulations is that humans start feeling numb to it. However, inour system, the tactile response is not a single continuous tactile stimulation at a single frequency. The constantpunctuation of low frequency voids with high frequency targets ensured that users did not get numb to a singletactile stimulation.

6.8.3 Limitations and Future Directions

The menu-study shows that the initial friction in use is overcome quickly. However, memory & attention demandsof real world DMT tasks needs investigation. Our target acquisition study intended to establish the baselinedependencies of a no-distractor DMT task. It confirmed our assumptions about adherence to Fitts’ width &distance and gave a sense of the performance of the wrist DMT. But, the exact performance model will not applyfor a with-distractor design. Even, with distractors, the performance model will apply only to that particular layoutof distractors on screen. It will still be useful to study a with-distractor design to see how the performance modelcompares. The no-distractor performance model, however, is a useful baseline to compare multiple DMTs, aswell as multiple inputs for the same DMT.

Many extensions of current work are potential research directions. Interface paradigms other than indirectpointing and 2D DMTs are rich conceptual extensions. Other actuation technologies and skin regions can furthershed light on DMT capabilities. Finally, advanced inputs such as single handed use for wrist DMTs can improveusability and performance.

6.9 Conclusion

We proposed the concept of direct manipulation for tactile displays and define DMTs - Direct Manipulation-enabled Tactile displays. To this end, we deconstruct the elements of a display and adapt them for tactile displays:tactile screen, tactile pixels, tactile pointer, and tactile targets. We describe the tactile indirect pointing interfacethat enables the user to track system state and perform user actions: tactile pointing, selection, execution, anddrag & drop. We implement a proof of concept DMT that uses a unique combination of phantom sensations andfrequency discrimination to generate a controllable pointer over a seemingly continuous tactile screen around thewrist using just 4 actuators and simulate targets different from voids. We define exploratory localization for DMTsand study and report its precision limits for our DMT which reasonably high at a target count of 19. We investigatethe performance dependencies of target acquisition in DMTs and validate its adherence to Fitts’ law. We foundposition to be an additionally significant factor. Based on the dependencies, we derive a performance model forour DMT. To demonstrate a DMT app analogous to visual apps, we study a 4, 8-item menu for target executionand drag & drop performance. With < 5 minutes of visual aid, not only were participants able to perform tactiledirect manipulation in a tactile menu without visual aid, they improved upon their speeds while preserving an


accuracy of > 90% for target execution. Drag & Drop was relatively harder to perform. The study evidenced theuse of exploratory localization, more so for higher number of targets. We end with guidelines for DMT design.Even with fundamental differences in our visual and tactile sense, the gap in visual and tactile interactions is toowide. Tactile direct manipulation is a pivotal step in reducing it.

Overall, this work and HapticClench show how haptics for small or no-screen wearable devices can be ex-panded from their current roles both in terms of their feedback and interaction capabilities. So far we havediscussed how we can expand the interaction capabilities of small screen and wearable devices. However, anotheruse case for wearables is not interact directly with them, but interact with other devices using them. Touchlessinteraction using freehand gestures in air can especially benefit from wearables in solving the interaction chal-lenges. Freehand gestures can be divided broadly into semaphoric and manipulative gestures. In the next twochapters, we propose two methods that respectively overcome the primary challenges of these gestures. We startwith semaphoric gestures in the next chapter.

Chapter 7

Learning Freehand Semaphoric Gesturesusing Tactile Finger Wearables

7.1 Introduction

1 With the concurrent rise of wearables and sensors, there is a renewed interest in freehand gestures that can beused in stationary or mobile contexts (e.g., while sitting on a desk, and while walking on the street). They can beused to interact with large screen displays using Kinect, desktops using Leap Motion, smartwatches and smart-phones using motion or muscle sensing, and even with devices without a visual display using novel technologiessuch as smartrings. This diversity of user contexts demands not just freehand gestures that can be used acrosscontexts, but also freehand gestural learning methods that can be integrated across contexts. Existing methodsthat support the learning of gestures typically rely on visual learning. However, constant engagement with visualdisplays is not always feasible or desired. We propose and evaluate haptic learning as a possible solution. Despiteits potential benefits for eyes-free learning, haptic learning has never been explored for freehand gestural learning.

In terms of perceptual accuracy, the haptic modality has long been considered inferior to vision [Connollyand Jones, 1970, Feygin et al., 2002, Rock and Victor, 1964]. However, learning also depends on phenomenadifferent from perception. Haptic feedback has been extensively used in motor training because 1) haptic trainingoccurs in a body-centered manner through motor coordinates as opposed to visuospatial coordinates, 2) certaincomplex motor movement information like 3D movements are difficult to explain visually or verbally and hapticsremoves the need for complex sensorimotor transformations [Feygin et al., 2002]. The promising haptic explo-rations for motor learning suggest the potential applicability of haptics for gestural learning. Gestural learning isfundamentally different from motor learning in that it is associative where the user learns an association betweentwo stimuli, one corresponding to the command and the other to its gesture. The most pervasive example is thekeyboard shortcut where one stimulus corresponds to the keyboard shortcut and the other, to the command action.The value of haptics for gestural learning has several open sub-questions — a) Can haptic stimulus be used forassociative learning of gestures? b) Is it better or worse than visual learning? c) Does it support immediate recall,or longer-term recall, or both?

To answer the questions about haptic learning for freehand gesture shortcuts, we conduct a two-day study with30 participants comparing the learning of haptic stimuli for gestures coupled with visual and audio command

1The contents of this chapter were published at UIST 2016 [Gupta et al., 2016b].

89

CHAPTER 7. LEARNING FREEHAND SEMAPHORIC GESTURES USING TACTILE FINGER WEARABLES 90

stimuli against visual-only learning. Although most freehand gestural interactions in the current literature aremanipulative gestures [Quek et al., 2002] (which “control an entity [on the screen] by applying a tight relationshipbetween the actual movements of the gesturing hand or arm with the object being manipulated”), we focus ourstudy on a second class of gestures, called semaphoric gestures [Quek et al., 2002] (which “employ a stylizeddictionary of static or dynamic hand or arm gestures that serve as a universe of symbols to be communicated tothe machine”). Keyboard shortcuts for commands are semaphoric gestures which involve a fixed pattern of fingermovement on the keyboard. Semaphoric gesture sets hold similar potential to be used as freehand gestures torapidly invoke command shortcuts in a diversity of contexts through carefully designed actions that would be lessprone to heavy hand and arm fatigue, known as the gorilla arm effect [Hincapié-Ramos et al., 2014]. Learning offreehand semaphoric gesture sets, however, is an open question. The most popular gestural learning techniquesrely on self-revelation of the gesture while the user is using the interface with manipulative gestures. This,however, is not possible for situations where the freehand gestures are designed for non-visual scenarios. Freehandsemaphoric gestures are therefore, perfect for our investigation of haptic learning of gesture shortcuts. Our studyshows that with < 30 minutes of learning, haptic learning of semaphoric finger tap gestures is comparable tovisual learning and maintains its recall on the second day.

7.2 Related Work

Chapter 2 describes haptics for guidance in section 2.2.3 and with and overview of semaphoric gestures in section2.3.3. Here we describe relevant works in haptics for guidance and learning and semaphoric gestural commandlearning.

7.2.1 Haptics for Learning

Haptics has commonly been used as feedback or notifications. Haptic feedback can help to guide users’ move-ments for medical rehab purposes [Shull and Damian, 2015], and computer-based interactions, such as targetacquisition [Oron-Gilad et al., 2007] and visual search tasks [Lehtinen et al., 2012]. With respect to training, ithas been used for posture [van der Linden et al., 2011a] or trajectory training [Bark et al., 2015] mostly consist-ing of repetitive motor movements. Yang et al. found the effectiveness of visual-haptic and visual training werecomparable for helping people develop motor skills [Yang et al., 2008]. Passive haptic learning allows acquisitionof motor skills (like piano tune) while the user is engaged in a distraction task [Seim et al., 2015]. All theseworks focus on non-associative learning of a movement through repeated exposure to the haptic stimulus. Seimet al.’s work on braille training [Seim et al., 2014] using passive haptic learning is the only instance of haptics inassociative learning. While haptics have been used as feedback to assist or guide freehand manipulative gestures[Carter et al., 2013, Lehtinen et al., 2012, Pfeiffer et al., 2014, Sodhi et al., 2013, Schonauer et al., 2012], activehaptic learning has never been used for associative learning of semaphoric gesture sets.

7.2.2 Semaphoric Gestural Commands Learning

Aside from cheat-sheets and video instruction, the common approaches to semaphoric gestural command learningare: self-revelation of gestures while performing manipulative gestures [Jain and Balakrishnan, 2012, Kurtenbach,1993], dedicated or active learning of gestures [Appert and Zhai, 2009, Ghomi et al., 2013, Ghomi et al., 2012,Ismair et al., 2015], and a combination of the two (i.e., revelation while actively learning gestures, also calleddynamic guides [Anderson and Bischof, 2013, Bau and Mackay, 2008, Freeman et al., 2009]). Most use only


visual learning, except a few [Ghomi et al., 2012, Grossman et al., 2007] where audio cues are investigated forgesture presentation and have been found to be similar in performance to visual cues. Some active learningworks include gesture cues in the visual interface [Appert and Zhai, 2009, Grossman et al., 2007] similar tokeyboard shortcuts being displayed next to the menu item. However, unless the user actively engages in learning,observational learning does not help [Blandin et al., 2010, Laguna, 2000]. Our research investigates whetheractive haptic learning can support the learning of semaphoric gestures without the use of visual stimuli used inthe learning process. Haptic cues are arguably easier to integrate in the interfaces because they do not take up anyvisual space.

7.3 Design of a Freehand Semaphoric Gesture Set

Because this is the first investigation in this space, we needed to see if haptic learning works for a simple set ofgestures. We design a simple set of freehand semaphoric gestures that are amenable to simple haptic learning. Theset consists of 14 gestures, which is the standard number used in earlier shortcut gesture set learning investigations[Appert and Zhai, 2009, Ghomi et al., 2012, Grossman et al., 2007]. Each gesture is composed of a sequence ofexactly three finger tap movements in air using the index (I), middle (M), and ring (R) fingers. For instance, theIMR gesture involves the index air-tap, followed by the middle air-tap, followed by the ring air-tap. Any fingercan repeat and therefore we have a potential set of 27 gestures in total.

We selected 14 gestures from these 27 based on i) the ease of performing them, and ii) on minimizing theconfusion between them. We conducted a pilot with four users to test the ease of performance for the 27 gestures.Although the users performed most gestures with ease, they were most uncomfortable in performing gestures thatinvolved using the ring finger twice, consecutively or alternately. Thus, we removed all gesture sequences whichinvolved a repetition of the ring finger, except RRR which would have high memorability. For every gesturesequence, there are six other gesture sequences that only differ by a single finger tap. For instance, IMR has IMI,IMM, IIR, IRR, MMR, and RMR. To minimize confusion, we next removed gestures such that for every gesturethere is a maximum of three other gestures that differ by a single finger tap. The final gesture set is as follows –III, IIM, IMR, IRI, IRM, MIM, MIR, MMM, MMR, MRI, RII, RMI, RMM, and RRR.

7.3.1 Visually Meaningful Gestures

Although we designed the finger tap gestures to explore haptic learning of gestures, we note that there have beenno semaphoric finger gesture sets previously proposed in the literature. The finger tap gestures are uniquelypositioned to fill this gap because they satisfy multiple requirements for practical gesture sets— i) they form agesture vocabulary that uses the same style of invocation, ii) finger air-taps cause minimal fatigue, iii) their subtletyallows them to be performed across diverse contexts, and iv) they are potentially less prone to variations in scaleand user drawings of the gesture, allowing consistent detection by the computer. However, airtaps are also fairlygeneric gestures and might have problems with respect to memorizing the association. One possible alternative isfor the user to define their own gesture sets which they will easily remember. However, user defined gestures willhave problems of their own. First, accurate algorithms for recognizing these on-the-fly gesture sets will have to begenerated automatically. Second, users might come up with meaningful gestures for a few commands but comingup with a consistent design for a lot of commands might be tough. The possibility of more visually meaningfulgestures (which can be static or dynamic) and how haptics can play a role in that certainly needs to be explored.However, we first need to see if haptics can help users learn generic gesture sets in mid-air.


7.4 Harware Implementation of Haptic Rings

We designed the haptic setup such that i) the fingers are actuated directly, b) the system can be made compact,cost-effective, and practical across diverse contexts, c) the gesture is presented in a duration no longer than anoptimal visual presentation, and d) there is zero confusion over which fingers were actuated. We built a set ofthree vibrotactile rings, one for each finger. Each ring contained a mini coin Dura VibeTM motor that is 8mmdiameter and 2.5mm height. The motor is housed within elastic straps to ensure firm positioning of motors acrossfinger sizes. The motor is controlled by pulses from an Arduino Pro Mini and driven through the ULN2003 chipto minimize load on the microcontroller. For any gesture, the rings on the respective fingers vibrated one by onein order. Because a finger can repeat in some patterns (e.g., IIM), we included a time gap between two vibrationsto keep the patterns distinct. The ON duration of a single vibration ring was fixed at 100ms and the OFF durationat 350ms, resulting in a total gesture presentation time of 1000ms. The circuit assembly is mounted on an armband as shown in Figure 7.1.

Figure 7.1: (left) A participant wearing the haptic rings setup for the Visual-Haptic condition. (right) The screen forthe Visual-Visual condition.

Psychophysics has established that vibrations are best perceived in the ventral regions (like the palm) becausethey have less hair and more Pacinian corpuscles that are responsible for detecting vibrations on skin. However,motors placed in the ventral region of one finger can produce vibrations incorrectly perceived by the adjacentfingers. We conducted a small pilot with three participants to test whether the ventral and dorsal regions was thebetter position on a finger to place the vibration motors. Testing with all 14 gestures, we found that participantsincorrectly perceived the gesture in 7.9% of the cases with the ventral placement as opposed to 0.8% with thedorsal placement.

7.5 Study of Active Haptic Learning of Gestures

To investigate the possibility of haptic learning with no visual engagement, we conducted a study comparing threeconditions: Sound-Haptic (SH), where the command stimulus is given via sound and the associated gesture ispresented haptically; Visual-Haptic (VH), where the command stimulus is visual; and Visual-Visual (VV), whichis the baseline condition where both the command stimulus and the associated gesture are presented visually.The SH and VH conditions will inform our understanding of the performance of haptic learning with differentcommand stimuli modalities. Because visual stimuli may lead to higher engagement, we hypothesize that theusers will perform best in VV, followed by VH and then SH.


7.5.1 Experiment Design

We conducted a between-subjects study for the three conditions. The experiment followed closely the studydesign used by Ghomi et al. [Ghomi et al., 2012], which itself followed earlier studies by Appert et al. [Appertand Zhai, 2009] and Grossman et al. [Grossman et al., 2007]. In total, there were 30 (9 female) participants whowere randomly assigned a condition (10 per condition). The participants were all right-handed and aged between18 and 29 (mean = 23.3). Each participant took part in the study for two-days. The experiment consisted of onelearning block and two testing blocks on day 1 to test immediate learning, and a third testing block on day 2 totest mid-term recall. In the learning block, depending on the condition, the image or the audio stimulus for anobject (representing a command) was presented to the participant, and then the associated gesture was presentedvisually or haptically. We asked the participant to perform the gesture after the presentation of each object andgesture pair. We repeated these steps with another object-gesture pair until all pairs have been presented. Then,in the testing blocks, the participant was presented with the object stimulus only and then asked to perform theassociated gesture. If the participant does not remember the gesture, she may request a hint which showed theassociated gesture to the participant. If the participant performed the gesture wrong, the correct gesture waspresented and then she was asked to perform the gesture again.

Figure 7.2: Object images and names

We defined a set of 14 common objects to act as commands that can be associated with the gestures (see Figure7.2). Because the finger tap gestures vary in difficulty, both in terms of required motor control and memorability,we generated 10 random associations between the 14 gestures and the 14 objects; i.e., a different association wasused for each of the 10 participants in a condition. To simulate a more realistic setup where some commandsmay appear more frequently than others, an appearance frequency is randomly assigned to each of the 14 objectsfollowing a Zipf distribution. For the learning block, we use the frequencies (6, 6, 3, 3, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1)and for the testing blocks (12, 12, 6, 6, 4, 4, 3, 3, 2, 2, 2, 2, 1, 1). Therefore, there were 30 trials in total in thelearning block, and 60 trials in a testing block. The presentation of trials was randomized.

7.5.2 Apparatus

During the experiment, participants sat in front of a laptop which displayed the study interface using the LabVIEWsoftware. During the learning block of the VV condition, participants saw an interface with the object image onthe left and the three visual indicators that highlight in sequence according to the associated gesture (see Figure7.1, right). For VH and SH, visual indicators were not shown on the interface. Instead, the participants wore thehaptic rings along with the armband in their right hands. For SH, audio of the object name was played, instead ofthe object image. LabVIEW connects serially with the Arduino Pro Mini and administers the haptic pattern for


gesture presentation without delay. We asked all participants to use the laptop’s trackpad with their left hand topress the Hint/Next button, and to perform the gestures with their right hand which rested vertically on an armrest.

We use manual detection to decide if the participants performed the gestures correctly or incorrectly. Toachieve this without making the participants overly conscious of the experimenter’s gaze, the setup was suchthat their fingers were visible in the laptop webcam which was live streamed to the experimenter in anotherarea of the room. When the participants performed a gesture, the experimenter sees the live stream and sendsa Correct/Incorrect signal to LabVIEW which displays a popup window informing participants if their responsewas correct or not. LabVIEW keeps a log of all the events.

7.5.3 Procedure

When a participant arrives, she is introduced to the task and its interface using a sample object image or audioof its name and an associated gesture outside the sets of 14. The experiment starts after the introduction, with 2minute breaks after every 15 trials. After the learning block, we inform the participant that the testing block willbegin and instruct the participant to seek a hint whenever she is unsure of the response. We ask the participant tocomplete a questionnaire at the end on Day 1 and interview the participant about her approach at the end of theexperiment on Day 2. The experiment takes about 50 minutes on Day 1 and 20 mins on Day 2. Each participantcompletes in total 30 learning block trials and 60 testing block 1 trials + 60 testing block 2 trials +60 testing block3 trials. Thus, there were (30+180) × 30 participants = 6,300 trials.

7.6 Results

The primary measures of the study are recall rate, the percentage of correct answers in a testing block withouthint; hint rate, the percentage of trials where participants used hint. We examine these two measures by theirlearning technique for each block.

7.6.1 Quantitative Results

We found a significant effect of learning technique on the recall rate in Block 1 (F (2, 27) = 4.590, p < 0.05)

and Block 2 (F (2, 27) = 3.703, p < 0.05). Post hoc Tukey tests revealed that the recall rate for Sound-Hapticwas significantly lower than Visual-Visual in both blocks 1 (p < 0.05)and2(p < 0.05). Figure 7.3 showsthe mean recall rate % for each technique by block. The recall rate for SH is 52.8% compared to 62.7% and73.2% for VH and VV respectively in Block 1. In Block 2, SH closes its gap with VH and VV with a recallrate of 78%, compared to 87.5% and 89.2% for VH and VV. However, it is still significantly lower than VV.By Block 3 on Day 2, there is no significant difference between the techniques. Further, there is a significanteffect of Block on the recall rates for all techniques (V V − FGreenhouse−Geisser(1.172, 10.547) = 12.964, p <

0.01; V H − FGreenhouse−Geisser(1.260, 11.339) = 76.830, p < 0.001; SH − F (2, 18) = 46.594, p < 0.001).The posthoc tests with Bonferroni correction revealed that the difference was between blocks 1 and 2, and blocks1 and 3 (p < 0.01 for all pairs). There was no significant difference between blocks 2 and 3 for any technique.

Participants associated object-gesture pairs easiest when the object commands were presented visually (VV &VH). No statistical differences could be found in the recall rates of VH and VV. This suggests that in less than 20minutes of active haptic learning with visual command stimuli, users can learn gestures with similar recall ratesas wholly visual learning. However, even though the difference in VH and VV is not significant after Block 1,VH seems to trail VV in magnitude which might be worth looking into in larger studies. On the other hand, SH


Figure 7.3: Recall Rate % for all techniques by blocks

had the lowest recall rate. However, the improvement in SH recall rate from block 1 to block 2 is 47.7% whichindicates the learning with SH may start slowly but can briskly catch up with the other two techniques; by block3, no statistical difference could be found in the recall rates of the 3 conditions (η2adj = .026). This suggests thatin less than 30 minutes, the learning of gesture shortcuts with haptic gesture presentation and audio commandstimuli without any visual engagement is comparable to wholly visual learning.

The mid-term recall rate from Block 3 is unaffected by a day’s gap for all three techniques. This is interestingbecause a majority of the studies in prior haptic literature for motor training have reported that participants getoverly dependent on haptic feedback during training, thus harming performance later [Lee, 2010, Schmidt andLee, 2005]. Although prior work had already shown that visual learning is retained by the participants in medium-term, this has proven to be the case for haptic learning as well. The participants across all techniques invariablymentioned that after Day 1, they expected to forget most gestures by Day 2. In the words of one participant: “I

just sat for the experiment on Day 2 thinking I don’t remember anything, but then when the objects appeared, the

gesture just automatically came to my fingers.” This suggests that the techniques result in muscle memory of thegestures in at least the mid-term.

Figure 7.4: Hint Rate % for all techniques by blocks

Figure 7.4 shows the hint rate for each technique by block. There is a significant effect of Block on hint rate forall techniques (V V−FGreenhouse−Geisser(1.223, 11.003) = 6.774, p < 0.01; V H−FGreenhouse−Geisser(1.094, 9.845) =

29.920, p < 0.001; SH − FGreenhouse−Geisser(1.015, 9.132) = 7.739, p < 0.05.) The post hoc tests revealedthe significant difference to be between blocks 1 & 2 and 1 & 3 for all techniques (p < 0.05 for all pairs). There is


no significant effect of technique on hint rate in any block. Essentially, participants took a lot of hints uniformlyacross all techniques in block 1, after which they did not need as many. 43% of the hints in block1 were forobjects that appeared only once or twice per block even when they were only 10% of the total trials. Participantsreported that remembering the gestures for objects that rarely appeared was the most difficult.

Although the lower recall rate for SH was much lower than for VV and VH, the hint rate for SH is same forVV and VH. This suggests that the participants in SH did not seek hints when they were unsure and gave moreincorrect responses. Participants reported they got confused between similar sounding words such as Chair andCherry, Gloves and Guitar, and Key and Kiwi. To a lesser extent, there were also confusions because of similarimages, such as Banana and Lemon being both yellow-colored confused some participants. Consequently, theperformance of SH could potentially have been closer to VV and VH if there were similar number of confusingelements in the images.

7.6.2 Subjective Feedback

The results from the questionnaire are illustrated in Figure 7.5 (left). A Kruskal-Wallis test revealed no effect ofthe technique on the participant’s perception of task difficulty, mental effort, and how successfully they thoughtthey performed in the task.

Figure 7.5: (left) Questionnaire Results, (right) Mnemonics used by participants by learning technique

Participants reported confusion with gestures in which only two fingers were involved, for instance IIM andMIM. Participants found the gestures with three distinct fingers to be easier, probably because it enabled thefingers to move in a smooth wave, for instance IMR, MIR, etc. Participants found patterns with one distinct finger(i.e., III, MMM, RRR) the easiest to remember.

Half of the participants reported using mnemonics of some kind while the other half did not use any memorytechniques. We classified their approaches into the following five categories:

• No technique.

• Image mnemonics. Participants developed tricks from how the object images looked and associated themwith the gesture labels; for example, “Cycle has rims. And the gesture was MIR, so I remembered opposite of rimis mir.”

• Sound mnemonics. Similar to above, SH participants also developed tricks from the way object namessound; for example, “Chair ends with IR, so MIR” and “Ba- -na- -na, so it is 1 2 2”.

• Rhythm mnemonics. Some participants were music enthusiasts and used rhythm techniques to help themremember; for example, “I associated the patterns with piano playing and imagined every image in front of the


piano playing the gesture.”

• Grouping of mnemonics. Participants created associations with similar objects or similar gestures; forexample, “123 for Tomato, so 321 for Cherry.”

Figure 7.5 (right) shows the distribution of these by participant count for each learning technique. Althoughgroup mnemonics were used by many participants, only two participants reported it as their dominant memoriza-tion technique. As we can see, mnemonics based on image or sound were the most popular. However, almost halfof the participants reported using no technique to remember the associations. We performed an ANOVA test to seeif there was a difference between participants who used mnemonics vs. who used no memorization techniques,across all three learning techniques. We found that the recall rate was significantly higher for participants whoused mnemonics for all three blocks (B1 : F (1, 28) = 5.347, p < 0.05, B2 : F (1, 28) = 9.579, p < 0.01, B3 :

F (1, 28) = 7.326, p < 0.05). Although the differences are significant, with the first block having a recall rate of69% for participants using mnemonics compared to 56% for no technique ones, by the second block, the recallrate of participants with no technique was 80% compared to 90% for participants using mnemonics. This sug-gests that even without mnemonics, participants would be able to achieve fairly good recall rates with less than20 minutes of active learning.

7.7 Discussion, Research and Design Considerations

7.7.1 Active vs. Passive Haptic Learning Approach

Passive Haptic Learning (PHL) allows acquisition of motor skills via haptic stimulation while no perceived at-tention is given to learning [Seim et al., 2015]. If users would be able to learn gestures passively in this way, itremoves the need for active or self-revelation learning.

Before we investigated active haptic learning, we conducted a short pilot study with four participants to in-vestigate PHL’s potential for gestural learning. We randomly selected 8 gestures from the set and associated themwith 8 object names randomly selected from the set shown in Figure 7.2. We employed a testing procedure similarto the design used in earlier PHL studies [Seim et al., 2015]. First, in the introductory phase, we played the audioof an object name and the haptic sequence for the associated gesture. We then asked participants to perform thegesture once. This was done once for all 8 object-gesture pairs. Then in the PHL phase, we asked participantsto play the Candy Crush game [King, 2017] on a smartphone and score as much as they can. At the same time,we played the audio cue followed by the associated haptic sequence for all 8 pairs in random order repeatedly.The PHL phase lasted 40 minutes within which each audio-haptic pair was played exactly 20 times. At the end ofthe PHL phase, we asked participants to reproduce the associated gesture to the different audio cues. Participantscould only reproduce an average of 2 gestures correctly out of 8. We concluded that associative learning taskswhere the user has to learn multiple command-gesture associations are not conducive for passive learning. As aresult, we did not use PHL in our study.

7.7.2 Active Haptic Learning Strategies

Given our results show that haptic learning can be learned, an open research question that needs to be investigatednext is how to naturally integrate haptic learning of gestures into actual systems. One possibility, for instance, isthat whenever a user selects an icon (e.g., by dwelling on it), the associated shortcut gesture’s haptic cue can beplayed.


Mnemonic associations depended heavily on the object-gesture pairings that participants randomly received.However, even when there were no natural associations, participants found creative ways to make associations.We believe these associations have been uniquely bolstered by the type of gestures we used. The I, M, R fingerpatterns are amenable to different strategies that participants may use to connect them to images, sounds, andrhythms. Such connections are arguably not as easy to other types of gestures (e.g., drawing gestures), thusmaking finger tap gestures more compelling in practice. However, participants performed well even withoutmnemonics and participants in haptic conditions reported that the vibrations helped them build muscle memorywithout much conscious effort. An interesting question to explore would be the extent of active engagement thatis elicited by each technique.

7.7.3 Freehand Finger Tap Gestures

Our results show that haptic learning of gestures can work comparably to visual learning. However, the extent oflearning will depend on the kind of gestures that are involved; this is, of course, true for visual learning as well.

We observed that while some users performed the gestures in a wavy rhythmic form, others performed it ina jerky motion. The gesture detection algorithm needs to handle these different input scenarios. Additionally,gestures that are very similar to each should be avoided. Particularly, if possible, the number of gestures withexactly two distinct fingers should be minimized to reduce initial confusion in learning the gesture set. Gestureswith a single distinct finger were the easiest to remember, followed by the ones with three distinct fingers, and thenthe ones with two. Because the gestures are of varying difficulty, different assignment strategies can be developedand tested. For instance, frequent commands could be assigned to the easiest gestures so that the user can startusing them immediately. On the other hand, if the application requires all commands to be learnt at the same rate,then perhaps the easiest gestures should be assigned to less frequent commands, and the hardest gestures assignedto the most frequent ones. Alternatively, perhaps the users can also be given the freedom to choose which gesturesto associate with different commands.

7.7.4 Conclusion

Our work is the first exploration into haptic learning of semaphoric gestures. To this end, we have designedfreehand finger tap gestures and haptic rings which can potentially be used across diverse user scenarios. Througha two-day study with 30 participants, we learn that haptic learning can be successfully used for associative learningof freehand finger tap gestures and command shortcuts. We show that when combined with visual commandstimulus, haptic learning of freehand semaphoric gestures performs comparably to visual learning. Further, hapticlearning with audio command stimulus and no visual engagement initially has a lower initial recall rate, but withless than 30 minutes of learning, it becomes comparable to visual learning. We also show that the mid-termrecall rate for haptic learning stays constant, as has been reported earlier for visual learning. Our results suggestthat gestural learning can be integrated into many contexts with minimal visual engagement. As the first workthat investigates the active haptic learning of semaphoric gestures, our findings are highly encouraging for futureexplorations in this space.

After investigating the learning of semaphoric gestures, in the next chapter, we turn our attention to manipu-lative gestures. Manipulative gestures in air are riddled with multiple problems, the foremost being fatigue due toconstant hand movement in air. We investigate a new technique that solves some of those problems.

Chapter 8

Easing Freehand Manipulative Gesturesby moving beyond Point & Select

8.1 Introduction

1 Manipulative gestures in mid-air for large displays follow the traditional point & select desktop model to manip-ulate controls such as buttons and sliders. Finger detection wares like Kinect and Leap Motion propagate the useof hand as a pointer proxy in their developer demos and commercial applications. Further, research in freehandinteraction also centers on optimizing the pointing paradigm [Liu et al., 2015, Vogel and Balakrishnan, 2005, Ren,2013, Bateman et al., 2013]. However, hand as an input device in air is vastly different from a mouse. While themouse has limited degrees of freedom (x-y movement, left-right click, scroll wheel), the hand enables a muchwider range. Each of the five fingers are capable of x-y-z movement (with individuation constraints), the handitself can move in the x-y plane, and the arm can move freely in x-y-z. The mouse is limited in that it cannot signala variety of different input intentions in a single action, thus necessitating navigation to the targets. The hand,however, can signal a variety of intentions without the need for navigating. Why, then, do freehand interactionsin air still rely on a paradigm that catered to the limitations of traditional input devices? One possible solution isgiven by semaphoric gestures. However, as discussed, these gestures can only be performed for fixed commandsand need to be learned extensively for each command.

The question is, once the user decides the target control, why should there be a need to navigate to it? Weintroduce summon & select, a fast, low-fatigue freehand interaction model to interact with interface controlswhere the user summons the intended control on the large display to get it into focus and then manipulates it. Inessence, it is a combination of semaphoric and manipulative gesturing, such that the user neither needs to navigateto the control, nor needs to learn an extensive gesture vocabulary. The user simply summons a control using asemaphoric gesture and then controls it using manipulative gestures. In the upcoming sections, we describethe design of summon & select interaction, its features and constraints, and report on a study that investigatesits design. We then report on a second study that compares it to pointing and shows that summon and selectoutperforms point and select for a multi-button interface on both performance and preference. We also conduct apreliminary investigation of how haptic feedback aids summon & select.

1The contents of this chapter were published at ISS 2017 [Gupta et al., 2017b].

99

CHAPTER 8. EASING FREEHAND MANIPULATIVE GESTURES BY MOVING BEYOND POINT & SELECT 100

(0) (1) (2) (4) (3.1) (3.2) (3.3)

Summon Disambiguate Manipulate Release

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

Figure 8.1: Steps of summon & select for the bottom slider. (0) Idle (1) Summoning gesture for slider (2)Disambiguating by zoning to the desired slider (blue focus moves to the bottom slider) (3.1-3.3) Manipulation: (3.1)Enter Drag gesture to enter dragging mode (green box around the bar) (3.2) Dragging the slider bar (3.3) Exit Drag

gesture to exit dragging mode (4) Release gesture to release the control

8.2 Related Work

Multiple works have focused on mid-air interactions using touch, pens, and other controllers [Jansen et al., 2012,Bragdon and Ko, 2011, Vatavu and Radu-Daniel, 2012, Nancel et al., 2013]. Our focus in this paper is onfreehand interactions, whose existing literature can be divided into manipulative and semaphoric gestures. Chapter2 discusses semaphoric gestures and describes how the existing work in manipulative gestures can be divided intoabsolute and relative pointing. We now discuss the specific techniques that deal with manipulating virtual objectsin mid-air and reducing fatigue.

8.2.1 Manipulative gestures for virtual objects

Song et al. [Song et al., 2012] propose an interaction technique to manipulate virtual objects using a handlebar metaphor. Gustafson et al’s imaginary interfaces [Gustafson et al., 2010] investigate freehand interactionswithout any screen or visual feedback by studying how accurately a user can point in an imaginary plane fixed atthe intersection of an L-shaped hand gesture. Stein et al’s imaginary devices [Steins et al., 2013] take this conceptfurther by imagining physical devices such as a keyboard under their fingers in air and using it. This is similar toour work in that making the pose for a physical device like a keyboard allows them to use the keyboard. However,imaginary devices is a technique to simulate physical devices in air. Summon & select, on the other hand, is aninteraction model to interact with controls in software interfaces.

8.2.2 Reducing Fatigue in Mid-air

Existing works tackle the fatigue problem in different ways. Freeman et al. [Freeman et al., 2016] tackle theproblem of devices with limited tracking range by using light and tactile cues to guide where and how to gesture.Other works [Rateau et al., 2014, Lin et al., 2013] solve the problem of gesturing outside of the hand’s comfortzone by and enabling user creation of virtual planes. Gunslinger [Liu et al., 2015] uses bimanual relaxed arms-down gestures with one hand doing the pointing and the other performing semaphoric gestures.

8.2.3 Alternative to Point & Select

Balakrishnan [Balakrishnan, 2004] analyzes methods to beat Fitts’ law for non mid-air contexts including jumpsto the next target which is similar to our Zoning method. Within mid-air work, finger-count menus [Bailly et al.,2011] allow menu item selection using a correspondence between the menu item ordering and the number of


stretched out fingers. This method could possibly be used for buttons in interfaces as well. However, the numberof targets are limited by the fingers unless higher order combinations of fingers are used. Further, the techniquewon’t apply across other common controls. PathSync [Carter et al., 2016] and TraceMatch [Clarke et al., 2016]enable object selection by mimicking the motion of the object [Clarke et al., 2016] or the motion-pattern displayedaround the object [Carter et al., 2016].

All techniques stated above either offer improvements to pointing or enable target selection (mostly buttons)using semaphoric gestures. But none of these offer an end-to-end alternative to point & select starting with selec-tion of a specific control amongst different types of controls, followed by a seamless transition to manipulation ofthe selected control. Summon & select combines semaphoric and manipulative gestures to offer a viable alterna-tive to point & select.

8.3 Summon & Select

Summon & select takes place in four essential steps: 1) Summon: The user performs a specific summoning gesture

to summon a type of control, like Button or Slider. This is denoted by a focus box on the control. 2) Disambiguate:If there are more than one controls of the summoned type, the user moves the focus box to the desired controlby either tabbing through the controls or zoning coarsely into the region of the intended control. 3) Manipulate:The user can now manipulate the control by clicking or dragging it. 4) Release: The user releases the control byperforming a release gesture.

Figure 8.1 illustrates how summon & select works for a slider control type. The interface consists of threebuttons and three slider controls. The user intends to manipulate the bottom-most slider. 1) Summon: The usermakes the thumb+index finger summoning gesture (as if to hold the slider bar) to summon the slider control(Figure 8.1-(1)). This brings the middle slider into focus. At this stage, the region around the user’s hand isdivided into three large virtual zones vertically, each corresponding to one of the sliders. 2) Disambiguate: Theuser moves the hand down to reach the volume slider’s zone which brings it into focus (Figure 8.1-(2)). 3)

Manipulate: To enter drag state, the user performs the enter drag gesture which is to pinch the thumb and fingercloser (as if to tighten the grip on the bar) (Figure 8.1-(3.1)). The user then moves the hand horizontally toreposition the bar (Figure 8.1-(3.2)). To exit drag state, the user performs the exit drag gesture which is to movethe thumb and finger apart (Figure 8.1-(3.3)). The user can perform 3.1-3.3 repetitively to adjust the slider baruntil she is satisfied. 4) Release: The user then releases the slider by performing an open palm gesture (Figure8.1-(4)).

The above example illustrates a dragging manipulation for the slider control. A button click will be simpler.The user makes an index finger-up summoning gesture (as if raised to press a button), zoning into the desiredbutton’s region, performs an air tap to click the button, and then releases it. Clicking a button also leads to anew screen in some applications, in which case the control will auto-release upon clicking. Multiple controls ofmultiple control types can exist in a single interface. For instance, a video player application can consist of play,reverse, forward buttons and playback, volume sliders. The user can use the appropriate summoning gesture tosummon a control type.

8.4 Design Elements & Midas Touch

One of the biggest challenges for freehand gestures is Midas touch which is the detection of an unintended gesturebecause of a combination of imprecise detection and inefficient gesture language design. We design summon &


select such that the possibility of accidental detection of unintended gestures is minimized. In summon & select,Midas touch can occur at each of the four steps: accidental summoning, accidental control switching, accidentalmanipulation, & accidental release. We now look at each of the four steps in depth and detail how we overcomethe design challenges including Midas Touch.

8.4.1 1. Summon

For applications consisting of a multitude of control types, the user will have to learn and remember a largenumber of summoning gestures. To solve this, we propose that gestures be designed such that they evoke thephysical manipulation of the control. For instance, the index finger-up button summoning gesture is similar tohow a user would approach a physical button. Similarly, the thumb-index slider gesture is how a user wouldapproach dragging a physical slider. Figure 8.2 shows a small snippet of summoning gestures for five differentcontrol types.

Midas Touch: Accidental Summoning

Accidental summoning can occur when a) the user’s hand pose summons something when they did not intendto summon anything, b) the summon gesture summons an unintended control type, or c) when the transition tothe following disambiguate or manipulation gestures are mis-recognized as part of the summon gesture or a newsummon gesture. To prevent (a), we only summon a control when the summoning gesture has been held in thesame pose for at least a 500ms period. To prevent (c), once a summoning gesture is detected and a control typeis summoned, the system does not register any new summoning gestures until the explicit release gesture releasesthe control. Solving (b) depends on a clear distinction between the summoning gestures which are enabled on aninterface that the users can perform easily and the algorithm can detect consistently.

1337

Figure 8.2: Example summoning gestures and manipulation gestures (in blue arrows) for different control types:Button (airtaps), dial knob (pinched rotation), switch (lateral thumb-tap), spinbox (lateral index, middle airtaps),

paired buttons (vertical index, middle airtaps).

8.4.2 2. Disambiguate

While the interaction consists of four steps, the design focuses on making it as fluent as possible. This involvesensuring that the summoning gesture segues into manipulation seamlessly so as to complete the physical metaphor.The slider, for instance, starts with the thumb-index grip gesture that is then tightened to transition to manipulation.Similarly, Button’s index finger-up gesture leads to air-tap. Thus the Disambiguate step should not alter the


summoning gesture pose. We propose two control disambiguation techniques that satisfy this criterion: Zoning

and Tabbing.

Zoning: As illustrated in Figure 8.1, in Zoning, after the user summons a control type, the hand trackingzone within which the user can comfortably move her hand is divided according to the number and position ofcontrols of the summoned type. For example, in Figure 8.1, summoning the slider type divides the zone into threeparts divided horizontally, each part associated with its slider. Similarly for buttons, the zone is divided into threevertical zones. When the user performs a summoning gesture, the control in the zone in which the user’s handresides is summoned. The user then moves her hand into the desired control’s zone which summons it. The usercan now manipulate the control. Once the user starts manipulation, zoning is turned off so that the user is notrequired to stay within a zone while manipulating.

Accidental Zoning could occur if the user’s hand accidentally stumbles into another zone while doing themanipulation gesture such as gripping the slider bar. This is the classic midas touch problem which frequentlyoccurs in point & select interactions, especially when trying to select smaller targets. Here, since the comfortableinteraction area around the user is divided into zones, each zone will be large enough to avoid accidental zoningunless there’s a large number of controls of the same type. Study I and Study II shed light on this issue for atelevision sliders interface and a multi-button interface.

Tabbing: Even though Zoning does not ask the user to change their hand pose, since it involves using thesame hand, there could be a possibility that it interferes with the enter drag gesture for certain control types.Consequently, we propose an alternate Disambiguate gesture which involves Tabbing through the controls byrepeatedly opening and closing the second hand. Each hand closing pose results in a tab through to the next controlof the summoned type in the logical ordering on the interface. This means that (a) the first hand is not requiredto move, and (b) the second hand only needs to change its pose without any movement, thus minimizing the firsthand’s wobbling even further. While Tabbing insulates the interaction from midas touch when disambiguating,it is a more tedious gesture that requires the second hand and tabbing through the controls one by one. Study Icompares Tabbing and Zoning for quantitative and qualitative performance.

8.4.3 3. Manipulate

Control manipulation can be static such as a button click or dynamic such as dragging the slider bar. Just likepoint & select for the mouse has a Dragging state distinct from its Tracking state [Buxton, 1990b], summon &select for the hand has a Dragging state distinct from its Summoned state which the user needs to enter and exitexplicitly. For button clicks, the enter drag-manipulate-exit drag transition happens all in one air-tap. For controlsthat involve dragging such as sliders, an explicit enter-exit gesture pair ensures that accidental manipulation of acontrol is a remote possibility during Disambiguation. The gesture pair should be designed such that (a) the statetransition is seamless as already discussed, and (b) the gesture pair itself does not cause accidental dragging in themiddle of performing it. As mentioned, the physical metaphor of tightening and loosening the grip on the objectbeing dragged satisfies both the requirements. For instance, the knob control in Figure 8.3 involves a three-fingersummoning gesture followed by tightening the grip to rotate the knob. Static operations such as button clicks areless prone to (b). The blue arrows in Figure 2 indicate how different control types’ summoning gestures lead tomanipulation.

Note that once the user starts manipulating a control, disambiguation is turned off and to select a new control,the user needs to release and summon again. This allows the user to freely move their hands and position itcomfortably while not in the dragging state.


Clutching

The explicit drag state enables clutching. For example, for video playback sliders, precise dragging might be anissue if the hand tracking zone is mapped to the entire duration of a very long video. We can map it to a smallerperiod and the users can clutch by entering drag, dragging, and exiting drag repeatedly. Study I investigates howaccurate users can be while positioning the bar to a specific value.

8.4.4 4. Release

The gesture that releases the summoned control should be such that (a) it does not lead to accidental manipulation,and (b) is not confused with summoning/disambiguate/manipulation gestures to prevent accidental release. Wedesignate an open palm gesture with the summoning hand as the release gesture which satisfies both requirementsto a very high degree. The release gesture doubles as a quick Undo when user summons the wrong control.Further, the open palm pose also acts as the perfect reset of the hand before beginning the next summoninggesture. Figure 8.3 shows the summon & select state machine.

Summoned

Summoning gesture

Manipulation

Release gesture

Idle

Enter drag gestureTabbing/Zoning

Exit drag gesture

Dragging

0

12

43.1

3.2

3.3

Figure 8.3: Summon & select state machine. The numbers correspond to Figure 8.1.

8.5 Summon & Select: Advantages and Limitations

8.5.1 Advantages & Applicability

Point & select in mid-air is fatiguing due to multiple issues which summon & select overcomes by virtue of itsdesign: 1) Navigational motion: Summon & select gets rid of the constant hand movement for navigating todesired controls. 2) Precise pointing & selection: To click a button, point & select requires the user to preciselyposition the hand over it and dwell/air-tap. Dwell requires the user to wait and keep the hand stationary, whileair-tap requires the user to not displace from the button while performing the air-tap due to the hand wobble. Thisis physically & mentally fatiguing, especially for smaller targets. In summon & select, once a target has beensummoned, the user only needs to remain within the zone of a button (for Zoning) which would typically be muchlarger than the size of the button. For Tabbing, the user is free to move the hand anywhere in the tracking zoneand clicking without losing the summoned target. 3) Out-of-bounds controls: The user’s hand can comfortablymove only within a certain area relative to the body. If the user is positioned close to the center of the screen andthe target is closer to the screen edges, the user needs to either stretch out the hand far out of the comfort zone ormove the body itself. The problem is especially exacerbated with ultra-sized displays. 4) Out-of-tracking range:

A related issue is that hand tracking devices have limited tracking range which might not be enough for precisecontrol of a large displays if mapped directly for point & select. Relative pointing is a possible solution; however,


there are no easy ways to clutch when using Point & select in air. Summon & select can be comfortably performedwithin the hand comfort zone and the tracking range. Study II investigates whether these factors actually yield abetter performance and user experience for summon & select when compared with point & select.

Summon & select can be useful for large screen displays for applications that involve interacting with multiplecontrols without restricting them to low-precision big-button interfaces; for instance, video players, menu grids(such as game menus), and television interfaces (as in Figure 8.4(left)). They can also be useful for other displayscenarios such as car dashboards or virtual reality. Since summon & select with Tabbing requires minimal handmovement and is not affected by casual hand motion (unless it is in dragging state), it can be useful in scenarioswhere the user is mobile (for instance, a user walking with an augmented reality headset).

8.5.2 Limitations

Current large screen interfaces are designed for point & select. Summon & select is limited in its capabilities tosubstitute point & select across all interfaces. First, while some control types can be adapted easily for summon& select, others like checkbox groups or date-pickers can not. For both, there is no obvious summoning gestureaside from the index finger-up gesture which is already reserved for the button. Further, their manipulation isentirely dependent on point & select. While such controls are used less often in large screen applications, it isa drawback. Second, summon & select will not work for large screen applications that involve selecting anyrandom point on the interface. For instance, in maps, users can place pins at any point on the map. Given theseconstraints, summon & select can only replace point & select if the application is not constrained by the abovetwo limitations. We discuss how summon & select can work alongside point & select in the Discussion section.An additional factor here is the use of basic semaphoric gestures in large screens such as swiping. These gestureswork at the scope of the entire screen, such as a left swipe leading to the next page. These can easily work togetherwith summon & select as long as they are performed with an open palm or with poses that do not interfere withthe summoning gestures of an application.

8.6 Prototype Implementation

To evaluate different aspects of summon & select, we built a prototype system consisting of a (60cm × 33cm)screen and Leap Motion for hand tracking. The interaction zone area is fixed at (50cm× 27.5cm) to account fortracking range, hand-comfort bounds, and appropriate resolution of mapping. We investigate two control types inour studies: Button & Slider. A blue focus frame around the control acts as visual feedback for the summonedslider. As the user zones or tabs to other sliders, the focus moves accordingly. Upon entering the drag state, theslider’s shading changes. Similar visual feedback is implemented for the button.

The hand gesture recognition was built on Leap’s hand model data such that the gestures were recognizedcorrectly and there was minimal conflict between various gestures and transitions. For example, the slider summongesture was recognized when the thumb and index finger were stretched out and the other three fingers were curledin. The thresholds for stretched out and curled in were defined such that they struck a balance between the exactgesture and a more relaxed gesture. After multiple iterations with the algorithm, we conducted a pilot study withthree users to investigate the gesture detection accuracy within lab conditions. After a 10 minute training andpractice session where the users learned the way to perform the gestures so as to extract maximum accuracy,each user performed 32 trials: 8 trials for summoning the button, zoning into one of nine buttons laid out on theinterface, clicking it, and releasing it; 8 trials for summoning the slider, zoning into one of five sliders on the


interface, tightening the grip, dragging the bar to any value they wanted, loosening the grip and releasing; and 16similar ones with tabbing. We only looked at the accuracy of the detection of each gesture. 92.7% trials wereperformed without any detection errors which was deemed good enough for the studies. During the studies, wediscarded the trials with erroneous detection and redid them. No major Midas touch issues were observed.

8.7 Study I: Slider Disambiguation & Dragging

We conducted a study to investigate the following factors: 1) Tabbing vs. Zoning performance: How fast can theuser tab or zone to the right slider amongst five vertically adjacent sliders? Further, how does the transition fromthe summoning gesture to manipulation get affected by Tabbing vs. Zoning? 2) Dragging performance: Withclutching enabled, how quickly and accurately can the user drag the bar to the correct value? This might differ forTabbing vs Zoning because the user’s hand is not always centered in zoning.

8.7.1 Study Design

The study follows a within-subjects design with two independent variables, disambiguation technique (Tabbing,Zoning) and slider number (Figure 8.4(left), sliders 1-5). The sliders, modeled on television settings, go from0-100 and have relative mapping where a 1cm hand displacement corresponds to 1 tick on the slider. 1cm is justenough for the user to accurately position the slider to an exact value in combination with clutching. In each trial,the participant is instructed to summon a particular slider and set it to a certain target value. The trial ends whenthe participant releases the slider. The next trials begins 3s after this release. The slider bar is repositioned at 50at the start of each trial. The default focus at the start of each trial in the tabbing condition is at the bottom slider(slider 1).

Figure 8.4: (left) Study I interface: Sliders are numbered 1-5 from the bottom. (right) Study II interface: Buttons arenumbered 1-3 from top to bottom on left and 4-6 on right. The original screens had more free space around these

snapshots which have been cropped here for space.

12 right-handed participants (mean age = 23, range: 21-26), none of whom had experience with freehandinteraction took part. The disambiguation technique was counterbalanced among participants. Participants wereintroduced to the gestures initially and allowed to play and practice with them. Before each technique, participantsperformed four practice trials. Each participant performed two trials per slider per technique with a different targetvalue for each of the two trials. The two values remained same across sliders. The ordering of sliders and thetarget values was randomized. Participants were instructed to be as fast and accurate as possible. For Zoning,


participants are asked to rest their elbow on the table after every trial, which brings them into the slider 1 zone.However, participants with longer forearms would sometimes overshoot into an upper zone. The study lasted30min. Participants were given a 2min break between the techniques. In total, we had 12 PARTICIPANTS × 2TECHNIQUES × 5 SLIDERS × 2 REPETITIONS = 240 total trials.

8.7.2 Results

Midas Touch

Of 120 Zoning trials, 8 trials resulted in zoning to an incorrect SLIDER (6.67% error). These were a result ofaccidental zoning when the participant tried to do the enter drag gesture with a strong jerk which switched thehand’s zone. For Tabbing, 3 of 120 trials resulted in tabbing to an incorrect SLIDER. Apart from this, no otherMidas touch problems were observed.

Tabbing vs. Zoning Performance

A user can potentially reach the correct SLIDER multiple times while tabbing or zoning before starting to ma-nipulate it. We measured the reach time for the intended slider starting at the time of instruction until the finalinstance the user reaches the correct SLIDER. So if the user passes over the desired control more than once, thetime for the last such instance before they start manipulation is used. If the user selects the incorrect SLIDER, thetrial is discarded. A two-way repeated measures anova with Greenhouse-Geisser correction showed a significantinteraction effect of TECHNIQUE and SLIDER on the reach time (F (1.891, 44) = 4.927, p < .01, η2p = .309).Figure 8.5 shows the mean reach times: 0.1s, 1.8s, 2.4s, 3.0s and 3.4s for tabbing to sliders 1-5 and 1.1s, 2.9s,1.8s, 2.9s and 2.2s for zoning to sliders 1-5.

For SLIDER 1, Tabbing’s reach time is faster. This is confirmed by posthoc tests for SLIDER 1 (F (1, 11) =

11.262, p < .01, η2p = .506). In fact, Tabbing’s reach time is almost zero since it is the default tab. Even forZoning, most participants’ hand started in SLIDER 1’s zone, but a few of them started in different zones. The reachtimes are comparable for sliders 2, 3, and 4. For SLIDER 5, Zoning is faster (F (1, 11) = 14.104, p < .01, η2p =

.562), partly because tabbing 5 times takes longer, but also because it is easy to reach the top zone quickly withoutmany hand adjustments. The results show that after a certain number of controls of the same type, zoning willstart to outperform tabbing, but not immediately. However, this depends on the arrangement of the controls aswell. We explore a 2D arrangement in the next study.

Mean

reach tim

e (

s)

1 2 3 4 5

TabbingZoning

0.0

1.0

2.0

3.0

4.0

SliderFigure 8.5: StudyI: Mean reach times for Tabbing and Zoning for the five sliders. Tabbing is faster for SLIDER 1,

while zoning is faster for SLIDER 5. Error bars are 95% CI.


Seamless Transition

The transition time is measured starting from the last time the user reaches the target SLIDER control until the userperforms the enter drag gesture. We found that Zoning had a significantly faster transition (mean=0.87s, 95% CI[.77,.97]) than Tabbing (mean=1.46s, 95% CI [1.05,1.87] ) (F (1, 11) = 13.917, p < .01, η2p = .559). This wasexpected since the user can zone and do the drag state gesture in one single flow with the same hand.

Dragging Performance

We measure the dragging time starting with the point at which the drag state is entered to when the control isreleased. No significant main or interaction effects were found. Dragging took a mean time of 6.7s (95% CI[5.1,7.8]) for Tabbing and 7.8s (95% CI [6.5,9.1]) for Zoning. Although zoning has a lower mean, the differencesare not significant. However, one participant mentioned the dragging while zoning issue - “Sliding with my hand

in a high or low position was tougher.” Overall, 7 participants preferred Zoning, while 5 preferred Tabbing.

Participants performed dragging highly precisely. The mean difference error across all 240 trials was 0.17.(Three outliers with a difference >25 were removed.) This shows that clutching was effective. Further, it showsthat accidental manipulation due to ungrasping is not an issue. Participants performed an average of 3.1 clutches,which implies that they clutched every 8.3cm on average. Even though participants can easily move their handwithin a range of 25cm, they chose to perform smaller movements with more clutching.

Summary

In summary, (a) Tabbing’s summoning speed is similar to Zoning for vertically distributed controls. Tabbing isobviously faster for the default control. Zoning is faster for the control with the fifth tab. Tabbing vs. zoning isgoverned by the number of controls and their positioning. (b) Further, transition to drag with Zoning is signifi-cantly faster than with Tabbing. (c) Even though sliding in low or high positions was considered tougher, draggingtime for Zoning and Tabbing are similar. (d) Participants can perform dragging with a very high accuracy and pre-fer clutching in air over moving a larger distance. (e) Dragging is sufficiently immune to accidental manipulationfrom enter/exit drag gestures.

8.8 Study II: Summon & select vs Point & Select

Study I showed that tabbing and zoning have similar performance for five vertically aligned slider controls. Italso showed that the summon & select interaction works well without midas touch issues. In Study II, we seeif summon & select can better point & select in terms of performance for a multi-button selection task. Weinvestigate three specific metrics - 1) Reach Time: Time it takes to reach the desired button just before clicking.We hypothesize that the reach time will be be affected by the technique with pointing being the slowest and zoningbeing the fastest. 2) Selection Time: Total time it takes to select the desired button. We hypothesize that this willbe affected both by technique and button size. 3) Qualitative aspects. In addition, we also hope to see how adifferent layout of controls which involves both horizontal & vertical movement affects Tabbing and Zoning.

8.8.1 Study Design

The study followed a within-subjects design with three independent variables - technique, button size, and buttondirection. The three techniques were: summon & select with tabbing vs. summon & select with zoning vs. point


& select. The interface consists of six buttons, with three different sizes and in two directions (Figure 8.4 (right)).For Zoning, six zones are formed from two vertical and three horizontal divisions.

The button sizes are 2.25cm2, 4.5cm2, and 9cm2. For pointing, the screen (60cm× 33cm) is mapped to thetracking zone (50cm × 27.5cm) centered at the center of the screen. The setup was designed such that the usercan reach the buttons comfortably which are placed 20cm from the center on either side. For button clicks inpointing, we use airtaps, same as in summon & select. Earlier work [van de Camp et al., 2013] suggests that whenpointing, once the pointer is over a button, users overwhelmingly prefer the airtap gesture for button click overeight other alternatives (dwell, grab, etc.). For zoning, the same (50cm × 27.5cm) area is mapped to six equalzones corresponding to each button. The default focus for Tabbing is at Button 1 which is small. Button 6, theother small button is last in the tabbing order.

12 right-handed participants (mean age = 24.1, range 20-28), all different from prior studies took part. Thetechnique is counterbalanced amongst participants using a Latin square design. Each trial starts with the user’shand in the center of the tracking zone, directly above Leap Motion with the pointer visible only in the Pointingcondition. The user is instructed to select a particular button. The user summons and tabs/zones, or reaches overa button depending on the condition, then clicks it, and finally performs the release gesture to end the trial. Theuser is then instructed to bring the hand back to the center, after which the next trial starts. The selection time datafrom incorrect button selections is discarded. In pointing, the user might airtap multiple times on screen areaswithout any button. These are recorded as missed clicks.

There are 4 trials for each of the 6 buttons per technique. Ordering of the 24 trials for each technique israndomized. Participants practiced the interactions for 10mins and did four practice trials before every technique.At the end, a Likert scale questionnaire was given and a brief interview was conducted. The study lasted 40mins. In total, we had 12 PARTICIPANTS × 3 TECHNIQUES × 3 BUTTON SIZES × 2 DIRECTIONS (left/right) ×4 REPETITIONS = 864 button clicks.

8.8.2 Results

Reach Time

For zoning and tabbing, the reach time is measured same as in Study II, starting from the instruction until thecontrol is in focus just before clicking it. For pointing, reach time is the first time the pointer reaches overthe target. The pointer can leave and enter the button area repeatedly in an attempt to click it, but the reachtime is when the user reaches it the first time. A three-way repeated measures ANOVA showed an interactioneffect of TECHNIQUE and DIRECTION (F (2, 22) = 17.300, p < .001, η2p = .611). No effects of BUTTON

SIZE were found. Figure 8.6 shows the reach time for the three TECHNIQUES for both DIRECTIONS: mean2.4s/2.0s for pointing, 1.5s/2.6s for tabbing and 1.9s/1.9s for zoning (left/right button). Because of highertabbing requirement, Tabbing on the right is slower than both (pairwise comparison with pointing: (F (1, 11) =

5.113, p < .05, η2p = .317), with zoning: (F (1, 11) = 18.792, p < .01, η2p = .631) ). However, Tabbingoutperforms Pointing in the left (F (1, 11) = 20.895, p < .01, η2p = .655) and is clearly benefited by thedefault focus and low tabbing requirement. Zoning performs similar to Tabbing in the left. Although their meansare different, Pointing and Zoning did not differ significantly in their reach times in either DIRECTION. This isbecause the buttons were easily reachable from the center. A variable distance study can shed light on whetherZoning improves reach time.

In summary, 1) Tabbing has better reach time than pointing for up to to three controls. 2) Pointing has betterreach time than tabbing for >3 controls. 2) Zoning’s reach time is highly consistent in both DIRECTIONS and is


Pointing

Tabbing

Zoning

Left Right0.0

1.0

2.0

3.0M

ea

n r

ea

ch

tim

e (

s)

Button

Figure 8.6: StudyII: Reach Time per TECHNIQUE for left & right buttons. Error bars are 95% CI..

not outperformed in any of the cases including tabbing for <=3 controls.

Selection Time

The total selection time is logged starting from the instruction prompt. For summon conditions, selection timeis measured until the completion of the release gesture. For pointing, it is measured until the button click sincerelease is not a part of the point & select interaction and only serves to end the trial. A three-way repeated measuresanova showed significant main effects of TECHNIQUE (F (2, 22) = 8.328, p < .01, η2p = .431) and BUTTON SIZE

(F (1.177, 12.947) = 5.548, p < .05, η2p = .339), and significant interaction effects of Technique × Button size(F (1.912, 21.030) = 4.723, p < .01, η2p = .300), and Technique × Direction (F (1.294, 14.233) = 17.018, p <

.001, η2p = .607). (The last three are with Greenhouse-Geisser correction.) The effects of DIRECTION can beattributed to the inclusion of reach time in selection time. The pairwise significant effects in reach time pertainingto DIRECTION are mirrored in selection time. We delve deeper into Technique and Button size effects.

Figure 8.7 shows the selection time for the three TECHNIQUES for each BUTTON SIZE (large, medium, small):mean 3.2s/3.5s/4.6s for pointing, 3.4s/3.5s/3.5s for tabbing and 2.6/2.7/2.8s for zoning. Posthoc tests withBonferroni correction show that Zoning has a significantly lower selection time than both Pointing and Tabbingacross all BUTTON SIZES (p < 0.05 for both). There were no significant differences between Tabbing and Pointingfor any of the BUTTON SIZES. Expectedly, pointing speed for the small button was significantly slower than theother two.

Pointing

Tabbing

Zoning

Large Medium SmallMean

sele

ctio

n

time (

s)

0.01.02.03.04.0

Button size

Figure 8.7: StudyII: Mean selection time per BUTTON SIZES per TECHNIQUE. Error bars are 95% CI.

Considering that the selection time for Zoning and Tabbing included the release gesture, the across-the-boardfaster selection time is a very strong argument in favor of zoning. Considering that reach times for Pointing andZoning were not significantly different, it implies that users took more time for air-tapping in pointing even forthe largest button. This affirms our earlier contention that precise selection is a problem for pointing which isalleviated by summon & select with Zoning.


As mentioned earlier, some buttons could auto-release by opening up a new window. Consequently, we takea look at how the three TECHNIQUES compare when the selection time for Tabbing and Zoning is also measureduntil the button click. Figure 8.8 shows the means: 3.2s/3.5s/4.6s for pointing, 2.7s/2.6s/2.8s for tabbing and1.9s/1.9s/2.0s for zoning. In this case, the difference between Pointing and the summon TECHNIQUES is starklyvisible. In addition to the earlier effects, posthoc tests with Bonferroni correction show that Tabbing’s time forclicking is also significantly faster than Pointing (p < 0.01).

Pointing

Tabbing

Zoning

Large Medium Small Button size

Mean

Clic

kTim

e (

s)

0.01.02.03.04.05.0

Figure 8.8: StudyII: Mean click time per BUTTON SIZE per TECHNIQUE. Error bars are 95% CI.

Errors

We also calculated the incorrect button selection error% for every TECHNIQUE. Out of 288 total selections perTECHNIQUE, Zoning had 14 errors (4.8%), Tabbing had 18 (6.2%), and Pointing had 1 (0.3%). The errors weremostly due to users trying to do the task quickly and making mistakes as a result. For Pointing, trying to do thetasks quickly only resulted in missed clicks and not incorrect button selections. Therefore, missed clicks were notcounted as an error since it did not result in an incorrect selection. While the difference in errors is not significant,the differing values show that even though misseed clicks are much more prevalent in Pointing, the incorrectselections might be more in Zoning and Tabbing. However, most of these errors were a result of confident userstrying to do the task very quickly, and can therefore be minimized by the user. Another way wto reduce errorswould be to improve hand tracking such that it is tolerant to very quick transitions.

Qualitative Results

Figure 8.9 shows the Likert scale response box-plot. Friedman tests show significant differences between thetechniques on mental demand (χ2(2) = 8.296, p < 0.05), physical demand (χ2(2) = 11.617, p < 0.01), andfrustration (χ2(2) = 18.476, p < 0.001). Posthoc Wilcoxon signed-rank tests with Bonferroni correction showedthat mental demand for Zoning was significantly lower than Tabbing (Z = −2.06, p < 0.05), and Pointing (Z =

−2.392, p < .05). Further, physical demand and frustration for both Zoning and Tabbing were significantly lowerthan Pointing (Z = −2.946, p < .01;Z = −2.614, p < 0.01;Z = −3.082, p < .01;Z = −2.85, p < 0.01).

Participants uniformly liked the concept of summoning. One participant with prior experience with Kinectsaid “I like that I can keep my hand anywhere and still do what I want". 10 of 12 participants ranked Zoning asthe most preferred TECHNIQUE for real-world use citing the use of two hands in Tabbing as a significant issue.Only two users preferred Tabbing, while Pointing was preferred by none. Compared to the sliders in Study II,here Zoning is heavily preferred. This could be because sliding in a high/low position was difficult or becausetabbing through six buttons is more cumbersome.

Participants who preferred tabbing suggested it could be useful in gaming - “It felt like a game to tab quickly


and accurately. Tabbing is good for games where it’ll be more immersive. Zoning can be used for public displays

where the second hand is not free and people have less time." One participant suggested a less physically demand-ing gesture for tabbing - “I liked the second hand one for buttons 1,2,3 because it only required hand gestures,

not movement of hand. But it gets too much to make a fist again and again 5-6 times. Can it just continue tabbing

when the fist is made and then stop when I open the fist?" The Likert scale responses and qualitative feedbackboth indicated that participants felt less physically and mentally fatigued when using summon & select.

However, as the box plot shows, even though better than point & select, summon & select is still fatiguing toa certain extent. Summon & select does not require the user to focus while selecting small targets, but the usersinitially felt that remembering the gestures and their transitions was mentally demanding. This got easier oncethey got used to it during practicing. Consequently, the barrier of entry for point & select is much lower thansummon & select.

Pointing

Tabbing

Zoning

Mental Physical Frustration1

3

5

7

Figure 8.9: StudyII: Qualitative results for Button Selection

In summary, (a) Zoning performance is consistent across BUTTON SIZES and directions and outperformsPointing and Tabbing for all BUTTON SIZES. (b) Tabbing performance is consistent across BUTTON SIZES butdependent on the tab ordering of the buttons. This makes it faster than zoning for buttons 1-3, but slower thanpointing for buttons 4-6. (c) Both Zoning and Tabbing had lower frustration and physical demand than Pointing.Zoning also had less mental demand than Pointing and Tabbing. (c) Pointing was slowest and least preferred, butit resulted in the least amount of errors.

8.9 Haptic Feedback

One feedback that we received from multiple participants was that the interaction involves paying close attentionto the visual feedback especially during dragging when the user repeatedly grips and releases the slider bar. Insuch a scenario, they suggested it would be beneficial to have other modes of feedback such as audio or haptic.Since the audio channel can be engaged elsewhere, we conducted a preliminary study to explore how hapticfeedback can aid users in performing dragging in summon & select.

We designed haptic feedback such that in addition to confirmation of user actions, it also gives the user a hintof physical manipulation of the control. Therefore, we used miniature vibrotactile rings similar to [Gupta et al.,2016b] on the tips of the thumb and index finger. The rings are connected to a driving circuit placed on a wristbandthat communicates with the application via Bluetooth. Upon summoning, a 150ms pulse is played in both rings toindicate that the slider is summoned. When the user enters the drag state, a continuous pulse starts playing in bothrings to mirror the grip of the slider bar. The pulse stops upon exit from the drag state. To reduce any perceivedirritability from the vibration, the amplitude was set just above the perceivable level and the frequency was set at350Hz. The 150ms pulse is played again upon release.


8.9.1 Study design

The study followed a within-subjects design to compare user performance and experience on dragging with andwithout haptic feedback. The interface consists of a single slider which the user is instructed to summon andthen drag the bar to a target value similar to Study I. Each participant performed 10 trials, 5 each for haptic andno-haptic feedback. Five predefined target values, same for both conditions, are presented in a random order. Werecorded the total time for a trial starting with the instruction until the release. Visual feedback is provided in bothconditions.

12 different right-handed participants (mean age = 23.36), none of whom had experience with freehand inter-action took part. The conditions were counterbalanced. Participants wore the rings in both conditions to reducebias due to inconvenience from the wires and setup. Before each of the four conditions, the participants performedfour practice trials. A small interview was conducted at the end. The study lasted 30mins. In total, we had 12PARTICIPANTS × 2 HAPTIC × 2 SLIDERS × 5 REPETITIONS = 240 total trials.

8.9.2 Results

The mean total time per trial was 7.13s (95% CI [6.1, 8.2]) for the HAPTIC condition and 6.07s (95% CI [5.3,6.9]) for No HAPTIC. No significant effect of haptics was found. However, 8 of 12 PARTICIPANTS preferredthe HAPTIC condition. Participants liked the haptic confirmation - “It just confirms that I’m doing it correctly,

so I’m not focusing on the shading color changes every time I release it and grab it.”. Multiple PARTICIPANTS

shared similar sentiments where they felt they could relax with the haptic feedback and not have to focus intentlywhile doing the sliding. The relaxed focus could also mean that the PARTICIPANTS did not do the interaction asquickly as they could have. Although the difference is not significant, this might explain the HAPTIC condition’slower mean. Participants also mentioned the physical feeling - “It’s like I’m really holding something. I think I

can do the task without visual feedback.”. However, some PARTICIPANTS did not like the sensations and wantedthem to be more realistic - “It’s weird. Maybe I could just feel discrete notches go by. I don’t like the continuous

sensation.”

In summary, haptic feedback shows promise for summon & select and further studies should focus on improv-ing the haptic actuation, and on whether haptic feedback alone without any visual feedback can enable similarperformance thus allowing for a less visually engaged interaction.

8.10 Discussion & Future Work

The results show that summon & select does not suffer from midas touch issues, enables pinpoint draggingaccuracy, and is agnostic to button sizes. When used with zoning, it is faster on selection time, has lower physical& mental demand, and is more preferred than pointing. Consequently, it offers a realistic solution to point &select’s problems of constant navigation, precise pointing, and out-of-bounds controls and lowers physical andmental fatigue for the user. Zoning is suitable for most scenarios, enables a faster seamless gestural interactionand can accommodate a large number of controls. However, Tabbing could be useful when the number of controlsis less and two-handed interaction is not an issue. Plus, as mentioned earlier, it can be useful in mobile scenarios.

8.10.1 Integration with Point & select

While the merits of using summon & select in certain scenarios are clear, it does not apply to all large screenapplications as discussed earlier. Consequently, it needs to be explored how summon & select can work alongside


point & select. A crude way would be to have summon & select as the default for some apps and point & select forthe others and the user can switch the defaults according to their preference using an explicit gesture. However,it would be better to integrate summon & select in the current interfaces as a shortcut alternative similar to aright click menu or keyboard shortcut. For instance, the user navigates the pointer with an open palm and if theon-screen pointer is not over a clickable area, the user can perform summon & select starting with the summoninggesture. If the pointer is over a clickable area, then the hand gestures are not recognized as summoning gesturessimilar to how right click does not work on controls. In such a scenario, integration with global semaphoricgestures such as swipes needs to be investigated. This is a topic for future exploration.

8.10.2 Generalizability and Learnability

Because summon & select is independent of target size and is less dependent on distance, we believe that theresults will apply for interfaces with a larger number of buttons and sliders. However, for interfaces that consistof a larger diversity of controls with different summon gestures, the cognitive load might be higher which couldaffect performance when compared to point & select. This needs to be investigated in detail. Summon & selectneeds to be investigated for an end-to-end real-world application that consists of multiple control types and morecomplex interaction sequences. With more control types, it will be worth exploring how to incorporate learning-while-doing mechanisms for the gestures. These could be, for instance, iconic representation besides the controls,or animated representations which are invoked when the user mouse-overs the controls.

8.11 Conclusion

We introduced summon & select, a fast, low-fatigue freehand interaction model for interface controls in mid-air. To this end, we describe its design and how it overcomes challenges including the Midas touch problem, itsconceptual advantages and limitations against point & select. We conduct two studies. The first study shows thatthe interaction can be performed without Midas touch issues, the drag state enables highly precise dragging, andthat Zoning outperforms Tabbing as the number of controls gets higher. In a second study, we compare summon& select to pointing and show that it outperforms pointing both quantitatively and qualitatively. We end with adiscussion, suggestions for future work, and a preliminary investigation of haptic feedback in summon & select.Point & select was never designed to be performed in mid-air. We believe summon & select is a significant steptowards offering a compelling alternative.

Chapter 9

Conclusion

This dissertation explored the use of three different hand attributes for interactions in different scenarios. Figure9.1 shows the three hand attributes we investigated. Our work evidences that utilizing the hand in novel ways canlead to interactions that enable novel interfaces and make existing tasks faster and more efficient. We discuss asummary and a discussion.

Figure 9.1: The three hand attributes being investigated: Chapters 3,4: Distinct Fingers (Blue), Chapters 5,6: TactileSense on the Wrist (Purple), Chapters 7,8: Hand Dexterity (Yellow)

115

CHAPTER 9. CONCLUSION 116

9.1 Summary and Future Directions

Our work shows the utility of using the hand attributes for novel interactions. Chapters 3 and 4 investigateddistinct fingers for touch input on small screens. Chapter 5 and 6 investigated the use of hand tactile sense fortouch output in wearables. Chapters 7 and 8 investigated hand dexterity for touchless interaction in air. We nowsummarize our contributions from each Chapter.

In Chapter 3, we propose porous interfaces that enable small screen multitasking using window transparencyand finger identification. We define its primary characteristics, and design the window setup helper features. Webuild a custom hardware that performs finger identification with a 99.5% accuracy which involves detecting subtlefinger movements. We build an end-to-end porous interface with fidelity to the existing smartphone interface anddemonstrated usage scenarios using nine demo applications. We report on a detailed qualitative study whichreinforced the usefulness and ease of porous interfaces. Two instances where users have to switch apps mostfrequently are for content transfer and attending to notifications. The beat gesture and the vanishing notificationdirectly addressed the two scenarios and received highly enthusiastic feedback from the participants.

In Chapter 4, we shift our focus to smaller wrist wearable touchscreens and how they can benefit from fingeridentification. We propose DualKey, a novel technique for miniature screen text entry using finger identification.We report on a comprehensive 10-day study that shows that DualKey outperforms existing smartwatch text-entrytechniques on long-term speed and error-rates and is comparable on novice performance. We then optimized theDualKey keyboard layout and propose a new Sweqty layout based on a quantitative latency & error analysis ofthe study results. We report on a second 10-day study of Sweqty that shows that DualKey Sweqty outperformsDualkey Qwerty on both short-term & long-term performance.

In Chapter 5, we focus on enhancing another aspect of wrist wearables, their haptic output. We proposeHapticClench, a system for generating squeezing sensations on the wrist. We formalized the concept of squeezingsensations which outlines its differences from the compression sensation. We describe the design challenges andbuilt a miniature wearable prototype using shape memory alloy springs. We analyze the load properties and reporton three psychophysical evaluations of wrist squeezing using HapticClench. These give us the baseline detectionthresholds for active & distracted use, and the JNDs which show an improved Weber’s fraction over compressionfeedback. We report on a pattern recognition study for a three-spring setup on the wrist which shows a >90%recognition accuracy after removing ambiguous patterns. We report on a study that investigates slow squeezing of30s/1min durations for conveying information ambiently. Results show that staggered squeezing pulses performbetter than continuously increasing squeezing. We further demonstrate other squeezing applications using SMAssuch as squeezing rings on the finger and tightening loose bracelets on the skin.

In Chapter 6, we take a step further from using wrist wearable haptics as simple feedback to using it asan end-to-end display for the skin a la visual displays. We introduce the concept of direct manipulation fortactile displays and define DMTs - Direct Manipulation-enabled Tactile displays. To this end, we deconstruct theelements of a display and adapt them for tactile displays: tactile screen, tactile pixels, tactile pointer, and tactiletargets. We describe the tactile indirect pointing interface that enables the user to track system state and performuser actions: tactile pointing, selection, execution, and drag & drop. We implement a proof of concept DMTthat uses a combination of phantom sensations and frequency discrimination to generate a controllable pointerover a seemingly continuous tactile screen around the wrist using just 4 actuators and simulate targets differentfrom voids. We define exploratory localization for DMTs and report on a study of its precision limits for ourDMT which is reasonably high at a count of 19 tactile targets around the wrist. We report on a second study of theperformance dependencies of target acquisition in DMTs and validate its adherence to Fitts’ law. The study showstarget position on the wrist to be an additionally significant factor for performance. Based on the dependencies,


we derive a performance model for our DMT. We report on a third study of a DMT menu app with 4 or 8 itemsfor target execution and drag & drop performance. With < 5 minutes of visual aid, not only were participantsable to perform tactile direct manipulation in a tactile menu without visual aid, they improved upon their speedswhile preserving an accuracy of > 90% for target execution. We delineate 10 guidelines to aid in future designand application of DMTs.

In Chapter 7, we focus our attention on how wearables can solve the problems of touchless interaction. Weconduct the first exploration into haptic learning of semaphoric gestures. We design freehand finger tap ges-tures and haptic rings. We report on a two-day study with 30 participants which shows that haptic learning incombination with both audio or visual command stimulus can be successfully used for associative learning offreehand finger tap gestures and command shortcuts. The results show that gestural learning can be integratedacross multiple contexts including on-the-go scenarios with minimal visual engagement.

In Chapter 8, we focus on how touchless interaction can move beyond the point & select paradigm. Wepropose summon &select, a novel freehand technique to manipulate interface controls on large, distant displays.Summon & select addresses the problems of fatigue and control precision in mid-air point & select. We define thefour-step interaction and its design elements designed so as to overcome the midas touch challenge. We implementa prototype and report on a study that shows that users are able to manipulate sliders to a high degree of precision.We conduct a second study to compare it against point & select for a multi-button interface. Results show thatsummon & select is faster than point & select across button sizes and has lower physical & mental demand.

9.1.1 Future Directions

While the future work pertaining to the individual chapters is included in the relevant chapters, here we list someoverall directions to pursue.

Porous Interfaces and DualKey evidence the utility of identifying distinct fingers to solve problems inherentin the small touchscreen. Particularly, it helps to solve the problem of multi-step interactions by collapsing theminto a single step. While DualKey enables a single-step key entry for smartwatch typing, porous interfaces enablesingle-step task switching for multitasking between app pairings on a smartphone. Future work in this space canfollow one of many threads: 1) To take full advantage of our distinct fingers, we need finger identification towork on the entire screen with low latency. There are two approaches to do this, using finger instrumentation andusing biosensing or instrumenting the touchscreen. 2) Investigating interactions that involve using all five fingersdistinctly. 3) Investigating other multi-step problems and tasks on small touchscreens that can benefit from fingeridentification. Further, investigating it on other small touch devices with space constraints such as rings or headmounted displays.

The central theme of our work in wearable haptics is expanding the role of haptics in interactions given theimmense opportunity that wearables offer. The future explorations can aim for three types of investigations: 1)Expanding the types of haptic feedback that wearables can enable. For tactile, these are - variations of light touch,vibration, pressure, pain, itch, and temperature. For kinesthetic, these are variations of weight, shape, and texture.2) Moving beyond haptics’s use as assistive feedback and informational patterns to making it more central to theinteraction. Our work on direct manipulation is one instance of such an investigation. 3) More applications wherehaptic substitutes or additions to visual or audio interfaces can enable better on-the-go use of certain applications.For instance, our work on haptic learning demonstrates that gestural learning does not need to be restricted to thevisual domain. These explorations can focus on different wearables on different parts of the body, ranging fromrings, watches and head-mounted gears to clothes and jewelry.

While in-air interactions are being intensively explored right now, a shift from thinking of point and select


as the operating model would be beneficial. Again, the investigations would be highly dependent on the displaymechanisms as well as scenarios of use. For instance, large screens, augmented reality, virtual reality, or evensmall or no screen scenarios such as smart speakers or wearables will benefit from touchless interactions. Further,for semaphoric gestures, haptic feedback for other types of gestures needs to be studied. Specifically, learning-while-doing mid-air gestures should be explored.

9.2 Congruent Themes

9.2.1 From Peripheral to Central

This thesis spans a range of input and output techniques. However, aside from the specific research contributions,there is one observation which is reflective of all its contributions. And that is conceptualizing novel techniquesand capabilities as a central part of the interaction instead of something that is peripheral. DMTs show that ifwe consider haptics as central to our interaction instead of just using it for feedback or notifications, it can leadto some interesting directions. Similarly, finger identification, we saw how using it as an organizing principlefor our interface can be useful. We have so many new capabilities and techniques that show promise. Plus wehave a range of new usage scenarios from different screens to different contexts. However, we often imaginethose new scenarios being driven by the same interaction concepts that were originally intended for use in othercontexts. Point & select in mid-air is one huge example of this. And the new capabilities are only presented asaugmentations to optimize or extend the existing interactions. While novel capabilties can surely be great add-onsto our interactions, they might be hiding highly capable interaction concepts that can become a driver for thewhole interaction if they are thought of as a central organizing principle for the context. Haptics as the driverfor non-visual scenarios or finger dexterity as the driver for mid-air interactions are just a few examples of howinteresting and powerful this approach can be.

9.2.2 Wearables as Tools for Interaction

Smartwatches or HMDs (head mounted devices) are smart devices that need to be interacted with. Another cate-gory of work that sometimes takes a backseat is using wearables as tools for our interactions with other devices.Figure 9.2 shows how the chapters use wearables. DualKey and Porous Interfaces use a finger mounted detectorfor distinguishing between fingers. While this might not be the most practical solution, it works exceedingly wellas a prototype that is precise and accurate. Other potential solutions such as fiduciary tags based optical trackingwere not accurate enough for the usecase. Beyond prototyping, haptic rings show how wearables can be usedfor the purpose of learning semaphoric gestures or assisting with manipulative gestures. The wearability of thesedevices enables them to be used in situations where other methods might not work. Perhaps the killer app of asmartwatch lies not in the interactions with the watch, but in the interactions using the watch.

9.2.3 Learning of Novel Interfaces

Most of our interaction models can be argued as not being “intuitive" at first. While they are easy to use, giventhe different conceptual model, they require some time to understand to learn. This learning element can be foundin almost all our individual contributions. While this might seem to be a limiting factor initially, in almost allinstances, we found that the users did not have much difficulty in performing the interactions. In porous interfaces,users quickly understood the fingers and their organizing principle. In Dualkey sweqty, even after deviating from


Figure 9.2: Chapter distribution for interactions with wearables, using wearables for other devices, and both.

the qwerty layout, the users performed better even on Day 1 than the qwerty counterpart. In DMTs, users couldbrowse 8-item menus based solely on their tactile sense without any visual or aural assistance with just 5 minutesof learning. Users performed better on summon & select and gave it more preferable subjective ratings than thepoint & select that they have been used to using always. Consequently, if the interactions are easy to use and ifthe interaction model is internally coherent, the users quickly learned to do them and even performed better thanexisting interactions in both quantitative and qualitative terms.

9.3 The Final Word

Going back to the first image in Chapter 1, the computer’s image of a user is an illustration that our currentinteractions are underutilizing the human body’s capabilities. Our work shows how the hand and its variousattributes can be used to boost performance, and enable novel interfaces. Our work only scratches the surface ofutilizing the human body’s capabilities. We sincerely hope that this work will serve both as an inspiration and asa guide to future work in these spaces.

Bibliography

[Abu-Khalaf et al., 2009] Abu-Khalaf, J. M., Park, J. W., Mascaro, D. J., and Mascaro, S. A. (2009). Stretchablefingernail sensors for measurement of fingertip force. In World Haptics 2009 - Third Jt. EuroHaptics Conf.

Symp. Haptic Interfaces Virtual Environ. Teleoperator Syst., pages 625–626. IEEE.

[Accot and Zhai, 2002] Accot, J. and Zhai, S. (2002). More than dotting the i’s — foundations for crossing-basedinterfaces. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. Chang. our world, Chang. ourselves - CHI ’02,page 73.

[Achibet et al., 2016] Achibet, M., Casiez, G., and Marchai, M. (2016). DesktopGlove: A multi-finger forcefeedback interface separating degrees of freedom between hands. In Proc. 3DUI 2016, pages 3–12.

[Ackad et al., 2015] Ackad, C., Clayphan, A., Tomitsch, M., and Kay, J. (2015). An in-the-wild study of learningmid-air gestures to browse hierarchical information at a large interactive public display. In Proc. UbiComp

’15, pages 1227–1238.

[Anderson and Bischof, 2013] Anderson, F. and Bischof, W. F. (2013). Learning and performance with gestureguides. In Proc. CHI ’13, page 1109.

[Angelini et al., 2015] Angelini, L., Lalanne, D., Hoven, E. v. d., Khaled, O. A., and Mugellini, E. (2015). Move,hold and touch: A framework for tangible gesture interactive systems. Machines, 3(3):173–207.

[Annett et al., 2011] Annett, M., Grossman, T., Wigdor, D., and Fitzmaurice, G. (2011). Medusa. In Proc. 24th

Annu. ACM Symp. User interface Softw. Technol. - UIST ’11, page 337, New York, New York, USA. ACMPress.

[Antfolk et al., 2010] Antfolk, C., Balkenius, C., Lundborg, G., Rosén, B., and Sebelius, F. (2010). Design andtechnical construction of a tactile display for sensory feedback in a hand prosthesis system. Biomed. Eng.

Online, 9(1):50.

[Appert and Zhai, 2009] Appert, C. and Zhai, S. (2009). Using strokes as command shortcuts. In Proc. 27th Int.

Conf. Hum. factors Comput. Syst. - CHI 09, page 2289.

[Apple, 2017] Apple (2017). Apple Force Touch. https://support.apple.com/en-us/HT204352.

[Arafsha et al., 2015] Arafsha, F., Zhang, L., Dong, H., and Saddik, A. E. (2015). Contactless haptic feedback:state of the art. In 2015 IEEE Int. Symp. Haptic, Audio Vis. Environ. Games, pages 1–6.

[Aslan et al., 2014] Aslan, I., Uhl, A., Meschtscherjakov, A., and Tscheligi, M. (2014). Mid-air AuthenticationGestures: An Exploration of Authentication Based on Palm and Finger Motions. In Proc. ICMI ’14, pages311–318.

120

BIBLIOGRAPHY 121

[Au and Tai, 2010] Au, O. K.-C. and Tai, C.-L. (2010). Multitouch finger registration and its applications. InProc. 22nd Conf. Comput. Interact. Spec. Interes. Gr. Aust. Comput. Interact. - OZCHI ’10, page 41.

[Bach-y Rita et al., 1998] Bach-y Rita, P., Kaczmarek, K., Tyler, M. E., and Garcia-Lara, J. (1998). Form per-ception with a 49-point electrotactile stimulus array on the tongue: a technical note. J. Rehabil. Res. Dev.,35(4):427–430.

[Bailly et al., 2011] Bailly, G., Walter, R., Müller, J., Ning, T., and Lecolinet, E. (2011). Comparing free handmenu techniques for distant displays using linear, marking and finger-count menus. In Proc. INTERACT’11,pages 248–262.

[Balakrishnan, 2004] Balakrishnan, R. (2004). "beating" fitts’ law: Virtual enhancements for pointing facilitation.Int. J. Hum.-Comput. Stud., 61(6):857–874.

[Baldwin and Chai, 2012] Baldwin, T. and Chai, J. (2012). Towards online adaptation and personalization ofkey-target resizing for mobile devices. In Proc. 2012 ACM Int. Conf. Intell. User Interfaces - IUI ’12, page 11.

[Banerjee et al., 2012] Banerjee, A., Burstyn, J., Girouard, A., and Vertegaal, R. (2012). MultiPoint: Comparinglaser and manual pointing as remote input in large display interactions. Int. J. Hum. Comput. Stud., 70(10):690–702.

[Banovic et al., 2014] Banovic, N., Brant, C., Mankoff, J., and Dey, A. (2014). ProactiveTasks. In Proc. 16th Int.

Conf. Human-computer Interact. with Mob. devices Serv. - MobileHCI ’14, pages 243–252.

[Bark et al., 2015] Bark, K., Hyman, E., Tan, F., Cha, E., Jax, S. A., Buxbaum, L. J., and Kuchenbecker, K. J.(2015). Effects of vibrotactile feedback on human learning of arm motions. IEEE Trans. Neural Syst. Rehabil.

Eng., 23(1):51–63.

[Bateman et al., 2013] Bateman, S., Mandryk, R. L., Gutwin, C., and Xiao, R. (2013). Analysis and comparisonof target assistance techniques for relative ray-cast pointing. Int. J. Hum. Comput. Stud., 71(5):511–532.

[Bau and Mackay, 2008] Bau, O. and Mackay, W. E. (2008). OctoPocus. In Proc. 21st Annu. ACM Symp. User

interface Softw. Technol. - UIST ’08, page 37.

[Bau et al., 2010] Bau, O., Poupyrev, I., Israr, A., and Harrison, C. (2010). TeslaTouch. In Proc. 23nd Annu.

ACM Symp. User interface Softw. Technol. - UIST ’10, page 283.

[Baudel and Beaudouin-Lafon, 1993] Baudel, T. and Beaudouin-Lafon, M. (1993). Charade: remote control ofobjects using free-hand gestures. Commun. ACM, 36(7):28–35.

[Baudisch and Chu, 2009] Baudisch, P. and Chu, G. (2009). Back-of-device interaction allows creating verysmall touch devices. In Proc. 27th Int. Conf. Hum. factors Comput. Syst. - CHI 09, page 1923.

[Baumann et al., 2010] Baumann, M. A., MacLean, K. E., Hazelton, T. W., and McKay, A. (2010). Emulatinghuman attention-getting practices with wearable haptics. In 2010 IEEE Haptics Symp., pages 149–156. IEEE.

[Benko et al., 2009] Benko, H., Saponas, T. S., Morris, D., and Tan, D. (2009). Enhancing input on and abovethe interactive surface with muscle sensing. In Proc. ACM Int. Conf. Interact. Tabletops Surfaces - ITS ’09,page 93.

BIBLIOGRAPHY 122

[Benko et al., 2006] Benko, H., Wilson, A. D., and Baudisch, P. (2006). Precise selection techniques for multi-touch screens. In Proc. CHI ’06, page 1263.

[Bi and Zhai, 2013] Bi, X. and Zhai, S. (2013). Bayesian touch. In Proc. UIST ’13, pages 51–60.

[Bier et al., 1993] Bier, E. A., Stone, M. C., Pier, K., Buxton, W., and DeRose, T. D. (1993). Toolglass and magiclenses. In Proc. 20th Annu. Conf. Comput. Graph. Interact. Tech. - SIGGRAPH ’93, pages 73–80.

[Blandin et al., 2010] Blandin, Y., Lhuisset, L., and Proteau, L. (2010). Cognitive Processes Underlying Obser-vational Learning of Motor Skills. Q. J. Exp. Psychol. Sect. A.

[Böhmer et al., 2011] Böhmer, M., Hecht, B., Schoning, J., Kruger, A., and Bauer, G. (2011). Falling asleep withAngry Birds, Facebook and Kindle. In Proc. 13th Int. Conf. Hum. Comput. Interact. with Mob. Devices Serv. -

MobileHCI ’11, page 47.

[Bolt, 1980] Bolt, R. (1980). “put-that-there”: Voice, gesture at the graphics interface. In Computer and Graph-

ics, volume 14, page 262–270.

[Bolt and Herranz, 1 15] Bolt, R. and Herranz, E. (1992-11-15). Two-handed gesture in multi-modal naturaldialog. In Proceedings of the 5th Annual ACM Symposium on User interface Software and Technology, page7–14, Monteray, California, United.

[Bonnet et al., 2013] Bonnet, D., Appert, C., and Beaudouin-Lafon, M. (2013). Extending the vocabulary oftouch events with ThumbRock. pages 221–228.

[Boring et al., 2012] Boring, S., Ledo, D., Chen, X. A., Marquardt, N., Tang, A., and Greenberg, S. (2012).The fat thumb. In Proc. 14th Int. Conf. Human-computer Interact. with Mob. devices Serv. - MobileHCI ’12,page 39.

[Bowman et al., 2004] Bowman, D. A., Kruijff, E., LaViola, J. J., and Poupyrev, I. (2004). 3D User Interfaces:

Theory and Practice. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA.

[Bowman et al., 2002] Bowman, D. A., Wingrave, C. A., Campbell, J. M., Ly, V. Q., and Rhoton, C. J. (2002).Novel Uses of Pinch Gloves™ for Virtual Environment Interaction Techniques. Virtual Real., 6(3):122–129.

[Bragdon and Ko, 2011] Bragdon, A. and Ko, H.-S. (2011). Gesture select: acquiring remote targets on largedisplays without pointing. In Proc. CHI ’11, page 187.

[Brewster and King, 2005] Brewster, S. A. and King, A. (2005). The design and evaluation of a vibrotactileprogress bar. In World Haptics 2005, pages 499–500. IEEE.

[Brian, 2015] Brian, M. (2015). Sony is crowdfunding a smart watch with a dumb face.http://www.engadget.com/2015/08/31/sony-wena-smartwatch/.

[Brown et al., 2006] Brown, L. M., Brewster, S. A., and Purchase, H. C. (2006). Multidimensional tactons fornon-visual information presentation in mobile devices. In Proceedings of the 8th conference on Human-

computer interaction with mobile devices and services, pages 231–238. ACM.

[Buchmann et al., 2004] Buchmann, V., Violich, S., Billinghurst, M., and Cockburn, A. (2004). Fingartips: ges-ture based direct manipulation in augmented reality. In Proceedings of the 2nd international conference on

Computer graphics and interactive techniques in Australasia and South East Asia, pages 212–221. ACM.

BIBLIOGRAPHY 123

[Budiu, 2015] Budiu, R. (2015). Multitasking on Mobile Devices.https://www.nngroup.com/articles/multitasking-mobile/.

[Butler et al., 2008] Butler, A., Izadi, S., and Hodges, S. (2008). SideSight. In Proc. 21st Annu. ACM Symp. User


[Buxton, 1990a] Buxton, W. (1990a). A three-state model of graphical input. In Proc. INTERACT’90, pages449–456.

[Buxton, 1990b] Buxton, W. (1990b). A three-state model of graphical input. In Proc. INTERACT ’90, pages449–456, Amsterdam, The Netherlands, The Netherlands. North-Holland Publishing Co.

[Cao et al., 2015] Cao, N., Lin, Y.-R., Li, L., and Tong, H. (2015). g-Miner. In Proc. 33rd Annu. ACM Conf.

Hum. Factors Comput. Syst. - CHI ’15, pages 279–288, New York, New York, USA. ACM Press.

[Carcedo et al., 2016] Carcedo, M. G., Chua, S. H., Perrault, S., Wozniak, P., Joshi, R., Obaid, M., Fjeld, M., andZhao, S. (2016). HaptiColor. In Proc. CHI ’16, pages 3572–3583.

[Carter et al., 2016] Carter, M., Velloso, E., Downs, J., Sellen, A., O’Hara, K., and Vetere, F. (2016). Pathsync:Multi-user gestural interaction with touchless rhythmic path mimicry. In Proc. CHI ’16, pages 3415–3427,New York, NY, USA. ACM.

[Carter et al., 2013] Carter, T., Seah, S. A., Long, B., Drinkwater, B., and Subramanian, S. (2013). UltraHaptics:multi-point mid-air haptic feedback for touch surfaces. In Proc. UIST ’13, pages 505–514.

[Casiez et al., 2008] Casiez, G., Vogel, D., Balakrishnan, R., and Cockburn, A. (2008). The impact of control-display gain on user performance in pointing tasks. Human–Computer Interaction, 23(3):215–250.

[Castellucci and MacKenzie, 2008] Castellucci, S. J. and MacKenzie, I. S. (2008). Graffiti vs. unistrokes. InProceeding twenty-sixth Annu. CHI Conf. Hum. factors Comput. Syst. - CHI ’08, page 305.

[Chen et al., 2008] Chen, H.-y., Santos, J., Graves, M., Kim, K., and Tan, H. Z. (2008). Tactor localization at thewrist. In Proc. Eurohaptics’08, pages 209–218.

[Chen et al., 2014a] Chen, X. A., Grossman, T., and Fitzmaurice, G. (2014a). Swipeboard. In Proc. UIST ’14,pages 615–620.

[Chen et al., 2014b] Chen, X. A., Grossman, T., Wigdor, D. J., and Fitzmaurice, G. (2014b). Duet. In Proc. 32nd

Annu. ACM Conf. Hum. factors Comput. Syst. - CHI ’14, pages 159–168.

[Chen et al., 2014c] Chen, X. A., Schwarz, J., Harrison, C., Mankoff, J., and Hudson, S. E. (2014c). Air+touch.In Proc. UIST ’14, pages 519–525.

[Chinello et al., 2014] Chinello, F., Aurilio, M., Pacchierotti, C., and Prattichizzo, D. (2014). The HapBand: ACutaneous Device for Remote Tactile Interaction. pages 284–291. Springer, Berlin, Heidelberg.

[Cho et al., 2014] Cho, H., Kim, M., and Seo, K. (2014). A text entry technique for wrist-worn watches with tinytouchscreens. In Proc. Adjun. Publ. UIST’14 Adjun., pages 79–80.

[Choi et al., 2016] Choi, K., Song, H., Koh, K., Bok, J., and Seo, J. (2016). Peek-a-view: Smartphone coverinteraction for multi-tasking. In Proceedings of the 2016 CHI Conference on Human Factors in Computing

Systems, CHI ’16, pages 4658–4662, New York, NY, USA. ACM.

BIBLIOGRAPHY 124

[Choi and Kuchenbecker, 2013] Choi, S. and Kuchenbecker, K. J. (2013). Vibrotactile Display: Perception, Tech-nology, and Applications. Proc. IEEE, 101(9):2093–2104.

[Cholewiak et al., 2004] Cholewiak, R. W., Brill, J. C., and Schwab, A. (2004). Vibrotactile localization on theabdomen: Effects of place and space. Percept. Psychophys., 66(6):970–987.

[Cholewiak and Collins, 2003] Cholewiak, R. W. and Collins, A. A. (2003). Vibrotactile localization on the arm:Effects of place, space, and age. Percept. Psychophys., 65(7):1058–1077.

[Chung et al., 2015] Chung, Y. G., Han, S. W., Kim, H.-S., Chung, S.-C., Park, J.-Y., Wallraven, C., and Kim,S.-P. (2015). Adaptation of cortical activity to sustained pressure stimulation on the fingertip. BMC Neurosci.,16:71.

[Clarke et al., 2016] Clarke, C., Bellino, A., Esteves, A., Velloso, E., and Gellersen, H. (2016). Tracematch:A computer vision technique for user input by tracing of animated controls. In Proc. UbiComp ’16, pages298–303, New York, NY, USA. ACM.

[Clarkson et al., 2006] Clarkson, E. C., Patel, S. N., Pierce, J. S., and Abowd, G. D. (2006). Exploring continuouspressure input for mobile phones. Technical report, Georgia Institute of Technology.

[Cohn et al., 2011] Cohn, G., Morris, D., Patel, S. N., and Tan, D. S. (2011). Your noise is my command:sensing gestures using the body as an antenna. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems, pages 791–800. ACM.

[Colley and Häkkilä, 2014] Colley, A. and Häkkilä, J. (2014). Exploring finger specific touch screen interactionfor mobile phone user interfaces. In Proc. 26th Aust. Comput. Interact. Conf. Des. Futur. Futur. Des. - OzCHI

’14, pages 539–548.

[Connolly and Jones, 1970] Connolly, K. and Jones, B. (1970). A developmental study of afferent-reafferentintegration. Br. J. Psychol., 61(2):259–266.

[Craig, 1977] Craig, J. (1977). Vibrotactile pattern perception: Extraordinary observers. Science,196(4288):450–452.

[Crossan et al., 2008] Crossan, A., Williamson, J., Brewster, S., and Murray-Smith, R. (2008). Wrist rotationfor interaction in mobile contexts. In Proc. 10th Int. Conf. Hum. Comput. Interact. with Mob. devices Serv. -

MobileHCI ’08, page 435.

[Darling et al., 2002] Darling, T., Chu, F., Migliori, A., Thoma, D., Lopez, M., Lashley, J., Lang, B., Boerio-Goates, J., and Woodfiel, B. (2002). Elastic and thermodynamic properties of the shape-memory alloy auzn.Philosophical Magazine B, 82(7):825–837.

[Delhaye et al., 2016] Delhaye, B., Barrea, A., Edin, B. B., Lefèvre, P., and Thonnard, J.-L. (2016). Surfacestrain measurements of fingertip skin under shearing. J. R. Soc. Interface, 13(115):20150874.

[Dietz and Leigh, 2001] Dietz, P. and Leigh, D. (2001). DiamondTouch. In Proc. UIST ’01, page 219.

[Döring et al., 2011] Döring, T., Kern, D., Marshall, P., Pfeiffer, M., Schöning, J., Gruhn, V., and Schmidt, A.(2011). Gestural interaction on the steering wheel: Reducing the visual demand. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems, CHI ’11, pages 483–492, New York, NY, USA. ACM.

BIBLIOGRAPHY 125

[Douglas et al., 1999] Douglas, S. A., Kirkpatrick, A. E., and MacKenzie, I. S. (1999). Testing pointing deviceperformance and user assessment with the iso 9241, part 9 standard. In Proc. CHI ’99, pages 215–222.

[Drobny and Borchers, 2010] Drobny, D. and Borchers, J. (2010). Learning basic dance choreographies withdifferent augmented feedback modalities. In Proc. 28th Int. Conf. Ext. Abstr. Hum. factors Comput. Syst. - CHI

EA ’10, page 3793, New York, New York, USA. ACM Press.

[Dunlop et al., 2014] Dunlop, M. D., Komninos, A., and Durga, N. (2014). Towards high quality text entry onsmartwatches. In Proc. Ext. Abstr. 32nd Annu. ACM Conf. Hum. factors Comput. Syst. - CHI EA ’14, pages2365–2370.

[Essl et al., 2010] Essl, G., Essl, G., Rohs, M., and Kratz, S. (2010). Use the Force (or something)- Pressure andPressure- Like Input for Mobile Music Performance.

[Esteves et al., 2015] Esteves, A., Velloso, E., Bulling, A., and Gellersen, H. (2015). Orbits. In Proc. 2015 ACM

Int. Jt. Conf. Pervasive Ubiquitous Comput. Proc. 2015 ACM Int. Symp. Wearable Comput. - UbiComp ’15,pages 419–422.

[Ewerling et al., 2012] Ewerling, P., Kulik, A., and Froehlich, B. (2012). Finger and hand detection for multi-touch interfaces based on maximally stable extremal regions. In Proc. 2012 ACM Int. Conf. Interact. tabletops

surfaces - ITS ’12, page 173.

[Faran and Shilo, 2015] Faran, E. and Shilo, D. (2015). Ferromagnetic Shape Memory Alloys-Challenges, Ap-plications, and Experimental Characterization. Exp. Tech., pages n/a–n/a.

[Feygin et al., 2002] Feygin, D., Keehner, M., and Tendick, R. (2002). Haptic guidance: experimental evaluationof a haptic training method for a perceptual motor skill. In Proc. 10th Symp. Haptic Interfaces Virtual Environ.

Teleoperator Syst. HAPTICS 2002, pages 40–47.

[Fisher et al., 1987] Fisher, S. S., McGreevy, M., Humphries, J., and Robinett, W. (1987). Virtual environmentdisplay system. In Proceedings of the 1986 Workshop on Interactive 3D Graphics, I3D ’86, pages 77–87, NewYork, NY, USA. ACM.

[Fitzmaurice, 1996] Fitzmaurice, G. W. (1996). Graspable user interfaces. PhD Thesis, University of Toronto.

[Fitzpatrick et al., 1994] Fitzpatrick, G., Haynes, T., and Williams, M. (1994). Method and apparatus for access-ing touch screen desktop objects via fingerprint recognition. EP Patent App. EP19,930,480,135.

[Fletcher et al., 1962] Fletcher, R. F. et al. (1962). The measurement of total body fat with skinfold calipers.Clinical science, 22:333–346.

[Freeman et al., 2009] Freeman, D., Benko, H., Morris, M. R., and Wigdor, D. (2009). ShadowGuides. In Proc.

ACM Int. Conf. Interact. Tabletops Surfaces - ITS ’09, page 165.

[Freeman et al., 2016] Freeman, E., Brewster, S., and Lantz, V. (2016). Do That, There: An Interaction Techniquefor Addressing In-Air Gesture Systems. In Proc. CHI ’16, pages 2319–2331.

[Freeman and Weissman, 1995] Freeman, W. and Weissman, C. (1995). Television control by hand gestures. InIEEE International Workshop on Automatic Face, Gesture Recognition, pages 179–183.

BIBLIOGRAPHY 126

[Freeman et al., 1996] Freeman, W. T., Tanaka, K.-i., Ohta, J., and Kyuma, K. (1996). Computer vision forcomputer games. In Automatic Face and Gesture Recognition, 1996., Proceedings of the Second International

Conference on, pages 100–105. IEEE.

[Gescheider et al., 1990] Gescheider, G. A., Bolanowski Jr, S. J., Verrillo, R. T., Arpajian, D. J., and Ryan, T. F.(1990). Vibrotactile intensity discrimination measured by three methods. The Journal of the Acoustical Society

of America, 87(1):330–338.

[Ghomi et al., 2012] Ghomi, E., Faure, G., Huot, S., Chapuis, O., and Beaudouin-Lafon, M. (2012). Usingrhythmic patterns as an input method. In Proc. CHI ’12, page 1253.

[Ghomi et al., 2013] Ghomi, E., Huot, S., Bau, O., Beaudouin-Lafon, M., and Mackay, W. E. (2013). Arpège. InProc. 2013 ACM Int. Conf. Interact. tabletops surfaces - ITS ’13, pages 209–218.

[Gibson, 1962] Gibson, J. J. (1962). Observations on active touch. Psychological Review, 69(6):477–491.

[Goel et al., 2012] Goel, M., Wobbrock, J., and Patel, S. (2012). GripSense. In Proc. UIST ’12, page 545.

[Goguey et al., 2014a] Goguey, A., Casiez, G., Pietrzak, T., Vogel, D., and Roussel, N. (2014a). Adoiraccourcix.In Proc. 26th Conf. l’Interaction Homme-Machine - IHM ’14, pages 28–37.

[Goguey et al., 2014b] Goguey, A., Casiez, G., Vogel, D., Chevalier, F., Pietrzak, T., and Roussel, N. (2014b). Athree-step interaction pattern for improving discoverability in finger identification techniques. In Proc. Adjun.

Publ. UIST’14 Adjun., pages 33–34.

[Goldstein, 1999] Goldstein, E. (1999). Sensation and perception. Wadsworth, Belmont, CA, 5th ed edition.

[Gordon et al., 2002] Gordon, G., Billinghurst, M., Bell, M., Woodfill, J., Kowalik, B., Erendi, A., and Tilander,J. (2002). The use of dense stereo range data in augmented reality. In Proceedings of the 1st International

Symposium on Mixed and Augmented Reality, ISMAR ’02, pages 14–, Washington, DC, USA. IEEE ComputerSociety.

[Grossman et al., 2007] Grossman, T., Dragicevic, P., and Balakrishnan, R. (2007). Strategies for acceleratingon-line learning of hotkeys. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. - CHI ’07, page 1591.

[Grubert et al., 2015] Grubert, J., Heinisch, M., Quigley, A., and Schmalstieg, D. (2015). MultiFi. In Proc. 33rd

Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 3933–3942.

[Gu et al., 2013] Gu, J., Heo, S., Han, J., Kim, S., and Lee, G. (2013). Longpad: A touchpad using the entirearea below the keyboard of a laptop computer. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems, CHI ’13, pages 1421–1430, New York, NY, USA. ACM.

[Gupta et al., 2016a] Gupta, A., Anwar, M., and Balakrishnan, R. (2016a). Porous interfaces for small screenmultitasking using finger identification. In Proceedings of the 29th Annual Symposium on User Interface

Software and Technology, pages 145–156. ACM.

[Gupta and Balakrishnan, 2016] Gupta, A. and Balakrishnan, R. (2016). Dualkey: Miniature screen text entry viafinger identification. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems,pages 59–70. ACM.

BIBLIOGRAPHY 127

[Gupta et al., 2016b] Gupta, A., Irudayaraj, A., Chandran, V., Palaniappan, G., Truong, K. N., and Balakrishnan,R. (2016b). Haptic learning of semaphoric finger gestures. In Proceedings of the 29th Annual Symposium on

User Interface Software and Technology, pages 219–226. ACM.

[Gupta et al., 2017a] Gupta, A., Irudayaraj, A. A. R., and Balakrishnan, R. (2017a). Hapticclench: Investigatingsqueeze sensations using memory alloys. In Proceedings of the 30th Annual ACM Symposium on User Interface

Software and Technology, pages 109–117. ACM.

[Gupta et al., 2016c] Gupta, A., Pietrzak, T., Roussel, N., and Balakrishnan, R. (2016c). Direct manipulation intactile displays. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages3683–3693. ACM.

[Gupta et al., 2017b] Gupta, A., Pietrzak, T., Yau, C., Roussel, N., and Balakrishnan, R. (2017b). Summon andselect: Rapid interaction with interface controls in mid-air. In Proceedings of the 2017 ACM International

Conference on Interactive Surfaces and Spaces, pages 52–61. ACM.

[Gustafson et al., 2010] Gustafson, S., Bierwirth, D., and Baudisch, P. (2010). Imaginary interfaces: Spatialinteraction with empty hands and without visual feedback. In Proc. UIST ’10, pages 3–12.

[Han and Lee, 2015] Han, J. and Lee, G. (2015). Push-push: A drag-like operation overlapped with a pagetransition operation on touch interfaces. In Proceedings of the 28th Annual ACM Symposium on User Interface

Software & Technology, UIST ’15, pages 313–322, New York, NY, USA. ACM.

[Hardenberg et al., 1 15] Hardenberg, V., C., B., and F. (2001-11-15). Bare-hand human-computer interaction.In Proceedings of the 2001 Workshop on Perceptive User interfaces, volume 1, page 1–8, Orlando, Florida.ACM.

[Harrison et al., 1995a] Harrison, B. L., Ishii, H., Vicente, K. J., and Buxton, W. A. S. (1995a). Transparentlayered user interfaces. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. - CHI ’95, pages 317–324.

[Harrison et al., 1995b] Harrison, B. L., Kurtenbach, G., and Vicente, K. J. (1995b). An experimental evaluationof transparent user interface tools and information content. In Proc. UIST ’95, pages 81–90.

[Harrison and Vicente, 1996] Harrison, B. L. and Vicente, K. J. (1996). An experimental evaluation of transparentmenu usage. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. common Gr. - CHI ’96, pages 391–398.

[Harrison et al., 2012a] Harrison, C., Horstman, J., Hsieh, G., and Hudson, S. (2012a). Unlocking the expressiv-ity of point lights. In Proc. CHI ’12, page 1683.

[Harrison and Hudson, 2012] Harrison, C. and Hudson, S. (2012). Using shear as a supplemental two-dimensional input channel for rich touchscreen interaction. In Proc. CHI ’12, page 3149.

[Harrison and Hudson, 2009] Harrison, C. and Hudson, S. E. (2009). Abracadabra: Wireless, high-precision, andunpowered finger input for very small mobile devices. In Proceedings of the 22Nd Annual ACM Symposium

on User Interface Software and Technology, UIST ’09, pages 121–124, New York, NY, USA. ACM.

[Harrison and Hudson, 2010] Harrison, C. and Hudson, S. E. (2010). Minput: Enabling interaction on smallmobile devices with high-precision, low-cost, multipoint optical tracking. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems, CHI ’10, pages 1661–1664, New York, NY, USA.ACM.

BIBLIOGRAPHY 128

[Harrison et al., 2012b] Harrison, C., Sato, M., and Poupyrev, I. (2012b). Capacitive fingerprinting: Exploringuser differentiation by sensing electrical properties of the human body. In Proceedings of the 25th Annual

ACM Symposium on User Interface Software and Technology, UIST ’12, pages 537–544, New York, NY,USA. ACM.

[Harrison et al., 2011] Harrison, C., Schwarz, J., and Hudson, S. E. (2011). Tapsense: Enhancing finger interac-tion on touch surfaces. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and

Technology, UIST ’11, pages 627–636, New York, NY, USA. ACM.

[Hasan et al., 2013] Hasan, K., Ahlström, D., and Irani, P. (2013). Ad-binning. In Proc. CHI ’13, page 899.

[Hayward, 2008] Hayward, V. (2008). A brief taxonomy of tactile illusions and demonstrations that can be donein a hardware store. Brain Research Bulletin (special issue on robotics and neuroscience), 75(6):742–752.

[Helicopter, 2017] Helicopter (2017). Helicopter.

[Henze et al., 2011] Henze, N., Rukzio, E., and Boll, S. (2011). 100,000,000 taps. In Proc. 13th Int. Conf. Hum.

Comput. Interact. with Mob. Devices Serv. - MobileHCI ’11, page 133.

[Heo and Lee, 2012] Heo, S. and Lee, G. (2012). <i>ForceDrag</i>. In Proc. 24th Aust. Comput. Interact. Conf.

- OzCHI ’12, pages 204–207, New York, New York, USA. ACM Press.

[Heo and Lee, 2013] Heo, S. and Lee, G. (2013). Ta-Tap. In Proc. Adjun. Publ. UIST ’13 Adjun., pages 91–92.

[Hilliges et al., 2009] Hilliges, O., Izadi, S., Wilson, A. D., Hodges, S., Garcia-Mendoza, A., and Butz, A. (2009).Interactions in the air. In Proc. 22nd Annu. ACM Symp. User interface Softw. Technol. - UIST ’09, page 139,New York, New York, USA. ACM Press.

[Hincapié-Ramos et al., 2014] Hincapié-Ramos, J. D., Guo, X., Moghadasian, P., and Irani, P. (2014). Consumedendurance. In Proc. 32nd Annu. ACM Conf. Hum. factors Comput. Syst. - CHI ’14, pages 1063–1072.

[Hinckley et al., 2016] Hinckley, K., Buxton, W., Heo, S., Pahud, M., Holz, C., Benko, H., Sellen, A., Banks, R.,O’Hara, K., and Smyth, G. (2016). Pre-Touch Sensing for Mobile Interaction. In Proc. 2016 CHI Conf. Hum.

Factors Comput. Syst. - CHI ’16, pages 2869–2881, New York, New York, USA. ACM Press.

[Hinckley et al., 1998] Hinckley, K., Pausch, R., Proffitt, D., and Kassell, N. F. (1998). Two-handed virtualmanipulation. ACM Trans. Comput.-Hum. Interact., 5(3):260–302.

[Hinckley et al., 2000] Hinckley, K., Pierce, J., Sinclair, M., and Horvitz, E. (2000). Sensing techniques formobile interaction. In Proc. UIST ’00, pages 91–100.

[Hinckley and Song, 2011] Hinckley, K. and Song, H. (2011). Sensor synaesthesia. In Proc. CHI ’11, page 801.

[Hinckley et al., 2014] Hinckley, K., Wilson, A., Pahud, M., Benko, H., Irani, P., Guimbretière, F., Gavriliu, M.,Chen, X. A., Matulic, F., and Buxton, W. (2014). Sensing techniques for tablet+stylus interaction. In Proc.

UIST ’14, pages 605–614.

[Hinckley et al., 2010] Hinckley, K., Yatani, K., Pahud, M., Coddington, N., Rodenhouse, J., Wilson, A., Benko,H., and Buxton, B. (2010). Pen + touch = new tools. In Proc. 23nd Annu. ACM Symp. User interface Softw.

Technol. - UIST ’10, page 27.

BIBLIOGRAPHY 129

[Hoffman, 1998] Hoffman, H. (1998). Physically touching virtual objects using tactile augmentation enhancesthe realism of virtual environments. In Proceedings. IEEE 1998 Virtual Real. Annu. Int. Symp. (Cat.

No.98CB36180), pages 59–63.

[Holz and Baudisch, 2013] Holz, C. and Baudisch, P. (2013). Fiberio. In Proc. UIST ’13, pages 41–50.

[Holz et al., 2012] Holz, C., Grossman, T., Fitzmaurice, G., and Agur, A. (2012). Implanted user interfaces. InProc. CHI ’12, page 503.

[Hong et al., 2015] Hong, J., Heo, S., Isokoski, P., and Lee, G. (2015). SplitBoard. In Proc. 33rd Annu. ACM

Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 1233–1236.

[Hoober, 2013] Hoober, S. (2013). How do users really hold mobile devices. UXmatters, http//www. uxmat-

ters.com/mt/archives/.

[Hsiu et al., 2016] Hsiu, M.-C., Wang, C., Huang, D.-Y., Lin, J.-W., Lin, Y.-C., Yang, D.-N., Hung, Y.-p., andChen, M. (2016). Nail+. In Proc. 18th Int. Conf. Human-Computer Interact. with Mob. Devices Serv. -

MobileHCI ’16, pages 1–6, New York, New York, USA. ACM Press.

[Huang et al., 2014] Huang, D.-Y., Tsai, M.-C., Tung, Y.-C., Tsai, M.-L., Yeh, Y.-T., Chan, L., Hung, Y.-P., andChen, M. Y. (2014). TouchSense. In Proc. 32nd Annu. ACM Conf. Hum. factors Comput. Syst. - CHI ’14,pages 189–192.

[Hwang et al., 2013] Hwang, S., Ahn, M., and Wohn, K.-y. (2013). MagGetz. In Proc. UIST ’13, pages 411–416.

[Hwang et al., 2015] Hwang, S., Song, J., and Gim, J. (2015). Harmonious Haptics. In Proc. 33rd Annu. ACM

Conf. Ext. Abstr. Hum. Factors Comput. Syst. - CHI EA ’15, pages 295–298.

[Ion et al., 2015] Ion, A., Wang, E. J., and Baudisch, P. (2015). Skin Drag Displays. In Proc. 33rd Annu. ACM


[Ishak and Feiner, 2004] Ishak, E. W. and Feiner, S. K. (2004). Interacting with hidden content using content-aware free-space transparency. In Proc. UIST ’04, page 189.

[Ismair et al., 2015] Ismair, S., Wagner, J., Selker, T., and Butz, A. (2015). MIME: Teaching Mid-Air Pose-Command Mappings. In Proc. MobileHCI ’15, pages 199–206.

[Israr and Poupyrev, 2011] Israr, A. and Poupyrev, I. (2011). Tactile brush. In Proc. CHI ’11, page 2019.

[Israr et al., 2006] Israr, A., Tan, H. Z., and Reed, C. M. (2006). Frequency and amplitude discrimination alongthe kinesthetic-cutaneous continuum in the presence of masking stimuli. J. Acoust. Soc. Am., 120(5):2789–2800.

[Jain and Balakrishnan, 2012] Jain, M. and Balakrishnan, R. (2012). User learning and performance with bezelmenus. In Proc. CHI ’12, page 2221.

[Jansen et al., 2012] Jansen, Y., Dragicevic, P., and Fekete, J.-D. (2012). Tangible remote controllers for wall-sizedisplays. In Proc. CHI ’12, page 2865.

[Jin et al., 2014] Jin, Y. S., Chun, H. Y., Kim, E. T., and Kang, S. (2014). VT-ware: A wearable tactile device forupper extremity motion guidance. In 23rd IEEE Int. Symp. Robot Hum. Interact. Commun., pages 335–340.

BIBLIOGRAPHY 130

[Johansson and Flanagan, 2009] Johansson, R. S. and Flanagan, J. R. (2009). Coding and use of tactile signalsfrom the fingertips in object manipulation tasks. Nat. Rev. Neurosci., 10(5):345–359.

[Jones et al., 2004] Jones, L., Nakamura, M., and Lockyer, B. (2004). Development of a tactile vest. In Proceed-

ings of Haptic Inter-faces for Virtual Environment and Teleoperator Systems (HAPTICS ’04, page 82–89, LosAlamitos, CA. IEEE Computer Society.

[Jones and Lederman, 2006] Jones, L. A. and Lederman, S. J. (2006). Human Hand Function. Oxford UniversityPress.

[Jones and Sarter, 2008] Jones, L. A. and Sarter, N. B. (2008). Tactile Displays: Guidance for Their Design andApplication. Hum. Factors, 50(1):90–111.

[Kajimoto, 2012] Kajimoto, H. (2012). Skeletouch. In SIGGRAPH Asia 2012 Emerg. Technol. - SA ’12, pages1–3.

[Kamba et al., 1996] Kamba, T., Elson, S. A., Harpold, T., Stamper, T., and Sukaviriya, P. (1996). Using smallscreen space more efficiently. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. common Gr. - CHI ’96, pages383–390.

[Kandel et al., 2000] Kandel, E., Schwartz, J., and Jessell, T., editors (2000). Principles of neural science.McGraw-Hill, New York, 4th ed edition.

[Kandogan and Shneiderman, 1997] Kandogan, E. and Shneiderman, B. (1997). Elastic Windows. In Proc.

SIGCHI Conf. Hum. factors Comput. Syst. - CHI ’97, pages 250–257.

[Karam and Schraefel, 2005] Karam, M. and Schraefel, m. c. (2005). A Taxonomy of Gestures in Human Com-puter Interactions.

[Karuei et al., 2011] Karuei, I., MacLean, K. E., Foley-Fisher, Z., MacKenzie, R., Koch, S., and El-Zohairy, M.(2011). Detecting vibrations across the body in mobile contexts. In Proc. CHI ’11, page 3267.

[Kettebekov, 0 13] Kettebekov, S. (2004-10-13). Exploiting prosodic structuring of coverbal gesticulation. InProceedings of the 6th international Conference on Multimodal interfaces, page 105–112, College, PA, USA.State.

[Kikuuwe and Yoshikawa, 2001] Kikuuwe, R. and Yoshikawa, T. (2001). Haptic display device with finger-tip presser for motion/force teaching to human. In Proc. 2001 ICRA. IEEE Int. Conf. Robot. Autom. (Cat.

No.01CH37164), volume 1, pages 868–873. IEEE.

[Kim et al., 2012] Kim, D., Hilliges, O., Izadi, S., Butler, A. D., Chen, J., Oikonomidis, I., and Olivier, P. (2012).Digits. In Proc. UIST ’12, page 167.

[King, 2017] King (2017). CandyCrush. https://king.com/game/candycrush.

[Kopp et al., 2004] Kopp, S., Tepper, P., and Cassell, J. (2004). Towards integrated microplanning of languageand iconic gesture for multimodal output. In Proceedings of the 6th International Conference on Multimodal

Interfaces, ICMI ’04, pages 97–104, New York, NY, USA. ACM.

BIBLIOGRAPHY 131

[Kou et al., 2015] Kou, Y., Kow, Y. M., and Cheng, K. (2015). Developing Intuitive Gestures for Spatial Inter-action with Large Public Displays. In Proc. Third Int. Conf. Distrib. Ambient. Pervasive Interact. - Vol. 9189,pages 174–181.

[Kratz and Rohs, 2009] Kratz, S. and Rohs, M. (2009). Hoverflow: Expanding the design space of around-deviceinteraction. In Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile

Devices and Services, MobileHCI ’09, pages 4:1–4:8, New York, NY, USA. ACM.

[Kratz et al., 2012] Kratz, S., Rohs, M., Guse, D., Müller, J., Bailly, G., and Nischt, M. (2012). PalmSpace. InProc. Int. Work. Conf. Adv. Vis. Interfaces - AVI ’12, page 181.

[Krebs et al., 1998] Krebs, H., Hogan, N., Aisen, M., and Volpe, B. (1998). Robot-aided neuro-rehabilitation.IEEE—Transactions on Rehabilitation Engineering, 6(1):75–87.

[Krueger, 1993] Krueger, M. (1993). Environmental technology: making the real world virtual. Commun. ACM,36(7):36–37.

[Krueger et al., 1985] Krueger, M. W., Gionfriddo, T., and Hinrichsen, K. (1985). Videoplace—an arti-ficial reality. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’85,pages 35–40, New York, NY, USA. ACM.

[Kry and Pai, 2008] Kry, P. and Pai, D. (2008). Grasp recognition and manipulation with the tango. In Experi-

mental Robotics, pages 551–559. Springer.

[Kurtenbach, 1993] Kurtenbach, G. P. (1993). The Design and Evaluation of Marking Menus.

[Kuzuoka et al., 1994] Kuzuoka, H., Kosuge, T., and Tanaka, M. (1994). Gesturecam: A video communicationsystem for sympathetic remote collaboration. In Proceedings of the 1994 ACM Conference on Computer

Supported Cooperative Work, CSCW ’94, pages 35–43, New York, NY, USA. ACM.

[Laguna, 2000] Laguna, P. (2000). The effect of model observation versus physical practice during motor skillacquisition and performance. J. Hum. Mov. Stud., 39(3):171–191.

[Laput et al., 2015] Laput, G., Brockmeyer, E., Hudson, S. E., and Harrison, C. (2015). Acoustruments. In Proc.

33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 2161–2170.

[Laput et al., 2014] Laput, G., Xiao, R., Chen, X. A., Hudson, S. E., and Harrison, C. (2014). Skin buttons. InProc. UIST ’14, pages 389–394.

[Lawson, 2014] Lawson, R. (2014). Recognizing familiar objects by hand and foot: Haptic shape perceptiongeneralizes to inputs from unusual locations and untrained body parts. Attention, Perception, Psychophys.,76(2):541–558.

[Lederman and Klatzky, 2009] Lederman, S. J. and Klatzky, R. L. (2009). Haptic perception: A tutorial. Atten.

Percept. Psychophys., 71(7):1439–1459.

[Lee et al., 2012a] Lee, B., Lee, H., Lim, S.-C., Lee, H., Han, S., and Park, J. (2012a). Evaluation of humantangential force input performance. In Proc. 2012 ACM Annu. Conf. Hum. Factors Comput. Syst. - CHI ’12,page 3121, New York, New York, USA. ACM Press.

BIBLIOGRAPHY 132

[Lee, 2010] Lee, J. (2010). Effects of haptic guidance and disturbance on motor learning: Potential advantage ofhaptic disturbance. In 2010 IEEE Haptics Symp., pages 335–342.

[Lee et al., 2015] Lee, J., Han, J., and Lee, G. (2015). Investigating the Information Transfer Efficiency of a 3x3Watch-back Tactile Display. In Proc. 33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages1229–1232.

[Lee et al., 2012b] Lee, J., Kim, Y., and Kim, G. (2012b). Funneling and saltation effects for tactile interactionwith virtual objects. In Proc. CHI ’12, page 3141.

[Lee and Starner, 2010] Lee, S. C. and Starner, T. (2010). BuzzWear. In Proc. 28th Int. Conf. Hum. factors

Comput. Syst. - CHI ’10, page 433.

[Lehtinen et al., 2012] Lehtinen, V., Oulasvirta, A., Salovaara, A., and Nurmi, P. (2012). Dynamic tactile guid-ance for visual search tasks. In Proc. UIST ’12, page 445.

[Leigh et al., 2014] Leigh, D., Forlines, C., Jota, R., Sanders, S., and Wigdor, D. (2014). High rate, low-latencymulti-touch sensing with simultaneous orthogonal multiplexing. In Proc. 27th Annu. ACM Symp. User inter-

face Softw. Technol. - UIST ’14, pages 355–364, New York, New York, USA. ACM Press.

[Leiva et al., 2012] Leiva, L., Böhmer, M., Gehring, S., and Krüger, A. (2012). Back to the app. In Proc. 14th

Int. Conf. Human-computer Interact. with Mob. devices Serv. - MobileHCI ’12, page 291.

[Leiva et al., 2015] Leiva, L. A., Sahami, A., Catala, A., Henze, N., and Schmidt, A. (2015). Text Entry on TinyQWERTY Soft Keyboards. In Proc. 33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages669–678.

[Lenay et al., 1997] Lenay, C., Canu, S., and Villon, P. (1997). Technology and perception: the contribution ofsensory substitution systems. In Proc. ICCT’97, pages 44–53.

[Lenman et al., 2002] Lenman, S., Bretzner, L., and Thuresson, B. (2002). Using marking menus to develop com-mand sets for computer vision based hand gesture interfaces. In Proc. Second Nord. Conf. Human-computer

Interact. - Nord. ’02, page 239, New York, New York, USA. ACM Press.

[Li et al., 2011] Li, F. C. Y., Guy, R. T., Yatani, K., and Truong, K. N. (2011). The 1line keyboard. In Proc. UIST

’11, page 461.

[Lieberman and Breazeal, 2007] Lieberman, J. and Breazeal, C. (2007). TIKL: Development of a Wearable Vi-brotactile Feedback Suit for Improved Human Motor Learning. IEEE Trans. Robot., 23(5):919–926.

[Lin et al., 2013] Lin, S.-Y., Shie, C.-K., Chen, S.-C., and Hung, Y.-P. (2013). AirTouch panel: a re-anchorablevirtual touch panel. In Proc. MM ’13, pages 625–628.

[Linjama and Kaaresoja, 2004] Linjama, J. and Kaaresoja, T. (2004). Novel, minimalist haptic gesture interactionfor mobile devices. In Proceedings of the Third Nordic Conference on Human-Computer Interaction, page457–458, New York. ACM Press.

[Liu et al., 2015] Liu, M., Nancel, M., and Vogel, D. (2015). Gunslinger: Subtle Arms-down Mid-air Interaction.In Proc. UIST ’15, pages 63–71.

BIBLIOGRAPHY 133

[Lopes and Baudisch, 2013] Lopes, P. and Baudisch, P. (2013). Muscle-propelled force feedback: bringing forcefeedback to mobile devices. In Proc. CHI ’13, page 2577.

[Lopes et al., 2015] Lopes, P., Ion, A., Mueller, W., Hoffmann, D., Jonell, P., and Baudisch, P. (2015). Proprio-ceptive Interaction. In Proc. 33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 939–948.

[Lü and Li, 2011] Lü, H. and Li, Y. (2011). Gesture avatar. In Proc. CHI ’11, page 207.

[Luk et al., 2006] Luk, J., Pasquero, J., Little, S., MacLean, K., Levesque, V., and Hayward, V. (2006). A rolefor haptics in mobile interaction. In Proc. CHI ’06, page 171.

[Luo and Vogel, 2014] Luo, Y. and Vogel, D. (2014). Crossing-based selection with direct touch input. In Proc.

32nd Annu. ACM Conf. Hum. factors Comput. Syst. - CHI ’14, pages 2627–2636.

[Lyons et al., 2012] Lyons, K., Nguyen, D., Ashbrook, D., and White, S. (2012). Facet. In Proc. UIST ’12, page123.

[MacKenzie et al., 2001] MacKenzie, I. S., Kober, H., Smith, D., Jones, T., and Skepner, E. (2001). Letterwise:Prefix-based disambiguation for mobile text input. In Proceedings of the 14th Annual ACM Symposium on

User Interface Software and Technology, UIST ’01, pages 111–120, New York, NY, USA. ACM.

[MacKenzie and Soukoreff, 2003] MacKenzie, I. S. and Soukoreff, R. W. (2003). Phrase sets for evaluating textentry techniques. In CHI ’03 Ext. Abstr. Hum. factors Comput. Syst. - CHI ’03, page 754.

[Maidenbaum et al., 2014] Maidenbaum, S., Abboud, S., and Amedi, A. (2014). Sensory substitution: closingthe gap between basic research and widespread practical visual rehabilitation. Neuroscience & Biobehavioral

Reviews, 41:3–15.

[Mancini et al., 2014] Mancini, F., Bauleo, A., Cole, J., Lui, F., Porro, C. A., Haggard, P., and Iannetti, G. D.(2014). Whole-body mapping of spatial acuity for pain and touch. Annals of neurology, 75(6):917–924.

[Marchal-Crespo et al., 2013] Marchal-Crespo, L., van Raai, M., Rauter, G., Wolf, P., and Riener, R. (2013). Theeffect of haptic guidance and visual feedback on learning a complex tennis task. Exp. Brain Res., 231(3):277–291.

[Marquardt et al., 2011] Marquardt, N., Kiemer, J., Ledo, D., Boring, S., and Greenberg, S. (2011). Designinguser-, hand-, and handpart-aware tabletop interactions with the TouchID toolkit. In Proc. ACM Int. Conf.

Interact. Tabletops Surfaces - ITS ’11, page 21.

[Marshall et al., 2008] Marshall, J., Pridmore, T., Pound, M., Benford, S., and Koleva, B. (2008). Pressing theFlesh: Sensing Multiple Touch and Finger Pressure on Arbitrary Surfaces. In Proc. 6th Int. Conf. Pervasive

Comput., pages 38–55. Springer-Verlag.

[Martínez et al., 2016] Martínez, J., Garcia, A., Oliver, M., Molina, J. P., and Gonzalez, P. (2016). IdentifyingVirtual 3D Geometric Shapes with a Vibrotactile Glove. IEEE Comput. Graph. Appl., 36(1):42–51.

[Martínez et al., 2013] Martínez, J., García, A. S., Molina, J. P., Martínez, D., and González, P. (2013). Anempirical evaluation of different haptic feedback for shape and texture recognition. Vis. Comput., 29(2):111–121.

BIBLIOGRAPHY 134

[Martínez et al., 2014] Martínez, J., Garcia, A. S., Oliver, M., Molina, J. P., and Gonzalez, P. (2014). Weight andSize Discrimination with Vibrotactile Feedback. In 2014 Int. Conf. Cyberworlds, pages 153–160.

[Mascaro and Asada, 2001] Mascaro, S. and Asada, H. (2001). Finger posture and shear force measurementusing fingernail sensors: initial experimentation. In Proc. 2001 ICRA. IEEE Int. Conf. Robot. Autom. (Cat.

No.01CH37164), volume 2, pages 1857–1862. IEEE.

[Matscheko et al., 2010a] Matscheko, M., Ferscha, A., Riener, A., and Lehner, M. (2010a). Tactor placement inwrist worn wearables. In Int. Symp. Wearable Comput. 2010, pages 1–8.

[Matscheko et al., 2010b] Matscheko, M., Ferscha, A., Riener, A., and Lehner, M. (2010b). Tactor placement inwrist worn wearables. In Int. Symp. Wearable Comput. 2010, pages 1–8. IEEE.

[Matsushita and Rekimoto, 7 10] Matsushita, N. and Rekimoto, J. (1997-10). Holowall: designing a finger, hand,body„ object sensitive wall. In Proceedings of the 10th Annual ACM Symposium on User interface Software

and Technology, page 209–210, Banff, Alberta, Canada.

[McDaniel et al., 2011] McDaniel, T., Goldberg, M., Villanueva, D., Viswanathan, L. N., and Panchanathan, S.(2011). Motor learning using a kinematic-vibrotactile mapping targeting fundamental movements. In Proc.

19th ACM Int. Conf. Multimed. - MM ’11, page 543.

[McGlone and Reilly, 2010] McGlone, F. and Reilly, D. (2010). The cutaneous sensory system. Neurosci. Biobe-

hav. Rev., 34(2):148–159.

[Minsky, 1984] Minsky, M. (1984). Manipulating simulated objects with real-world gestures using a force, posi-tion sensitive screen. In SIGGRAPH Comput. Graph, volume 18, page 195–203.

[Mitsuda, 2013] Mitsuda, T. (2013). Pseudo Force Display that Applies Pressure to the Forearms. Presence

Teleoperators Virtual Environ., 22(3):191–201.

[Morioka and Griffin, 2006] Morioka, M. and Griffin, M. J. (2006). Magnitude-dependence of equivalent comfortcontours for fore-and-aft, lateral and vertical hand-transmitted vibration. J. Sound Vib., 295(3-5):633–648.

[Mountcastle, 2005] Mountcastle, V. B. (2005). The sensory hand : neural mechanisms of somatic sensation.Harvard University Press.

[Nagata, 2003] Nagata, S. F. (2003). Multitasking and Interruptions during Mobile Web Tasks. Proc. Hum.

Factors Ergon. Soc. Annu. Meet., 47(11):1341–1345.

[Nakai et al., 2014] Nakai, Y., Kudo, S., Okazaki, R., Kajimoto, H., and Kuribayashi, H. (2014). Detection oftangential force for a touch panel using shear deformation of the gel. In Proc. Ext. Abstr. 32nd Annu. ACM

Conf. Hum. factors Comput. Syst. - CHI EA ’14, pages 2353–2358, New York, New York, USA. ACM Press.

[Nakamura et al., 2008] Nakamura, T., Takahashi, S., and Tanaka, J. (2008). Double-Crossing: A New Interac-tion Technique for Hand Gesture Interfaces. In Comput. Interact., pages 292–300.

[Nakatsuma et al., 2011] Nakatsuma, K., Shinoda, H., Makino, Y., Sato, K., and Maeno, T. (2011). Touch inter-face on back of the hand. In ACM SIGGRAPH 2011 Emerg. Technol. - SIGGRAPH ’11, pages 1–1.

BIBLIOGRAPHY 135

[Nancel et al., 2013] Nancel, M., Chapuis, O., Pietriga, E., Yang, X.-D., Irani, P. P., and Beaudouin-Lafon, M.(2013). High-precision pointing on large wall displays using small handheld devices. In Proc. CHI ’13, page831.

[Ni et al., 2011] Ni, T., Bowman, D. A., North, C., and McMahan, R. P. (2011). Design and evaluation of freehandmenu selection interfaces using tilt and pinch gestures. Int. J. Hum. Comput. Stud., 69(9):551–562.

[Nickel and Stiefelhagen, 2003] Nickel, K. and Stiefelhagen, R. (2003). Pointing gesture recognition based on3d-tracking of face, hands and head orientation. In Proceedings of the 5th International Conference on Multi-

modal Interfaces, ICMI ’03, pages 140–146, New York, NY, USA. ACM.

[Nicolau et al., 2013] Nicolau, H., Guerreiro, J., Guerreiro, T., and Carriço, L. (2013). UbiBraille. In Proc. 15th

Int. ACM SIGACCESS Conf. Comput. Access. - ASSETS ’13, pages 1–8.

[Nishino et al., 1998] Nishino, H., Utsumiya, K., and Korida, K. (1998). 3d object modeling using spatial andpictographic gestures. In Proceedings of the ACM Symposium on Virtual Reality Software and Technology,VRST ’98, pages 51–58, New York, NY, USA. ACM.

[Norvig, 2013] Norvig, P. (2013). English Letter Frequency Counts: Mayzner Revisited or ETAOIN SRHLDCU.http://norvig.com/mayzner.html.

[Novich and Eagleman, 2014] Novich, S. D. and Eagleman, D. M. (2014). [D79] A vibrotactile sensory substi-tution device for the deaf and profoundly hearing impaired. In 2014 IEEE Haptics Symp., pages 1–1.

[Oakley and Lee, 2014] Oakley, I. and Lee, D. (2014). Interaction on the edge. In Proc. 32nd Annu. ACM Conf.

Hum. factors Comput. Syst. - CHI ’14, pages 169–178.

[Oakley et al., 2015] Oakley, I., Lee, D., Islam, M. R., and Esteves, A. (2015). Beats. In Proc. 33rd Annu. ACM


[Oakley and O’Modhrain, 2005] Oakley, I. and O’Modhrain, S. (2005). Tilt to scroll: evaluating a motion basedvibrotactile mobile interface. In Proc. WHC ’05, page 40–49.

[Oney et al., 2013] Oney, S., Harrison, C., Ogan, A., and Wiese, J. (2013). ZoomBoard. In Proc. CHI ’13, page2799.

[Oron-Gilad et al., 2007] Oron-Gilad, T., Downs, J. L., Gilson, R. D., and Hancock, P. A. (2007). VibrotactileGuidance Cues for Target Acquisition. IEEE Trans. Syst. Man Cybern. Part C (Applications Rev., 37(5):993–1004.

[Pakkanen et al., 2008] Pakkanen, T., Lylykangas, J., Raisamo, J., Raisamo, R., Salminen, K., Rantala, J., andSurakka, V. (2008). Perception of low-amplitude haptic stimuli when biking. In Proc. 10th Int. Conf. Multi-

modal interfaces - IMCI ’08, page 281, New York, New York, USA. ACM Press.

[Park and Han, 2010] Park, Y. S. and Han, S. H. (2010). One-handed thumb interaction of mobile devices fromthe input accuracy perspective. Int. J. Ind. Ergon., 40(6):746–756.

[Pasquero, 2006] Pasquero, J. (2006). Survey on communication through touch (no. tr-cim 06.04). Technicalreport, Center for Intelligent Machines, McGill University, Montreal, Canada.

BIBLIOGRAPHY 136

[Pasquero et al., 2007] Pasquero, J., Luk, J., Levesque, V., Wang, Q., Hayward, V., and MacLean, K. (2007).Haptically enabled handheld information display with distributed tactile transducer. IEEE Transactions on

Multimedia, 9(4):746–753.

[Pasquero et al., 2011] Pasquero, J., Stobbe, S. J., and Stonehouse, N. (2011). A haptic wristwatch for eyes-freeinteractions. In Proc. CHI ’11, page 3257.

[Patterson and Katz, 1992] Patterson, P. E. and Katz, J. A. (1992). Design and evaluation of a sensory feedbacksystem that provides grasping pressure in a myoelectric hand. J. Rehabil. Res. Dev., 29(1):1–8.

[Perrault et al., 2013] Perrault, S. T., Lecolinet, E., Eagan, J., and Guiard, Y. (2013). Watchit. In Proc. CHI ’13,page 1451.

[Pfeiffer et al., 2014] Pfeiffer, M., Schneegass, S., Alt, F., and Rohs, M. (2014). Let me grab this. In Proc. 5th

Augment. Hum. Int. Conf. - AH ’14, pages 1–8.

[Pierce and Pausch, 4 20] Pierce, J. and Pausch, R. (2002-04-20). Comparing voodoo dolls, homer: exploring theimportance of feedback in virtual environments. In Proceedings of the SIGCHI Conference on Human Factors

in Computing Systems: Changing Our World, Changing Ourselves, page 105–112, Minneapolis, Minnesota,USA. ACM.

[Pohl et al., 2017] Pohl, H., Brandes, P., Ngo Quang, H., and Rohs, M. (2017). Squeezeback: Pneumatic com-pression for notifications. In Proceedings of the 2017 CHI Conference on Human Factors in Computing

Systems, CHI ’17, pages 5318–5330, New York, NY, USA. ACM.

[Poupyrev et al., 2002] Poupyrev, I., Maruyama, S., and Rekimoto, J. (2002). Ambient touch. In Proc. 15th Annu.

ACM Symp. User interface Softw. Technol. - UIST ’02, page 51, New York, New York, USA. ACM Press.

[Quek et al., 2002] Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X.-F., Kirbas, C., McCullough, K. E.,and Ansari, R. (2002). Multimodal human discourse: gesture and speech. ACM Trans. Comput. Interact.,9(3):171–193.

[Rahal et al., 2009] Rahal, L., Cha, J., Saddik, A., Kammerl, J., and Steinbach, E. (2009). Investigating theinfluence of temporal intensity changes on apparent movement phenomenon. In Virtual Environments, Human-

Computer Interfaces and Measurements Systems, 2009. VECIMS ’09. IEEE International Conference on, pages310–313.

[Ramos et al., 2007] Ramos, G., Cockburn, A., Balakrishnan, R., and Beaudouin-Lafon, M. (2007). Pointinglenses. In Proc. SIGCHI Conf. Hum. factors Comput. Syst. - CHI ’07, page 757.

[Rantala et al., 2013] Rantala, J., Salminen, K., Raisamo, R., and Surakka, V. (2013). Touch gestures in commu-nicating emotional intention via vibrotactile stimulation. International Journal of Human-Computer Studies,71(6):679–690.

[Rateau et al., 2014] Rateau, H., Grisoni, L., and De Araujo, B. (2014). Mimetic interaction spaces: controllingdistant displays in pervasive environments. In Proc. IUI ’14, pages 89–94.

[Rekimoto, 1996] Rekimoto, J. (1996). Tilting operations for small screen interfaces. In Proc. UIST ’96, pages167–168.

BIBLIOGRAPHY 137

[Rekimoto, 2013] Rekimoto, J. (2013). Traxion. In Proc. UIST ’13, pages 427–432.

[Rekimoto and Jun, 2002] Rekimoto, J. and Jun (2002). SmartSkin. In Proc. SIGCHI Conf. Hum. factors Com-

put. Syst. Chang. our world, Chang. ourselves - CHI ’02, page 113, New York, New York, USA. ACM Press.

[Ren, 2013] Ren, G. (2013). Designing for Effective Freehand Gestural Interaction. PhD thesis, University ofBath.

[Rock, 1984] Rock, I. (1984). Perception. Scientific American Library, New York.

[Rock and Victor, 1964] Rock, I. and Victor, J. (1964). Vision and touch: an experimentally created conflictbetween the two senses. Science, 143(3606):594–6.

[Rogers et al., 2011] Rogers, S., Williamson, J., Stewart, C., and Murray-Smith, R. (2011). AnglePose. In Proc.

CHI ’11, page 2575.

[Roth and Turner, 2009] Roth, V. and Turner, T. (2009). Bezel swipe. In Proc. 27th Int. Conf. Hum. factors

Comput. Syst. - CHI 09, page 1523.

[Roumen et al., 2015] Roumen, T., Perrault, S. T., and Zhao, S. (2015). NotiRing. In Proc. 33rd Annu. ACM


[Roy et al., 2015] Roy, Q., Guiard, Y., Bailly, G., Lecolinet, É., and Rioul, O. (2015). Glass+Skin: An EmpiricalEvaluation of the Added Value of Finger Identification to Basic Single-Touch Interaction on Touch Screens. InINTERACT IFIP Int. Conf. Human-Computer Interact.

[Ruiz and Vogel, 2015] Ruiz, J. and Vogel, D. (2015). Soft-Constraints to Reduce Legacy and Performance Biasto Elicit Whole-body Gestures with Low Arm Fatigue. In Proc. CHI ’15, pages 3347–3350.

[Salvucci et al., 2004] Salvucci, D. D., Kushleyeva, Y., and Lee, F. J. (2004). Toward an act-r general executivefor human multitasking. pages 267–272.

[Samsung, 2017] Samsung (2017). What is Air View™, and what can I see when hovering over the display withmy Samsung Galaxy S4? http://www.samsung.com/us/support/answer/ANS00044011/.

[Saponas et al., 2009] Saponas, T. S., Tan, D. S., Morris, D., Balakrishnan, R., Turner, J., and Landay, J. A.(2009). Enabling always-available input with muscle-computer interfaces. In Proc. 22nd Annu. ACM Symp.

User interface Softw. Technol. - UIST ’09, page 167.

[Schmidt and Lee, 2005] Schmidt, R. A. and Lee, T. D. (2005). Motor control and learning: A behavioral

emphasis (4th ed.).

[Schonauer et al., 2012] Schonauer, C., Fukushi, K., Olwal, A., Kaufmann, H., and Raskar, R. (2012). Multi-modal motion guidance. In Proc. 14th ACM Int. Conf. Multimodal Interact. - ICMI ’12, page 133.

[Schönauer et al., 2015] Schönauer, C., Mossel, A., Zait,i, I.-A., and Vatavu, R.-D. (2015). Touch, movementand vibration: User perception of vibrotactile feedback for touch and mid-air gestures. In Human-Computer

Interaction, pages 165–172. Springer.

[Schulman, 2017] Schulman, J. (2017). BlackBerry Storm 2 – and its piezoelectric soul – dissected at last.https://www.engadget.com/2009/08/25/blackberry-storm-2-and-its-piezoelectric-soul-finally-diss/.

BIBLIOGRAPHY 138

[Segen and Kumar, 1998] Segen, J. and Kumar, S. (1998). Gesture vr: vision-based 3d hand interace for spa-tial interaction. In Proceedings of the Sixth ACM international Conference on Multimedia (Bristol, United

Kingdom, September 13 - 16, 1998). MULTIMEDIA ’98. ACM, page 455–464, New York, NY.

[Seim et al., 2014] Seim, C., Chandler, J., DesPortes, K., Dhingra, S., Park, M., and Starner, T. (2014). Passivehaptic learning of Braille typing. In Proc. 2014 ACM Int. Symp. Wearable Comput. - ISWC ’14, pages 111–118.

[Seim et al., 2015] Seim, C., Estes, T., and Starner, T. (2015). Towards Passive Haptic Learning of piano songs.In 2015 IEEE World Haptics Conf., pages 445–450.

[Sergi et al., 2008] Sergi, F., Accoto, D., Campolo, D., and Guglielmelli, E. (2008). Forearm orientation guidancewith a vibrotactile feedback bracelet: On the directionality of tactile motor communication. In 2008 2nd IEEE

RAS EMBS Int. Conf. Biomed. Robot. Biomechatronics, pages 433–438.

[Sharma et al., 1996] Sharma, R., Huang, T., Pavovic, V., Zhao, Y., Lo, Z., Chu, S., Schulten, K., Dalke, A.,Phillips, J., Zeller, M., and Humphrey, W. (1996). Speech/gesture interface to a visual computing environmentfor molecular biologists. In Proceedings of the International Conference on Pattern Recognition (ICPR ’96),

IEEE Computer Society, 964.

[Shaw and Green, 1997] Shaw, C. and Green, M. (1997). Thred: a two-handed design system. Multimedia Syst,5(2):126–139.

[Shneiderman, 1983] Shneiderman, B. (1983). Direct manipulation: A step beyond programming languages.Computer, 16(8):57–69.

[Shull and Damian, 2015] Shull, P. B. and Damian, D. D. (2015). Haptic wearables as sensory replacement,sensory augmentation and trainer – a review. J. Neuroeng. Rehabil., 12(1):59.

[Sienko et al., 2013] Sienko, K. H., Balkwill, M., Oddsson, L. I. E., and Wall, C. (2013). The effect of vibrotactilefeedback on postural sway during locomotor activities. J. Neuroeng. Rehabil., 10(1):93.

[Sigrist et al., 2013] Sigrist, R., Rauter, G., Riener, R., and Wolf, P. (2013). Terminal Feedback OutperformsConcurrent Visual, Auditory, and Haptic Feedback in Learning a Complex Rowing-Type Task. J. Mot. Behav.,45(6):455–472.

[Simeone, 2015] Simeone, A. L. (2015). Substitutional reality: Towards a research agenda. In 2015 IEEE 1st

Work. Everyday Virtual Real., pages 19–22.

[Sodhi et al., 2013] Sodhi, R., Poupyrev, I., Glisson, M., and Israr, A. (2013). AIREAL: interactive tactile expe-riences in free air. ACM Trans. Graph., 32(4):1.

[Song et al., 2000] Song, C. G., Kwak, N. J., and Jeong, D. H. (2000). Developing an efficient technique of se-lection and manipulation in immersive v.e. In Proceedings of the ACM Symposium on Virtual Reality Software

and Technology, VRST ’00, pages 142–146, New York, NY, USA. ACM.

[Song et al., 2012] Song, P., Goh, W. B., Hutama, W., Fu, C.-W., and Liu, X. (2012). A handle bar metaphor forvirtual object manipulation with mid-air interaction. In Proc. CHI ’12, page 1297.

[Song et al., 2015] Song, S., Noh, G., Yoo, J., Oakley, I., Cho, J., and Bianchi, A. (2015). Hot & tight: exploringthermo and squeeze cues recognition on wrist wearables. In Proc. 2015 ACM Int. Symp. Wearable Comput. -

ISWC ’15, pages 39–42, New York, New York, USA. ACM Press.

BIBLIOGRAPHY 139

[Sony, 2017] Sony (2017). Floating touch™ – Developer World. https://developer.sonymobile.com/knowledge-base/technologies/floating-touch/.

[Spelmezan et al., 2009a] Spelmezan, D., Jacobs, M., Hilgers, A., and Borchers, J. (2009a). Tactile motioninstructions for physical activities. In Proc. 27th Int. Conf. Hum. factors Comput. Syst. - CHI 09, page 2243,New York, New York, USA. ACM Press.

[Spelmezan et al., 2009b] Spelmezan, D., Schanowski, A., and Borchers, J. (2009b). Wearable Automatic Feed-back Devices for Physical Activities. In Proc. 4th Int. ICST Conf. Body Area Networks. ICST.

[Spink et al., 2009] Spink, A., Cole, C., and Waller, M. (2009). Multitasking behavior. Annu. Rev. Inf. Sci.

Technol., 42(1):93–118.

[Stanley and Kuchenbecker, 2011] Stanley, A. A. and Kuchenbecker, K. J. (2011). Design of body-groundedtactile actuators for playback of human physical contact. In 2011 IEEE World Haptics Conf., pages 563–568.

[Stanley and Kuchenbecker, 2012] Stanley, A. A. and Kuchenbecker, K. J. (2012). Evaluation of Tactile Feed-back Methods for Wrist Rotation Guidance. IEEE Trans. Haptics, 5(3):240–251.

[Steins et al., 2013] Steins, C., Gustafson, S., Holz, C., and Baudisch, P. (2013). Imaginary devices: Gesture-based interaction mimicking traditional input devices. In Proc. MobileHCI ’13, pages 123–126.

[Stevens and Choo, 1996] Stevens, J. C. and Choo, K. K. (1996). Spatial Acuity of the Body Surface over theLife Span. Somatosens. Mot. Res., 13(2):153–166.

[Sturman et al., 1989] Sturman, D. J., Zeltzer, D., and Pieper, S. (1989). Hands-on interaction with virtual en-vironments. In Proceedings of the 2Nd Annual ACM SIGGRAPH Symposium on User Interface Software and

Technology, UIST ’89, pages 19–24, New York, NY, USA. ACM.

[Su et al., 2013] Su, C.-H., Chan, L., Weng, C.-T., Liang, R.-H., Cheng, K.-Y., and Chen, B.-Y. (2013). NailDis-play. In Proc. CHI ’13, page 1461.

[Sugiura and Koseki, 1998] Sugiura, A. and Koseki, Y. (1998). A user interface using fingerprint recognition. InProc. UIST ’98, pages 71–79.

[Suhonen et al., 2012] Suhonen, K., Väänänen-Vainio-Mattila, K., and Mäkelä, K. (2012). User experiences andexpectations of vibrotactile, thermal and squeeze feedback in interpersonal communication. Proc. 26th Annu.BCS Interact. Spec. Gr. Conf. People Comput.

[Swanner, 2015] Swanner, N. (2015). Sonavation has bonded 3D fingerprint sensors to GorillaGlass. http://thenextweb.com/insider/2015/07/21/sonovation-has-bonded-3d-fingerprint-sensors-to-gorilla-glass-kiss-your-home-button-goodbye/.

[Swindells et al., 2002] Swindells, C., Inkpen, K. M., Dill, J. C., and Tory, M. (2002). That one there! pointingto establish device identity. In Proceedings of the 15th Annual ACM Symposium on User Interface Software

and Technology, UIST ’02, pages 151–160, New York, NY, USA. ACM.

[Takeoka et al., 2010] Takeoka, Y., Miyaki, T., and Rekimoto, J. (2010). Z-touch. In ACM Int. Conf. Interact.

Tabletops Surfaces - ITS ’10, page 91, New York, New York, USA. ACM Press.

BIBLIOGRAPHY 140

[Tan and Pentland, 2001] Tan, H. and Pentland, A. (2001). Tactual displays for sensory substitution and wearablecomputers. In Barfield, W. and Caudell, T., editors, Fundamentals of wearable computers and augmented

reality, page 579–598. Erlbaum, Mahwah, NJ.

[Tejeiro et al., 2012] Tejeiro, C., Stepp, C. E., Malhotra, M., Rombokas, E., and Matsuoka, Y. (2012). Compari-son of remote pressure and vibrotactile feedback for prosthetic hand control. In 2012 4th IEEE RAS EMBS Int.

Conf. Biomed. Robot. Biomechatronics, pages 521–525. IEEE.

[van de Camp et al., 2013] van de Camp, F., Schick, A., and Stiefelhagen, R. (2013). How to Click in Mid-Air,pages 78–86. Springer Berlin Heidelberg, Berlin, Heidelberg.

[van den Hoven and Mazalek, 2011] van den Hoven, E. and Mazalek, A. (2011). Grasping gestures: Gesturingwith physical artifacts. AI EDAM, 25(3):255–271.

[van der Linden et al., 2011a] van der Linden, J., Johnson, R., Bird, J., Rogers, Y., and Schoonderwaldt, E.(2011a). Buzzing to play. In Proc. CHI ’11, page 533.

[van der Linden et al., 2011b] van der Linden, J., Schoonderwaldt, E., Bird, J., and Johnson, R. (2011b). Music-Jacket—Combining Motion Capture and Vibrotactile Feedback to Teach Violin Bowing. IEEE Trans. Instrum.

Meas., 60(1):104–113.

[van Erp and Spapé, 2003] van Erp, J. and Spapé, M. (2003). Distilling the underlying dimensions of tactilemelodies. In Proceed-ings of Eurohaptics, page 111–120. Eurohaptics Society, Paris.

[van Erp and van Veen, 2003] van Erp, J. and van Veen, H. (2003). A multi-purpose tactile vest for astronauts.In the International Space Station. In Proceedings of Eurohaptics, page 405–408, Paris. Eurohaptics Society.

[Vatavu and Radu-Daniel, 2012] Vatavu, R.-D. and Radu-Daniel (2012). Point & click mediated interactions forlarge home entertainment displays. Multimed. Tools Appl., 59(1):113–128.

[Vega-Bermudez and Johnson, 2001] Vega-Bermudez, F. and Johnson, K. O. (2001). Differences in spatial acuitybetween digits. Neurology, 56(10):1389–1391.

[Vogel and Balakrishnan, 2005] Vogel, D. and Balakrishnan, R. (2005). Distant freehand pointing and clickingon very large, high resolution displays. In Proc. UIST ’05, pages 33–42.

[Vogel and Baudisch, 2007] Vogel, D. and Baudisch, P. (2007). Shift. In Proc. SIGCHI Conf. Hum. factors

Comput. Syst. - CHI ’07, page 657.

[von Bekesy, 1957] von Bekesy, G. (1957). Sensations on the skin similar to directional hearing, beats, andharmonics of the ear. J. Acoust. Soc. Am., 29(4):489–501.

[Wagner et al., 2014] Wagner, J., Lecolinet, E., and Selker, T. (2014). Multi-finger chords for hand-held tablets.In Proc. 32nd Annu. ACM Conf. Hum. factors Comput. Syst. - CHI ’14, pages 2883–2892.

[Walter et al., 2013] Walter, R., Bailly, G., and Müller, J. (2013). StrikeAPose: revealing mid-air gestures onpublic displays. In Proc. CHI ’13, page 841.

[Wang et al., 2006] Wang, Q., Levesque, V., Pasquero, J., and Hayward, V. (2006). A haptic memory gameusing the stress2 tactile display. chi ‘06 extended abstracts. In on Human Factors in Computing Systems, page271–274. ACM Press, New York.

BIBLIOGRAPHY 141

[Wang and Popovic, 2009] Wang, R. and Popovic, J. (2009). Real-time hand-tracking with a color glove. InHoppeACM, editor, ACM SIGGRAPH 2009 papers (SIGGRAPH ’09), Hugues, volume 63, page pages. NewYork, NY, USA.

[Wang et al., 2012] Wang, R., Quek, F., Tatar, D., Teh, K. S., and Cheok, A. (2012). Keep in touch. In Proc.

2012 ACM Annu. Conf. Hum. Factors Comput. Syst. - CHI ’12, page 139, New York, New York, USA. ACMPress.

[Webb et al., 2016] Webb, A. M., Pahud, M., Hinckley, K., and Buxton, B. (2016). Wearables as Context forGuiard-abiding Bimanual Touch. In Proc. 29th Annu. Symp. User Interface Softw. Technol. - UIST ’16, pages287–300, New York, New York, USA. ACM Press.

[Weigel et al., 2015] Weigel, M., Lu, T., Bailly, G., Oulasvirta, A., Majidi, C., and Steimle, J. (2015). iSkin. InProc. 33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 2991–3000.

[Weigel et al., 2014] Weigel, M., Mehta, V., and Steimle, J. (2014). More than touch. In Proc. 32nd Annu. ACM

Conf. Hum. factors Comput. Syst. - CHI ’14, pages 179–188.

[Wellner, 1991] Wellner, P. (1991). The digitaldesk calculator: Tangible manipulation on a desk top display. InProceedings of the 4th Annual ACM Symposium on User Interface Software and Technology, UIST ’91, pages27–33, New York, NY, USA. ACM.

[Wells and Fewtrell, 2006] Wells, J. and Fewtrell, M. (2006). Measuring body composition. Archives of disease

in childhood, 91(7):612–617.

[Westerman and Elias, 2006] Westerman, W. C. and Elias, J. G. (2006). System and method for packing multi-touch gestures onto a hand. US Patent 7,705,830.

[Wexelblat, 1995] Wexelblat, A. (1995). An approach to natural gesture in virtual environments. In ACM Trans.

Comput.-Hum. Interact, 2(3):179–200.

[Wigdor and Balakrishnan, 2003] Wigdor, D. and Balakrishnan, R. (2003). TiltText. In Proc. UIST ’03, pages81–90.

[Wigdor et al., 2007] Wigdor, D., Forlines, C., Baudisch, P., Barnwell, J., and Shen, C. (2007). Lucid touch. InProc. UIST ’07, page 269.

[Williamson et al., 2007] Williamson, J., Murray-Smith, R., and Hughes, S. (2007). Devices as interactive physi-cal containers: The shoogle system. In CHI ’07 Extended Abstracts on Human Factors in Computing Systems,page 2013–2018. ACM Press, New York.

[Wilson, 2006] Wilson, A. D. (2006). Robust computer vision-based detection of pinching for one and two-handed gesture input. In Proc. 19th Annu. ACM Symp. User interface Softw. Technol. - UIST ’06, page 255,New York, New York, USA. ACM Press.

[Withana et al., 2015] Withana, A., Peiris, R., Samarasekara, N., and Nanayakkara, S. (2015). zSense. In Proc.

33rd Annu. ACM Conf. Hum. Factors Comput. Syst. - CHI ’15, pages 3661–3670.

[WU and Balakrishnan, 3 11] WU, M. and Balakrishnan, R. (2003-11). Multi-finger, whole hand gestural inter-action techniques for multi-user tabletop displays. In Proceedings of the 16th Annual ACM Symposium on

User interface Software and Technology, page 193–202, Vancouver, Canada. ACM.

BIBLIOGRAPHY 142

[Xia et al., 2015] Xia, H., Grossman, T., and Fitzmaurice, G. (2015). Nanostylus: Enhancing input on ultra-smalldisplays with a finger-mounted stylus.

[Xia et al., 2014] Xia, H., Jota, R., McCanny, B., Yu, Z., Forlines, C., Singh, K., and Wigdor, D. (2014). Zero-latency tapping. In Proc. 27th Annu. ACM Symp. User interface Softw. Technol. - UIST ’14, pages 205–214,New York, New York, USA. ACM Press.

[Xiao et al., 2014] Xiao, R., Laput, G., and Harrison, C. (2014). Expanding the input expressivity of smart-watches with mechanical pan, twist, tilt and click. In Proc. 32nd Annu. ACM Conf. Hum. factors Comput. Syst.

- CHI ’14, pages 193–196.

[Xiao et al., 2015] Xiao, R., Schwarz, J., and Harrison, C. (2015). Estimating 3D Finger Angle on CommodityTouchscreens. In Proc. 2015 Int. Conf. Interact. Tabletops Surfaces - ITS ’15, pages 47–50, New York, NewYork, USA. ACM Press.

[Xu et al., 2011] Xu, C., Israr, A., Poupyrev, I., Bau, O., and Harrison, C. (2011). Tactile display for the visuallyimpaired using teslatouch. In Ext. Abs. CHI’11, pages 317–322.

[Yanagida et al., 2004] Yanagida, Y., Kakita, M., Lindeman, R., Kume, Y., and Tetsutani, N. (2004). Vibrotactileletter reading using a low-resolution tactor array. In Proceedings of Symposium on Haptic Interfaces for Virtual

Environment and Teleoperator Systems (HAPTICS ’04, page 400–406, Los Alamitos, CA. IEEE ComputerSociety.

[Yang et al., 2008] Yang, X.-D., Bischof, W. F., and Boulanger, P. (2008). Validating the Performance of HapticMotor Skill Training. In 2008 Symp. Haptic Interfaces Virtual Environ. Teleoperator Syst., pages 129–135.

[Yang et al., 2011] Yang, X.-D., Grossman, T., Irani, P., and Fitzmaurice, G. (2011). TouchCuts and TouchZoom.In Proc. CHI ’11, page 2585.

[Yang et al., 2009] Yang, X.-D., Mak, E., Irani, P., and Bischof, W. F. (2009). Dual-Surface input. In Proc. 11th

Int. Conf. Human-Computer Interact. with Mob. Devices Serv. - MobileHCI ’09, page 1.

[Yatani et al., 2012] Yatani, K., Banovic, N., and Truong, K. (2012). SpaceSense. In Proc. 2012 ACM Annu.

Conf. Hum. Factors Comput. Syst. - CHI ’12, page 415, New York, New York, USA. ACM Press.

[Yatani et al., 2008] Yatani, K., Partridge, K., Bern, M., and Newman, M. W. (2008). Escape. In Proceeding

twenty-sixth Annu. CHI Conf. Hum. factors Comput. Syst. - CHI ’08, page 285.

[Yatani and Truong, 2009] Yatani, K. and Truong, K. N. (2009). SemFeel. In Proc. 22nd Annu. ACM Symp. User


[Yee, 2003] Yee, K.-P. (2003). Peephole displays. In Proc. Conf. Hum. factors Comput. Syst. - CHI ’03, page 1.

[Ying Zheng et al., 2013] Ying Zheng, Su, E., and Morrell, J. B. (2013). Design and evaluation of pactors formanaging attention capture. In 2013 World Haptics Conf., pages 497–502. IEEE.

[Yousefpor and Bussat, 2014] Yousefpor, M. and Bussat, J. (2014). Fingerprint Sensor in an Electronic Device.US Pat. App. 14/ 451,076.

[Yu et al., 2011] Yu, N.-H., Tsai, S.-S., Hsiao, I.-C., Tsai, D.-J., Lee, M.-H., Chen, M. Y., and Hung, Y.-P. (2011).Clip-on gadgets. In Proc. UIST ’11, page 367.

BIBLIOGRAPHY 143

[Zhai, 2012] Zhai, S. (2012). Foundational Issues in Touch-Surface Stroke Gesture Design — An IntegrativeReview. Found. Trends® Human–Computer Interact., 5(2):97–205.

[Zhai et al., 2009] Zhai, S., Kristensson, P. O., Gong, P., Greiner, M., Peng, S. A., Liu, L. M., and Dunnigan, A.(2009). Shapewriter on the iphone. In Proc. 27th Int. Conf. Ext. Abstr. Hum. factors Comput. Syst. - CHI EA

’09, page 2667.

[Zhao et al., 2014] Zhao, C., Chen, K.-Y., Aumi, M. T. I., Patel, S., and Reynolds, M. S. (2014). SideSwipe. InProc. UIST ’14, pages 527–534.

[Zhao and Balakrishnan, 4 10] Zhao, S. and Balakrishnan, R. (2004-10). Simple vs. compound mark hierarchi-cal marking menus. In Proceedings of the 17th Annual ACM Symposium on User interface Software and

Technology, volume 24, page 33–42, Santa Fe, NM, USA.

[Zheng and Morrell, 2012] Zheng, Y. and Morrell, J. B. (2012). Haptic actuator design parameters that influenceaffect and attention. In 2012 IEEE Haptics Symp., pages 463–470. IEEE.

[Zimmerman et al., 1987] Zimmerman, T. G., Lanier, J., Blanchard, C., Bryson, S., and Harvill, Y. (1987). Ahand gesture interface device. In Proceedings of the SIGCHI/GI Conference on Human Factors in Computing

Systems and Graphics Interface, CHI ’87, pages 189–192, New York, NY, USA. ACM.

[Zimmerman et al., 1995] Zimmerman, T. G., Smith, J. R., Paradiso, J. A., Allport, D., and Gershenfeld, N.(1995). Applying electric field sensing to human-computer interfaces. In Proceedings of the SIGCHI Con-

ference on Human Factors in Computing Systems, CHI ’95, pages 280–287, New York, NY, USA. ACMPress/Addison-Wesley Publishing Co.

XTENDED H A T I , T OUTPUT AND TOUCHLESS INTERACTION€¦ · three attributes for three different...

Documents

Transcript of XTENDED H A T I , T OUTPUT AND TOUCHLESS INTERACTION€¦ · three attributes for three different...