The State of the AR(t) of the User Interface · • Digital Fabrication • … input optimized...

64
11/16/2014 The State of the AR(t) of the User Interface Prof. Dr. Otmar Hilliges

Transcript of The State of the AR(t) of the User Interface · • Digital Fabrication • … input optimized...

11/16/2014

The State of the AR(t) of the User InterfaceProf. Dr. Otmar Hilliges

whoami

• Professor of Computer Science at ETHZ since March 2013

11/16/2014 2

Computer Science @ ETH

Networked Systems & Parallel

Computing

Data management & Machine

Learning

Pervasive Computing &

Embedded Systems Information &

System Security

Visual ComputingTheory &

Algorithms

Programming Languages &

Software Engineering

Visual Computing @ ETH

People:• 4 Professors• 65+ PhD students and PostdocsAreas:• Computer Graphics• Computer Vision• Image Processing• Human Computer Interaction• Robotics• Digital Fabrication• …

input optimized result

Computer Graphics Lab (CGL)

Prof. Markus Gross is head of CGL and Disney Research Zurich

CGL has 7 senior researchers, 21 PhD students

Research topics in the field of

Physics, Animation

and Fabrication

Computer-based LearningCapture and

Geometry

Images and

Video

Prof. Markus Gross

Disney Research Zurich

6

Computer Graphics Vision and Sensing Animation

Stereo and Displays Rendering Imaging and Video

Materials Effects and Capture Robotics

Interactive Geometry Lab (IGL)

Interactive shape modeling

Digital geometry processing

16 November 2014 7

Modeling for 3D fabrication

input optimized result

Interactive Geometry Lab

Olga Sorkine-Hornung

Computer Vision Lab

11/16/2014Prof. Marc Pollefeys

ETH Games Programming Lab

11/16/2014 9

ETH Game lab

VisualComputing

Core CS

Visual Computing @ ETH

People:• 4 Professors• 65+ PhD students and PostdocsAreas:• Computer Graphics• Computer Vision• Image Processing• Human Computer Interaction• Robotics• Digital Fabrication• …

input optimized result

Advanced Interactive Technologies Lab

http://ait.ethz.ch/

Faculty:

Otmar Hilliges

PhD Students:

Tobias Nägeli Jie SongKarthik Sheshadri

Postdoc:

Fabrizio Pece

“How will Humans use Technology in the Future?” “How do Humans Interact with Technology?”

The AIT Agenda

11/16/2014 14

Develop novel algorithms for input recognition & semantic interpretation

Understand user needs and capabilities

Push the boundaries of what humans can do with computers

Build complex and interactive systems to verify algorithmic work

What is a Computer?

11/16/2014 15

Our mental model

11/16/2014 The computer’s mental model

11/16/2014 17

New Computers

11/16/2014

Desktop

Mobile

Stationary

Inp

ut

Rec

ogn

itio

nEn

viron

me

nt R

econ

structio

n

Motion Keyboard

HumanCentric Flight

2Ft PC

TUI 3D

Augmented Reality

Human Robot Interaction

Wearables

HoloDesk

Digits

Poke A Process

KinectFusion

RoomWare Beamatron

GesturePhone

AR Projection

HoloDeskDirect 3D Interactions with a Situated See-Through Display

[Hilliges, Kim, Izadi, Weiss. In CHI ’12]

Video and Project Links

Project page: http://ait.inf.ethz.ch/projects/2012/holodesk/

Video: http://youtu.be/JHL5tJ9ja_w

Desktop

Mobile

Stationary

Inp

ut

Rec

ogn

itio

nEn

viron

me

nt R

econ

structio

n

Motion Keyboard

HumanCentric Flight

2Ft PC

TUI 3D

Augmented Reality

Human Robot Interaction

Wearables

HoloDesk

Digits

Poke A Process

AR Projection

KinectFusion

RoomWare Beamatron

GesturePhone

11/16/2014

A Motion Sensing Mechanical Keyboard[Taylor, Keskin, Hilliges, et al. In ACM CHI ’14. (Best Paper Award)]

Keyboard Gesten

Sensordaten

Traditional Machine Learning Pipeline

Sensor Image

[Nowozin et al., MSR TR ‘11][Song et al. CHI’11]

One-Shot Gesture Classification

[Zhang et al. CHI’14]

Extract Features

Form Feature Vector

Spot Gesture

Classify Gesture

Decision Trees

Is top part blue?

Is bottom part green?

Is bottom part blue?

A decision tree

Learned Categories

[Criminisi et al. 2011]

Learned Splits

16.11.2014

Tree structure learned from training data

?

?

?

Input as vector 𝑣 = 𝑥1, 𝑥2, … , 𝑥𝑑

Internal nodes split data in binary fashion 𝑆𝑗 = 𝑆𝑗𝐿 ∪ 𝑆𝑗

𝑅

Probability of label 𝑐 depending on input 𝑣

Learned Probability distributions

[Criminisi et al. 2011]

Randomized Decision Forest

...

F0 < τ0> τ0

F1 F2

F3 F4 F5 F6

< τ2< τ1> τ1 > τ2

“Swipe Down”

Single frame gesture classification? Possible?

Temporal information is crucial!a) Raise / Lower Hand b) Swipe Down c) Swipe Left d) Swipe Right

Motion Signatures – Binary Motion History Images

𝑏𝑀𝐻𝐼 𝑡 = 𝑖=0𝑘−1𝑤𝑘−1−𝑖 𝐼𝑡−𝑖

𝑚 , where 𝑤𝑖 =2𝑖

𝑘 𝑘−1, and 𝐼𝑡

𝑚 = 𝐼𝑡 𝑥 > 𝜏

Motion Signatures – Intensity Motion History Images

35𝑖𝑀𝐻𝐼 𝑡 = 𝑖=0𝑘−1𝑤𝑘−1−𝑖 𝐼𝑡−𝑖

𝑑 , where 𝑤𝑖 =2𝑖

𝑘 𝑘−1, and 𝐼𝑡

𝑑 =1

𝐼 𝑥

16.11.2014 36

Classifying Motion Signatures

t0

MHI t0

t1

MHI t1

t2

MHI t2

Gesture Recognition

Static Transition Dynamic Hover Combination Non-Gesture

Static 0.93 0.00 0.00 0.05 0.00 0.02

Transition 0.03 0.95 0.00 0.02 0.00 0.00

Dynamic 0.00 0.00 0.94 0.00 0.04 0.02

Hover 0.05 0.01 0.00 0.94 0.00 0.00

Combination 0.00 0.04 0.07 0.00 0.89 0.00

Non-Gesture 0.02 0.00 0.01 0.02 0.00 0.95

• 93% per-frame accuracy• Almost perfect gesture classification performance

Video and Project Links

Project page: http://ait.inf.ethz.ch/projects/2014/96Bytes/

Video: http://youtu.be/Y3dUeGNIX4M

Desktop

Mobile

Stationary

Inp

ut

Rec

ogn

itio

nEn

viron

me

nt R

econ

structio

n

Motion Keyboard

HumanCentric Flight

2Ft PC

TUI 3D

Augmented Reality

Human Robot Interaction

Wearables

HoloDesk

Digits

Poke A Process

AR Projection

KinectFusion

RoomWare Beamatron

GesturePhone

In-air Gestures Around Unmodified Mobile Devices[Song, Sörös, Pece, Fanello, Izadi, Keskin, Hilliges In ACM UIST ’14]

16.11.2014 42

Image pixgood.com

Issues with Touch Input

Complement Touch with Gestures

[Hilliges et al. CHI’12]

[Oikonomidis et al. ICCV’11]

Problem Statement

49

INPUT SEGMENTATION LABELED OUTPUT

H = log(n)

Runtime Efficiency vs Memory Footprint

N = exp(H)

= 2(30– 1)536.870.912 4.3 GB

= 2(5– 1)16 16 bytes

= …… …

[Keskin et al. ECCV 2012]

Too close

Active range

Too Far

Coarse Depth Classification

Raw data

Skin detection

Connected Components

Hand Segmentation

The Pipeline

𝑆𝑡 𝑢 = 1, 𝑖𝑓 min 𝑅𝑡 𝑢 − 𝐺𝑡 𝑢 , 𝑅𝑡 𝑢 − 𝐵𝑡 𝑢 > 𝜏𝑡

0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Depth Classification Forest (DCF)

Too close

Active range

Too Far

Coarse Depth Classification

Raw data

Skin detection

Connected Components

Hand Segmentation Shape Classification

The Pipeline

Shape Classification Forest (SCF)

Too close

Active range

Too Far

Coarse Depth Classification

Raw data

Skin detection

Connected Components

Hand Segmentation Part ClassificationShape Classification

The Pipeline

Part Classification Forest (PCF)

DCF+SCFs, Depth 5+10, 4.5MB, 85%SCFs, Depth 15, 110MB, 50%

DCF+SCF vs SCF Only

Confusion Matrix

93% per-frame accuracy Leave One Subject Out

Pinch Open 0.88 0.03 0.0 0.0 0.0 0.0 0.02

Pinch Close 0.0 0.93 0.05 0.0 0.0 0.0 0.01

Pointing 0.02 0.01 0.9 0.04 0.0 0.0 0.01

Gun 0.0 0.0 0.02 0.95 0.0 0.0 0.0

Splayed Hand 0.0 0.0 0.0 0.01 0.99 0.0 0.0

Flat Hand 0.05 0.0 0.0 0.0 0.01 0.99 0.11

No-Gesture 0.05 0.03 0.03 0.0 0.0 0.01 0.85

Applications

Not Only Mobile Phones

Not Only Mobile Phones

Video and Project Links

Project page: http://ait.inf.ethz.ch/projects/2014/InAirGesture/

Video: http://youtu.be/T9e-VYPBir8

Desktop

Mobile

Stationary

Inp

ut

Rec

ogn

itio

nEn

viron

me

nt R

econ

structio

n

Motion Keyboard

HumanCentric Flight

2Ft PC

TUI 3D

Augmented Reality

Human Robot Interaction

Wearables

HoloDesk

Digits

Poke A Process

AR Projection

KinectFusion

RoomWare Beamatron

GesturePhone

16.11.2014 ETH Zürich – Departement Computer Science – AIT Lab – Prof. Dr. Otmar Hilliges

92