Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control...

33
Human motion prediction for “human-aware” robots Jim Mainprice, PhD “HUMANS TO ROBOTS MOTION” (HRM) Research Group Leader

Transcript of Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control...

Page 1: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Human motion prediction for “human-aware” robots

Jim Mainprice, PhD

“HUMANS TO ROBOTS MOTION” (HRM)

Research Group Leader

Page 2: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Human Space Sharing Skills

• Dynamic Environment

- Reactive

- Accurate

• Multi-Agent Coordination

- "Body language"

- Anticipation

2

Page 3: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Human Interactive Manipulation

• Pick

• Place

➡ Give

➡ Receive

➡ Co-Manipulate

3

• Pick

• Place

➡ Give

➡ Receive

➡ Co-Manipulate

Page 4: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice4

Dynamic Social Movement

Page 5: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice5

Anticipation

Page 6: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice6

Legibility

Anticipation

?

Page 7: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice7

Legibility

Coordination

Anticipation

Page 8: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice8

Legibility

Anticipation

Coordination

Learning

Page 9: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice9

Human-Robot Interactions

Model of the human, context of the task

and the context of the interaction

Robot Models

Mental model of the robot, social cognition

and interpersonal perception difference

Human Models

Task

actionac

tion

Interaction Models

Models of the communication and interaction

[Multu, Springer Handbook of Robotics, 2016]

Page 10: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

yt yt st

11 at

st st−11

State

Estimation

Optimal

ControlPrediction

Human

Estimationt ht−1

ht

Perception

Action

Uncertainty

Robot Models

Page 11: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Outline

1. Human Aware Motion Planning

2. Inverse Optimal Control of Collaborative Motion

3. Combining with Data Driven Dynamical Models

Page 12: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Outline

1. Human Aware Motion Planning

2. Inverse Optimal Control of Collaborative Motion

3. Combining with Data Driven Dynamical Models

Page 13: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice13

Anthropological Studies :

Theory of « proxemics » [Hall66]

1 - Distance 2 - Visibility

Collaborators : Rachid Alami, Thierry Siméon, Daniel Sidobre, LAAS-CNRS

Publications : IARP 2010, ICRA 2011, ROMAN 2012, SORO 2014

“Human-Aware” extension of motion planning algorithms

3 – Musculoskeletal Effort

Workspace

W = {x 2 R3}

Elementary Interaction Costs

Trajectory Space

·Explores trajectories globally

Explores trajectories locally

Ξ = {ξ | ξ : [0, 1] ! C}

T-RRT

STOMP

ξ1

ξ1

Configurations Space

C = {q 2 Rd}

Page 14: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice14

(a) Chemin RRT (b) Chemin Costmap

Collaborators : Rachid Alami, Thierry Siméon, Daniel Sidobre, et al. LAAS-CNRS

Publications : IARP 2010, ICRA 2011, ROMAN 2012, SORO 2014

Motion planning withbinaire vs. continu cost maps

RRT T-RRT + STOMP

Motion planning with statique vs. mobile human receiver

Statique Mobile

“Human-Aware” extension of motion planning algorithms

Page 15: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

C1

ξ11 C2

ξ1 ξ2

Collaborators : Dmitry Berenson, Worcester Polytechnic Institute

Publication : IROS 2013

Anticipation of human movement in dynamic motion planning

15

Number of classes

Partial trajectory Probability to belong to class mencoded in a GMM

Probability of traversein class m

Voxel

p(x|ξ) =M∑

m=1

p(x|Cm)p(Cm|ξ)

Red and green, high and low probability to be traversed

Solution : Prediction of swept volume

Page 16: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Outline

1. Human Aware Motion Planning

2. Inverse Optimal Control of Collaborative Motion

3. Combining with Data Driven Dynamical Models

Page 17: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice17Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

Imitation of interactive behaviors

yt yt st

st st−11

State

Estimation

Optimal

Control

Perception

Action

How to balance the elementary interaction

features in the cost function ?

11 at

1) Demonstration

2) Features

Demonstrations Sampling Cost function

ξ1

Solution : Inverse Optimal Control

c(ξ)|{z}

cost function

=

Z

T

X

i=1:N

wi|{z}

weight

e�kξ(t)�xik2

| {z }

RBF

dt

Page 18: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Collaborative Manipulation Experiment

18Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

Page 19: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Goalset Stochastic Inverse Optimal Control

Demonstration Trajectory Sampling Learned cost landscape

Learning

Prediction

PIIRL [Kalakrishnan 13]

STOMP [Kalakrishnan 11]

Goalset Trajectory Sampling

- Modified covariance

- Project the samples to the goal region

with respect to the smoothness metric

19

minimize∆ξ

1

2||∆⇠||2R

subject to h(⇠t +∆⇠) = 0

X �

ξ =⇥q1 . . . qN

⇤T

Trajectory vector

Smoothness Metric

R ← KTK ;

Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

Page 20: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Learning Collaborative Motion Objectives

20

Start State Final State

Collaborative Experiment

Trajectory Sampling

Goalset

Trajectory

Sampling

Inverse Optimal Control

Human

Motion

Library

θ

Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

Page 21: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice

Interactive Features Importance

210

50

10

0

15

0

20

0

25

0

30

0

Recovered Weights

JLength

JVelocity

JAcceleration

JJerk

TLength

TVelocity

TAcceleration

TJerk

d(Pelvis, Pelvis)

d(Pelvis , rWristX)

d(Pelvis , rElbowZ)

d(Pelvis , rShoulderX)

d(rWristX , Pelvis)

d(rWristX , rWristX)

d(rWristX , rElbowZ)

d(rWristX , rShoulderX)

d(rElbowZ , Pelvis)

d(rElbowZ , rWristX)

d(rElbowZ , rElbowZ)

d(rElbowZ , rShoulderX)

d(rShoulderX , Pelvis)

d(rShoulderX , rWristX)

d(rShoulderX , rElbowZ)

d(rShoulderX , rShoulderX)

CONFIG INDEX SPINE 0

CONFIG INDEX SPINE 1

CONFIG INDEX SPINE 2

CONFIG INDEX ARM RIGTH SHOULDER 0

CONFIG INDEX ARM RIGTH SHOULDER 1

CONFIG INDEX ARM RIGTH SHOULDER 2

CONFIG INDEX ARM RIGTH ELBOW 0

CONFIG INDEX ARM RIGTH ELBOW 1

CONFIG INDEX ARM RIGTH ELBOW 2

CONFIG INDEX ARM RIGTH WRIST 0

CONFIG INDEX ARM RIGTH WRIST 1

CONFIG INDEX ARM RIGTH WRIST 2

Significant interference example

• Interpersonal distances – Between the center of each link

• Shoulder

• Elbow

• Wrist

• C-space distance to a

resting posture

– 12 DoFs considered

• Smoothness

– Length

– Sum of squared velocity,

acceleration and jerk

Interpersonal distances

Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

Page 22: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice22

15 users x 8 execution = 2120 trajectories

Anticipation of Human-Robot movement by Inverse Optimal Control

Collaborators : Rafi Haynes et Dmitry Berenson, Worcester Polytechnic Institute

Publications : ICRA 2015, TRO 2016

IOC better than manual tuning of the

cost function and GMMs

d(T1, T2) = kp1 � p2k+ 0.1 ⇤ cos−1(|hv1, v2i|)

• The robot is executing fixed trajectories

• Compare against baseline tunings

– Conservative : all distances active

– Agressive: no-interlink distances

• Compare with multiple metrics

– Joint center distances

– Task space metric

Page 23: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Outline

1. Human Aware Motion Planning

2. Inverse Optimal Control of Collaborative Motion

3. Combining with Data Driven Dynamical Models

Page 24: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice24Collaborators : Philipp Kartzer, Stuttgart Uni.

Publication : IEEE Humanoids 2018, IROS Workshop 2018, ECCV Workshop 2018

Long-term activity and motion prediction

Activity

Anticipation

Kinematic

Prediction 1kHz10Hz

Mode

Trajectory

Perception

Prediction

- Reaching

- Placing

- Pouring

- Drinking

- ….

X �

ξ =⇥q1 . . . qN

⇤T

Torque

Prediction1Hz

Page 25: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice25

Data gathering of activity and motion

~1 hour of data

7 different activities

Tracking objects and human

Collaborators : Philipp Kartzer, Jan Hoffman Stuttgart Uni.

PUPIL sensor Fullbody Data

Page 26: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice26

Combining Data Driven Dynamical Models with Trajectory Optimization

Collaborators : Philipp Kartzer, Jan Hoffman Stuttgart Uni.

Publications : IEEE Humanoids 2018, IROS Workshop 2018, ECCV Workshop 2018

• Discrete trajectory of states: ξt = (s0, . . . , st)

• Step 1:

• Learn dynamic behavior of humans: st+1 = f (ξt)

• We do this using a Gaussian process (GP)

• Step 2:

• Unroll the prediction

• Iteratively apply st+1 = f (ξt)

• Step 3:

• Account for constraints e.g. target state

• Optimize the trajectory

Page 27: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice27Collaborators : Jan Hoffman, Philipp Kartzer, Stuttgart Uni.

Publications : IROS Workshop 2018

Prediction of human activity

- Sample Affordances

- Generate Spline Trajectories

- Evaluate the features

- Evaluate Energy/Cost for sampled frame

Affordance Sampling

[Koppula 16]

Algorithm:

Page 28: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Outline

Architectures for Reactive Manipulation

Page 29: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice29

World

Model

Trajectory

Optimization

Kinematic

ControlTorque

Control

1kHz100Hz10Hz

Distance Field, 10Hz

Closest Points, 100Hz

Perception Action

Collaborators : Nathan Ratliff, Jeannette Bogh et al., Max Planck Institute, Tübingen

Publications : IROS 2016, LBR-ICRA 2017, RA-L 2018

300+ experiments on 4 dynamic

scenarios with dynamical geometries

Integrating feedback at multiple time scales

Page 30: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice30

Configuration Space

Smoothness

Collision

Terminal Potential

Joint Limits

Joint Postural

Task Space

Smoothness

Constraints Objectives

Compute Lagrange Multipliers

Unconstrained Proxy

Local Approximation

Optimization Step

Constraints

Objectives

X �

ξ =⇥q1 . . . qN

⇤T

Trajectory vector

Local Optimizer

while not StopCondition() do

v f(ξi) ;

gξ rf(ξi) ;

ξi+1 ξi � ηiB−1i gξi

;

Collaborators : Nathan Ratliff et al., Max Planck Institute, Tübingen

Publications : IROS 2016

“Gauss-Newton” for motion planning

Page 31: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Slide Jim Mainprice31

Modeling the Environment

Collaborators : Nathan Ratliff et al., Max Planck Institute, Tübingen

Publications : IROS 2016, LBR-ICRA 2017, RA-L 2018

Smooth occupancy grids

using tri-cubic splines

Page 32: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Conclusion

- Motion planning for manipulation ➡ High-DOF, Continuous state-space

➡ Sampling-based + local methods

- Human-awareness requires a predictive

models of human motion ➡ safety, comfort (i.e, social awareness)

- Challenges: ➡ Model environment : affordances

➡ Model awareness : theory of mind

➡ Hierarchical representations

Page 33: Jim Mainprice, PhD - GitHub Pages · Slide Jim Mainprice Goalset Stochastic Inverse Optimal Control Demonstration Trajectory Sampling Learned cost landscape Learning Prediction PIIRL

Questions

Collaborators Past and Present

Rachid Alami, Thierry Siméon

LAAS-CNRS, Toulouse

Dmitry Berenson, Sonia Chernova,Rafi Haynes, Paul Oh

Worcester Polytechnic Institute, MA, USA

Nathan Ratliff, Daniel Kappler, Jeannette Bogh, Stefan Schaal Max Planck Institute, Tübingen

Martin GieseUniversité de Tübingen

Philipp Krazner, Yoojin Oh, Marc ToussaintUniversité de Stuttgart