Online Bayesian Changepoint Detection for Articulated ...cga/tmp-public/sniekum.pdf · changes...

Online Bayesian Changepoint Detection for Articulated Motion Models

Scott Niekum1,2 Sarah Osentoski3 Christopher G. Atkeson1 Andrew G. Barto2

Abstract— We introduce CHAMP, an algorithm for onlineBayesian changepoint detection in settings where it is difficult orundesirable to integrate over the parameters of candidate mod-els. CHAMP is used in combination with several articulationmodels to detect changes in articulated motion of objects in theworld, allowing a robot to infer physically-grounded task infor-mation. We focus on three settings where a changepoint modelis appropriate: objects with intrinsic articulation relationshipsthat can change over time, object-object contact that resultsin quasi-static articulated motion, and assembly tasks whereeach step changes articulation relationships. We experimentallydemonstrate that this system can be used to infer varioustypes of information from demonstration data including causalmanipulation models, human-robot grasp correspondences, andskill verification tests.

I. INTRODUCTION

Many manipulation problems in robotics are characterizedby articulation properties of objects such as rotating doors,sliding drawers, and rigidly connected parts. Several methodshave recently been proposed to identify articulation relation-ships from visual observations [10], [16] but have assumedthat these relationships remain static over time. However,much articulated motion is of a changing nature—a doorcan both rotate and then lock rigidly in place; a loose graspallows for quasi-static revolute and prismatic motion of anobject when external force is applied; assembly proceduresaim to change articulation relationships between parts witheach step. Learning generalizable policies for many tasksrequires understanding the nature of these changing relation-ships and identifying what brings about these changes. Toaddress this, we propose an online method to detect changesin the underlying articulation models that are generating therobot’s observations and infer the parameters of each of thesemodels.

First, we introduce an algorithm for online Bayesianchangepoint detection in settings where it is difficult toanalytically or numerically integrate over the parameters ofcandidate models. Building on the work of Fearnhead andLiu [7], we show that with some modifications, approximateonline Bayesian changepoint detection can be performedusing estimates of the maximum likelihood parameters foreach segment—for example, via regression or a sample con-sensus method. Our modifications also remove a significantrestriction on model definition when detecting parameter

1Scott Niekum and Christopher G. Atkeson are with the RoboticsInstitute, Carnegie Mellon University, Pittsburgh, PA 15213, [email protected]

2Scott Niekum and Andrew G. Barto are with the School of ComputerScience, University of Massachusetts Amherst, Amherst, MA 01020, USA

3Sarah Osentoski is with Robert Bosch Research and Technology Center,Palo Alto, CA 94304, USA

changes within a single model. We call this new algorithmCHAMP (Changepoint detection using Approximate ModelParameters). This algorithm is a general contribution that isuseful beyond finding changes in articulation models.

Next, we use CHAMP in conjunction with a set of articula-tion models (revolute, prismatic, and rigid) to detect changesin the articulated motion of objects in the world and to inferthe parameterized models that describe those articulations.We examine three settings where a changepoint model isappropriate: objects with intrinsic articulation relationshipsthat can change over time, object-object contact that resultsin quasi-static articulated motion, and assembly tasks whereeach step changes articulation relationships. Experimentsare performed in a learning from demonstration setting andshow how this system can be used to infer various typesof information from demonstration data, including causalmanipulation models, human-robot grasp correspondences,and skill verification tests.

II. RELATED WORK

A. Changepoint Detection

Hidden Markov Models (HMMs) are largely the defacto tool of choice when analyzing time series data, butthe standard HMM formulation has several undesirableproperties. The number of hidden states must be knownahead of time (or chosen using model selection), inferenceis often costly and subject to local minima when algo-rithms like Expectation-Maximization are used, and segmentlengths are inherently geometrically distributed. Nonpara-metric Bayesian models like the HDP-HMM [8] relax someof these conditions, but incur a new set of challenges,including the need for MCMC-based inference. In settingswhere the primary objective is to identify model changeswithout considering shared hidden states across segments,changepoint detection methods can be a more appropriatealgorithmic choice.

Frequentist approaches to changepoint detection andpiecewise regression include methods such as PELT [11]that can perform exact inference in linear time over a widerange of cost functions. Alternately, Chopin [5] introducesa Bayesian changepoint detection algorithm that uses arecursive filtering approach, but requires MCMC steps forparameter inference. Building on this work, Fearnhead andLiu present an approximate Bayesian changepoint detectionalgorithm [7] that can perform online inference efficiently,finding the distribution of locations of the changepoints andthe model parameters of each segment using computationaltime linear in the number of observations. However, thiswork requires that model parameters can be marginalized.

Other approaches to multiple model fitting have been pro-posed, such as MultiRANSAC [18], but cannot take advan-tage of the time-series nature of our setting.

B. Articulation Models

Many previous methods have been proposed to identifystatic articulation relationships between objects from sensorobservations. Two of most salient recent examples are thework of Katz and Brock. [10] and Sturm et al. [16]. Katzand Brock use interactive perception to actively manipulateobjects and visually identify object parts and articulationrelationships. Visual features are identified and tracked inthe scene to identify rigid bodies and a kinematic graphof connections between parts described by planar kinematicmodels. Sturm et al. infer kinematic graphs that describe fully3-D articulation relationships between object parts by analyz-ing the movement of known objects or object parts. Ratherthan tracking visual features, they track either visual fiducialsor use a markerless framework to detect retangular objects.In addition to rigid, revolute, and prismatic models, they alsoallow for the discovery of more complex articulations (likethe straight-and-curved track of a garage door opening) usinga Gaussian process model. To the best of our knowledge, wepresent the first work that addresses articulation relationshipsthat can change over time.

C. Learning from Demonstration

Learning from Demonstration (LfD) methods [2], [3] haveemerged as a fast, effective way to provide robots with hu-man insight into tasks. To date, the majority of these methodshave focused on approaches to policy learning, includingtrajectory libraries [14], inverse reinforcement learning [1],and dynamic movement primitives [9]. By contrast, otherrecent work has used LfD to learn information about tasksand objects in the world that is not directly policy-related.Examples of this include incremental learning of object-centric skills and global task structure [13], inference ofobject affordances from demonstration data [12], and thediscovery of articulated relationships between objects [16].

While the above types of information do not encode apolicy directly, they provide an understanding of tasks andthe world that can be used to generate generalized behavior.Direct policy learning can be highly effective for monolithicactions like a tennis swing [15], but our belief is thatcomplex manipulation tasks are best approached by learningphysically interpretable task-related qualities, which can beused to generate appropriate behavior in any given situation.

III. CHANGEPOINT DETECTION USING APPROXIMATEMODEL PARAMETERS

A. Online MAP Changepoint Detection

First, we describe the online MAP (maximum a poste-riori) changepoint detection model of Fearnhead and Liu[7]. Assume we have time-series observations y1:n =(y1, y2, . . . , yn) and a set of candidate models Q. Our goalis to infer the MAP set of changepoints times τ1, τ2, . . . , τm,with τ0 = 0 and τm+1 = n, giving us m+1 segments. Thus,

the ith segment consists of observations yτi+1:τi+1and has

an associated model qi ∈ Q with parameters θi.We assume that data after a changepoint is independent of

data prior to that changepoint, and we model the changepointpositions as a Markov chain in which the transition proba-bilities are defined by the time since the last changepoint:

p(τi+1 = t|τi = s) = g(t− s), (1)

where g(·) is a probability distribution over time and G(·)is its cumulative distribution function.

Given a segment from time s to t and a model q, definethe model evidence for that segment as:

L(s, t, q) = p(ys+1:t|q) =

∫p(ys+1:t|q, θ)p(θ)dθ. (2)

It can be shown how the standard Bayesian filtering re-cursions and an online Viterbi algorithm can be used toefficiently estimate Ct, the distribution over the position ofthe first changepoint prior to time t [7]. Define Ej as theevent that given a changepoint at time j, the MAP choice ofchangepoints has occurred prior to time j and define:

Pt(j, q) = p(Ct = j, q, Ej ,y1:t) (3)PMAPt = p(Changepoint at t, Et,y1:t). (4)

This results in the equations:

Pt(j, q) = (1−G(t− j − 1))L(j, t, q)p(q)PMAPj (5)

PMAPt = max

j,q

[g(t− j)

1−G(t− j − 1)Pt(j, q)

]. (6)

At any point, the Viterbi path can be recovered by findingthe (j, q) values that maximize PMAP

t . This process can thenbe repeated for the values that maximize PMAP

j , until timezero is reached. A straightforward alternate formulation [7]allows for the simulation of the full posterior distribution ofchangepoint locations, though in this work, we focus onlyon the MAP changepoints.

The algorithm is fully online, but requires O(n) com-putation at each time step, since Pt(j, q) values must becalculated for all j < t. To reduce computation time toa constant, ideas from particle filtering can be leveragedto keep only a constant number of particles, M , at eachtime step, each of which represent a support point in theapproximate density p(Ct = j,y1:t). At each time step, if thenumber of particles exceeds M , stratified optimal resampling[7] can be used to choose which particles to keep in a mannerthat minimizes the KL divergence from the true distributionin expectation.

B. CHAMP

The model evidence shown in Equation 2 requires that theparameters of the underlying model can be marginalized.This requires the use of either conjugate priors, allowinganalytical integration, or a low dimensional parameter spacethat can be efficiently numerically integrated. However, manymodels do not fit into either of these categories, requiring

an alternate solution for when only point-estimates of modelparameters are available. Furthermore, marginalization of themodel parameters can prevent the detection of changepointsin which the model stays the same, but the parametersof the model change. This can happen when the modelbeing considered treats each data point as independent; sincethe likelihood can be factorized into a product and themodel parameters are marginalized, the likelihood functionshows no preference for multiple segments in the case of aparameter change within a model.

For example, imagine generating a set of independentdata points under model q with parameters θab for ya:b

and parameters θbc for yb:c. Despite the different underlyingparameters for each segment,∫

p(ya:c|q, θ)p(θ)dθ

=

∫p(ya:b|q, θ)p(θ)dθ

∫p(yb:c|q, θ)p(θ)dθ.

Notice that this is not the case for some models, such as theautoregressive models originally used by Fearnhead and Liu[7].

We present CHAMP (Changepoint detection usingApproximate Model Parameters)—a modified version ofFearnhead and Liu’s changepoint algorithm that allows theuse of models of any form (with independent emissions orotherwise), in which parameter estimates are available viameans such as maximum likelihood fit, MCMC, or sampleconsensus methods. We propose three primary changes tobest accommodate this new setting.

1) Approximate model evidence: The Bayesian Infor-mation Criterion (BIC) is a well-known approximation tointegrated model evidence [4] that provides a principledpenalty against more complex models by assuming a Gaus-sian posterior distribution of parameters around the estimatedparameter value θ̂. Using the BIC, the model evidence canbe approximated as:

lnL(s, t, q) ≈ ln p(ys+1:t|q, θ̂)−1

2kq ln(t− s), (7)

where kq is the number of free parameters of model q.This approximation allows us to avoid directly evaluatingthe model evidence integral.

2) Minimum segment length: Since we are now assumingthat parameter estimates come from some type of modelfitting procedure, the quantity L(s, t, q) is no longer well-defined for all t > s. Instead, each model q has a minimumvalue of t− s for which the model is defined. For example,a line requires a minimum of two points to define, whereasa plane requires three. As a simplification, and to preventoverfitting, some sufficient minimum segment length α canbe chosen for all models. This requires three changes:changepoints can only begin to be considered at time t = 2α(when a changepoint in the center would create two equalhalves of length α), Pt(j, q) must only be calculated forvalues of t − j > α, and the choice of a segment lengthdistribution g(·) must be reconsidered.

Fearnhead and Liu suggest the use of a geometric lengthdistribution [7], as it arises naturally from a constant proba-bility of seeing a changepoint at each time step. However, itis a monotonically decreasing distribution with a mode of 1that favors shorter segments, which can lead to overfitting,especially in a setting with fitted model parameters. As analternative, Chopin [5] suggests using a uniform prior overlimited support to ensure it is well-defined. However, thisartificially places a hard limit on segment lengths, regardlessof the data. We propose the use of a truncated normaldistribution, which enforces a minimum segment lengthnaturally, has easily interpretable parameters, and is lessprone to overfitting:

g(t) =1σφ(t−µσ

)1− Φ

(α−µσ

) (8)

G(t) = Φ

(t− µσ

)− Φ

(α− µσ

), (9)

where φ is the standard normal PDF, Φ is its CDF, and α isthe minimum segment length. Since the mode of the distribu-tion is close to the mean (or identical if no truncation occurs),segment lengths are pushed toward the mean, instead ofbeing pushed toward 1. By using a broad value of σ, we cansupport a wide range of segments lengths, while leaving µ asa adjustable parameter that can be tuned if over-segmentationor under-segmentation is an issue. Alternatively, if specificprior knowledge about segment length is known, µ can beset accordingly with a more narrow value of σ to restrictsegment length appropriately.

3) Particle definition: Finally, since model fitting can bean expensive procedure, we suggest a slight revision of thedefinition of a particle from that of Fearnhead and Liu. Pre-viously, each particle represented a support point to approx-imate the joint distribution p(Ct = j,y1:t), marginalizingover models q. To potentially save on the number of requiredmodel fits, we suggest each particle also include the model,so that our approximated distribution is p(Ct = j, q,y1:t),allowing particular models to be selectively discarded ateach time step. This also prevents us from overlooking thepossibility of a changepoint at a given time step when onlyone model is a reasonable fit and the others are very poor.

Figure 1 provides pseudocode for CHAMP. Additionally,an open-source implementation of CHAMP as a ROS serviceis available online 1.

IV. ARTICULATION CHANGEPOINT DETECTION

In order to use CHAMP to detect changes in observed ar-ticulated motion between objects, we require an articulationlikelihood model and a parameter estimation method for anyparticular data segment. In this work, we use the articulationmodels of Sturm et al. [16] and MLESAC [17] for parameterestimation. Define the relative transform between two objects

1REDACTED_FOR_BLIND_REVIEW

REDACTED_FOR_BLIND_REVIEW

Input: Observations y1:n, candidate models q1, . . . , qr,prior distribution π(q), minimum segment length α, andmaximum number of particles M .Output: Viterbi path of changepoint times and models

// Initialize data structures1: max path, prev queue, particles = {}2: prev queue.push(1/r)3: for i = 1 : r do4: new p = newParticle(pos = 0, model = qi,

prev MAP = 1/r)5: particles.add(new p)6: end for

// Do for all incoming data, starting at time α7: for t = α : n do

// Add new particles8: if t >= 2α then9: pref = prev queue.pop() // PMAP

t α steps ago10: for i = 1 : r do11: new p = newParticle(pos = t−α, model = qi,

prev MAP = prev)12: particles.add(new p)13: end for14: end if

// Compute fit probabilities for all particles15: for p ∈ particles do16: p tjq = L(p.pos, t, q) · π(q) · p.prev MAP17: p.MAP = g(t− p.pos) · p tjq18: end for

// Find max particle and update Viterbi path19: max p = maxp p.MAP20: prev queue.push(max p.MAP)21: max path.add(j = max p.pos, q = max p.model)

// Resample if too many particles22: if particles.length > M then23: particles = stratOptResample(particles, M)24: end if

25: end for

// Recover the Viterbi path26: v path = {}27: curr cp = n28: while curr cp > 0 do29: 〈j, q〉 = max path[curr cp - α]30: v path.add(start = j, end = curr cp, model = q)31: curr cp = j32: end while33: return v path

Fig. 1. CHAMP

with poses xi and xj ∈ SE(3) as ∆ij = xi xj2. Given

covariance Σy , the observation model is defined as:

y ∼

{∆ +N (0,Σy) if v = 1

U if v = 0,(10)

where the probability of the observation being an outlier isp(v = 0) = γ, in which case it is drawn from a uniformdistribution U . The data likelihood is then defined as:

p(y|∆) = p(y|∆, γ)p(γ) (11)

with:

p(y|∆, γ) = (1− γ)p(y|v = 1) + γp(y|v = 0) (12)

p(γ) ∝ e−wγ , (13)

where w is a weighting constant. This formulation allows forthe overall data likelihood to be only minimally affected bya small number of outliers.

We then use three candidate articulation models Mrigid,Mprismatic, andMrevolute to describe the relationship betweentwo objects. Define p(y|M, θ) as the likelihood functionof the observed relative transform y between two objectsunder model M with parameters θ. The rigid model has6 parameters that encode the rigid transform between thetwo objects. The prismatic model has 8 parameters whichdefine the transform between the base object and the originof the prismatic joint and its axis. The revolute model has 9parameters that define the transform between the base objectand the center of rotation, and a transform from the centerof rotation to the moving point.

The full derivation of the forward and inverse kinematicfunctions defined by these articulation models is outside thescope of this work; for more details, see Sturm et al. [16].However, each model defines functions of the form:

f−1M,θ(y) = c (inverse kinematics)fM,θ(c) = ∆ (forward kinematics),

where c is a configuration (ex. a scalar position on aprismatic axis, or an angle in radians on an axis of rotation).These can be used to project an observation y onto a model,giving us a predicted relative pose ∆̂, and allowing us todefine a likelihood function, using the observation modelintroduced in Equation 11:

p(y1:n|M, θ̂) =

n∏t=1

p(yt|∆̂t). (14)

Finally, we use these models to define our BIC-penalizedlikelihood function for CHAMP:

lnL(s, t, q) ≈ ln p(ys+1:t|Mq, θ̂)−1

2kq ln(t− s), (15)

where estimated parameters θ̂ are inferred using MLESAC(Maximum Likelihood Estimation Sample Consensus) [17].

2For example, if xi,xj ∈ R4x4 are homogeneous matrices, then xi xj = x−1

i xj

Fig. 2. 5 segments of mean-zero Gaussian data with changing variance:an accurate segmentation by CHAMP (left) and an inaccurate segmentationusing Fearnhead and Liu’s original algorithm (right).

MLESAC is a sample consensus method that generates aguess for the model parameters θ by randomly choosing aminimal set of points that can be used to define the model.After generating many guesses, the guess that maximizes thatdata likelihood is retained and can be improved iterativelywith a nonlinear optimization method, BFGS, and by esti-mating the outlier ratio γ with Expectation-Maximization.

V. EXPERIMENTS

A. 1-D Gaussian Experiment

First, we present an experiment to demonstrate the abilityof CHAMP to reliably detect changepoints using maximumlikelihood parameter estimates for models with independentemissions. Five segments of data were generated (of lengths40, 60, 30, 50, and 70) by making draws from a zero-meanGaussian distribution with parameterized variance (σ =2.0, 1.0, 3.0, 1.5, and 2.5), shown in the left panel ofFigure 2. CHAMP was then used to try to recover thelocations of the changepoints with the following parameters:a truncated Gaussian length distribution with µ = 50 andσ = 10, a minimum segment length of 2, and 100 maximumparticles. We compared this analysis with that of the originalFearnhead and Liu algorithm under the same parameters(where applicable) by integrating the likelihood function witha conjugate Gamma prior on the precision τ , and settinghyperparameters a = 4.0, b = 0.5:

L(s, t, q) =

∫ ∞0

t∏i=s+1

N (xi|µ = 0, τ−1)Gam(τ |a, b)dτ

= Γ(a+ 1/2)ba

Γ(a)

1√2π

(b+

x2

2

)−a−1/2.

The center panel of Figure 2 shows a segmentation ofthe data by CHAMP that correctly divides the data into 5segments. Identical changepoint locations were found in all100 runs of CHAMP (the stratified optimal resampling stepcan introduce stochasticity), and were all found to be within2 data points of the true changepoint locations. The rightpanel of Figure 2 shows the failure of Fearnhead and Liu’salgorithm to properly detect parameter switches within thesingle model, since the data emissions were independent.This result held across a wide range of parameter settings,as it is a fundamental deficit in the original algorithm.

B. Articulation ExperimentsWe now present three scenarios in which CHAMP is

used to detect changes in articulated motion between objectsduring task demonstrations. The timing of these changesand the parameters of each articulated motion are thenused to infer physically-grounded information about the taskand the associated objects. In each experiment, a humandemonstration is observed via an RGB-D sensor mountedon a PR2 mobile manipulator. We have restricted each sceneto contain only two relevant objects, or object parts, whichare marked with AR-tags, a type of visual fiducial that allowsfor easy recovery of the 6-DOF object pose. Data is collectedat 10 Hz, recording the relative pose of the two objectsy ∈ SE(3), and ignoring the data point if either object isnot visible.

For all experiments, the following parameters were usedfor CHAMP: a truncated Gaussian length distribution withµ = 100 and σ = 5, a minimum segment length of 10, and 10maximum particles. Additionally, the following parameterswe used for the articulation models: observation covariancediag(Σy) = {σpos, σorient} = {0.0075 m, π/12 rad}, 50sample consensus iterations, and no parameter optimization,except for the final set of segment parameters, which gothrough 10 iterations of optimization.

In each experiment, only a single demonstration is given,emphasizing the possibility for one-shot learning when usingour system, in contrast to the larger number of demon-strations often required by LfD methods that learn policiesdirectly. In all experiments, segmentation was performed10 times, always yielding highly similar results as in theGaussian example. As a proof-of-concept, the changepointtimes, models, and model parameters discovered by CHAMPare used to infer physically-interpretable task information ineach case that could be used in the future to improve decisionmaking, planning, and policy learning.

1) Stapler Manipulation: The first experiment focuses onan object that has multiple intrinsic articulation states thatcan change over time—a stapler that can act as a revolutejoint or lock rigidly when rotated beyond a certain point. Therobot is shown a demonstration of a human manipulating thetop arm of the stapler, as shown in the top row of Figure3, first showing it’s ability to rotate, then pushing it intoa locked position and demonstrating rigid movement of thewhole object.

The left panel of Figure 4 shows the segmentation pro-duced by CHAMP, in which each data point is a relative posebetween the two tags, and points are colored to indicate thesegment they belong to. Two segments were found—first arotational segment, denoted by the red circle drawn to showall possible relative poses allowed by the model, and a rigidsegment, indicated by the green sphere with radius equal to3σpos. Thus, CHAMP was able to accurately detect the twoseparate articulation states.

Using this segmentation data, a simple proof-of-conceptcausal model can be learned to explain the cause of thearticulation model change from revolute to rigid. First, allthe data prior to the changepoint is projected onto the model

Fig. 3. Demonstrations of stapler manipulation (top), re-grasp of a whiteboard eraser (middle), and partial table assembly (bottom).

using the its inverse kinematic equation, providing rotationalconfigurations c. We then examine the configurations thatoccur in the 2 second window prior to the changepoint,and check if those configurations are outside the range ofconfigurations observed at any other prior time; if they areunique in that sense, then we take that as evidence that thoseconfigurations may have caused the model change.

The left column of Figure 5 shows a green cone ofconfigurations (where a configuration angle corresponds tothe location of the center of the tag on the top arm of thestapler) that appears to cause the model change. The bottomimage shows the stapler resting at a position just short ofbecoming locked; it can be seen that center of the tag is justabove the beginning of the cone, showing that our system hasaccurately learned that if the arm were pushed any further, itwould likely change to a rigid locked state. This is a simpleexample, but it illustrates the power of being able to identifychangepoints, allowing relevant facts to be learned aboutobjects in the world. It is easy to imagine extending thisto mine a larger set of experiences for causal structure andto use additional types of data like force to understand, forexample, how much force is required to unlock the staplerand make it rotate again.

2) Eraser Regrasp: The next experiment examines thecase of frictional contact that can create quasi-static ar-ticulated motion—a loose grasp of an object can producerigid motion with the hand, but can also afford revolute andprismatic motion when external forces are applied. We aremotivated by the concept of extrinsic dexterity [6], in whichin-hand regrasps of objects are accomplished by leveragingforces like gravity and contact with external objects, ratherthan relying on complex, high-DOF manipulation with thehand. Thus far, such re-grasps have been hand-coded for

particular objects [6], but we provide a proof-of-concept thatit is possible to learn these re-grasps from demonstration byidentifying changes in articulation models.

The robot is shown a demonstration of a re-grasp of awhiteboard eraser, as shown in the center row of Figure3: it begins with a side-grasp from a table top, and thenuses the table to rotate the eraser 90 degrees, followedby a repositioning, and another assist from the table toprismatically slide the grasp to the center of mass so thatthe grasp is suitable for using the eraser. The center panel ofFigure 4 shows the segmentation produced by CHAMP, inwhich 3 segments are found—a rotation, a rigid segment(moving into position for the next table contact), and aprismatic shift.

It would be desirable for the robot to be able to reproducethis behavior, possibly in a new environment, after seeingthe demonstration, but there is the correspondence problem,in which the robot does not know how its body correspondsto that of the demonstrator. For example, from the robot’spoint of view, it only saw two tags moving around, and hasno concept of hands, fingers, or the grasp locations utilizedby the demonstrator. However, using the segmentation data,we can work backwards and infer the human’s initial grasplocation, without resorting to computer vision or other morecomplicated means.

Examining the first rotational segment, we simply identifythe perpendicular axis that goes through the center of rota-tion, giving us the initial grasp location. The center columnof Figure 5 shows a green arrow representing this axis, whichcan serve as an approximately correct grasp location at theedge of the eraser. From this initial location, it would thenbe trivial to calculate where to apply forces on the eraser toproduce the desired motion from the demonstration, and also

Fig. 4. Segmentation via CHAMP of the stapler data (left), eraser data (center), and table data(right). Data points (relative poses), along with theircorresponding articulation, are colored by segment. A colored sphere indicates a rigid model, where the radius corresponds to 3σpos in the observationmodel. A line and circle illustrate a fitted revolute and prismatic model, respectively.

Fig. 5. (Left) The green cone shows the range of configurations that are associated with a change from a revolute to a rigid model. (Center) The greenarrow indicates the estimated initial grasp position used by the human demonstrator. (Right) The purple arrows indicate the directions (relative to a fixedtable coordinate frame) in which rigidity between the parts has been observed at various times during the task. When the leg/screw is inserted, rigidity inthe plane is observed (top), but after the leg is screwed in, there is also rigidity out of the plane (bottom).

how the expected grasp location would change, giving us afull map of the re-grasp to follow in the future.

3) Table Assembly: The final experiment involves anassembly task, in which each step of the task intentionallychanges the articulation relationships between objects. Ourgoal is to analyze demonstrations to discover the underlyingnumber of steps and the associated articulation changes.These expected changes in articulation relationships can thenbe used to automatically design kinematic tests that canbe used to verify if an assembly step has be successfullyaccomplished.

The robot is shown an example of a partial table assemblytask, as shown in the bottom row Figure 3. First, the tableleg and screw are inserted into the main table surface. Thedemonstrator then moves the leg in the plane, causing thetable to move as well, showing the solid connection that has

been made. The demonstrator then screws the leg in until itis rigid and again moves the leg, this time showing rigidityout of the plane as well. We preprocess the data, identifyingsegments of free-body movement if the leg and table are toofar apart to be touching.. The right panel of Figure 4 showsthe segmentation produced by CHAMP, in which 6 segmentsare found—a free-body segment (the insertion approach),a rigid segment (the movement in the plane), 3 rotationalsegments during the screwing-in process (as the screw wentfurther in, the center of rotation got lower, and CHAMPcorrectly detected that as a change in parameters), and afinal rigid segment (the movement in and out of the plane).

In sequential tasks that require precision, it is often im-portant to verify that each step has been completed correctlybefore moving on to the next step. In this case, the rigid

data segments can be leveraged by using them as templatesfor verification tests. Each rigid segment is analyzed todetermine in which axis-aligned directions rigid movementbetween the objects was actually observed, and in whichdirections no movement was observed, using a threshold toaccount for noise.

The right column of Figure 5 shows purple arrows thathave been drawn relative to the table to illustrate the di-rections rigid motion was observed after the insertion (topimage, arrows only in the x-y plane) and after fully screwingin the leg (bottom image, arrows in the x-y plane, and in thez-direction as well). We see that our approach is able to inferthat after the insertion, rigidity is expected in the plane ofthe table, but not out of it (since that would pull out thescrew), and that after the screw has been tightened, rigidityis expected in all directions. In the future, such informationcould be used to automatically design skill verificationtests—-for example, applying force in each direction thatrigidity is expected and checking visually or through forcesensing for the appropriate response.

VI. DISCUSSION AND CONCLUSION

Many everyday tasks in robotics are dependent on under-standing the changing physical relationships between objectsin the world. We introduced a general-purpose change-point detection algorithm, CHAMP, that extends Bayesianchangepoint detection to settings in which it is difficult tointegrate out the parameters of candidate models, or when itis desirable to be able to detect changes in the parametersof a single model with independent emissions. We then usedCHAMP in conjunction with several models of articulatedmovement to identify changes in articulation relationshipsbetween observed objects in the world. This changepoint in-formation was then used in several proof-of-concept systemsto infer physically grounded task-relevant information thatcould be used to assist in future robotic decision makingand policy generation. Specifically, we inferred a simplecausal manipulation model, solved a human-robot graspcorrespondence problem, and automatically generated skillverification tests for a task.

More generally, we view this work as one particularexample of how robots can learn about the physical worldfrom demonstration data to inform robust decision making,rather than learning potentially brittle policies directly fromdemonstration. When using structured models that reflectprior knowledge about the world (in this case, a limitednumber of basic articulation types), we show that this kindof inference can be performed robustly with very little data.

Our application of CHAMP advances the state-of-the-artin both changepoint detection and articulation identification,but leaves several interesting issues to be addressed infuture work. First, we could consider additional articulationmodels (such as a helical screw), compositions of models,or even use CHAMP on entirely different types of data likeforce measurements. Second, the MLESAC method of fittingarticulation models is entirely passive and can be trickedeasily, either by a malicious demonstrator, or by accident.

For example, if two unconnected objects move only along aline with respect to each other, our method will assume thata prismatic joint connects them, despite having no evidencethat they are connected rigidly in other directions. To remedythis, future work could include a model of uncertaintyover various hypotheses, and additionally could use activelearning to allow the robot to actively reduce uncertainty andlearn about articulation relationships autonomously. Finally,though CHAMP is a fully online model, we did not takeadvantage of its online capabilities; it could be used toimplement and learn do-until behaviors, continuous anomaly/ failure checking, and a number of other functions to supportrobot decision making.

REFERENCES

[1] P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse rein-forcement learning. In Proceedings of the Twenty First InternationalConference on Machine Learning, pages 1–8, 2004.

[2] B. Argall, S. Chernova, M. Veloso, and B. Browning. A survey ofrobot learning from demonstration. Robotics and Autonomous Systems,57(5):469–483, 2009.

[3] A. Billard, S. Calinon, R. Dillmann, and S. Schaal. Handbook ofRobotics, chapter Robot programming by demonstration. Springer,Secaucus, NJ, USA, 2008.

[4] C. M. Bishop. Pattern Recognition and Machine Learning (Informa-tion Science and Statistics). Springer, 2007.

[5] N. Chopin. Dynamic detection of change points in long time series.Annals of the Institute of Statistical Mathematics, 59(2):349–366,2007.

[6] N. C. Dafle, A. Rodriguez, R. Paolini, B. Tang, S. S. Srinivasa,M. Erdmann, M. T. Mason, I. Lundberg, H. Staab, and T. Fuhlbrigge.Extrinsic dexterity: In-hand manipulation with external forces. InInternational Conference on Robotics and Automation, 2014.

[7] P. Fearnhead and Z. Liu. On-line inference for multiple changepointproblems. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 69(4):589–605, 2007.

[8] E. Fox, E. Sudderth, M. Jordan, and A. Willsky. An HDP-HMM forsystems with state persistence. In Proceedings of the 25th InternationalConference on Machine Learning, pages 312–319. ACM, 2008.

[9] A. Ijspeert, J. Nakanishi, and S. Schaal. Learning attractor landscapesfor learning motor primitives. Advances in Neural InformationProcessing Systems 16, pages 1547–1554, 2003.

[10] D. Katz and O. Brock. Extracting planar kinematic models usinginteractive perception. In Unifying Perspectives in Computational andRobot Vision, pages 11–23. 2008.

[11] R. Killick, P. Fearnhead, and I. Eckley. Optimal detection of change-points with a linear computational cost. Journal of the AmericanStatistical Association, 107(500):1590–1598, 2012.

[12] L. Montesano, M. Lopes, A. Bernardino, and J. Santos-Victor. Learn-ing object affordances: From sensory–motor coordination to imitation.Robotics, IEEE Transactions on, 24(1):15–26, 2008.

[13] S. Niekum, S. Chitta, B. Marthi, S. Osentoski, and A. G. Barto.Incremental semantically grounded learning from demonstration. InRobotics: Science and Systems, 2013.

[14] P. Pastor, M. Kalakrishnan, L. Righetti, and S. Schaal. Towardsassociative skill memories. IEEE-RAS International Conference onHumanoid Robots, 2012.

[15] S. Schaal. Dynamic movement primitives–a framework for motor con-trol in humans and humanoid robotics. The International Symposiumon Adaptive Motion of Animals and Machines, 2003.

[16] J. Sturm, C. Stachniss, and W. Burgard. A probabilistic framework forlearning kinematic models of articulated objects. Journal of ArtificialIntelligence Research, 41(2):477–526, 2011.

[17] P. H. Torr and A. Zisserman. MLESAC: A new robust estimator withapplication to estimating image geometry. Computer Vision and ImageUnderstanding, 78(1):138–156, 2000.

[18] M. Zuliani, C. S. Kenney, and B. Manjunath. The multiRANSACalgorithm and its application to detect planar homographies. InImage Processing, 2005. ICIP 2005. IEEE International Conferenceon, volume 3, pages III–153. IEEE, 2005.

Online Bayesian Changepoint Detection for Articulated ...cga/tmp-public/sniekum.pdf · changes...

Documents

Transcript of Online Bayesian Changepoint Detection for Articulated ...cga/tmp-public/sniekum.pdf · changes...