Active Exploration for Robust Object Detection

Active Exploration for Robust Object Detection

Javier Velez† Garrett Hemann† Albert S. Huang† Ingmar Posner‡ Nicholas Roy†

†MIT Computer Science and ‡Mobile Robotics GroupArtificial Intelligence Laboratory Dept. of Engineering Science, Oxford University

Cambridge, MA, USA Oxford, UK

Abstract

Today, mobile robots are increasingly expected tooperate in ever more complex and dynamic envi-ronments. In order to carry out many of the higher-level tasks envisioned a semantic understanding ofa workspace is pivotal. Here our field has benefitedsignificantly from successes in machine learningand vision: applications in robotics of off-the-shelfobject detectors are plentiful. This paper outlinesan online, any-time planning framework enablingthe active exploration of such detections. Our ap-proach exploits the ability to move to different van-tage points and implicitly weighs the benefits ofgaining more certainty about the existence of anobject against the physical cost of the explorationrequired. The result is a robot which plans trajecto-ries specifically to decrease the entropy of putativedetections. Our system is demonstrated to signif-icantly improve detection performance and trajec-tory length in simulated and real robot experiments.

INTRODUCTIONYears of steady progress in robotic mapping and navigationtechniques have made it possible for robots to construct accu-rate traversability maps of relatively complex environmentsand to robustly navigate within them (see, for example New-man et al., 2009). Such maps generally represent the worldas regions which are traversable by a robot and regions whichare not (see Fig. 1(a)) and are ideal for low-level navigationtasks such as moving through the environment without col-lisions. However, mobile robots are increasingly tasked toperform high-level requests in complex and dynamic envi-ronments. Sophisticated interactions between an agent and itsworkspace require the addition of semantic information intotraditional environment representations, such as informationabout the location and identity of objects in the workspace(see Fig. 1). For example, a cleaning robot knows that dirtydishes are usually on top of tables and benefits from knowingwhere the tables are in the environment.

This work was performed under NSF IIS Grant #0546467 andONR MURI N1141207-236214 and their support is gratefully ac-knowledged.

Recently, advances in vision- and laser-based object recog-nition have been leveraged to enrich maps with higher-order,semantic information (e.g., Posner, Cummins, and Newman,2009; Douillard, Fox, and Ramos, 2008; Mozos, Stachniss,and Burgard, 2005; Anguelov et al., 2004). A straightfor-ward approach to adding semantic information is to acceptthe results of a standard object detector framework prima fa-cie, irrespective of sensor noise. A consequence of directlyusing the results of an object detector is that the quality of themap built strongly depends on the shortcomings of the objectdetector. Vision-based object detection, for example, is often-times plagued by significant performance degradation causedby a variety of factors including a change of aspect comparedto that encountered in the training data and, of course, oc-clusion (e.g., Coates and Ng, 2010; Mittal and Davis, 2008).Both of these factors can be addressed by choosing the loca-tion of the camera carefully before acquiring an image andperforming object detection.

Rarely, however, are the capabilities of the mobile robotbuilding the map exploited to improve the robustness of thedetection process by specifically counteracting known detec-tor issues. Rather than placing the burden of providing per-fect detections with the detector itself, the robot can act toimprove its perception.

In this work, we explore how a robot’s ability to selectivelygather additional information about a possible object by mov-ing to a specified location — a key advantage over more con-ventional, static deployment scenarios for object detectors —can improve the precision of object detections included in themap. In particular, we propose a planning framework thatreasons about taking detours from the shortest path to a goalin order to explore potential object detections based on noisyobservations from an off-the-shelf object detector. At eachstep, the agent weighs the potential benefit of increasing itsconfidence about a potential object against the cost of takinga detour to a more suitable vantage point.

We make two primary contributions in this paper. Firstly,we explicitly incorporate the costs of motion when planningsensing detours. Secondly, we give an approximate sensormodel that captures correlations between subsequent obser-vations. objects). Previous work has largely ignored motioncosts and has typically assumed observations are condition-ally independent given the sensor position. Inspired by recentprogress in forward search for planning under uncertainty, we

(a) (b)

Figure 1: A traditional environment map (a) with regionsthat are traversable (cyan) and not (red) useful for navigation,(b) the traversable map augmented with semantic informationabout the identity and location of objects in the world allow-ing for richer interactions between agent and workspace.

show that motion costs can be incorporated and measurementcorrelations modeled, allowing us to efficiently find robustobservation plans.1

RELATED WORKPlanning trajectories for a mobile sensor has been exploredin various domains. The approach presented here is inspiredby forward search strategies for solving partially observableMarkov decision processes (Ross et at., 2008; Prentice andRoy, 2009) but incorporates a more complex model that ap-proximates the correlations between observations.

The controls community and sensor placement commu-nity formulate the problem as minimizing a norm of theposterior belief, such as entropy, without regard to motioncost. As a consequence, greedy strategies that choose thenext-most valuable measurement can be shown to be bound-edly close to the optimal, and the challenge is to generatea model that predicts this next-best measurement (Guestrin,Krause, and Singh, 2005; Krause et al., 2008). Formulat-ing the information-theoretic problem as a decision-theoreticPOMDP Sridharan, Wyatt, and Dearden (2008) showed thattrue multi-step policies did improve the performance of acomputer vision system in terms of processing time. How-ever, the costs of the actions are independent (or negligible),leading to a submodular objective function and limited im-provement over greedy strategies.

A few relevant pieces of work from the active vision do-main include Arbel and Ferrie (1999) and more recently La-porte and Arbel (2006) who use a Bayesian approach tomodel detections that is related to ours, but only search forthe next-best viewpoint, rather than computing a full plan.The work of Deinzer, Denzler, and Niemann (2003) is per-haps most similar to ours in that the viewpoint selection prob-lem is framed using reinforcement learning, but again the au-thors “neglect costs for camera movement” and identify theabsence of costs as a limitation of their work.

The contribution of our work over the existing work is pri-marily to describe a planning model that incorporates bothaction costs and detection errors, and specifically to give anapproximate observation model that captures the correlations

1This paper presents results that originally appeared at ICAPS2011 (Velez et al., 2011).

(a) (b)

Figure 2: A conceptual illustration of (a) the robot at view-point x while following the original trajectory (bold line) to-wards the goal (red star), (b) the perception field for a par-ticular object detector centered around an object hypothesisand alternative path (bold dash-dotted line) taken by a curiousrobot to investigate the object of interest. Lighter cell shad-ing values indicate a lower conditional entropy and thereforedesirable vantage points.

between successive measurements that still allows forward-search planning to operate, leading to an efficient multi-stepsearch to improve object detection.

PROBLEM FORMULATIONConsider an agent navigating through an environment withpotential objects of interest at unknown locations. The agenthas a goal destination but is allowed to inspect the locationsof said objects of interest; for example, a rescue robot look-ing for people in a first-responder scenario. Traditionally, anobject detector is used at waypoints along the way and an ob-ject is either accepted into the map or rejected based upon asimple detector threshold.

However, the lack of introspection of this approach regard-ing both the confidence of the object detector and the qualityof the data gathered can lead to an unnecessary acceptance ofspurious detections; what looked like a ramp from a particu-lar viewpoint may in fact have been a trick of the light. Mostsystems simply discard lower confidence detections, and haveno way to improve the estimate with further, targeted mea-surements. In contrast, we would like the robot to modifyits motion to both minimize total travel cost and the cost oferrors when deciding whether or not to add newly observedobjects to the map.

Let us represent the robot as a point x ∈ R2; without lossof generality, we can express a robot trajectory as a set ofwaypoints x1:K . As the robot moves, it receives output fromits object detector that gives rise to a belief over whether adetected object truly exists at the location indicated. Let thepresence of an object at some location (x, y) be captured bythe random variable Y ∈ {object, no-object}2. Let us also de-fine a decision action a ∈ {accept, reject}, where the detectedobject is either accepted into the map (the detection is deter-mined to correspond to a real object) or rejected (the detectionis determined to be spurious). Additionally, we have an ex-

2We assume the robot knows its own location, and has a suffi-ciently well-calibrated camera to assign the location of the object.In this work, the uncertainty is whether or not an object is present atthe given location (x, y).

plicit cost ξdec : {accept, reject} × {object, no-object} 7→ Rfor a correct or incorrect accept or reject decision. We cannotknow the true costs of the decisions because we ultimately donot know the true state of objects in the environment. But, wecan use a probabilistic sensor model for object detections tominimize the expected cost of individual decision actions ξdecgiven the prior over objects. We therefore formulate the plan-ning problem as choosing a sequence of waypoints to mini-mize the total travel cost along the trajectory and the expectedcosts of the decision actions at the end of the trajectory.

OBJECT DETECTION SENSOR MODELIn order to compute the expected cost of decision actions, wemust estimate the probability of objects existing in the world,and therefore require a probabilistic model of the object de-tector. The key idea is that we model the object detector asa spatially varying process; around each potential object, wecharacterize every location with respect to how likely it is togive rise to useful information.

A measurement, z, at a particular viewpoint consists of theoutput of the object detector, assumed to be a real numberindicating the confidence of the detector that an object exists.The distribution over the range of confidence measurementsis captured by the random variable Z defined over a range Zof discretized states (bins). At every location x the posteriordistribution over Y can be expressed as

p(y|z,x) =p(z|y,x)p(y)∑y′ p(z|y′,x)p(y′)

, (1)

where p(z|y,x) denotes the likelihood of observing a partic-ular detector confidence at x given the true state of the object.This likelihood can be obtained empirically.

When observations originate from a physical device suchas a camera, a straightforward approach is to treat observa-tions as conditionally independent given the state of the robot(see Fig. 3(a)). This model is easy to work with, but assumesthat measurements vary only as a result of sensor noise andthe object state. In practice, measurements vary as a functionof many different factors such as scene geometry, lighting, oc-clusion, etc. The approach in Fig. 3(a) simply ignores thesefactors. If all of these (often unobservable) sources of varia-tion were correctly modeled and estimated in a environmen-tal variable Ψ (Fig. 3(b)), then the conditional independencewould hold, but constructing and maintaining a model suffi-cient to capture the image generation process is an intractablecomputational and modeling burden.

To correct our observation model without an explicitmodel Ψ, we can maintain a history of observation view-points. As more viewpoints are visited, knowledge regard-ing Y and future observations can be integrated recursively.As shown in Fig. 3(c), we remove Ψ and add a depen-dency between previous viewpoints and the current obser-vation zK . Let T K denote a trajectory of K locations tra-versed in sequence. At each location a measurement isobtained, giving a possible detection and the correspond-ing confidence. The trajectory is thus described by a setT K = {{x1, z1}, {x2, z2}, . . . , {xK , zK}} of K location-observation pairs. Knowledge gained at each step along the

trajectory can be integrated into the posterior distribution overY such that

p(y|T K) =p(zK |y,xK , T K−1)p(y|T K−1)

p(zK |xK , T K−1), (2)

where zK is the Kth observation, which depends not only onthe current viewpoint but also on the history of measurementsT K−1. In principle, K can be arbitrarily long, so the primarychallenge is to develop an efficient way of conditioning ourobservation model on previous viewpoints.

To overcome this difficulty, we approximate the real pro-cess of object detection with a simplistic model of how theimages are correlated. We replace the correlating influence ofenvironment Ψ with a convex combination of a fully uncorre-lated and a fully correlated model such that the new posteriorbelief over the state of the world is computed as

p(y|T K) =

((1−α)

p(zK |y,xK)

p(zK |xK)+ α

)p(y|T K−1) (3)

This captures the intuition that repeated observations from thesame viewpoint add little to the robot’s knowledge about thestate of the world. Observations from further afield, however,become increasingly independent; Ψ has less of a correlatingeffect. The mixing parameter, α, can be chosen such that noinformation is gained by taking additional measurements atthe same location and the information content of observationsincreases linearly with distance from previous ones (see Velezet al., 2011).

Perception FieldUsing the observation model and Equ. 3, it is possible to eval-uate how much additional information a future viewpoint xKprovides on the object state Y . Given xK and the trajectoryT K−1 visited thus far, the expected reduction in uncertaintyis captured by the mutual information between Y and the ob-servation Z received at xK :

I(Y,Z;xK , T K−1) = (4)H(Y ; T K−1)−H(Y |Z;xK , T K−1),

Of these two terms, the firstH(Y ; T K−1) term is indepen-dent of xK and can be ignored. The second term, the con-ditional entropy, can be readily evaluated using empiricallydetermined quantities. In particular, we use the conditionalentropy evaluated at all viewpoints in the robot’s surroundingworkspace to form the perception field (Velez et al., 2011) fora particular object hypothesis (see Fig. 2(b)). This field de-notes locations with high expected information gain. Due tothe correlations between individual observations made over atrajectory of viewpoints, the perception field changes as newobservations are added. Note that if the robot’s only goalwas to determine Y with the greatest confidence, it wouldrepeatedly visit locations with least conditional entropy, asindicated by the perception field.

PLANNING DETOURSWe now describe a planning algorithm that trades off the ben-efit of gaining additional information about an object hypoth-esis against the operational cost of obtaining this information.

(a) (b) (c)

Figure 3: Different graphical models representing the observation function. (a) A naive Bayes approximation, that assumes thatevery observation is conditionally independent given knowledge of the object. (b) The true model that assumes that observationsare independent given knowledge of the environment and the object. (c) The model employed here, in which the correlationsare approximated by way of a mixture model parameterized by α as per Equ. 3.

When an object is first detected, a new path to the orig-inal goal is planned based on the total cost function, whichincludes both the motion cost along the path and the value ofmeasurements from locations along the path, expressed as areduction in the expected cost of decision actions. The costfunction consists of two terms: the motion cost cmot(x1:K)and the decision cost cdec(x1:K , a), such that the optimal planπ∗ is given by

π∗ = arg minx1:K ,a

(cmot(x

1:K) + cdec(x1:K , a)

), (5)

cdec(x1:K , a) = Ey|x1:K[ξdec(a, y)], (6)

where Ey|x1:K [·] denotes the expectation with respect to therobot’s knowledge regarding the object, after having executedpath x1:K . We choose cmot(x1:K) to be a standard motioncost function proportional to the path length.

The planning process therefore proceeds by searching oversequences of x1:K , evaluating paths by computing expecta-tions with respect to both the observation sequences and theobject state. The paths with the lowest decision cost will tendto be those leading to the lowest posterior entropy, avoidingthe large penalty for false positives or negatives.

Multi-step Planning A naive approach to searching overtrajectories scales exponentially with the planning horizonK and rapidly becomes intractable as more observations areconsidered. We therefore adopt a roadmap scheme in whicha fixed number of locations are sampled every time a newviewpoint is to be added to the current trajectory. A graph isbuilt between the sampled poses, with straight-line edges be-tween samples. The perception field is used to bias samplingtowards locations with high expected information gain.

The planning approach described so far can be extendedto planning in an environment with M object hypotheses byconsidering a modified cost function which simply adds thecost for each object and treating the existence of each objectindependently such that individual perception fields add at aparticular location. See Velez et al. (2011) for more details.

EXPERIMENTSWe tested our approach on both a simulated robot with an em-pirically derived model of an object detector and on an actual

Figure 4: Robotic wheelchair platform

autonomous wheelchair (Fig. 4) using a vision-based objectdetector (Felzenszwalb, McAllester, and Ramanan, 2008).In both the simulation and physical experiments, the robotwas tasked with reaching a manually specified destination.The robot was rewarded for correctly detecting and reportingthe location of doors in the environment, penalized for falsealarms, and incurred a cost proportional to the length of itstotal trajectory. We chose doors as objects of interest due totheir abundance in indoor environments and their utility tomobile robots – identifying doorways is often a componentof higher level tasks.

Our autonomous wheelchair is equipped with onboardlaser range scanners, primarily used for obstacle sensing andnavigation, and a Point Grey Bumblebee2 color stereo cam-era. The simulation environment is based on empirically con-structed models of the physical robot and object detector. Weset ξdec(reject, ·) to zero, indicating no penalty for missed ob-jects.

Learned Perception FieldFig. 5 shows the perception field for the detector modellearned from 3400 training samples, with each cell indicat-ing the conditional entropy of the posterior distribution over

Average Randomβ=0.8 Randomβ=0.6 Greedyβ=0.8 Greedyβ=0.6 Planned RTBSSPrecision 0.27 ±0.03 0.26 ±0.04 0.31 ±0.06 0.60 ±0.07 0.75 ±0.06 0.45 ±0.06Recall 0.72 ±0.06 0.60 ±0.07 0.44 ±0.07 0.62 ±0.07 0.80 ±0.06 0.58 ±0.07Path Length (m) 62.63 ±0.02 62.03 ±0.67 67.08 ±2.23 41.95 ±0.88 54.98 ±3.04 47.57 ±0.19Total Trials 50 50 50 50 50 50

Table 1: Simulation performance on single door scenario, with standard error values.

Average Randomβ=0.8 Randomβ=0.6 Greedyβ=0.8 Greedyβ=0.6 Planned RTBSSPrecision 0.64 ±0.03 0.67 ±0.03 0.64 ±0.03 0.54 ±0.03 0.53 ±0.05 0.70 ±0.03Recall 0.64 ±0.04 0.69 ±0.03 0.63 ±0.02 0.57 ±0.03 0.76 ±0.03 0.66 ±0.03Path Length (m) 199.62 ±11.24 161.36 ±6.13 153.32 ±4.37 121.35 ±1.32 138.21 ±7.12 160.74 ±6.08Total Trials 50 50 50 50 50 50

Table 2: Simulation performance on multiple door scenario, with standard error values.

Figure 5: Learned perception field for a possible door. Theunknown object is centered at the origin (blue). Brighter re-gions correspond to viewpoints more likely to result in higherconfidence posterior beliefs.

the true state of an object given an observation from this view-point, p(y|z,x) and a uniform initial belief. Intuitively, thissuggests that the viewpoint from which a single observationis most likely to result in a low-entropy posterior belief is ap-proximately 9 m directly in front of the door. Viewpoints tooclose to or far from the door, and those from oblique anglesare less likely to result in high-confidence posterior beliefs.

Simulation ResultsWe first assessed our planning approach using the learnedmodel in a simulated environment. Our simulation environ-ment consisted of a robot navigating through an occupancymap, with object detections triggered according to the learnedobject detection model and correlation model α. We alsosimulated false positives by placing non-door objects thatprobabilistically triggered object detections using the learnedmodel for false-alarms. The processing delay incurred by theactual object detector was also simulated (the object detectorrequires approximately 2 seconds to process a spatially deci-mated 512x384 pixel image).

Comparison AlgorithmsFor the simulation trials we compared our algorithm againstthree other algorithms. The Randomβ algorithm repeatedlyobtained observations from randomly selected viewpointsnear detected objects until the belief of each object exceededa threshold β, and then continued on to the original destina-tion. The Greedyβ algorithm selected the best viewpoint ac-cording to our perception field for each potential object until

Figure 6: Simulation scenario with multiple doors (cyantriangles) and non-doors (black bars). The robot starts atthe bottom-left (S) with a goal near the bottom-right (G).The robot’s object detector responds to both doors and non-doors. Top: example trajectory (purple) executed using ran-dom viewpoint selection. Bottom: example trajectory (pur-ple) executed using planned viewpoints.

the belief of each object exceeded a threshold β. Lastly, wecompared our algorithm against the RTBSS online POMDPalgorithm (Paquet, Tobin, 2005) with a maximum depth of 2.

Single Door SimulationFirst, we tested our planning algorithm on a small simulationenvironment with one true door and two non-doors. Table1 shows the results of 50 trials. Overall, explicitly planningviewpoints resulted in significantly higher performance. Theplanned viewpoints algorithm performed better than RTBSSin terms of precision and recall, most likely because our algo-rithm sampled continuous-space viewpoints and the RTBSSalgorithm had a fixed discrete representation, while RTBSSpaths were shorter.

Multi Door SimulationNext, we evaluated our algorithm in a larger, more complexscenario containing four doors and six non-door objects. Fig-ure 6 shows the multiple door simulation environment and ex-

Figure 7: Trajectory executed on the actual robot wheelchairusing planned viewpoints from ’S’ to ’G’ where the robot dis-covers one true door (cyan triangle). Near the goal, it detectstwo more possible doors (red dots), detours to inspect them,and (correctly) decides that they are not doors.

Average Greedyβ=0.8 PlannedPrecision 0.53 ±0.14 0.7 ±0.15Recall 0.60 ±0.14 0.7 ±0.15Path Length (m) 153.86 ±33.34 91.68 ±15.56Total Trials 10 10

Table 3: Results of real-world trials using robot wheelchair.

ample trajectories planned and executed by the planned andRandomβ algorithms.

Table 2 shows the simulation results for the multi-door sce-nario. Our planned viewpoints algorithm resulted in the sec-ond shortest paths after Greedyβ=0.6 but with superior detec-tion performance. Planned viewpoints also resulted in signif-icantly shorter paths than RTBSS given the same operatingpoint on the ROC curve.

Physical Wheelchair TrialsWe conducted a small experiment comparing our plannedviewpoints algorithm and the Greedyβ=0.8 on a robotwheelchair platform. The robot was given a goal positionsuch that a nominal trajectory would bring it past one truedoor, and near several windows that trigger object detections.

Figure 7 illustrates the trajectory executed during a singletrial of the planned viewpoints algorithm, and Table 3 sum-marizes the results of all trials. Our planned viewpoints algo-rithm resulted in significantly shorter trajectories while main-taining comparable precision and recall. For doors detectedwith substantial uncertainty, our algorithm planned more ad-vantageous viewpoints to increase its confidence and ignoredfar away detections because of high motion cost.

CONCLUSIONS AND FUTURE WORKPrevious work in planned sensing has largely ignored motioncosts of planned trajectories and used simplified sensor mod-els with strong independence assumptions. In this paper, wepresented a sensor model that approximates the correlation inobservations made from similar vantage points, and an effi-cient planning algorithm that balances moving to highly in-formative vantage points with the motion cost of taking de-

tours. The result is a robot which plans trajectories specifi-cally to decrease the entropy of putative detections. The per-formance of our algorithm could be further improved by fu-ture work in both the sensor model and planning technique.

ReferencesAnguelov, D.; Koller, D.; Parker, E.; and Thrun, S. 2004. Detecting

and modeling doors with mobile robots. In Proc. ICRA.Arbel, T., and Ferrie, F. P. 1999. Viewpoint selection by navigation

through entropy maps. In Proc. ICCV.Coates, A., and Ng, A. Y. 2010. Multi-camera object detection for

robotics. In Proc. ICRA.Deinzer, F.; Denzler, J.; and Niemann, H. 2003. Viewpoint selection

- planning optimal sequences of views for object recognition. InProc. ICCV. Springer.

Douillard, B.; Fox, D.; and Ramos, F. 2008. Laser and Vision BasedOutdoor Object Mapping. In Proc. RSS.

Felzenszwalb, P.; McAllester, D.; and Ramanan, D. 2008. A dis-criminatively trained, multiscale, deformable part model. In Proc.CVPR.

Guestrin, C.; Krause, A.; and Singh, A. 2005. Near-optimal sensorplacements in gaussian processes. In Proc. ICML.

Krause, A.; Leskovec, J.; Guestrin, C.; VanBriesen, J.; and Falout-sos, C. 2008. Efficient sensor placement optimization for secur-ing large water distribution networks. Journal of Water ResourcesPlanning and Management 134:516.

Laporte, C., and Arbel, T. 2006. Efficient discriminant viewpointselection for active bayesian recognition. International Journalof Computer Vision 68(3):267–287.

Mittal, A., and Davis, L. 2008. A general method for sensor plan-ning in multi-sensor systems: Extens ion to random occlusion.International Journal of Computer Vision 76:31–52.

Mozos, O. M.; Stachniss, C.; and Burgard, W. 2005. SupervisedLearning of Places from Range Data using Adaboost. In Proc.ICRA.

Newman, P.; Sibley, G.; Smith, M.; Cummins, M.; Harrison, A.;Mei, C.; Posner, I.; Shade, R.; Schroeter, D.; Murphy, L.;Churchill, W.; Cole, D.; and Reid, I. 2009. Navigating, recognis-ing and describing urban spaces with vision and laser. Interna-tional Journal of Robotics Research.

Paquet, S. and Tobin, L 2005 Real-Time Decision Making for LargePOMDPs. In Proc. Canadian Conference on Artificial Intelli-gence .

Posner, I.; Cummins, M.; and Newman, P. 2009. A generativeframework for fast urban labeling using spatial and temporal con-text. Autonomous Robots.

Prentice, S., and Roy, N. 2009. The belief roadmap: Efficient plan-ning in belief space by factoring the covariance. InternationalJournal of Robotics Research 8(11-12):1448–1465.

Ross, S.; Pineau, J.; Paquet, S.; and Chaib-draa, B. 2008. Onlineplanning algorithms for POMDPs. Journal of Artificial Intelli-gence Research 32(1):663–704.

Sridharan, M.; Wyatt, J.; and Dearden, R. 2008. HiPPo: Hierarchi-cal POMDPs for Planning Information Processing and SensingActions on a Robot. In Proc. ICAPS.

Velez, J.; Hemann, G.; Huang, A.; Posner, I.; and Roy, N. 2011.Planning to Perceive: Exploiting Mobility for Robust Object De-tection. In Proc. ICAPS.

Active Exploration for Robust Object Detection

Documents

Transcript of Active Exploration for Robust Object Detection