ICRA 2015 Spotlight

9
PAUL STURGESS AND SUNANDO SENGUPTA OXFORD BROOKES UNIVERSITY ICRA 2015 Semantic Octree: Unifying Recognition, Reconstruction and Representation via an Octree Constrained Higher Order MRF *Joint First Author, {paul.sturgess.cv,sunando.sengupta}@gmail.com

Transcript of ICRA 2015 Spotlight

PAUL STURGESS AND SUNANDO SENGUPTAOXFORD BROOKES UNIVERSITY

ICRA 2015

Semantic Octree: Unifying Recognition, Reconstruction and

Representation via an Octree Constrained Higher Order MRF

*Joint First Author, {paul.sturgess.cv,sunando.sengupta}@gmail.com

Semantic Octree

Recognition Structured Prediction widely adopted in vision: AHRF Efficiency of the outputted structure is not the focus.

Reconstruction Octree widely adopted in robotics: Octomap Incorporating high level semantic information is not the focus

Unifying Representation Complementary to recognition and reconstruction. Efficient for further manipulations of underlying data.

Combine Octomap and AHRF to get best of both

2

Recognition3

● AHRF - Associative Higher-order Random Fields Framework.

● Multi-resolution approach to Semantic image segmentation.● Efficient and bounded inference with alpha-expansion.

Reconstruction4

The main elements of a occupancy based scene reconstruction are: Occupied: Objects present in the world, Free: required for collision avoidance, path planning. Unmapped: unknown areas in the scene need to be avoided.

Representation5

• Efficient access to, and manipulation of, 3D object models are at the heart of robotics. o Point clouds, Mesh---cannot map free and unknown area.o Stixels/Height maps/2.5D---one height value in a 2D grid and free

area not accurately mapped.o Fixed sized grid of voxels---Voxels not indexed which makes it

inefficient • Octree based volumetric representationo Represents accurately 3d space, efficient indexing of volume

Image courtesy: O Armin et. al., OctoMap: An efficient probabilistic 3D mapping framework based on octrees.

6

Semantic Octree - framework

Input stereo images are used to generate point clouds which are fused into an octree through an pre-estimated camera

Leaf- nodes (xi) are the smallest sized voxelsAny internal node (xc) gives a natural grouping of 3D space

7

Semantic Octree - framework

Perform inference over 3D voxels to give labelled scene.

8

Results

Hierarchical grouping while inference vs leaf level voxel labelling (much sparser)

Conclusion9

● Proposed a method which performs reconstruction in an efficient representation aided by semantics of the scene

● Combined AHRF and Octomap to get best of both

● Some Future Applications○ Scene interaction and manipulation.○ Collision detection, with known object types.○ Path Planning with known affordances.