3D OBJECT RECONSTRUCTION FROM SWISSRANGER SENSOR DATA ... · 3D OBJECT RECONSTRUCTION FROM...

3D OBJECT RECONSTRUCTION FROM SWISSRANGER SENSORDATA USING A SPRING-MASS MODEL

Babette Dellen1,2, Guillem Alenya 2, Sergi Foix2, and Carme Torras 2

1Bernstein Center for Computational Neuroscience, Max-Planck Institute for Dynamics and Self-Organization,Bunsenstrasse 10, Gottingen, Germany

2Institut de Robotica i Informatica Industrial (CSIC-UPC), Llorens i Artigas 4-6, 08028 Barcelona, [email protected], [email protected], [email protected], [email protected]

Keywords: Swissranger Sensor, 3D Reconstruction, Spring-Mass Model.

Abstract: We register close-range depth images of objects using a Swissranger sensor and apply a spring-mass model for3D object reconstruction. The Swissranger sensor delivers depth images in real time which have, comparedwith other types of sensors, such as laser scanners, a lower resolution and are afflicted with larger uncer-tainties. To reduce noise and remove outliers in the data, we treat the point cloud as a system of interactingmasses connected via elastic forces. We investigate two models, one with and one without a surface-topologypreserving interaction strength. The algorithm is applied to synthetic and real Swissranger sensor data, demon-strating the feasibility of the approach. This method represents a preliminary step before fitting higher-levelsurface descriptors to the data, which will be required to define object-action complexes (OACS) for robotapplications.

1 INTRODUCTION

The automatic reconstruction and model building ofcomplex physical objects is important to guide robotictasks such as grasping and view-point changes, andmore in general to predict the outcome of robot ma-nipulations. Due to improvements during the lastdecade in the field of 3D time-of-flight sensors (TOF),faster and more accurate data can now be obtainedfrom these sensors. The Swissranger sensor providesan excellent tool for 3D robot exploration tasks. Thecamera can be mounted on a robot arm, depth imagescan be acquired in real time, and objects can be si-multaneously manipulated and reconstructed. The re-sulting depth images in this specific application are oflow resolution and contain large uncertainties, requir-ing the adaptation and development of suitable mod-els.

Previously, several methods have been proposedto reconstruct 3D object surfaces from a cloud of3D data points, e.g. dynamic particles [Szeliskiet al., 1992], implicit-surface based methods [Hoppeet al., 1992], volumetric methods [Curless andLevoy, 1996], tensor-voting based methods [Tang andMedioni, 2002], Voronoi-based surface reconstruc-

tion [Amenta et al., 1998], level sets [Zhao et al.,2000], and surface fitting with radial basis functions[Carr et al., 2001]. Techniques based on dynamic par-ticles [Szeliski et al., 1992] have the advantage that amicroscopic model of the object is constructed, whichcan be used to model outcome of robot action appliedto deformable objects, e.g. the manipulation of a ta-ble cloth [Cuen et al., 2008]. In this paper, we proposea method based on a system of elastically interactingmasses. The proposed model differs from previousapproches based on dynamic particles in the way thedata is incorporated into the model and in the spe-cific choices of the interaction functions. A novel fea-ture of this algorithm is the inclusion of a dampingand a noise term which drive the system towards a lo-cal minimum, an idea similar to simulated annealing.The model constitutes a preliminary step before fittinghigher-level surface descriptors to the data. These en-tities will then provide a solid basis for guiding ob-ject actions, i.e. viewpoint changes and actions of therobot gripper.

The reconstruction process consists of the follow-ing steps: Image capture (a), coarse registration (b),fine registration (c), and implicit surface modeling viaa spring-mass model (d), which is introduced in Sec-

tion 2. Steps a-c are described in Section 3.

2 THE MODEL SYSTEM

The basic framework we consider can be defined asfollows. Let P be a set of data points with measuredposition Xp = (Xp,Yp,Zp) in a three-dimensionalspace. To each data point, we assign a masspoint defined by a continuous position variable xp =(xp,yp,zp) and a velocity variable vp = (up,vp,wp).

The masses are moving under the influence of adata force

Fdata = k(Xp−xp) , (1)

where k is a parameter.Each mass q exerts a force

Fint(p,q) = κ(p,q)(xq−xp) , (2)

on p, if q is in the neighborhood N(p) of p.We consider two different models, A and B. In

model A, κ(p,q) is equal to a constant c, while inmodel B the function κ(p,q) depends on the anglebetween the surface normal of particle q and the dif-ference vector xp−xq, such that

κ(p,q) = 2c[π/2−θ(xp−xq,nq)]/π , (3)

where nq is the surface normal at q and

θ(a,b) =

{∠(a,b) if ∠(a,b) 5 π/2,

π−∠(a,b) if ∠(a,b) > π/2.(4)

Note that ∠(a,b) is the inner angle enclosed by a andb and thus does not grow larger than π. Eq. 4 ensuresthat the interaction force depends only on the orienta-tion of the surface normal relative to the mass.

A schematic comparison of model A and B is pro-vided in Fig. 1. While the interaction forces in modelA only depend on the relative distances of neighbor-ing particles, model B takes into account the positionof a particle with respect to the surface orientationof the neighboring particles. This has the advantagethat the data is preferably smoothed in the directionof the local surface normal, thus reducing undesiredcontractions of the object model and oversmoothingat edge discontinuities.

The dynamics of the system is described by a sys-tem of coupled ordinary differential equations

dxp/dt = vp (5)dvp/dt = Fdata +Fint + τ− γvp , (6)

where τ is a noise term and γvp a damping term withdamping constant γ. These additional forces havebeen added to move the dynamical system towards alocal minimum.

(a) Model A

(b) Model B

Figure 1: Schematic of particle interaction models. (a) Inmodel A, the interaction forces (thick arrow) are indepen-dent of the orientation of the local surface (thin line). (b) Inmodel B, the interaction forces are stronger in the directionof the local surface normal

2.1 Finding a Local Minimum

The system of differential equations is solved using a4th order Runge Kutta technique with a step size of0.1, starting from random initial conditions. Coolingis introduced through the damping force and the noiseterm τ. With the course of time, we decrease the noiseaccording to

τ = pr(ni− t)/ni (7)

where ni is the total number of iterations and t is thecurrent iteration number. The number pr is drawnfrom a Gaussian distribution with a standard devia-tion of 5 pixels.

2.2 Surface Normals

Following Liang and Todhunter [Liang and Tod-hunter, 1990], we define the local covariance matrixat p as

C = ∑q∈Ns(p)

(xq−xm) · (xq−xm)T /n , (8)

where n is the number of points in the local neighbor-hood Ns(p) of p, and

xm = ∑q∈Ns(p)

xq/n (9)

is the mean position vector. The local plane whichminimizes, in the least squares sense, the orthogo-nal projections of all points in Ns(p) onto the planeis given by the two eigenvectors corresponding to thetwo largest eigenvalues.

−25 −20 −15 −10 −5 0 5 10 15 20 25−25

−20

−15

−10

−5

0

5

10

15

20

25

(a) Data view 1

−30 −20 −10 0 10 20 30−20020

−25

−20

−15

−10

−5

0

5

10

15

20

25

(b) Data view 2

−25 −20 −15 −10 −5 0 5 10 15 20 25−25

−20

−15

−10

−5

0

5

10

15

20

25

(c) Model A view 1

−30 −20 −10 0 10 20 30−20020

−25

−20

−15

−10

−5

0

5

10

15

20

25

(d) Model A view 1

−25 −20 −15 −10 −5 0 5 10 15 20 25−25

−20

−15

−10

−5

0

5

10

15

20

25

(e) Model B view 1

−30 −20 −10 0 10 20 30−20020

−25

−20

−15

−10

−5

0

5

10

15

20

25

(f) Model B view 2

Figure 2: (a-b) Original data, seen from two different views.(c-d) Results model A, seen from two different views. (e-f)Results model B, seen from two different views.

3 IMAGE REGISTRATION

For our experiments a Swissranger SR3100 camerahas been used. This new type of sensor has the abil-ity of simultaneously measuring depth and intensityfor every pixel in the image. The camera is providedwith its own illumination system, composed by a setof modulated infra-red LEDs. The sensor has a lowpixel resolution of 176× 144 but a high frame rateaverage of 25 fps. This high frame rate makes theSwissranger camera a suitable sensor for real time ap-plications.

Depth images of the object are taken for differ-ent views of the scene. Coarse global registration is

achieved via coarse pose estimation using multiplecorrespondences of invariant geometric features ex-tracted from the 3D point clouds [Chua and Jarvis,1997]. After the coarse pose registration, an itera-tive closest point algorithm (ICP) is applied in orderto achieve a fine pose registration [Besl and McKay,1992].

3.1 TOF Errors and Integration Time

The 3D-Range images captured with TOF sensorscontain different systematic errors that falsify depthmeasurements [Oprisescu et al., 2007]. System-atic errors due to environmental electromagnetic sig-nals such as sunlight and auto-lighting reflectionshave been minimised by restricting the environmen-tal scene to a suitable non-reflecting uniform texturedone. After recalibration by the manufacturer and fo-cus distance tuning, short range image extraction hasproved to be appropriate. As a consequence, the in-trinsic errors due to oversaturation have been consid-ered as negligible.

One of the TOF sensor parameters that allows toobtain close range pictures is Integration Time. Thisparameter has to be very low in order to assure goodreadings from the camera at a range of 30 cm. How-ever, it adds depth variance into the pixels of the im-age. In order to deal with this problem the averageof 15 3D-Range images is computed. In order toenhance the quality of the images, depth discontinu-ities due to jump edge surfaces are filtered using themethod proposed by [Fuchs and May, 2008].

In the constrained environment of our set-up, theobject is segmented from the background using adepth threshold.

3.2 Invariant Geometric FeatureExtraction and Rigid Registration

Different invariant geometric feature extraction andrigid registration methods have been studied and usedin the literature [Seeger and Laboureux, 2000]. Weuse a method proposed by Chua and Jarvis (1997).This method defines rotation and translation invariantfeatures called “Point Signatures” for every point of afree-form surface with the benefit that it does not needpartial derivative computations. This approach hasbeen chosen due to its simplicity and fast behaviour.

4 RESULTS

For parameter choices γ = 0.5, kn = 30, k = 1, c =20/kn, and ni = 1000, we compute the results of

model A and model B for a synthetic example. ModelB is then applied to the Swissranger sensor data. Thedata was scaled by a factor 0.04 before applicationof the model, and returned to its original scale after-wards.

Each aquired data point is assigned a mass pointin the model. We find all kn nearest neighbors defin-ing the neighborhood N(p) of each mass p. The sur-face normals are computed by finding ks = 12 nearestneighbors. The neighbors are not altered during thesimulation of the dynamical system.

−20 −15 −10 −5 0 5 10 15 20−20

−15

−10

−5

0

5

(a) Raw data

−20 −15 −10 −5 0 5 10 15 20−20

−15

−10

−5

0

5

(b) Coarse registration

−20 −15 −10 −5 0 5 10 15 20−20

−15

−10

−5

0

5

(c) Fine registration

−20 −15 −10 −5 0 5 10 15 20−20

−15

−10

−5

0

5

(d) Result model B

Figure 3: Modeling process of Swissranger data taken froma mug. (a) Raw data with superimposed unmerged views.(b) Coarse registration. (c) Fine registration. (d) Final resultusing model B.

As a synthetic test example, we choose an objectin the shape of an open box, which is then acquiredfrom two different view points, separated by an angleπ, via an orthographic projection. We further assumethat the data acquisition process is noisy. In the dataspace, the object is represented by a noisy point cloud.Two different views of this point cloud are shown inFig. 2 (a-b). For illustration purposes, we mark alldata points with a surface normal closer to a vectorpointing in the upper left corner as red, and all otherdata point as blue. We simulate the dynamical systemof model A for this synthetic example. Two views ofthe model data, corresponding to the views shown forthe raw data, are depicted in Fig. 2(c-d). The modeldata is much smoother than the original data. How-ever, since the interaction forces do not take into ac-count the topology of the object, oversmoothing isobserved at edge discontinuities. Further, the objectmodel is smaller than the original object. In Fig. 2(e-f), the result of model B are depicted. Unlike in modelA, the modeled object has the same size as the original

object, and edge discontinuities have been preservedduring the smoothing process.

We apply model B to the depth images of a ce-ramic mug acquired with the Swissranger sensor. Thedifferent stages of the modeling process, i.e. raw dataacquisition, coarse and fine registration, and results ofthe subsequent application of the spring-mass model,are shown in Fig. 3, a-d, respectively. At the end ofthe process, the mug model has a clear shape and mostof the outliers have been removed, except for the han-dle. In Fig. 4, the color-coded orientations of localsurface patches for the fine registration and the finalresult after applying model B are depicted. Surfacesof mug model B are smoother than the surfaces at thefine registration stage, thus providing a good basis forsubsequent applications.

−15 −10 −5 0 5 10 15 20

−20

−10

0

1055

60

65

70

75

80

(a) Fine registration

−15 −10 −5 0 5 10 15 20

−20

−10

0

1055

60

65

70

75

80

(b) Result model B

Figure 4: Color-coded orientation of local surface patchesfor the mug result (a) Fine registration. (b) Final result ofmodel B.

5 CONCLUSIONS

5.1 Summary

Within the larger context of a robot manipulation task,we proposed a 3D model for the reconstruction of 3Dobjects from Swissranger sensor data. Depth imagesare captured with a Swissranger sensor. After tak-ing depth images from different views, the imagesare merged via a coarse registration method [Chuaand Jarvis, 1997]. The shape of the 3D point cloudis enhanced through application of an iterative clos-est point algorithm [Besl and McKay, 1992]. Theresulting data is of low resolution and afflicted withlarge uncertainties. Modeling the 3D point cloud asa system of dynamic masses interacting via spring-like elastic forces reduces these deficiencies. Sincethe interaction between masses is dependent on theorientation of local surfaces fitted to the mass points,noise and outliers are removed without causing over-smoothing at edge discontinuities.

5.2 Future Work

Our long-term goal is to model 3D objects duringthe execution of a robot task. The proposed modelconstitutes a preliminary step before fitting higher-level surface descriptors which may then be used forview planning or action selection. The spring-massmodel further provides a framework for modeling de-formable objects. The outcome of actions can be pre-dicted by adding an external force to the dynamicalsystem of the object, e.g. representing a robot grip-per.

ACKNOWLEDGEMENTS

This work has received support from the BMBFfunded BCCN Gottingen, the EU Project PACO-PLUS under contract FPG-2004-IST-4-027657, andthe Generalitat de Catalunya through the Roboticsgroup. G. Alenya was supported by the CSIC under aJae-Doc Fellowship.

REFERENCES

Amenta, N., Bern, M., and Kamvysselis, M. (1998). Anew voronoi-based surface reconstruction algorithm.In SIGGRAPH ’98, pages 415–421, New York, NY,USA. ACM.

Besl, P. J. and McKay, N. D. (1992). A method for regis-tration of 3-d shapes. IEEE Transactions on PatternAnalysis and Machine Intelligence, 14(2):239–256.

Carr, J., Beatson, R., Cherrie, J., Mitchell, T., Fright, W.,McCallum, B., and Evans, T. (2001). Reconstructionand representation of 3d objects with radial basis func-tions. In Fuime, E., editor, Proceedings of SIGGRAPH2001, pages 67–76. ACM Press/ACM SIGGRAPH.

Chua, C. S. and Jarvis, R. (1997). Point signatures: a newrepresentation for 3d object recognition. InternationalJournal of Computer Vision, 25(1):63–85.

Cuen, S., Andrade, J., and Torras, C. (2008). Action selec-tion for robotic manipulation of deformable objects.In Proc. ESF-JSPS Conference on Experimental Cog-nitive Robotics, Kanagawa.

Curless, B. and Levoy, M. (1996). A volumetric methodfor building complex models from range images. InProceedings of SIGGRAPH’96.

Fuchs, S. and May, S. (2008). 3d pose estimation and map-ping with time-of-flight cameras. In Workshop on 3D-Mapping, IEEE International Conference on Intelli-gent Robots and Systems (IROS 2008).

Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., andStuetzle, W. (1992). Surface reconstruction from un-organized points. In SIGGRAPH ’92, pages 71–78,New York, NY, USA. ACM.

Liang, P. and Todhunter, J. S. (1990). Representationand recognition of surface shapes in range images.Computer Vision, Graphics and Image Procesing,52(10):78–109.

Oprisescu, S., Falie, D., Ciuc, M., and Buzuloiu, V. (2007).Measurements with tof cameras and their necessarycorrections. In Proc. ISSCS, pages 13–14.

Seeger, S. and Laboureux, X. (2000). Feature extractionand registration: An overview. In Girod, B., Greiner,G., and Niemann, H., editors, Principles of 3D Im-age Analysis and Synthesis, pages 153–166, Boston-Dordrecht-London. Kluwer Academic Publishers.

Szeliski, R., Tonnesen, D., and Terzopoulos, D. (1992).Modeling surfaces of arbitrary topology with dynamicparticles. CVPR’93, pages 82–87.

Tang, C. and Medioni, G. (2002). Curvature-augmentedtensor voting for shape inference from noisy 3d data.IEEE Transactions on Pattern Analysis and MachineIntelligence, 26(6):858–864.

Zhao, H., Osher, S., Merriman, B., and Kang, M.(2000). Implicit and non-parametric shape reconstruc-tion from unorganized data using a variational level setmethod. In Proceedings of CVIU.

3D OBJECT RECONSTRUCTION FROM SWISSRANGER SENSOR DATA ... · 3D OBJECT RECONSTRUCTION FROM...

Documents

Transcript of 3D OBJECT RECONSTRUCTION FROM SWISSRANGER SENSOR DATA ... · 3D OBJECT RECONSTRUCTION FROM...