An Authoring Solution for a Façade-Based AR Platform: Infrastructure, Annotation and...

22
DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF IRELAND MAYNOOTH An Authoring Solution for a Façade-Based AR Platform: Infrastructure, Annotation and Visualization Guillaume GALES Eric MCCLEAN John MCDONALD

description

Slides from our presentation at ISMAR 2012 Workshop Authoring Solutions for Augmented Reality

Transcript of An Authoring Solution for a Façade-Based AR Platform: Infrastructure, Annotation and...

Page 1: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

DEPARTMENT OF COMPUTER SCIENCENATIONAL UNIVERSITY OF IRELAND MAYNOOTH

An Authoring Solution for a Façade-Based AR Platform:

Infrastructure, Annotation and Visualization

Guillaume GALESEric MCCLEAN

John MCDONALD

Page 2: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Motivation

2

Input Image Augmented Image

Page 3: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Overall System

3

Mobile client side

Server side

Authoring client

Façade extraction

Façade Matching

Database

WidgetRetrieval

Rendering

Page 4: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Rendering

4

Page 5: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Outline

Façade ExtractionAuthoring solution• Façade database• 3D annotation

VisualizationConclusion, perspectives and discussion

5

Page 6: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Extraction

Page 7: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Extraction

Key of the authoring solution and visualization• Façades are the frame of references of 3D content• Augment a view with 3D content in urban environment from a single image

Input: image of a façadeOutput: homography between the façade and its imageAdvantages:• Robust matching (invariant to rotation and perspective changes)• Façade normalization (used to build a representation of the environment)• Camera pose estimation (used by the visualization system)

7

Page 8: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Extraction

8

Page 9: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Authoring Solution

Page 10: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Database

For each street, take pictures of façadesAutomatic façade extraction and matching• Geometrical constraint makes matching robust• Invariance to rotation and perspective changes

10

Page 11: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Database

11

Page 12: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Façade Database

Candidate selection• GPS coordinates• Bag-of-word description for selecting

candidates

Similarity constraint

Figure 2: Database infrastructure for computing planar facade mo-saics. The individual facades are stitched together into planes tobuild a frame of reference for authoring.

where x1 are the homogeneous coordinates of a pixel within the firstplanelet and x2 are the homogeneous coordinates of a pixel withinthe second planelet.

Here, we want to match a planelet against a plane. We havealready computed the feature points for our planelet. Let Xplaneletbe the matrix of their homogeneous coordinates in the coordinatesystem of the planelet. We also have selected subsections of plane,the strips, where the planelet is more likely to match. For each strip,we start by retrieving, the matrix of the homogeneous coordinatesof the feature points in the coordinate system of the plane. Theorigin of a plane the origin is given by the origin of the first planeletused to build this plane. We have:

Xplane = TXplanelet (2)

where we need to estimate the parameters of T.For each feature point from the planelet, we measure the Ham-

ming distance between its ORB descriptor and the feature pointsfrom the strip. If this distance is close enough, i.e. below a thresh-old, we add this match to the set of putative matches.

Next, we apply the RANSAC algorithm to get the largest con-sensus set of feature point matches that satisfies the geometric con-straint (2). If the number of matches from this best consensus set isgreater than some threshold, we assume that our planelet matchesthe current plane with a transformation T.

If a match is found for different strips from the same plane, weonly keep the one found with the largest consensus set. When amatch is validated, the planelet is saved to the database and markedas belonging to the plane with the geometric relationship T. Fol-lowing this, the feature points of the planelet are transformed intothe frame of reference of the plane (by transforming their coordi-nates by T) and added to the database.

T2T1

T1T!1

2

Figure 3: Merging. When a planelet matches two planes, it is firstmatched to the best one: i.e. the one giving the largest consensusset satisfying the geometric constraint (2). Then, all the planeletsfrom the second best plane are merged into the first one.

Finally, the strips for that plane are updated (or created) by up-dating (or calculating) the bag-of-words taking into account thenewly added feature points.

3.3 Merging

If a planelet matches more than one plane, and if the number offeature point matches satisfying (2) is greater than a threshold, weconsider that the planelet matches both planes. In this case, wematch the planelet with the first best result as described in the pre-vious section and combine the whole second plane with the first,as shown in Figure 3. The planelets defining the second planes aremarked as belonging to the other plane where the relationship iscalculated as follows:

Tplane1= T1T!1

2 Tplane2(3)

where Tplane2is the geometric relationship of a given planelet to

the second best plane. The coordinates of their feature points andstrips are also updated.

4 DISCUSSION ON POSSIBILITIES AND CHALLENGES FOR

AUTHORING SOLUTIONS

The infrastructure presented in this paper provides an intuitive andflexible platform for augmented reality applications. In particularwe distinguish between two levels of authoring:

• Base level authoring (Populating the image database) – Asdetailed in Section 3, the creation of the image database is au-tomatic due to the facade extraction algorithm. However, weassume that the facades from one street can be approximatedby a planar surface and that they exhibit discriminant features.This planar representation has a number of advantages. Forone street, only the descriptors of the feature points from theplane image need to be stored in the database which leads to a“light” database. Furthermore, once a street has been created,the planar representation can be reused and shared by manydifferent applications that can easily extend it to a 3D space toadd augmented content to that street.

• Content authoring (Adding 3D content) – Different applica-tions can add their own augmented content on top of the baselayer given by the planar representation of the facades of astreet. For example, we can imagine an application showingthe opening hours or the promotions of a store, virtual ex-hibits, social tagging, etc.. One of the advantages of usingfaccades, is that we can also orientate 3D objects within theuser view and therefore are not restricted to the 2D facades.For example, Figure 5 shows how the system can be used ina 3D navigation context. As detailed in [9], the facade basedmodel also provides an easy and intuitive interface for manual

12

Page 13: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

3D Annotation

Extension of façade to 3D: frame of reference

An Image Database Infrastructure for Authoring, Storing and Retrieval inAugmented Reality Mobile Applications

Guillaume GALES!

National University ofIreland Maynooth

Ireland

Eric MCCLEAN†

National University ofIreland Maynooth

Ireland

John MCDONALD‡

National University ofIreland Maynooth

Ireland

ABSTRACT

Content authoring is an important stage in the workflow of creat-ing rich augmented reality applications. In this paper we describea facade-based database infrastructure for authoring and storing 3D

content for use in urban environments. It provides frames of ref-erence for the environment as well as a mechanism to match newimages with the facades and thus retrieving associated 3D content.The infrastructure is flexible in that we can add different 3D “lay-ers” of content on top of the facades and hence opens many pos-sibilities for augmented reality applications in urban environments.Furthermore the system provides a representation suitable for bothmanual and automatic content authoring.

Keywords: Augmented Reality, Infrastructure, Authoring,Facade-based Database, Content Storing and Retrieving.

1 INTRODUCTION

Mobile augmented reality applications provide rich and useful in-formation to their users about their surrounding environment. Tocreate usable augmented reality applications, an efficient infrastruc-ture is required. Such infrastructures involve:

1. building a map of the environment ;

2. adding content ;

3. retrieving content.

In this paper we propose an infrastructure that makes authoringeasy, intuitive and flexible. Our goal is to create a platform for mo-bile applications to be used in an urban environment. Users take animage of a building facade with their mobile phone, then 3D wid-gets providing information about the building viewed in the sceneare displayed and correctly oriented relative to the scene. The au-thoring of such information is made easy by the infrastructure de-sign which provides an image of the facade in a viewpoint normal-ized space. This space is used as a frame of reference to store 3D

content.We use images of these facades to build a map of the environ-

ment, as well as a frame of reference to link 3D content. Our sys-tem is based on the fact that, in an urban environment, many of thefacades of a building can be approximated by planes.

We start by creating an image database where the images ofthe facades of the same street are stitched together to providea 2D frame of reference for that particular street. This frameof reference is then extended to 3D by using the normal of theplane as the third dimension. Next, given any particular loca-tion within that frame of reference, we can easily add 3D wid-gets providing information about that location. Finally, whena user submits an image of a facade from her/his mobile, this

!e-mail:[email protected]†e-mail: [email protected]‡e-mail: [email protected]

Figure 1: Overall system. The infrastructure of the image databaseprovide a 3D frame of reference to position and to store 3D content.

image is matched against the database to recognise the facade.Since there is an homography between the image of the in-put facade and the image stored in the database, we can eas-ily retrieve the coordinates of the augmented content associatedwith the user view. A video showing these steps is available athttp://www.cs.nuim.ie/research/vision/data/ismar2012/

The infrastructure described in this paper has many advantages:

• Easy and intuitive authoring – It makes authoring easy and in-tuitive by providing each building facade as a reference framefor 3D content. For example, we can position a 3D cup ofcoffee in front of the doors of a building to indicate there is acoffee shop inside this building.

• Flexible and expandable – The process of adding new streetsto the database is automatic. The system only requires theimages of the facades from new streets. Furthermore, oncea street has been added, it can be reused for many differentapplications by using different layers of 3D content.

• Scalable content model – When the database is queried, aquick preselection of potential facade matches based on GPScoordinates and bag-of-words make our solution fast and suit-able for a large amount of data. Furthermore, we do not need

13

Page 14: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

3D Annotation

Desktop application• Easy and intuitive interface to use for non expert users• Predefined list of 3D models

14

Page 15: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

3D Annotation

Mobile Application (ongoing work)

Manual Accurate GeolocalisationLong Tap to add widgets

15

Page 16: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Visualization

Page 17: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Data Retrieval

Façade ExtractionFaçade MatchingWidget Retrieval

An Image Database Infrastructure for Authoring, Storing and Retrieval inAugmented Reality Mobile Applications

Guillaume GALES!

National University ofIreland Maynooth

Ireland

Eric MCCLEAN†

National University ofIreland Maynooth

Ireland

John MCDONALD‡

National University ofIreland Maynooth

Ireland

ABSTRACT

Content authoring is an important stage in the workflow of creat-ing rich augmented reality applications. In this paper we describea facade-based database infrastructure for authoring and storing 3D

content for use in urban environments. It provides frames of ref-erence for the environment as well as a mechanism to match newimages with the facades and thus retrieving associated 3D content.The infrastructure is flexible in that we can add different 3D “lay-ers” of content on top of the facades and hence opens many pos-sibilities for augmented reality applications in urban environments.Furthermore the system provides a representation suitable for bothmanual and automatic content authoring.

Keywords: Augmented Reality, Infrastructure, Authoring,Facade-based Database, Content Storing and Retrieving.

1 INTRODUCTION

Mobile augmented reality applications provide rich and useful in-formation to their users about their surrounding environment. Tocreate usable augmented reality applications, an efficient infrastruc-ture is required. Such infrastructures involve:

1. building a map of the environment ;

2. adding content ;

3. retrieving content.

In this paper we propose an infrastructure that makes authoringeasy, intuitive and flexible. Our goal is to create a platform for mo-bile applications to be used in an urban environment. Users take animage of a building facade with their mobile phone, then 3D wid-gets providing information about the building viewed in the sceneare displayed and correctly oriented relative to the scene. The au-thoring of such information is made easy by the infrastructure de-sign which provides an image of the facade in a viewpoint normal-ized space. This space is used as a frame of reference to store 3D

content.We use images of these facades to build a map of the environ-

ment, as well as a frame of reference to link 3D content. Our sys-tem is based on the fact that, in an urban environment, many of thefacades of a building can be approximated by planes.

We start by creating an image database where the images ofthe facades of the same street are stitched together to providea 2D frame of reference for that particular street. This frameof reference is then extended to 3D by using the normal of theplane as the third dimension. Next, given any particular loca-tion within that frame of reference, we can easily add 3D wid-gets providing information about that location. Finally, whena user submits an image of a facade from her/his mobile, this

!e-mail:[email protected]†e-mail: [email protected]‡e-mail: [email protected]

Figure 1: Overall system. The infrastructure of the image databaseprovide a 3D frame of reference to position and to store 3D content.

image is matched against the database to recognise the facade.Since there is an homography between the image of the in-put facade and the image stored in the database, we can eas-ily retrieve the coordinates of the augmented content associatedwith the user view. A video showing these steps is available athttp://www.cs.nuim.ie/research/vision/data/ismar2012/

The infrastructure described in this paper has many advantages:

• Easy and intuitive authoring – It makes authoring easy and in-tuitive by providing each building facade as a reference framefor 3D content. For example, we can position a 3D cup ofcoffee in front of the doors of a building to indicate there is acoffee shop inside this building.

• Flexible and expandable – The process of adding new streetsto the database is automatic. The system only requires theimages of the facades from new streets. Furthermore, oncea street has been added, it can be reused for many differentapplications by using different layers of 3D content.

• Scalable content model – When the database is queried, aquick preselection of potential facade matches based on GPScoordinates and bag-of-words make our solution fast and suit-able for a large amount of data. Furthermore, we do not need

17

Page 18: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Camera Pose Estimation

intrinsics are knownextrinsics are given by the façade extraction algorithm (homography between plane and its image decomposed into rotation and translation)

Camera Façade

18

H = K�1⇥R t

H�1

Page 19: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Results

19

Page 20: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Conclusion, Perspectives and

Discussion

Page 21: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Conclusion, Perspectives & Discussion

Easy and intuitive authoring solution for AR applications based on façade extractionExtension to real time computation on mobile• Use of GPU and accelerometer

Extension to automatic dynamic authoring• Link with OpenStreetMap through GPS coordinates

Collaborative authoring client (and database updates)• HTML5• Mobile version

21

Page 22: An Authoring Solution for  a Façade-Based AR Platform: Infrastructure, Annotation  and Visualization

Acknowledgment

Research presented in this paper was funded by a Strategic Research Cluster grant (07/SRC/I1169) by Science Foundation Ireland under the National Development Plan. The authors gratefully acknowledge this support.Thank you for your attention

22