Light Field based 360o Panoramas - ULisboa...tornam-se volumes (3D), alterando o paradigma...

Light Field based 360o Panoramas

André Alexandre Rodrigues Oliveira

Thesis to obtaining the Master of Science Degree in

Electrical and Computer Engineering

Supervisor: Prof. Fernando Manuel Bernardo Pereira

Prof. João Miguel Duarte Ascenso

Prof. Catarina Isabel Carvalheiro Brites

Examination Committee

Chairperson: Prof. José Eduardo Charters Ribeiro da Cunha Sanguino

Supervisor: Prof. João Miguel Duarte Ascenso

Members of the Committee: Prof. Pedro António Amado Assunção

November 2016

iii

Resumo

Os panoramas 360º oferecem experiências mais imersivas aos utilizadores uma vez que permitem uma

navegação mais livre e intuitiva no mundo visual 3D, nomeadamente em qualquer direção desejada.

Recentemente, este tipo de conteúdo tem sido cada vez mais usado em vários domínios de aplicação,

oferecendo aos utilizadores a oportunidade de compreender mais profundamente o mundo 3D sem

restrições a priori ou ocultando ângulos específicos de visão. Os panoramas 360º estimulam a interação

do utilizador levando a um número crescente de visualizadores e ao aumento do tempo total de

consumo. A criação de panoramas 360º é normalmente alcançada através de um processo de ‘costura’

(stitching) que combina várias imagens com alguma sobreposição dos respectivos de campos de visão.

Geralmente, os sensores presentes nas câmeras convencionais apenas capturam a soma total da

luz que incide numa dada posição da lente. No entanto, esta é uma representação limitada do campo

luminoso da cena real o qual pode ser mais fielmente expresso através de uma função que caracteriza

a quantidade de luz viajando em cada direção e através de cada ponto no espaço, denominada por

função plenóptica. Recentemente, novos sensores e câmeras designados plenópticos ou de campos

de luz surgiram com a capacidade de capturar representações com mais elevada dimensionalidade da

informação visual do mundo real, por exemplo usando matrizes de micro-lentes no caminho ótico para

capturar a luz que incide em cada posição espacial (x,y) segundo qualquer direção angular (𝜃, 𝛷). Esta

representação oferece imagens mais ricas e logo funcionalidades adicionais como, por exemplo, a

possibilidade de focar em qualquer parte da imagem após a sua captura, mudar ligeiramente o ponto

de vista do utilizador, re-iluminar e re-colorir, selecionar automaticamente objetos com base na

informação de profundidade, entre outras. Com estas novas câmeras de campos de luz, as imagens

tornam-se volumes (3D), alterando o paradigma convencional de representação de imagens que usa

superfícies planas (2D).

Naturalmente, a criação de panoramas 360º utilizando imagens de campos de luz em vez de

imagens convencionais é um caminho excitante a percorrer considerando as potenciais funcionalidades

adicionais e a constante necessidade de oferecer ao utilizador uma experiência mais intensa e

envolvente.

O principal objetivo desta Tese de Mestrado é o desenvolvimento de uma solução para criação de

imagens panorâmicas baseada em campos de luz, capaz de explorar o potencial das emergentes

câmeras de campos de luz para a produção e consumo de panoramas 360º. Este trabalho irá combinar

a criação, ‘costura’, processamento, manipulação, interação e visualização adequada de panoramas

360º baseados em campos de luz de forma amigável para o utilizador. Para alcançar o objetivo

pretendido, esta dissertação começa por rever, analisar e discutir as soluções convencionais de criação

de panoramas 360º mais importantes e representativas da literatura. Apesar de a investigação na área

de criação de panoramas 360º baseada em campos de luz estar ainda numa fase inicial, já existem

algumas soluções propostas na literatura. Desta forma, esta dissertação analisa e revê também duas

soluções representativas da criação de panoramas 360º baseada em campos de luz. De seguida, será

apresentada a solução proposta para a criação de panoramas 360º baseada em campos de luz.

Finalmente, será feita a avaliação do desempenho desta solução através da apresentação e análise de

alguns panoramas criados com a solução proposta.

Palavras-chave: fotografia digital, criação de panoramas 360º, stitching, função plenóptica, campos de

luz

v

Abstract

360º panoramas bring more intense and immersive experiences to users since they support a free and

intuitive navigation in the 3D visual world, notably in any desired direction. Recently, this type of content

has been increasingly used in many application domains, providing the users the chance to deeply

understand the 3D world without a priori constraints or hiding specific viewing angles. 360° panoramas

stimulate user interaction leading to an increase of the viewership numbers and total media consumption

time. 360º panoramas creation is normally achieved by means of a stitching procedure which combines

multiple images with overlapping fields of view.

Generally, the sensors present in conventional cameras merely capture the total sum of the light

impinging in a given same position of the lens. This is clearly a limited representation of the real scene

light field which can be more faithfully expressed through a well-known function characterizing the

amount of light traveling in every direction and through every point in space, the so-called plenoptic

function. Recently, new sensors and light field or the so-called plenoptic cameras have emerged with

the capacity to capture higher dimensional representations of the world visual information, for example

using a micro-lens (i.e. lenslet) array in the optical path, which is able to capture the light for each spatial

position (x,y) coming from any angular direction (𝜃, 𝛷). This richer imaging representation offers

additional functionalities such as refocusing to any part of the image after the capture, slightly changing

the user viewpoint, relighting and recoloring, selecting objects automatically based on the depth

information, among others. With these new light field cameras, images became (3D) volumes, changing

the conventional imaging representation model that uses (2D) flat planes.

Naturally, the creation of 360º panoramas using light field images and not anymore conventional

images is an exciting path to pursue considering the potential additional functionalities and the constant

need to offer the user with more intense and immersive experience.

The main objective of this Master Degree Thesis is the development of a light field based 360º

imaging panorama creation solution able to exploit the potential of the emerging light field cameras for

360º panorama production and consumption. This work will combine the creation, stitching, processing,

manipulation, interaction and adequate visualization of light fields based 360º panoramas in a user

friendly way. To reach the intended objective, this dissertation starts with the review, analysis and

discussion of the most important and representative conventional 360º panoramas creation solutions in

the literature. While the research area on light fields based 360º panorama creation is still at its infancy,

some first solutions already exist in the literature. Thus, this dissertation also reviews and analyzes after

two representative solutions regarding light field based 360º panoramas creation. After the solution

proposed for light field based 360º panorama creation is presented. Finally, the performance of this

solution will be discussed through the presentation and analysis of some representative light field

panoramas created with the proposed solution.

Keywords: digital photography, 360º panorama creation, stitching, plenoptic function, light field

vii

Table of Contents

Resumo ................................................................................................................................................... iii

Abstract.....................................................................................................................................................v

List of Figures ...........................................................................................................................................x

List of Tables ......................................................................................................................................... xiii

Acronyms ............................................................................................................................................... xiv

1. Introduction ...................................................................................................................................... 1

1.1. Context and Motivation ................................................................................................................ 1

1.2. Objectives and Structure ............................................................................................................. 2

2. State-of-the-Art on Conventional 360o Panoramas Creation .......................................................... 4

2.1. Proposing an Architecture for Conventional 360º Panorama Creation ....................................... 4

2.2. Types of 360º Panoramas ........................................................................................................... 8

2.3. Reviewing the Main Conventional 360º Panorama Creation Solutions .................................... 12

2.3.1. Solution 1: Panoramic Image Creation Combining Patch-based Global and Local Alignment

Techniques ............................................................................................................................................ 12

A. Objectives and Technical Approach .............................................................................................. 12

B. Architecture and Main Tools .......................................................................................................... 13

C. Performance and Limitations ......................................................................................................... 14

2.3.2. Solution 2: Panoramic Image Creation using Invariant Feature based Alignment and Multi-

Band Blending ....................................................................................................................................... 16




2.3.3. Solution 3: Panoramic Image Creation using Invariant Feature based Alignment and

Seamless Image Stitching ..................................................................................................................... 21




2.3.4. Solution 4: Panoramic Image Creation using a Locally Adaptive Alignment Technique based

on Invariant Features ............................................................................................................................. 25


viii



3. Light Field based 360º Panoramas Creation ................................................................................. 28

3.1. Basic Concepts .......................................................................................................................... 28

3.2. Reviewing the Main Light Field based 360 Panorama Creation Solutions ............................... 31

3.2.1. Solution 1: Light Field based 360º Panorama Creation using Invariant Features based

Alignment ............................................................................................................................................... 31




3.2.2. Solution 2: Light Field based 360º Panorama Creation using Regular Ray Sampling ......... 34




4. Light Field based 360º Panorama Creation: Architecture and Tools……………………………….39

4.1. Global System Architecture and Walkthrough ........................................................................... 39

4.2. Light Field Toolbox Processing Description .............................................................................. 42

4.3. Main Tools: Detailed Description ............................................................................................... 46

4.3.1. Central Perspective Images Registration Processing Architecture ....................................... 46

4.3.2. Composition Processing Architecture ................................................................................... 49

5. Light Field based 360º Panoramas Creation: Assessment ………………………………………….53

5.1. Test Scenarios and Acquisition Conditions ............................................................................... 53

5.1.1. Test Scenarios ....................................................................................................................... 53

5.1.2. Acquisition Conditions ........................................................................................................... 56

5.2. Example Results and Analysis .................................................................................................. 57

5.2.1. Perspective Shift Capability Assessment .............................................................................. 57

5.2.2. Refocus Capability Assessment ............................................................................................ 69

6. Summary and Conclusions…………………….……………………………………………………….78

6.1. Summary and Conclusions ........................................................................................................ 78

6.2. Future Work ............................................................................................................................... 79

Bibliography ........................................................................................................................................... 87

Appendix A………………………………………………………………………………………………………82

x

List of Figures

Figure 1 – Light Field Cameras: Lytro (a) first and (b) second generation camera, respectively [1]; (c)

Raytrix camera [2]; .................................................................................................................................. 2

Figure 2 - Proposed architecture for the creation and interactive consumption of conventional 360º

panoramas. .............................................................................................................................................. 4

Figure 3 – The 3D sphere of vision displaying all the acquired images [14]. .......................................... 7

Figure 4 - Panorama projection impact and corresponding example: (a) and (b) Cylindrical projection

[14] [22] ; (c) and (d) Spherical projection [14] [22]. ................................................................................ 9

Figure 5 - Panorama projection impact and corresponding example: (a) and (b) Rectilinear projection

[14] [22]; (c) and (d) Fisheye projection [14] [22]; (e) and (f) Stereographic projection [14] [22]. ......... 11

Figure 6 - Panorama examples: (a) Sinusoidal projection; (b) Panini projection [22]. .......................... 12

Figure 7 – Architecture of the panoramic image creation solution combining pixel-based global and local

alignment techniques [23]...................................................................................................................... 13

Figure 8 – Mitigating misregistration errors by applying global alignment: (a) image mosaics with visible

gaps/overlaps; (b) corresponding image mosaics after applying the global adjustment technique; (c) and

(d) close-ups of left middle regions of (a) and (b), respectively [23]. .................................................... 15

Figure 9 – Mitigating the effect of motion parallax by applying local alignment: (a) image mosaic with

parallax; (b) image mosaic after applying a single deghosting step (patch size of 32); (c) image mosaic

after applying three times deghosting steps (patch sizes of 32, 16 and 8) [23]. ................................... 16

Figure 10 - Architecture of the invariant feature based automatic panoramic image creation solution. 17

Figure 11 – Recognizing panorama capability: (a) image collection containing connected sets of images

that will later form different panoramas and noise images; (b) 4 different blended panoramas outputted

by the panorama creation solution [24]. ................................................................................................ 19

Figure 12 – Panoramas produced: (a) without applying gain compensation technique; (b) with gain

compensation technique; (c) with both gain compensation and multi-band blending technique [24]. .. 20

Figure 13 - Architecture of the invariant feature based seamless HDR panorama creation solution [30].

............................................................................................................................................................... 22

Figure 14 - HDR panorama creation [30]: (a) Registered input images; (b) Results after applying the first

step of image selection: reference labels (left), resulting panoramic image (center) and tone-mapped

version of the panoramic image created (right); (c) Results after applying the second step of image

selection: final reference label (left), HDR panorama (center) and tone-mapped version of the HDR

compressed panorama (right). .............................................................................................................. 24

Figure 15 – Architecture of panoramic image creation solution using a locally adaptive alignment

technique based on invariant features. ................................................................................................. 25

xi

Figure 16 – Panoramas created with the Autostich and APAP solutions [36]. ..................................... 27

Figure 17 – Illustrating the plenoptic function [37]. ............................................................................... 29

Figure 18 – Light field cameras and imaging acquisition system: (a) Lytro Illum camera [1]; (b) Raytrix

camera [2]; (c) imaging acquisition system [39]; (d) micro images formed behind the micro-lens array.

............................................................................................................................................................... 30

Figure 19 – Architecture of the creation and interactive consumption of the light field based panoramic

image creation using invariant features based alignment. .................................................................... 32

Figure 20 – Panoramic Image created showing the regions of overlap between the all-in-focus grayscale

images [40]. ........................................................................................................................................... 33

Figure 21 - Architecture of the light field based panoramic image creation solution using regular ray

sampling. ............................................................................................................................................... 35

Figure 22 - Panoramas created with the AutoStitch solution and the light field based 360º panorama

creation solution reviewed in this section [42]. ...................................................................................... 37

Figure 23 – Illustration of the stitching process of light field images (represented as perspective images

stacks). .................................................................................................................................................. 40

Figure 24 – Global system architecture of the proposed light field based 360º panorama creation

solution. ................................................................................................................................................. 40

Figure 25 - Lytro Illum light field camera: (a) GRBG Bayer-pattern filter mosaic [49]; and (b) imaging

acquisition system [50]. ......................................................................................................................... 41

Figure 26 - Light Field Toolbox software: light field images processing architecture............................ 43

Figure 27 - Hexagonal micro-lens array: close up [52]; and (b) example of a white image and associated

estimated lenslet centers represented as red dots [51]. ....................................................................... 43

Figure 28 - Example of: (a) calibration pre-processed light field checkerboard image; (b) checkerboard

corners identification [51]....................................................................................................................... 44

Figure 29 - Example of an image: (a) before and (b) after demosaicing [53]. ...................................... 45

Figure 30 - Example of a demosaiced raw lenslet image before devignetting [43]............................... 45

Figure 31 – Central perspective images registration architecture of the proposed light field based 360º

panorama creation solution. .................................................................................................................. 47

Figure 32 – Features detected and extracted from 2 overlapping central perspective images. ........... 47

Figure 33 – Image Matching after applying RANSAC algorithm (inlier matches). ................................ 48

Figure 34 – Wave correction examples: (a) without and (b) with applying the panorama straightening

technique. Both examples presented are the final panorama that was obtained after all composition

steps. ..................................................................................................................................................... 49

xii

Figure 35 – Composition architecture of the proposed light field based 360º panorama creation solution.

............................................................................................................................................................... 50

Figure 36 - Image warping example: (a) before (b) after applying image warping in a central perspective

image. .................................................................................................................................................... 51

Figure 37 – Image mask example. ........................................................................................................ 51

Figure 38 – Central view for the Room with toys 1 fight field 360º panorama corresponding to test

scenario A.1. .......................................................................................................................................... 54

Figure 39 – Central view for the Room with toys 2 fight field 360º panorama corresponding to test

scenario A.2. .......................................................................................................................................... 54

Figure 40 – Light field 270º panoramas corresponding to test scenario B.3: (a) Sea landscape; and (b)

Park landscape. ..................................................................................................................................... 55

Figure 41 – Central view for the Empty Park fight field 300º panorama corresponding to test scenario

C.3. ........................................................................................................................................................ 55

Figure 42 – Full acquisition system used. ............................................................................................. 56

Figure 43 – Light field panorama presented as a 2D matrix of perspective panoramic images. .......... 58

Figure 44 – Extreme left perspective panorama example (position (8,1)) with undesired effects (such as

vignetting and blurring): (a) perspective panorama located at the border of the perspective panoramas;

(b) first and (c) second perspective images (extracted from the first and second acquired light field

images belonging to the presented light field panorama) used to compose the presented perspective

panorama............................................................................................................................................... 59

Figure 45 – Five perspectives extracted from the Room with toys 1 light field 360º panorama created for

the test scenario A.1: (a) central perspective (8,8); (b) left perspective (8,3); (c) right perspective (8,13);

(d) top perspective (2,8); and (e) bottom perspective (14,8). ................................................................ 60

Figure 46 – Horizontal perspective shift close-ups: (a) and (d) correspond to the two close-ups from the

left perspective (8,3); (b) and (e) correspond to the two close-ups from the central perspective (8,8);

lastly (c) and (f) correspond to the two close-ups from the right perspective (8,13). ............................ 61

Figure 47 - Vertical perspective shift close-ups: (a) and (d) correspond to the two close-ups from the top

perspective (2,8); (b) and (e) correspond to the two close-ups from the central perspective (8,8); lastly

(c) and (f) correspond to the two close-ups from the bottom perspective (14,8). ................................. 62

Figure 48 - Perspective extracted from the Room with toys 2 light field 360º panorama created for the

test scenario A.2: (a) central perspective (8,8); (b) and (c) two close-ups presenting camera

overexposure problems and 2D stitching artifacts. ............................................................................... 64

Figure 49 – Five perspectives extracted from the Sea landscape light field 270º panorama created for

the test scenario B.3: (a) central perspective (8,8); (b) left perspective (8,3); (c) right perspective (8,13);

(d) top perspective (2,8); and (e) bottom perspective (14,8). ................................................................ 66

xiii







Figure 52 – Three depth planes extracted from the Room with toys 1 light field 360º panorama and two

corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope = - 0.05

where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.25 where

(f) and (g) are the corresponding close-ups; (c) depth plane extracted with slope = 0.6 where (h) and (i)

are the corresponding close-ups. .......................................................................................................... 71

Figure 53 - Last depth plane extracted from the Room with toys 2 light field 360º panorama and two

corresponding close-ups: (a) depth plane extracted with slope = 0.6 where (b) and (c) are the

corresponding close-ups. Red rectangles highlight the close-ups that will be used to help visualizing the

focus in specific parts of the light field image. ....................................................................................... 72

Figure 54 - Three different depth planes extracted from the Sea landscape light field 270º panorama

and two corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope

= 0.15 where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.45

where (f) and (g) are the corresponding close-ups; (c) depth plane extracted with slope = 0.55 where

(h) is the corresponding close-up. ......................................................................................................... 74

Figure 55 - Three different depth planes extracted from the Park landscape light field 270º panorama

and two corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope

= 0 where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.15

where (f) and (g) are the corresponding close-ups; (c) depth plane extracted with slope = 0.25 where

(h) is the corresponding close-up. ......................................................................................................... 76







Figure 58 - Two different depth planes extracted from the Empty park light field 300º panorama and two

corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope = 0.25

where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.5 where (f)

and (g) are the corresponding close-ups. .............................................................................................. 85

xiv

List of Tables

Table 1 – Test scenarios characteristics. .............................................................................................. 54

xv

Acronyms

2D Two Dimensional

3D Three Dimensional

4D Four Dimensional

7D Seven Dimensional

APAP As-Projective-As-Possible

CMOS Complementary Metal-Oxide Semiconductor

DLT Direct Linear Transformation

EXIF Exchangeable Image File Format

FOV Field of View

GRBG Green Red Blue Green

HDR High Dynamic Range

JSON Java Script Object Notation

LF Light Field

LFT Light Field Toolbox

LMS Least Median of Squares

PNG Portable Network Graphics

RANSAC RANdom SAmple Consensus

RGB Red Green Blue

RMSE Root Mean-Squared Error

SIFT Scale-Invariant Feature Transform

SNR Signal-to-Noise Ratio

SURF Speeded Up Robust Features

Chapter 1

1. Introduction

This chapter will introduce the topic of this Thesis, which is the creation of light field based 360º

panoramas. In this context, this chapter begins by highlighting the context and motivation behind this

work, proceeding to defining its objectives and, finally, the structure of this document.

1.1. Context and Motivation

Photography is the process of recording visual information by capturing light rays on a light-sensitive

recording medium, e.g. film or digital sensors. The result of this process, the images, is one of the most

important communication medium for human beings, largely employed in a variety of application areas,

from art and science to all types of businesses. Over the years, photography related methods and

gadgets, e.g. digital cameras, have emerged to become more refined and better suit the increasing user

needs. However, the most common photography cameras developed up today present an important

limitation: whether analog or digital, they have a limited field of view (i.e. the part of the visual world that

is visible through the camera at a particular position and orientation in space), which is in general much

smaller than the human field of view. In summary, it is not an easy task to encompass wide fields of

view in a single camera shot.

With the desire to capture, in a single image, wide fields of view, panoramic photography has

emerged as a technique that combines a set of partially overlapping elementary images of a visual

scene acquired from a single camera location to obtain an image with a wide field of view. Contrary to

what might be thought, this type of photography is not so recent; however, with the emergence of digital

cameras, a number of new possibilities have been opened for panoramic image creation.

Conventionally, 360º panoramas creation is a process involving a sequence of different steps, starting

with the acquisition of several images representing different parts of the scene and ending with a

stitching process that combines the multiple overlapping images, resulting in the desired panorama.

However, this procedure suffers from several limitations associated to the conventional imaging

representation paradigm used where the images are just a collection of rectangular 2D projections for

some specific wavelength components. Generally, conventional cameras can merely capture the total

sum of the light rays that reach a certain point in the lens using the two available spatial dimensions at

2

the camera sensor (vertical and horizontal resolutions), thus leading to loss of valuable information from

the real scene light field. This valuable information is the directional distribution of the light of the scene,

which can be used in a number of different ways to improve the creation of panoramas and also to

provide several new functionalities to the users. Without the use of this precious visual information, the

user experience becomes greatly restricted and limited. The full real scene light field can be fully

expressed by a well-known function characterizing the amount of light traveling through every point in

space, in every direction, for any wavelength, along time, the so-called (7D) plenoptic function. For this

reason, the search for novel, more complete representations of the world visual information (i.e. higher

dimensional representations) has become a hot research field, in a demand to offer the users more

immersive, intense and faithful experiences.

Recently, with the emergence of new sensors and cameras, e.g. the plenoptic or light field cameras

illustrated in Figure 1, allowing higher dimensional representations of the visual information, by capturing

the angular light information, these problems have started to be addressed. These new cameras allow

to conveniently capture and record more light information since they have an innovative design where

a micro-lens array allows to capture the light for each position and from each angular direction, i.e. they

capture and record a 4D light field. Each of these micro-lenses captures a slightly different

perspective/view from the scene, allowing these cameras to record not only the location and intensity of

light, as a conventional digital camera does, but also to differentiate and record the light intensity for

each incident direction.

(a) (b) (c)

Figure 1 – Light Field Cameras: Lytro (a) first and (b) second generation camera, respectively [1]; (c) Raytrix camera [2];

This important characteristic allows capturing a much richer visual information representation of the

scene, which can be used to overcome the limitations related to the conventional imaging representation

and capture. For instance, this richer representation of the visual information brings additional interesting

features like the ability to a posteriori refocus any part of the image, relighting and recoloring, slightly

changing the user viewpoint, among others. All these new capabilities will inevitably lead to the

reinvention of concepts and functionalities associated with digital photography, and thus the 360º

panoramic image creation solutions will also suffer the impact of this new imaging representation

paradigm and associated available cameras.

1.2. Objectives and Structure

In the context described above, the main objective of this Master Thesis is the design, implementation

and assessment of a novel light field imaging based 360º panorama creation solution based on the best

available technologies. To achieve its goal, this dissertation starts by reviewing and analyzing the most

relevant conventional 360º panoramas creation solutions in the literature, after proposing a basic

3

architecture for this process. After reviewing the panorama creation solutions for conventional imaging,

this dissertation will move to the light field representation paradigm based on the plenoptic function.

Even though the research on light fields based 360º panorama creation is still at an early stage, some

first solutions already exist in the literature. Thus, this dissertation will review first the few available

solutions regarding light field based 360º panoramas creation. Finally, the Thesis will address the

design, implementation and assessment of a powerful light field based 360º panorama solution precisely

extending the available (non-light field based) methods and software.

To report the work developed, this dissertation is structured as follows: Chapter 2 will review the

state-of-the-art on conventional 360º panorama creation by, firstly, proposing a global architecture for

360o panorama creation, proceeding after to a presentation of the most common panorama types and,

lastly, the reviewing of the most relevant conventional 360º panorama creation solutions in the literature.

Then, Chapter 3 will review the state-of-the-art on light field based panorama creation by, firstly,

presenting the basic concepts behind the light field representation paradigm and then reviewing two

available light field based panorama creation solutions. Next, Chapter 4 will present the light field based

360º panorama creation solution proposed, starting by describing the global system architecture and

walkthrough, followed by a detailed description of the main parts, namely the light field data pre-

processing module and the key modules used to create the 360º panorama light field image. Then,

Chapter 5 will introduce the performance analysis of the developed panorama creation solution, starting

by presenting the test scenarios and adopted acquisition conditions, moving after to the visual inspection

and analysis of some representative light field panoramas. Lastly, Chapter 6 will conclude with a

summary and the future work plan.

4

Chapter 2

2. State-of-the-Art on Conventional 360o Panoramas

Creation

This chapter will present the main concepts, approaches and tools involved in the creation of

conventional 360º panoramas also known as full-view panoramas. With this goal in mind, it starts by

presenting a global architecture for the creation of 360º panoramas, proceeding after to the presentation

of several types of 360º panoramas and their characteristics. Finally, a brief review of some of the main

conventional panorama creation solutions available in the literature is presented.

2.1. Proposing an Architecture for Conventional 360º Panorama Creation

Creating a full-view 360º panorama is a complex task that involves a series of steps, since the acquisition

of image data from the scene until the final generation of a seamless 360º panorama to be experienced

through a rendered view which position is interactively selected by the user. The first target of this

chapter is to propose a global architecture for the creation of 360º panoramas designed to embrace the

main approaches available in the literature. The proposed architecture is presented in Figure 2:

Figure 2 - Proposed architecture for the creation and interactive consumption of conventional 360º panoramas.

In the following, a brief description on the various modules present in the proposed architecture is

presented:

Image Acquisition: This first step regards the acquisition of all images representing the 3D world

scene. In this architecture, it is considered that the camera performing the acquisition stays in the

same position while rotating around its nodal point or no-parallax point; this requires that the

Image

AcquisitionCalibration

3D World

ScenePre-Processing

Rendered

View

Final 360º

PanoramaProjection Blending

Global

AlignmentRendering

Motion

Based

Alignment

Feature

Detection

and

Extraction

Feature

Matching

Direct (pixel-based)

Registration

Feature-based Registration

5

camera is carefully mounted on a tripod or hand held levelly at a chosen stable position and stays

in the same position throughout the acquisition of all images. One image is taken for each rotation

of the camera, so that the final set of images covers the full scene but also has fixed overlapping

areas, which highly facilitates their future alignment and makes possible to produce a full-view

panorama with all the image contents fitted into a single frame [3].

Pre-Processing: In this step, some pre-processing of the acquired images may be needed, e.g.

to minimize differences between the used camera-lens combination and an ideal lens model with

the goal of correcting some optical defects such as distortion and different exposures between

the images [4].

Calibration: In this step, some important calibration data is extracted, notably the camera intrinsic

and extrinsic parameters [3], which are computed based on the acquired images/textures. The

intrinsic or internal camera parameters allow a mapping between the coordinates of each image

point and the corresponding coordinates in the camera reference frame (relying only on camera

characteristics). The extrinsic or external camera parameters relate the orientation and location

of the camera center with a known world reference coordinate system [5].

Registration: In this step, the set of acquired images is registered meaning that they are brought

into one single coordinate system and thus all images put together with corresponding

overlappings. There are two main types of techniques in the literature to perform the image

registration process:

- Direct (pixel-based) techniques: These solutions directly minimize the pixel-to-pixel

intensity dissimilarities to align the images. The main advantage of the direct (pixel-based)

techniques is that they make optimal/full use of the information available for the image

alignment process since they consider the contribution of every single pixel from all images;

their main disadvantage is that they have a limited range of convergence (they also need to

be initialized), implying that for photo-based panoramas they fail too often to be useful [6].

These techniques typically involve motion based alignment where a suitable error metric is

used to compare the pixels intensities in all images and a convenient search technique is

used to find the alignment where the most pixels agree. Generally speaking, there are two

ways of performing the alignment search: i) the first is to extensively try all possible

alignments, i.e., to perform a full search [6]; and ii) the alternative, faster way is to perform a

hierarchical coarse-to-fine search, i.e. a hierarchical motion estimation [7]. There are many

techniques for performing the image registration, notably by using methods based on Fourier

analysis [7]. In general, for panoramic applications, high accuracy is requested in the

alignment process to obtain acceptable results; thus, it is necessary to use sub-pixel precision

by adopting incremental methods, e.g. based on the Taylor series expansion [6].

- Feature-based techniques: These methods work by extracting a sparse set of interesting

points in each image and matching these points after to equivalent points in other images in

the collection [8]. The main advantage of feature-based techniques is that they are

computationally more efficient than the direct techniques and have a better range of

convergence without the need for initialization. The main disadvantage is that they have to

6

deal with regions in the images that do not fit well in the motion model selected to match these

points of interest in an image with similar points present in other images due to either moving

objects in the scene [9] or parallax differences, among others [6]. Registration solutions based

on features typically consider two main steps:

Feature Detection and Extraction: In this step, the goal is to detect and extract a set of

distinctive local features for previously detected keypoints (or points of interest) in each

image. Local features can be defined as distinctive parts of an image, like edges, corners

or blobs (i.e. regions of interest); it is desirable that these features are present in the

highest number of images for an easier alignment [8]. To extract those features, a

keypoint detector is first employed, which corresponds to a low-level image processing

operator that examines every pixel to check if there is a good feature present at that pixel.

The most important property for a keypoint detector is its representability, meaning that

keypoints should correspond to positions with high image expressive power. Some other

desirable characteristics for keypoint detectors include invariance to image noise, scale,

rotation and translation, affine transformations and blur. After finding the set of distinctive

keypoints using keypoint detectors, local feature descriptors are used to describe the

texture on a patch (in general a square) defined around the corresponding keypoint [6].

The most common local feature descriptors are the Scale-Invariant Feature Transform

(SIFT) [10] and Speeded Up Robust Features (SURF) [11], which offer scale and rotation

invariant properties. Feature descriptors must also be robust to small deformations or

keypoint localization errors and allow finding the corresponding pixel locations in other

images which capture the same information about spatial intensity patterns under

different conditions or perspectives [6].

Feature Matching: This step performs a match between the set of features detected and

extracted in the previous step in various images. With this goal in mind, it is necessary to

determine which features correspond to the same locations in different images and then

determine the appropriate mathematical model (i.e. estimate the homography) which

relates the features found in a given image with the corresponding features in another

image. After finding this model, it is expected that some features do not fit well in this

model, so they are classified as outlier features in opposition to the features fitting well in

the model, called inlier features; the outlier features are removed from the model. Finally,

the matching between the neighboring images is performed using the inlier features [6].

The two main methods to perform the selection between inlier and outlier features and

the appropriate matching models are RANSAC (RANdom SAmple Consensus) [12] and

LMS (Least Median of Squares) [13].

Global Alignment: In this step, the goal is to find a generally consistent set of alignment

parameters that reduce the accumulated registration errors between all pairs of images, thus

obtaining an optimally registered panoramic image. Generally, in most panoramic applications, it

is necessary to register more than just a pair of images and thus it is fundamental to extend the

pairwise matching criteria used to establish an alignment between a simple pair of images to a

7

global energy function that involves all images and their pose parameters (parameters coming

from the calibration step) [6]. For the creation of a 360º panorama, it is necessary to find the

precise location of all the acquired images in a single coordinate system which define the 3D

sphere of vision as shown in Figure 3.

Figure 3 – The 3D sphere of vision displaying all the acquired images [14].

To make this happen, every pixel in each image will be represented in some spherical coordinates

(yaw, pitch and roll) and thus associated to a position in the 3D sphere of vision surface. The most

relevant technique to adjust the pose parameters for a collection of overlapping images, thus

finding the optimal overall registration, is called bundle adjustment [15]. After combining multiples

images of the same scene into the 3D sphere of vision, it may be necessary to perform local

adjustments such as parallax removal to reduce double images and blurring due to local

misregistration [6]. If the acquisition provides an unordered set of images to register, it is

imperative to identify which images have overlapping areas with each other’s and fit them together

to form one or more different panoramas using a process called recognizing panoramas [16].

Projection: This step will project the aligned images into a final composing surface which

depends on the type of selected panorama. Section 2.2 will introduce the main types of available

360º panoramas which depend on the choice of the projection. In this process, each image is

successively projected according to the chosen composing surface coordinates, i.e. the mapping

between each source image pixels and the same composite surface is performed, giving to the

360º panorama their final visual signature [6].

Blending: Since there are typically overlapping areas and possibly moving objects, this step will

define the pixel values in each position of the final panorama, notably how to optimally weight or

blend the pixels in such a way that visible seams (due to exposure differences), blurring (due to

misregistration) and ghosting (due to moving objects) effects are minimized [6]. This final step

provides the 360o panorama which should allow after to render attractive looking views.

Rendering: In this final step, the created panorama is rendered with appropriate tools to create

a virtual tour, this means an appropriate view for each user viewing direction. This is possible by

projecting the created 360º panorama into a spherical grid or map representing the 360º sphere

of vision around a perspective point where the user’s eyes are positioned [17]. The user has the

possibility to interact with the panorama, e.g. using his/her mouse by rotating it in all directions,

to navigate through the whole scene, making zoom-ins and zoom-outs, etc. to enjoy an interactive

user experience.

8

2.2. Types of 360º Panoramas

Panoramas are basically defined by the type of projection used for their creation; the projection is one

of the modules included in the architecture proposed in the previous section. There are many different

types of panoramas, some of them full-view 360º panoramas and others only considering a limited part

of the 3D sphere of vision. Each of these different types of panoramas is characterized by a specific

geometric surface associated to the projection transformation equations, e.g. rectilinear panoramas

among many others. The different projections, also called transformations, that can be applied to the

set of acquired images produce a variety of available panorama types [18]. Each of the projections has

its specific field of view (FOV) corresponding to the extent of the visible world that may be presented to

the user when interacting with the resulting panorama.

The panoramic projections differ from each other both in terms of its mathematical definition as well

as its panorama characteristics, thus each type has its specific attributes and limitations [18]. Naturally,

there is some inevitable distortion when mapping the 3D sphere of vision onto a 2D flattened image.

This happens because, with the increase of the viewing angle, the viewing arc becomes more curved;

this effect is also where the differences between the various panorama projection types become more

evident. Typically, each panorama projection type tries to reduce one type of distortion at the expense

of other types of distortions, thus the decision of what panorama projection should be used largely

depends on the application scenario [19].

Although there is a large diversity of panorama types, the reality is that there are just a few which are

rather popular. There are also some more complex panorama types with some additional properties,

e.g. resulting from the combination of two or more basic panorama types. Some of the most common

projections and thus panorama types are:

Rectangular projections: In this projection, the horizontal distance is proportional to the

horizontal viewing/rotation angle or yaw angle (the horizontal field of view is an angle up to 360º)

and the vertical distance is proportional to the vertical viewing angle this means the angle from

below to above the horizon or pitch angle. There are several types of rectangular projections,

notably:

- Cylindrical projection: This type of projection results from involving the 3D sphere of

vision with a 2D flat plan, tangent to the equator. At the same time, a light is projected from

the center of the sphere to the outside. The cylindrical projection enables to produce

panoramas with a large latitude range (vertical viewing range), e.g. larger than 120º.

Although it can cover larger latitudes, e.g. up to 360º, near the poles the panorama

becomes much distorted, making a large range of latitudes not really usable; in practice,

the maximum range is typically 180º for both FOV [20]. This is related to the fact that this

projection shows all the vertical straight lines in the scene as straight lines in the final

panorama [21]. Figure 4(a) shows the impact produced by this projection on a globe and

Figure 4(b) shows an example of this type of panorama.

- Spherical projection (or equirectangular projection): This type of projection transforms

all the points in the 3D sphere of vision into latitude and longitude coordinates which are

9

directly converted into horizontal and vertical coordinates in a 2D flattened image. This

projection preserves all vertical lines and converts the horizon into a straight line across

the middle of the image (but does not preserve the remaining horizontal lines). The north

and south poles of the 3D sphere of vision are stretched across the entire width of the 2D

flattened resulting image [20]. The maximum FOV for this projection is up to 360º both in

the vertical and horizontal directions [21]. Figure 4(c) shows the impact produced by this

projection on a globe and Figure 4(d) shows an example of this type of panorama.

- Mercator projection: This type of projection represents a trade-off between the

Cylindrical and Spherical projections as it provides less vertical stretching and a larger

usable vertical FOV than the Cylindrical projection but shows more line curvature [19]. This

projection can be used up to 360º FOV horizontally and up to 180º FOV vertically. One of

the variations of this type of panorama is the Transverse Mercator projection which

corresponds to the 90º rotation of the traditional Mercator projection; this projection is

appropriate for very tall vertical panoramas [19].

(a) (b)

(c) (d)

Figure 4 - Panorama projection impact and corresponding example: (a) and (b) Cylindrical projection [14] [22] ; (c) and (d) Spherical projection [14] [22].

Azimuthal projections: These projections are characterized by rotational symmetry around

the center of the image and may take several forms:

- Rectilinear projection: This type of projection can be imagined as placing a flat 2D plan

tangent to the 3D sphere of vision at a single point, and projecting the light from the

sphere’s center. It has the property of preserving all the straight lines in the real 3D space

into the final projected panoramic image. The maximum FOV for this type of projection is,

for both directions, an angle up to 180º, making it inappropriate for images with very large

angles of view [21]. For the large angles of view, it can exaggerate the perspective of the

objects in the panorama, which appear distorted at the edges [19]. Figure 5(a) shows the

10

impact produced by this projection on a globe and Figure 5(b) shows an example of this

type of panorama. There is a sub-class of this projection type called Cubic projection that

organizes the images like the faces of a cube (90º x 90º FOVs) viewed from its center,

thus maintaining all straight lines straight [20].

- Fisheye projection: This type of projection creates a 2D flattened grid where the distance

from the center of the image to a certain point is proportional to the viewing angle; this

implies that straight lines become more curved as they move away from the center of the

final panorama. One of the limitations is that both the vertical and horizontal FOVs must

be 180º or less to fit the projected image into a circle [19] [21]. Figure 5(c) shows the impact

produced by this projection on a globe and Figure 5(d) shows an example of this type of

panorama.

- Stereographic projection (or little planet projection): This type of projection can be

used to create the illusion of a ‘little planet’ as it corresponds to a projection of the 3D

sphere of vision as seen from the pole onto the 2D flat plan. It is very similar to the Fisheye

projection. However, the distance to the center of the image is not equivalent to the spatial

angle, thus offering a better sense of perspective [20]. Large FOV show the same

perspective-exaggerating characteristic as in the rectilinear projection but less

pronounced. This type of projection has a maximum FOV of 360º, in both directions, and

does not preserve either the horizontal or vertical lines [21]. Figure 5(e) shows the impact

produced by this projection on a globe and Figure 5(f) shows an example of this type of

panorama.

- Equisolid projection: This type of projection is similar to a ‘mirror ball’ where the straight

lines passing close to the center are maintained but they become more curved when

approaching the boundaries of the panorama. Although the field of view can go close to

360º, the image is circularly limited at the edges, making it ideal when the distortion is not

critical [20].

(a) (b)

11

(c) (d)

(e) (f) Figure 5 - Panorama projection impact and corresponding example: (a) and (b) Rectilinear projection [14] [22];

(c) and (d) Fisheye projection [14] [22]; (e) and (f) Stereographic projection [14] [22].

Other more complex projections based on some of the previously presented projections are:

- Sinusoidal projection: This type of projection aims to guarantee equal areas throughout

all sections of the image, making possible to flatten the 3D sphere of vision and rolling it

back up again to the original sphere. This is similar to the Fisheye and Stereographic

projections. This characteristic is useful as it facilitates the projection of 3D sphere of vision

to a 2D plan while maintaining the resolution in all axis throughout the image, which results

into perfect horizontal latitude lines [19]. The maximum FOV for this type of projection is

360º in the horizontal direction and 180º in the vertical direction; it does not preserve either

horizontal or vertical lines [21]. Figure 6(a) shows an example of this type of panorama.

- Panini projection (or Vedutismo panorama): This type of projection maintains the

vertical lines vertical and the radial lines straight but displays the original horizontal straight

lines as curves; it offers a sense of correct perspective for wide angles of view with a single

central vanishing point. Straight lines which do not pass through the center will become

curved [20]. Figure 6(b) shows an example of this type of panorama.

12

(a) (b)

Figure 6 - Panorama examples: (a) Sinusoidal projection; (b) Panini projection [22].

2.3. Reviewing the Main Conventional 360º Panorama Creation Solutions

Conventional 360º panoramas capture the whole FOV around the point where the image collection is

acquired allowing an interactive user experience, e.g. through navigation over the whole scene. For this

to happen, it is first necessary to create a full-view panorama from the acquired images. In this section,

four representative conventional 360º panoramas creation solutions from the literature will be reviewed.

These solutions were selected considering their technical approach in order to make this review more

conceptually varied and thus useful for the reader. While the first solution considers a direct (pixel-

based) registration approach, the remaining solutions consider feature-based registration.

2.3.1. Solution 1: Panoramic Image Creation Combining Patch-based Global and

Local Alignment Techniques

This section will review the solution proposed by Shum and Szeliski in [23]. This solution proposes a

framework for full-view panoramic image creation where patch-based global and local alignment

techniques are combined to improve the quality of the created panorama. In the context of the previously

proposed architecture (see Section 2.1), this solution corresponds to a 360º panorama creation solution

based on a direct (pixel-based) registration technique.

A. Objectives and Technical Approach

The main objective of this solution is to enable the creation of high quality full-view panoramas from

images taken with handheld cameras by combining patch-based global and local alignment techniques.

To achieve this goal, this solution uses a rotational panorama representation, where each input

image is associated with a rotation matrix (and optionally a focal length), instead of an explicit projection

of all input images into a common composing surface. After a pairwise alignment of all images according

to the respective motions models, a global alignment technique is applied over the images collection to

reduce possible accumulated registration errors. A local alignment technique is then applied at the block

level, for each image, to further compensate for local misregistration (due to motion models inadequacy

or camera models estimation inaccuracy). By combining global and local alignment techniques, the

quality of the final panorama is significantly improved.

13

B. Architecture and Main Tools

Figure 7 depicts the architecture of the panoramic image creation solution reviewed in this section based

on the combination of both global and local alignment techniques [23].

Figure 7 – Architecture of the panoramic image creation solution combining pixel-based global and local

alignment techniques [23].

In the following, a short walkthrough is presented with the most interesting tools deserving more

detail:

1. 8-Parameter Perspective Mosaics – Firstly, if the camera intrinsic parameters are unknown,

an initial estimate for the transformation associated with each input image is obtained by

performing motion estimation between each input image and a warped version of the mosaic

(panoramic image) resulting from the previous images’ pairwise registration; in this case, an 8-

parameter perspective transformation (i.e. homography) is used in the warping process.

2. Estimate focal length – Based on the homography initial estimate associated with each pair of

image (computed in step 1), a rough estimate of the lens focal length is computed.

3. Rotational Mosaics – Once the images’ focal lengths are known, and assuming that the

camera is rotating around its optical center, one rotation matrix is estimated for each input image

considering that the mapping (transformation) between two images is described by a 3-

parameter rotational model (instead of the homography model previously used). Compared to

the homography, the 3-parameter rotational model has less degrees of freedom, which allows

a faster convergence in the rotation matrix estimation process and makes it more appropriate

to the scenario where the camera is rotating around its optical center. After associating one

rotation matrix with each input image, the image registration process can be performed in the

input image’s coordinate system, thus creating the rotational mosaics.

4. Patch-based Image Alignment – In this step, a patch-based image alignment algorithm is used

to align each image with a previously composited mosaic (resulting from previous images

registration) based on the rotational motion models computed in step 3. For this purpose, each

image is divided into a number of blocks or patches Pj (e.g. 8x8 samples blocks) and for each

14

patch center (belonging textured areas) its corresponding point is searched in an overlapping

image. By sequentially applying this algorithm to each input image, an initial panorama is

assembled.

5. Block Adjustment - In this step, a global alignment (block adjustment) technique is applied to

the whole set of images, adjusting each image’s transformation (i.e. rotation and focal length)

to minimize the accumulated registration errors (resulting from the previous step); this results

into an optimally (in the least square sense) registered mosaic. The pairwise alignment

performed in the previous step may not be optimal since it assumes that all pixels contained in

a given patch share the same behavior, thus a hierarchical or pyramid motion estimation

technique is adopted [7]. The adopted global alignment technique is based on establishing point

correspondences between images that have overlapping areas (from the patch-based

alignment step). After dividing each image into a number of patches (e.g. 16x16 samples), patch

center correspondences are found by inter-frame transformation.

6. Deghosting - After performing the global alignment (previous step), there may still be localized

registration errors present in the image mosaic, due to effects that were not taken into account

in the adopted camera model, such as camera translation and radial distortion, among others.

To compensate for these registration errors, thus making the images globally consistent, local

(patch-based optical flow) motion estimation is performed between overlapping images’ pairs.

The resulting (motion) displacements are then used to wrap the respective input image to

reduce localized registrations errors (ghosting) that might have subsisted to the global alignment

step. At the end of this step, the final panorama is available.

7. Panoramic Image Mosaics - The final panorama (created in the previous step) is stored as a

collection of images with associated geometrical (rotational) transformations.

8. Environmental Maps - In this post-processing step, the final panorama is converted (mapped)

into an arbitrary texture-mapped polyhedron surrounding the origin, called environmental map,

with the goal of exploring the virtual environment. The shape of the environmental map, i.e. the

panorama type or panorama projection, is a decision left up to the user.

C. Performance and Limitations

All the experiments reported [23] were performed using a rotational panorama representation with

unknown focal length. The overlapped area percentage between two neighboring images was around

50%. For the patch-based and global alignment steps, a patch size of 16, an alignment accuracy of 0.04

pixel and 3 levels in the pyramid were considered.

Figure 8 illustrates a panorama created from 6 images, acquired with a leveled and tilted up camera,

before (Figure 8(a)) and after (Figure 8(c)) applying the global alignment technique described in Section

3.2.3.1. B (step 5); Figure 8(c) and Figure 8(d) correspond to a close-up of Figure 8(a) and Figure 8(b),

respectively. In Figure 8 , the images do not cover the entire horizontal FOV in the 3D sphere of vision

(only 6 images were used).

15

(a) (b)

(c) (d)

Figure 8 – Mitigating misregistration errors by applying global alignment: (a) image mosaics with visible gaps/overlaps; (b) corresponding image mosaics after applying the global adjustment technique; (c) and (d)

close-ups of left middle regions of (a) and (b), respectively [23].

In Figure 8(c), which shows a close-up of double images on the middle left side of Figure 8(a), it is

possible to see a misalignment which is no longer noticeable in Figure 8(b) and Figure 8(d), after

applying the global alignment technique.

Figure 9(a) shows a panorama created from two images acquired with a handheld camera where

some camera translation occurs. The local misregistration resulting from the motion parallax introduced

by the camera translation is visible in Figure 9(a), notably through the double image (i.e. ghosting effect)

of the stop sign. This effect is significantly reduced using the local alignment (deghosting) technique

(described in Section 3.2.3.1. B – step 6), as shown in Figure 9(b). Nevertheless, some artifacts are still

visible due to the fact that this technique is patch-based (with a patch size of 32 in this example) instead

of pixel-based. To overcome this problem, the local alignment technique is repeatedly applied with

successively smaller patch sizes. Figure 9(c) shows the panorama image in Figure 9(a) after applying

the local alignment technique three times, with patch sizes of 32, 16 and 8.

(a) (b)

16

(c)

Figure 9 – Mitigating the effect of motion parallax by applying local alignment: (a) image mosaic with parallax; (b) image mosaic after applying a single deghosting step (patch size of 32); (c) image mosaic after applying

three times deghosting steps (patch sizes of 32, 16 and 8) [23].

As it is possible to observe in Figure 9(c), this local alignment iterative process has the advantage of

being able of refine the local alignment and handling large motion parallax, thus improving considerably

the quality of the panorama created.

The most important limitations of the panoramic image creation solution presented in this section

combining both global and local alignment techniques are: 1) filling a gap (between the first and last

image in the input set that occurs due to accumulated misregistration errors) or removing an overlap

present in a panoramic image only works well for a set of images with uniform motion steps (i.e. pure

panning motions) and requires that the set of images encompass the entire horizontal FOV in the 3D

sphere of vision; and 2) the global and local alignment techniques are patch-based (rather than doing

direct intensity difference minimization) to remove all visible artifacts present in the panorama.

2.3.2. Solution 2: Panoramic Image Creation using Invariant Feature based

Alignment and Multi-Band Blending

In this section the solution developed by Brown and Lowe in [24] will be reviewed. This solution proposes

an automatic panoramic image creation solution using feature based alignment and multi-band blending

techniques. Regarding the architecture previously proposed in Section 2.1, this work corresponds to a

360º panorama creation solution based on a feature-based registration technique.


The major objective of this second solution is to allow a fully automatic panoramic image creation, where

no input information on the image collection (e.g. images order) and no initialization of the image

alignment process is required from the user. This solution addresses the full-view panorama creation

problem as a multi-image matching, allowing to recognize panoramas from a collection of input images

containing several panoramas, where invariant local features are used to establish matches between

images in the collection, and multi-band blending is used to create seamless panoramas.

In this context, this solution starts by establishing accurate matches between the set of input images

using invariant local features, which remain unchanged with varying orientation, zoom and illumination

(due to changes in exposure/aperture and flash settings) in the input images; due to the features’

invariance properties, no input information on the image collection (e.g. image ordering) is required from

the user. After that, pairwise matching is established between each input image and overlapping images

17

with a large number of features matched; each matching images set defines a panoramic sequence. A

global alignment technique (bundle adjustment) is then applied over each matching images set to reduce

possible accumulated registration errors resulting from the previous registration step. Then, an

automatic panorama straightening technique is used to correct an possible wavy effect that might be

present in the panorama, due to relative camera motion over its optical center. Global compensation is

applied afterwards to reduce the effect of different intensities in overlapping images. Lastly, a multi-band

blending technique is used to minimize the effect of false edges in images overlapping regions that

might still be visible in the panorama image (due to unmodelled effects such as vignetting or even some

unwanted motion parallax by the camera optical center), ensuring a smoother transition between images

and allowing this solution to output seamless panoramas.


Figure 10 illustrates the architecture of the invariant feature based approach for fully automatic panoramic image creation solution reviewed in this section.

Figure 10 - Architecture of the invariant feature based automatic panoramic image creation solution.

A brief walkthrough of the architecture depicted in Figure 10 is presented in the following while

reviewing in more detail the most interesting tools:

1. Feature-based Registration: First, SIFT features [10] are detected and extracted from all input

images. Each feature location is assigned with a characteristic feature orientation and scale

identifier with the objective of selecting stable features i.e. features that remain constant under

changes of illumination, viewpoint or other viewing conditions and, therefore, can be extracted

even in images exhibiting orientation and zoom variations. The orientation and scale of each

SIFT feature is then saved in a feature descriptor vector. The feature orientation is useful when

the target image (i.e. image where a feature correspondence is looked for) is rotated with

respect to the reference image where the initial feature was extracted. The SIFT descriptor is

scale-invariant since it is computed by accumulating local gradients in orientation histograms

that are measured at the selected scale in a region/patch around each keypoint; this

characteristic provides robustness to affine changes since it enables edges to shift smoothly

without changing the local descriptor. By making use of gradients and normalizing the

descriptor, it is possible to achieve also illumination invariance. After all features have been

extracted from all input images in the collection, they are matched to its k nearest neighbors in

the descriptor space using a k-d tree method [25] (this solution considered k = 4).

2. Image Matching: For each input image, it is selected from all the overlapping images those

with the highest number of descriptor matches to the current input image, forming the set of

potentially matching images; in practice, a constant number of 6 images was considered for the

potential matching images set size. Then, the RANSAC algorithm [12] with DLT (Direct Linear

Transformation) [26] is applied to each pair of potentially candidate images, estimating the

transformation (i.e. homography) between the images. Feature/descriptor matches lying inside

Input

Images

Final 360º

Panorama

Automatic

Panorama

Straightening

Multi-band

Blending

Bundle

Adjustment

Feature-

based

Registration

Image

Matching

Gain

Compensation

18

the overlapping area that are geometrically consistent with the estimated homography form the

inlier features set while the remaining features (inside the overlapping area) that are not

geometrically consistent form the outlier features set. After, a probabilistic model is used to

verify the image matching based on the number of inliers. After establishing the pairwise

matches between images, it is possible to recognize different panoramas in the image collection

by clustering the image collection into connected sets of matching images; images that don’t

match any other image in the image collection don’t belong to a panorama and therefore are

considered noise images.

3. Bundle Adjustment - Once the set of geometrically consistent matches between the images is

found, it is necessary to perform global alignment over each cluster of matching images in order

to minimize accumulated errors resulting from the pairwise image registration in the previous

steps. To do that, a bundle adjustment technique [27] is used to estimate all the camera

parameters (i.e. rotation and focal length) simultaneously. Each image and its best matching

image (i.e. the one with the highest number of consistent matches) are added to the bundler

adjuster, one at a time; each time a new image is added, the bundler adjuster is initialized with

the same camera parameters (i.e. focal length and orientation) of the image to which it best

matches. After projecting each feature into overlapping images with corresponding features, the

camera parameters are then updated using the Levenberg-Marquardt algorithm [28] by

minimizing the sum of squared projection errors.

4. Automatic Panorama Straightening - The goal is here to remove a wavy effect that may be

present in the final panorama due to unknown 3D rotations relative to a chosen world coordinate

frame, which were not taken into account in previous registration steps (only the relative rotation

between different positions of the camera was considered). It is not very probable that the

camera performing the acquisition of all images is perfectly leveled and untilted. However, it is

reasonable to assume that twisting the camera relative to the horizon is something that people

rarely do thus minimizing the impact on this problem. In this context, the automatic panorama

straightening technique corrects the wavy effect from the panorama by applying a global rotation

such that the perpendicular vector to the plane containing both camera center and the horizon

is vertical to the projection plane.

5. Gain Compensation - The overall intensity gain between images is first computed by defining

an error function (in this case, the sum of gain normalized intensity errors) over all overlapping

samples. Once the gains are known, gain compensation is performed over all overlapping

samples thus reducing the intensity differences between overlapping images.

6. Multi-band Blending - Lastly, a multi-band blending technique [29] is used to reduce some

unwanted effects present in the final panorama (e.g. image edges that are still visible due to a

number of unmodelled effects, such as parallax effects, radial distortion, and vignetting, among

others). To solve this problem, each image, expressed in spherical coordinates, is iteratively

filtered, using a Gaussian filter with a different standard deviation value in each iteration, and a

high pass version of each image is created by subtracting the original image from its filtered

version (in each iteration); the high pass version of the image represents the spatial frequencies

19

in the range established by the Gaussian filter standard deviation value. Blending weight maps

are also created for each image in each (image filtering) iteration. The blending weight map is

initialized for each image by finding the set of samples in the previously created panoramic

image for which that image is the most responsible (for the sample values) and then is iteratively

filtered (at the same time the image filtering process takes place) using the same filter applied

to the image. The final panorama results from a weighted sum of all high pass filtered versions

of each overlapping image where the blending weights were obtained as previously described.

Therefore, low frequencies are blended over a large spatial range, while high frequencies use

a short range, thus allowing smooth transitions between images even with illumination changes

(while at the same time preserving high frequency details).


In the reported experiment [24], a set of 24 input images, depicted in Figure 11(a), was used: 19 images

that match to other images in the image set (and, thus, belong to a panorama); and 5 images that do

not match to any other images in the image collection (i.e. noise images). In Figure 11(a), each arrow

connecting a pair of images illustrates that a coherent set of features matches was detected between

those image pair. As it can be noticed from Figure 11(a), this solution can effectively establish accurate

matches between overlapping images.

(a)

(b)

Figure 11 – Recognizing panorama capability: (a) image collection containing connected sets of images that will later form different panoramas and noise images; (b) 4 different blended panoramas outputted by the

panorama creation solution [24]. Figure 11(b) shows the four different panoramas resulting from the automatic panoramic image

creation solution reviewed in this section. By observing Figure 11(b), it is possible to conclude that this

solution has the capability of effectively recognizing panoramas from a collection of input images

containing several panoramas while ignoring the noise images presented in the images collection.

20

Figure 12 shows a final panorama created from 57 different input images, with a spatial resolution of

2272x1704 pixels, acquired using the camera’s automatic mode, which enables the aperture of the

camera and the exposure time to change and flash to fire when appropriate for some images’

acquisition. Figure 12(a) shows the panorama created without applying the gain compensation

technique and Figure 12(b) shows the same panorama with the gain compensation technique. The

multi-band blending technique was applied to create a seamless 360ºx100º final panorama (with a

resolution of 8908x2552 pixels) projected in spherical coordinates.

(a)

(b)

(c)

Figure 12 – Panoramas produced: (a) without applying gain compensation technique; (b) with gain compensation technique; (c) with both gain compensation and multi-band blending technique [24].

The final panorama presented in Figure 12(c) shows (by comparison with Figure 12(a) and (b)) how

the gain compensation technique effectively reduces the effect produced by large changes in brightness

present in the input images; it also shows the multi-band blending technique capability to conveniently

smooth the remaining edges (after gain compensation) that were still noticeable due to unmodelled

effects such as vignetting.

The biggest limitations of the solution previously described are: 1) creating panoramic images based

on 3D world scenes containing many moving objects or several larger objects will introduce visible

artifacts in the final panorama as the multi-band blending technique (described in Section 2.3.2. B – step

6) was not designed to accommodate that type of scenes; 2) image collections with large changes in

brightness will also cause noticeable artifacts in the panorama because high brightness changes cannot

always be corrected by gain compensation and multi-band blending (described in Section 2.3.2. B –

step 5 and 6); and 3) cameras’ radial distortion will create noticeable artifacts in the final panorama as

21

well because this effect was not taken into account in the bundle adjustment technique (described in

Section 2.3.2 B – step 3) solution reviewed in this section.

2.3.3. Solution 3: Panoramic Image Creation using Invariant Feature based

Alignment and Seamless Image Stitching

The third solution that will be reviewed was developed by Eden et al. in [30]. This work focus on the

problem of seamless high dynamic range (HDR) panorama creation from scenes containing

considerable exposure differences, large motions and other misregistrations between images in the

input set. In the context of the previously proposed architecture (see Section 2.1), this work fits in the

class of 360º panorama creation solutions based on a feature-based registration.


This solution has the primary objective of enabling the creation of seamless HDR panoramas from a

collection of input images taken with standard handheld cameras that may contain large scene motions,

exposure differences and misregistrations between images, e.g. due to parallax and camera calibration

errors, among others.

To reach the intended objective, this solution starts by geometrically aligning the input images,

acquired at different orientations and exposures, using a feature based registration technique similar to

the one described in Section 2.3.2 [16] [24], which is able to handle exposure differences. After the input

images are geometrically aligned, they are radiometrically aligned by mapping the images to a common

global radiance space map; this is done by computing a radiance value for each image sample relating

the camera settings with each sample’s measured intensity (inversely mapped through the camera

model used). After determining the radiance values corresponding to each image, the final panorama is

created (in the radiance space) by selecting for each sample a radiance value from one of the images

in the input collection. Two distinct steps are here involved: first, a reference panorama covering the full

angular extent of the image collection is created from a subgroup of the (geometrically and

radiometrically) aligned images using a graph-cut technique similar to that described in [31] (while

working in the radiance space); this first step enables to identify and fix the presence of moving objects

in the acquired scene. In the second step, the dynamic range of the reference panorama is extended to

the one present in the collection of input images by using cost functions favoring sample values with

higher signal-to-noise ratio (SNR) while trying to ensure smooth transitions between the images. Lastly,

a blending technique can be applied to the reference panorama, resulting in the final seamless HDR

panorama.


Figure 13 illustrates the architecture of the seamless HDR panorama creation solution using invariant

feature based alignment and seamless image stitching, which was designed to handle large motion

scenes and exposure differences.

22

Figure 13 - Architecture of the invariant feature based seamless HDR panorama creation solution [30].

A brief walkthrough of this solution is presented in the following, giving more emphasis to the most

relevant tools (names for the modules come from the reference):

1. Capture with Varying Orientation and Exposure: In this first step, the acquisition of all images

that will be later used in the desired HDR panorama creation is performed. The input image

collection has the particularity of including images with large scene motion and considerable

exposure differences.

2. Geometrically Align: In this step, an automatic feature based alignment technique, similar to

the one described in Section 2.3.2 [16] [24], is used to geometrically align the input images. This

technique was chosen since it is capable of handling sets of input images acquired under

different exposure conditions.

3. Radiometrically Align: In this step, the set of input images already geometrically aligned (from

the previous step) undergoes a radiometric alignment process by computing a radiance value

for each image sample. This is done by relating the camera settings (shutter speed, aperture,

ISO and white balance) with the intensity measured at each sample; the camera is pre-

calibrated using a technique similar to the one in [32]. While the shutter speed, aperture and

ISO settings are extracted from EXIF (Exchangeable Image File Format) tags typically available

in digital still cameras [33], the white balance is computed from a rough estimate of the radiance.

This is achieved by selecting a reference image from the image collection with the desired color

balance and computing a per color channel gain (via least squares) so that the remaining

images match the color balance of the chosen reference image. After determining the gains,

they are applied inversely to the rough radiance estimate to normalize the color balance

differences, thus obtaining the final radiance value for each sample in each image.

4. Image Selection: In this step, the two major objectives are to detect and fix the presence of

moving objects in the scene and also to fill the entire dynamic range of the reference panorama

from the dynamic range available in all images from the input collection. This can be achieved

through two steps:

a. Firstly, a reference panorama is created using a subgroup of images from the input

collection already after the geometrical and radiometrical alignment performed in the

previous steps. The chosen subgroup of images has the particularity of including the

entire FOV present in the collection of input images, but not necessary the entire

available dynamic range. The reference panorama is created using a two-step graph

23

cut technique similar to [31], but with the particularity of working in the radiance space,

where each sample of the reference panorama has a corresponding value from one of

the images in the subgroup;

b. Lastly, the entire dynamic range available from all images contained in the input

collection is used to extend the dynamic range of the previously created reference

panorama (adding some details whenever possible). To do that, data and seam cost

functions are used to select (preferably) samples with higher signal-to-noise ratios

(samples with higher intensity have higher signal-to-noise ratios) while keeping smooth

transitions. Each selected sample is assigned with a label corresponding to the input

image identification where it comes from. An energy minimizing graph cut algorithm [34]

[35] is used to optimally select the sample (and corresponding label) minimizing both

cost functions.

5. Blend Images: In this step, an image blending technique is applied to create the HDR

panorama. In this case, the sample value for each position in the final HDR panorama is directly

copied from the corresponding input image according to the samples (and corresponding labels)

selection performed in the previous step.

6. Tonemap: Finally, a tone-mapper may be applied to the HDR panorama image if the panorama

image is to be displayed in a display with lower dynamic range; no reference to the tone-mapper

used has been provided in [30].


In the reported experiment [30], a set of input images acquired with a handheld rotating camera in auto-

bracket mode (i.e. the camera takes 3 different shots under different exposure conditions) was used to

create the HDR panorama. Figure 14(a) shows the set of input images after performing geometric

registration; it should be notice that the set of input images presents a considerable amount of parallax

and also includes a person moving in the scene. Figure 14(b) center illustrate the reference panorama

after applying the first step of image selection process and Figure 14(c) center illustrates the final HDR

panorama generated after applying the second step of image selection process. Figure 14(b) center

shows the reference panorama created from the input images with the shortest exposure and Figure

14(b) right illustrates the result of applying a tone-mapper to the reference panorama; this was done

only to demonstrate the presence of noise in the darker areas of the scene, since the reference

panorama does not consider the full available dynamic range from the image collection. Figure 14(b)

and (c) left shows the reference labels identifying the input image (more specifically the pixel value)

selected to compose the desired panorama, before and after the second step in the image selection

process, respectively. Figure 14(c) shows the result of applying the second step of the image selection

process to the reference panorama; as it can be seen, this step considerably improves the quality of the

reference panorama, but there are still visible artifacts due to parallax and motion (notice that the tone-

mapper also introduced some artifacts).

24

(a)

(b)

(c)

Figure 14 - HDR panorama creation [30]: (a) Registered input images; (b) Results after applying the first step of image selection: reference labels (left), resulting panoramic image (center) and tone-mapped version of the panoramic image created (right); (c) Results after applying the second step of image selection: final reference label (left), HDR panorama (center) and tone-mapped version of the

HDR compressed panorama (right).

From Figure 14(b) center and right, it is possible to conclude that, after the first step of image

selection process, the reference panorama encompasses the entire FOV of all images (Figure 14(c)

center and right shows the same FOV) but not the full available high dynamic range of all images present

in the input set. As shown in Figure 14(b) left, only two images from the input collection (color

corresponds to an image) were necessary for the reference panorama to include the entire FOV of the

input set. From Figure 14(c) center and right, it is possible to notice that, by using a few more images

(see Figure 14(c) left), the dynamic range of the previously created reference panorama is conveniently

extended, adding more detail in many places. Figure 14(c) center and right show the final HDR

panorama without significant misalignments but the transitions between the images used to form this

panorama are not very smooth. The reason for this behavior is that, as previously mentioned, the sample

value for each position in the final HDR panorama is directly copied from the corresponding input image

according to the samples (and corresponding labels) selection process.

The most important limitations of this third solution are: 1) lack of an automatic mechanism to properly

select which images describe the location of the objects in the creation of the reference image; 2) lack

of image blending and tone-mapping techniques specifically designed for the HDR panorama creation

scenario; and 3) vignetting effect present in images acquired with low quality digital still cameras is not

addressed.

25

2.3.4. Solution 4: Panoramic Image Creation using a Locally Adaptive Alignment

Technique based on Invariant Features

The fourth solution that will be reviewed was developed by Zaragoza et al. presented in [36]. This work

focuses on improving the accuracy of multiple images’ alignment by estimating a piecewise perspective

transformation to account for input data deviations from the global perspective model, while preserving

the geometric realism of the scene in the resulting full-view panorama. In the context of the previously

proposed architecture in Section 2.1, this solution corresponds to a 360º panorama creation solution

based on a feature-based registration approach.


The foremost objective of this solution, denominated as As-Projective-As-Possible (APAP) method, is

to enable a natural-looking panoramic image creation (ideally without visible misalignment artifacts) from

a collection of input images that differ from each other not only by rotation but also by translation; rotation

and translation variations are typical of a scenario where images are acquired by a casual user with a

handheld camera.

To achieve this goal, this solution begins by establishing accurate matches between the input images

in the collection using invariant features. Then, pairwise matching is performed between all overlapping

images, resulting in a set of global (rigid) homographies. The estimated global homographies are then

used to map all the keypoints from all overlapping images into a reference image selected from the

images collection. This reference image is then uniformly divided into a number of n x m cells, taking

each cell center as the cell representative sample. After that, a set of sample location dependent

homographies are determined between the reference image and overlapping images (i.e. mapping each

sample cell center and the remaining samples within the same cell of the reference image to one of the

remaining overlapping images) using the proposed Moving DLT method [36]. Then, a locally weighted

bundle adjustment [36] technique is used to simultaneously refine all the previously estimated adaptive

homographies, thereby improving the alignment between all overlapping images. Finally, an image

blending technique available in the literature is used to produce the final full-view blended panorama.


Figure 15 depicts the architecture of the panoramic image creation solution using a locally adaptive

alignment technique based on invariant features reviewed in this section.

Input

Images

Final 360º

Panorama

Image

Blending

Locally

Weighted

Bundle

Adjustment

Feature-based

Registration

Image

Matching

Figure 15 – Architecture of panoramic image creation solution using a locally adaptive alignment technique

based on invariant features.

A brief walkthrough of the architecture is presented in the following, where the most interesting tools

deserve more detail:

1. Feature-based Registration: In this step, a set of SIFT features is extracted from all input

26

images. Those features are then matched to its k-nearest neighbors using a k-d tree algorithm

[23], in a similar way to the process described in Section 2.3.2 B – Step 1.

2. Image Matching: In this step, pairwise matching between overlapping images is performed in

a similar way to the process described in Section 2.3.2 B (step 2) [24]. This matching method

applies the RANSAC algorithm [12] with DLT [26] over each potential overlapping matching pair

of images to estimate a global (rigid) homography between them. An image connection graph

is then constructed based on the estimated set of global homographies, thus relating

overlapping pairs of images, with the goal of identifying the input image with the highest number

of feature matches with the remaining overlapping images; this image is chosen as the

reference image. The set of estimated global homographies are then used to map all the

keypoints from all overlapping images into the reference image. The keypoints coordinates in

the reference image sharing the same identity (determined by the pairwise matches established)

are averaged. The outcome of this image matching step is a set of coordinates (i.e. a set of

sample locations) within the reference image, where each sample location is potentially

matched to a particular keypoint present in one of the remaining overlapping images.

3. Locally Weighted Bundle Adjustment: After determining the reference image and the set of

sample locations (within the reference image) potentially matched to keypoints in the remaining

overlapping images, the reference image is uniformly divided into n × m cells; the center sample

of each cell is taken as the cell’s representative sample. For each cell, local weights are

computed for the cell center based on the distance between each point matching (i.e. sample

location potentially matched to a keypoint in one overlapping image) and the cell’s

representative sample. After, for the representative sample of each cell in the reference image

for which a match has been found in one of the remaining overlapping images, a set of sample

location depend homographies is estimated using a proposed Moving DLT method [36]; this

method aims at estimating the projective warping (sample location dependent homography) that

better respects the local structure around the cells’ center sample. These homographies, locally

adapted to the cell’s representative sample, are then applied to every sample within the cell.

This way, it is possible to create an overall warping that flexibly adapts to the input data while

attempting to preserve the projective trend of the warp (homography); in fact, these adaptive

homographies reduce smoothly to a global homography as the camera translation becomes

zero. Then, bundle adjustment is used to simultaneously refine the set of sample location

dependent homographies by minimizing the transfer error of all established correspondences.

The transfer error is a weighted sum over all correspondences of the distance between each

keypoint in the overlapping images and the projective warp (obtained using the locally adapted

homographies previously estimated) of the potentially matched keypoint in the reference image.

4. Image Blending: Finally, the sample values for each position of the final panorama are obtained

by applying a blending method over all overlapping images, which minimizes seam effects that

may remain visible after the previous steps. In this work [36], this module has been implemented

using image blending techniques available in the literature (e.g. pixel intensity averaging,

feathering blending [6] and seam cutting [31]). For the results presented in Figure 16, the

27

overlapping images are simply blended using sample intensity averaging.


The results reported in this section [36] were obtained using 7 images of 2000x1329 pixels each,

corresponding to views differing in terms of rotations and translations [36]. The reference image was

partitioned into 100x100 cells and the number of keypoints matched within this image was 13380. The

aligned images were projected into a cylindrical surface and image blending was performed using simple

pixel intensity averaging. Figure 16 illustrates the resulting panoramas obtained with the APAP solution

and the solution reviewed in Section 2.3.2 (denominated here as Autostitch). The red circles in both

panoramic images highlight alignment errors.

Figure 16 – Panoramas created with the Autostich and APAP solutions [36].

As it can be observed in Figure 16, the panorama created with the Autostitch solution presents

considerable misalignment artifacts. This is due to the fact that this solution uses global homographies

(applied to all samples of each pair of overlapping images) to align the images, which are estimated

assuming that the overlapping images correspond to views purely differ by a rotation. Conversely, the

APAP solution, based on location dependent homographies estimation (via Moving DLT) and locally

weighted bundle adjustment, is able to handle in a more accurate way the input images deviations from

the aforementioned assumption; when compared to the Autostitch solution, the APAP approach leads

to much less misalignment artifacts in the scene static areas. However, both solutions cannot handle

large objects moving through the scene (such as people movement) without the use of pixel blending

methods.

The most important limitation of this fourth solution is that it is not able to properly handle large

moving objects in the scene (which inevitably produce visible artifacts in the final panoramic image),

without the use of advanced pixel blending methods.

28

Chapter 3

3. Light Field based 360º Panoramas Creation

In this chapter, the basic concepts as well as some first methodologies and tools involved in the creation

of 360º panoramas using a light field imaging representation will be addressed. To achieve this goal,

this chapter begins by presenting the basic concepts regarding light fields imaging representation,

proceeding after to the brief review of two representative solutions regarding light field based 360º

panoramas creation. Contrary to the previous chapter where conventional imaging 360º panoramas

creation was addressed, this chapter addresses 360º panoramas creation in the context of a new

imaging representation paradigm, light fields.

3.1. Basic Concepts

Since the beginning of photography, photographers have the inevitable desire of changing and editing

their pictures after they have been acquired. Conventional pictures are largely defined at the acquisition

moment, since parameters like the focus and the viewpoint are fixed at that time and cannot be change

after, thus creating several limitations associated to the traditional concept of photography. After the

acquisition moment, some visual information is irreversibly lost since conventional cameras only capture

in their sensors the total sum of light rays that strikes the same position in the camera lens, rather than

capturing the amount of light carried by each single light ray that contributes to the acquired image.

Whether they are analog of digital, conventional cameras only capture a two-dimensional representation

of the 3D world scene using the two available dimensions (the x and y axis) of the camera film/sensor.

This is clearly a limited imaging representation of the real visual scene.

Nowadays, with the recent emergence of new sensors and cameras capturing higher dimensional

representations of the visual world, the conventional imaging representation model corresponding to a

collection of rectangular samples for some wavelength ranges (i.e. 2D trichromatic image) is being

challenged. These tremendous developments, e.g. the Lytro [1] and Raytrix [2] cameras, are associated

with the need to provide the user with a more immersive, powerful and faithful visual experience of the

word around.

29

To better understand the impact of these innovative developments, it is useful to revisit the

fundamentals of the human vision process in a demand to discover and design new ways of representing

the visual world information [37]. In this context, it assumes particular relevance the so-called plenoptic

function, which represents all the visual information in the world by considering the intensity of light that

can be seen from any 3D spatial position (viewpoint at x,y,z coordinates), from every possible angle

(angular viewing direction 𝜃, 𝛷), for every wavelength, λ, and at every time, t. This complete function

provides a powerful representation framework of the world visual information using 7 degrees of

freedom, P(x,y,z,𝜃, 𝛷, 𝜆,t) [38], as shown in Figure 17. The intensity of the light transported in a light ray

is denoted as radiance, i.e. the magnitude of each light ray.

Figure 17 – Illustrating the plenoptic function [37].

Despite the plenoptic function being able to provide a complete, powerful and conceptually simple

7D representation of the visual world, its high dimensionality implies a tremendous amount of data.

Thus, for practical and realistic applications, it is necessary to conveniently sample it, perhaps reducing

its dimensionality in an appropriate way. Moreover, the adopted plenoptic function sampling will have to

be performed while avoiding the occurrence of aliasing. There are some useful assumptions to simplify

the sampling of the plenoptic function, notably: 1) considering that the light rays travel in a 3D space

without regions of occlusion (i.e. free space), the radiance carried in a light ray prevails unchanged along

its route through empty space, reducing one spatial dimension [39]; 2) fixing the time in the capturing

moment; and 3) considering only the wavelength of the electromagnetic waves in the visible spectrum

[37].

Considering the situation where the three previous sampling assumptions can be applied (i.e. a static

4D sampling of the plenoptic function), the notion of a 4D light field emerges corresponding to the

amount of light flowing through each point in space (x,y), for all directions (𝜃, 𝛷) [39], i.e. formally the

magnitude of each light ray (radiance) traveling in the empty space.

As mentioned above, the recent development of new sensors and cameras aims to provide a more

immersive and powerful visual experience to the user. The most notorious emerging developments in

this area are the so-called light field cameras which embrace this innovative imaging representation

approach; examples of the recent Lytro [1] and Raytrix [2] light field cameras are shown in Figure 18(a)

and (b), respectively. These new cameras allow to conveniently sample the plenoptic function as a 4D

light field, since they have a micro-lens (i.e. lenslet) array on the optical path, inserted between the

digital sensor and the main lens, as illustrated in Figure 18(c). Each lenslet captures a slightly different

30

perspective of the scene by considering directional information, meaning that the radiance for each

wavelength range is captured for each position and for each angle. Every sub-image (i.e. each

perspective view captured by each lenslet) differs a little bit from its neighbor sub-image, since the

incoming light rays are deflected slightly differently according to the corresponding lenslet position in the

array, as shown in Figure 18(d).

(a) (b)

(c) (d)

Figure 18 – Light field cameras and imaging acquisition system: (a) Lytro Illum camera [1]; (b) Raytrix camera [2]; (c) imaging acquisition system [39]; (d) micro images formed behind the micro-lens array.

Since these new cameras capture a higher dimensional representation of the visual information in a

scene, the resulting data allows new interesting functionalities compared to conventional photography,

notably: 1) to define what parts of the image should be in focus or not (i.e. interactive focus shifting)

after the image have been acquired (i.e. using post-processing); 2) to use larger apertures (i.e. the

opening of the lens's diaphragm through which light passes), facilitating photographs taken in low-light

environments without using a flash ; 2) to change slightly the viewpoint for the visualization process [39];

3) to perform realistic recoloring and relighting of the acquired scene by using the abundant visual

information captured; and 4) to generate 3D stereo images from a single light field camera by using the

captured information associated to the scene depth [37].

While using this innovative imaging representation paradigm corresponds to richer images produced

using light field cameras, these images cannot be directly and immediately displayed in a traditional

displays which are ready to display the conventional 2D arrays of radiance. Thus, to visualize the scene

in some chosen way, it is first necessary to computationally extract from the 4D light field representation

a specific 2D image corresponding to a selected viewpoint. To perform this task, it is necessary to use

the so-called computational imaging methods that are able to render 2D images from the captured 4D

light fields [39]. However, to fully benefit at the display time of the 4D light field representation, new light

field displays are already being developed which are able to replicate in front of the viewer the originally

captured directional light field.

31

Naturally, if 360º panoramas were until now created with multiple conventional 2D images, it is

expected to see in the future 360º panoramas created using multiple 4D light field images. Since based

on richer elementary images, it is expected that these new panoramas may provide the user with

additional functionalities. While this research field is still at its infancy, some first solutions already exist

in the literature and will be reviewed in the following.

3.2. Reviewing the Main Light Field based 360 Panorama Creation Solutions

This section will review two representative light field based 360º panorama creation solutions from the

literature. The solutions that will be reviewed were selected considering their technical approach in order

to make this review more thorough and valuable for the reader. The first reviewed solution follows an

approach similar to the traditional 360º panorama creation solutions (reviewed earlier in Section 2.3).

The second reviewed solution addresses light field based 360º panorama creation in a more innovative

way since it does not require performing depth reconstruction or extraction and matching of image

features as it happened in the first light field based solution reviewed.

3.2.1. Solution 1: Light Field based 360º Panorama Creation using Invariant

Features based Alignment

This section will review the solution developed by Lu et al. in [40]. This solution is designed to enable

light field based full-view panorama image creation using a feature based alignment technique to register

the set of input images. This solution corresponds to a light field based 360º panorama creation solution

using a feature-based registration approach.


The main objective of this solution is to enable light field based 360º panoramic image creation from

images acquired by the Lytro light field camera [1] using a feature-based image registration approach.

To accomplish this objective, this solution begins with image acquisition of the 3D world scene using

the Lytro light field camera [1]. Next, the depth information of the acquired scene is extracted by creating

a light field image stack per input collection image, where each image within the stack is focused at a

different depth. Then, each light field image stack is flattened into an all-in-focus image by copying, for

each sample position of the all-in-focus image, the (co-located) sample data from the most focused

image within each stack. After this, all the all-in-focus images (associated to the light field image stacks)

are converted to grayscale and features are extracted from each of them. Once the extracted the

features from all grayscale all-in-focus images, feature matching is performed between a given all-in-

focus image (reference image) and each one of the remaining all-in-focus images, estimating the

corresponding transformation between them. Using the estimated transformation, the non-reference

grayscale all-in-focus images are translated and warped to the reference image coordinate system.

Then, image blending and stitching technique are applied to all grayscale all-in-focus images (aligned

in the same coordinate system), generating the final light field panorama. Lastly, using a viewer tool

32

created by the authors of this solution, it is possible to enjoy an interactive view over the created light

field based panorama, giving to the user the possibility of performing zoom-in’s and zoom-out’s at any

specific location and to enjoy an interactive view over the created light field based panorama.


Figure 19 depicts the architecture of light field based panoramic image creation solution using invariant

features based alignment.

Figure 19 – Architecture of the creation and interactive consumption of the light field based panoramic image

creation using invariant features based alignment.

A succinct walkthrough of the architecture depicted in Figure 18 is presented in the following, giving

more detail to the most relevant technical tools used:

1. Light Field Acquisition: Firstly, images are acquired using the Lytro light field camera [1];

besides recording the location and intensity of light as a regular digital still camera does, the

Lytro camera also records the direction of the light rays. This camera produces pictures

originally in the RAW format.

2. 3D Data Extraction: In this step, the depth information of the scene photographed is extracted.

For that purpose, a stack of jpeg images, each one focused at a different depth, is created from

each image in the input collection (in the RAW format). This is done by using the software

provided by Lytro [1]. For each input image (in RAW format), this software generates the desired

light field image stack as a number of different images with a spatial resolution of 1080 x 1080

pixels each and associates a depth value to each image, corresponding to the depth of the focus

plane; each image within the stack defines a layer. The number of images (layers) in the stack

depends on how many different depth planes are necessary to display the RAW image under a

full range of focuses.

3. Feature Detection and Extraction: This step aims at detecting and extracting distinctive

features from each image. To achieve this goal, each light field image stack is first flattened into

an all-in-focus image. This is done by copying, for each sample position of the all-in-focus image,

the (co-located) sample data from the most focused layer; the most focused layer (image within

the stack) is determined by comparing the contrast at each sample position between layers.

This process allows easier and more accurate feature detection because it reduces the blurring

effect associated to image areas that are poorly focused. Each all-in-focus image (associated

to a light field image stack) is then converted to grayscale and SURF features [11] are extracted

from it. SURF descriptor characterize how pixel intensities are dispersed within a scale

dependent neighborhood of each keypoint detected using Fast Hessian and offer scale and

rotation invariant properties (similarly to the SIFT descriptors).

4. Image Alignment: After extracting features from all grayscale all-in-focus images, image

matching is performed. For that purpose, a reference image is first selected. In this case, the

3D

World

Scene

Image

Blending

and

Stitching

Light Field

Acquisition

Image

Alignment

Feature

Detection

and

Extraction

Rendered

ViewViewerViewer

3D Data

Extraction

Final 360º

Panorama

33

first all-in-focus image is considered as the reference image, which means that the remaining

all-in-focus images will be aligned relative to it. Feature matching is then performed between

the reference (all-in-focus) image and the remaining all-in-focus images. After that, the RANSAC

algorithm [12] is applied to each pair of all-in-focus images in order to estimate the

transformation (homography) between them. As it has been mentioned in Section 2.1 and 2.3.2

B – Step 2, feature matches that are geometrically consistent with the estimated homography

are considered inlier features while remaining features are considered outlier features and, for

that reason, are discarded. The estimated homographies are then used to translate and warp

each image to the reference image coordinate system. Once the images are correctly aligned,

a JSON (JavaScript Object Notation) file is created in order to store the correspondences

founded, including the location of the corners where the images align and the matching vertices.

5. Image Blending and Stitching: In this step, image blending and stitching techniques are

applied to the all-in-focus images previously aligned (step 4) to create the light field panorama.

Unfortunately, no details have been provided in [40] on the blending and stitching techniques

used.

6. Viewer: In this step, the user has the possibility to enjoy an interactive view of the final created

light field based panorama using an appropriated viewer tool. The viewer was created by the

authors and is an extension to the work developed by Behnam [41]. The user has the possibility

to interact with the created panorama, and thus enjoy an interactive experience, using the

mouse; it is possible to rotate, zoom-in and zoom-out the created panorama, etc. The zoom-ins

and zoom-outs are achieved by using the light field image stacks, which contain images focused

at different depths.


Figure 20 shows a panoramic image created using 3 input images, acquired with the Lytro light field

camera, differing from each other only by rotation; each input image has a spatial resolution of 1080 x

1080 pixels.

Figure 20 – Panoramic Image created showing the regions of overlap between the all-in-focus grayscale

images [40].

34

Observing Figure 20, it is possible to conclude that this solution is able of perform accurate matching

between the all-in-focus grayscale images. Furthermore, it is also possible to observe that the final light

field panoramic image created is globally consistent, with only few artifacts visible, e.g. see the visible

discontinuity in the lamp post in the overlapping region between the second and the third image.

The most significant limitations of this light field based panoramic image creation solution are: 1) not

all depth information (i.e. depth map) generated by the Lytro light field software [1] about the acquired

scene is used, although it could lead to better results in the alignment process and help reducing the

presence of visible artifacts in the final panoramic image; 2) the viewer created by the authors shows

alignment problems in the zoom out transition between the child image (i.e. the most focused image at

that specific location present in the stack) and the parent image (i.e. the all-in-focus image corresponding

to that specific location in the final panorama). This happens when the child image is not perfectly aligned

with the large panorama; 3) the image stacks created by the Lytro’s software are optimized for web use,

thereby this software tries to make the image stacks as small as possible and this may result in a stack

with only one image from which it is impossible to extract accurate depth information (if this happens,

the final panoramic image may not be focused and the zoom-ins and zoom-outs accuracy may become

compromised).

3.2.2. Solution 2: Light Field based 360º Panorama Creation using Regular Ray

Sampling

In this section, the solution developed by Birklbauer and Bimber in [42] will be reviewed. This solution

presents an innovative approach to record and compute light field based cylindrical 360º panoramas

where light rays are processed directly and, thus, it is not necessary to perform depth reconstruction or

extraction and matching of image features as it happened in the first light field based solution reviewed.


This solution has the primary objective of enabling light field based cylindrical 360º panoramic image

creation using all the information available from the set of input light fields acquired by the Lytro light

field camera [1] .

To achieve the primary objective, this solution starts with calibration of a Lytro camera based on the

calibration method described in [43], but with the particularity of supporting regular ray sampling. Then,

it is performed the acquisition of the scene’s light field using the Lytro camera [1] mounted on a

panoramic tripod head specially designed and constructed by the authors [42] for this purpose. Next,

the RAW data produced by the Lytro camera is decoded to a regular 4D light field based on the approach

presented in [43]. Afterward, all input light fields are registered based on a minimization of a global root

mean square (RMS) per-pixel luminance registration error related to a set of extrinsic registration

parameters. Then, this registration procedure is applied over all possible panorama projections (related

with the cylindrical parameterization adopted). Finally, a blending step is applied to the set of registered

input light rays, generating the desired final cylindrical 360º light field panorama.

35


Figure 21 illustrates the architecture of light field based panoramic image creation solution using regular ray sampling.

Figure 21 - Architecture of the light field based panoramic image creation solution using regular ray sampling.

In the following, a short walkthrough of the architecture depicted in Figure 21 is presented while

reviewing in more detail the most relevant tools:

1. Calibration: In this step, the Lytro camera is calibrated according to the calibration method

described in [43]. This method determines the intrinsic and extrinsic parameters of a light field

camera and, furthermore, undistorts and rectifies the acquired input light fields. The outcome of

calibration and rectification (as explained in [43]) is an intrinsic transformation matrix that

transforms ray-coordinates i, j, k, l (i.e. micro-image pixel coordinates and lenslet indices) to

coordinates of ray intersection with two parameterized planes S’T’ and U’V’ (where S’T’ and

U’V’ represent the perspective and focal planes inside and outside the camera housing).

However, this intrinsic transformation matrix does not allow the desired regular ray sampling

and, thus, it had to be updated. To do that, the light field was re-parameterized by shifting the

S’T’ and U’V’ planes along their normal (i.e. camera’s optical axis) such that all rays with equal

i, j pixel coordinates focus in the same position on the shifted S’T’ (referred as ST plane) and all

rays with equal lenslet indices k, l have the same coordinates on the shifted U’V’ plane (referred

as UV plane). This led to a new intrinsic transformation matrix that allows regular ray sampling,

as desired. The intrinsic parameters rely on the desired zoom and focus settings of the camera’s

main optics and must be pre-calibrated according to those settings.

2. Light Field Acquisition: Firstly, the scene’s light field is acquired by rotating the panoramic

tripod head in which the Lytro light field camera [1] is mounted on; the exposure and ISO speed

were kept constant. The Lytro camera has a 331 x 382 hexagonal micro-lens array ahead of a

3280 x 3280 CMOS sensor. Thereby, the input light fields present a spatial resolution of 331 x

382 and an angular resolution of 11 x 11. This Lytro camera produces data originally in the RAW

format.

3. Decoding Lytro RAW Data: In this step, the RAW data produced by the Lytro camera (i.e. a

sensor image of nested hexagonally disposed micro-images) is converted into 9 x 9 perspective

images, each one having a resolution of 331 x 382, through sampling of corresponding entries

per perspective in each micro-image. The decoding the RAW data produced by the Lytro

camera to a regular 4D light field involves several steps, such as demosaicing, vignetting

correction, alignment, rectification, color correction and white balancing; while the former steps

are similar to the ones presented in [43], the latter two steps were performed by applying the

parameters extracted from metadata of the first input light field to the remaining acquired light

fields of the same panorama, to prevent visible seams after the blending step. Then, each

perspective image is upsampled by a factor of two using linear interpolation (which depends on

image gradients in order to preserve edges) of the sub-pixel-sifted neighborhood.

3D

World

Scene

Final 360º

PanoramaRegistration Blending

Decoding

Lytro RAW

Data

CalibrationLight Field

AcquisitionRe-parameterization

36

4. Registration: In this step, all the input light fields are registered. Firstly, a set of (ideal) extrinsic

registration parameters is determined based on the pre-calibrated intrinsic parameters of the

light field camera (obtained for the desired focus and zoom settings of the camera’s main optics)

and the input light fields acquired under camera rotation. The extrinsic registration parameters

(i.e. the distance of the rotated ST planes to a common rotation axis dr and the angles of rotation

αi between each successive input light field pair) are related with a chosen cylindrical light field

parameterization, which requires a multi-view circular projection in one direction and a multi-

view perspective projection in the other direction [44]. Therefore, in a cylindrical light field

parameterization, rays are expressed on a nested UV and ST cylinders instead of being

expressed in two parallel planes. The horizontal and vertical perspectives are characterized by

an angle β and a height h, respectively. Thus, each ray is parameterized by the parameters β,

h and by its intersection with the UV cylinder. For a given rotation angle between a pair of input

light fields, all rays belonging to that pair (in two-plane parameterization) are transformed with

respect to the current extrinsic parameters (dr and αi) and the rays (in cylindrical

parameterization) corresponding to the center perspective (β and h equal to zero) are selected.

The selected rays are projected from the ST plane onto the UV cylinder and are linearly

interpolated to match the sampling grid on the UV cylinder. Thus, for each pair of input light

fields, two partly overlapping cylindrically projected images (within the same cylindrical

parameter space) are computed. By minimizing the RMS per-pixel luminance error in the

overlapping image regions the optimal angle of rotation between the corresponding input light

field pair is found. This procedure is then applied successively to the remaining pairs of input

light fields and the corresponding RMS per-pixel luminance error (i.e. registration error) is

accumulated, obtaining the global registration error of the panorama for a given dr value. The

optimal dr value is the one that minimizes the global registration error.

5. Re-parameterization: In this step, the procedure adopted to register the input light fields

according to the center of perspective (β and h equal to zero) are repeated for all remaining

panorama perspectives (β and h different from zero), considering simultaneously an appropriate

sampling in horizontal and vertical angular perspective directions. This is done with the goal of

guaranteeing that the final cylindrical 360º light field panorama supports the same perspective

sampling as the input light fields and that this sampling is symmetric in both directions.

6. Blending: In this step, the set of different input light rays are blended in order to create the final

seamless cylindrical 360º panorama light field. Firstly, every necessary ray belonging to the

same input light field is linearly weighted using a vertically constant hat function. The hat function

is centered with respect to each pair of light field projections at the center of the input ray bundle.

After that, and considering each horizontal perspective, the overlapping light field projections

are alpha-blended in order to create a pleasant panorama perspective. At the end of this step,

the final cylindrical 360º light field panorama is available.


In the reported experiments [42], the light field acquisition was performed using a Lytro light field camera

37

[1] (in everyday mode, i.e. default refocus range, with 1.5x zoom) mounted on a panoramic tripod head.

Figure 22 depicts the resulting light field panoramas obtained with the proposed solution (first, second

and third rows) and with the solution reviewed in Section 2.3.2 (fourth row), denominated here as

AutoStitch; in the latter solution (AutoStitch), the panorama has been obtained through an independent

stitching procedure of the perspective images. The AutoStitch solution failed in globally registering full

360º panoramas of complex acquired scenes, thus only fractions of the set of input light fields were

registered together. The resulting light field panorama created with the solution reviewed in this section

has a resolution of 8,805 x 662 x 9 x 9. The first and second rows of Figure 22 illustrate the final light

field panorama focused on the furthest (back focus) and closest (front focus) plan of the acquired scene.

The green squares in both light field panoramas (back and front focus) highlight two specific sections

that were compared between both solutions (the solution reviewed in this section and the AutoStitch

solution) in terms of refocusing (Figure 22 third and fourth rows).

Figure 22 - Panoramas created with the AutoStitch solution and the light field based 360º panorama creation solution reviewed in this section [42].

Since the AutoStitch solution processes the spatial and directional domain of the light field

independently, the light field panorama outputted by this solution is inconsistent in the directional

domain. As it can be observed from Figure 22 fourth row, this leads to notorious artifacts, particularly in

refocusing, that are mainly due to relatively large parallax in the acquired scene. Conversely, the solution

reviewed in this section processes spatial and directional domains of the acquired light field jointly and,

therefore, provides correct results, in particularly when trying to refocusing (Figure 22 third row).

The biggest limitations of the solution reviewed in this section are: 1) this solution shares the

limitations of common light field acquisition (e.g. artifacts in the input light fields, due to under-sampling

in the spatial or direction domain, will also be noticeable in the outputted light field panorama); 2) this

38

solution demands a dense ray sampling (i.e. small parallax between light field perspectives) since it

uses linear interpolation to estimate missing light rays in the registration and re-parameterization

processes; and 3) this solution is based on some assumptions (e.g. the rotation point belongs to the

optical axis, the camera does not rotate around its optical axis, among others) which, if strongly infringed,

can lead to failures.

39

Chapter 4

4. Light Field based 360º Panorama Creation:

Architecture and Tools

In this chapter the light field based 360º panorama creation solution proposed is presented. In this

context, it starts by describing the global system architecture and walkthrough, followed by a detailed

description of the main parts, namely the light field data pre-processing module (which corresponds to

a solution already available in the literature [45]) and the key modules used to create the panorama light

field image. The creation of a light field panoramic image requires taking multiple light field images with

a suitable camera, at different camera angles. The light field panorama created should preserve the

reality of the scene as much as possible, thus preserving the directional ray information, which allows

to change a posteriori the perspective and the objects in focus.

4.1. Global System Architecture and Walkthrough

The main goal of this section is to describe the global system architecture and walkthrough of the

proposed light field based 360º panorama creation solution. Each light field image is acquired with the

Lytro Illum camera [46] and is represented as a 2D matrix of 15x15 sub-aperture images (called here

as a perspective image stack). The idea followed here is to create a light-field panorama from a set of

2D panoramas obtained by stitching all the perspective image stacks (light field images). This method,

which is named here as multi-perspective image stitching, stiches together a set of perspective images

at the same location of the image stack, using classical 2D panorama creation techniques. Figure 23

illustrates 3 different light field images (as perspective images stacks) and the association between

perspective images of different light field captures which will be used as input for the stitching process.

The idea is to first perform stitching in the central image (yellow rectangles and arrows) located at

position (8,8) of the sub-aperture light field image (note that it is considered that the first sub-aperture

image, which is a black sub-aperture image, have an indexing of (1,1)) and, then, derive some

(registration) parameters which are used to make the stitching in the remaining perspective images (red

rectangles and arrows). This will allow to maintain the disparity between the different perspectives of

40

the light field panoramas similar to the disparity between the perspectives of each of the light field image

(before stitching). In addition, by using the same (registration) parameters obtained from the central

perspective image is possible to make this process more coherent, i.e. any stitching errors will occur in

all perspective images and the perspective image content will only change due to occlusions or new

content and not due to a different deformation or blending between adjacent perspective images. From

the stitching process it is obtained a set of 2D perspective panoramic images which are regarded as the

final panoramic light field. The perspective images that are within the matrix 4 regions (for each light

field image presented) highlighted with green and labeled with a green letter “B” are not used to create

the final light field 360º panorama because they are black or very dark images and thus not useful.

These perspectives are replaced in the final light field 360º panorama by black panoramas. All the

perspective panoramas used (i.e. the 2D perspective panoramas and the black panoramas) have the

same resolution.

Figure 23 – Illustration of the stitching process of light field images (represented as perspective images stacks).

The proposed multi-perspective image stitching solution is inspired by the work of Brown and Lowe

in [24] previously reviewed in Section 2.3.2. This solution is able to create a light field based 360º

panorama using a feature-based registration approach. The type of projection to create the final light

field 360º panorama is the spherical projection. Figure 24 depicts the global system architecture of the

proposed light field based 360º panorama creation solution.

Figure 24 – Global system architecture of the proposed light field based 360º panorama creation solution.

In the following, a brief walkthrough of the proposed solution depicted in Figure 24 is presented:

1. Light Field Acquisition: In this step, the light field of a visual scene is acquired from different

perspectives using a Lytro Illum light field camera [46], a Nodal Ninja 4 panoramic tripod head

[47] and a Manfrotto 190CXPRO 4 tripod [48]. The Lytro Illum camera is mounted on the

panoramic tripod head which rotates around a central rotation point, with a constant rotation

angle between each acquisition, to perform the acquisition of all parts (in the horizontal plane)

of the visual scene. Due to the acquisition procedure, the light field panorama obtained may

have a FOV of 360º in the horizontal direction and approximately 62º of FOV in the vertical plane

(corresponds to the vertical FOV of the Lytro Illum camera since no vertical rotation is performed

in this acquisition step). Regarding the Lytro Illum camera, the light rays are collected by a

CMOS sensor (with 7728 × 5368 samples) containing an array of pixel sensors organized in a

Light Field Image 2Light Field Image 1 Light Field Image 3

B

B

B

B

B

B

B

B

B

B

B

B

Light Field

Acquisition

Light Field Data

Pre-Processing

Central

Perspective

Images

Registration

Light Field 360º

Panorama

Creation

360º Light

Field

Panorama

3D

World

Scene

Composition

41

Bayer-pattern filter mosaic as illustrated in Figure 25(a); this sensor produces GRBG RAW

samples with 10 bit/sample. Naturally, a lenslet array on the optical (illustrated in Figure 25(b))

path allows to capture the different light directions. The Lytro camera stores the acquired

information in the so-called .LFR files; this is a container format that stores various types of data,

notably the Raw Bayer pattern GRBG image, associated metadata, a thumbnail in PNG format

and system settings, among others. The remaining acquisition conditions are described in

Section 5.1.

(a) (b)

Figure 25 - Lytro Illum light field camera: (a) GRBG Bayer-pattern filter mosaic [49]; and (b) imaging acquisition system [50].

2. Light Field Data Pre-Processing: In this step, the RAW light field data produced by the Lytro

Illum camera is pre-processed to obtain a 4D light field, i.e. a 4D dimensional array with two

ray directions and two spatial indices of pixel RGB data. In this case, several operations are

performed, namely demosaicing, devignetting, transforming and slicing, and finally color

correction (rectification light field processing is not performed). The pre-processing applied over

the RAW light field data is done using the Light Field Toolbox developed by D. Dansereau [51],

which is explained in detail in Section 4.2. After, all the 2D perspective images (of the 4D light

field) are stored, i.e. it was extracted from the 4D Light Field (LF) array, 193 perspective images,

which are stored in the bitmap format (the images obtained from one light field image

correspond to a perspective image stack). Also, a directory file which lists all extracted

perspective images obtained from the set of light field images that compose the final panorama

was written. This file indicates the order (i.e. sequential order, thus the images need to be

processed according to their position in the final light field panorama) of the input light field

images and associated perspective image stacks which will be used in the creation of the final

light field 360º panorama; note that the order needs to be sequential. All perspective images

have a resolution of 623 x 434 pixels. Ideally, the resolution should be 625 x 434 pixels but due

to the presence of black pixels in the first and last columns of each perspective image it was

necessary to remove the first and the last column of each perspective image.

3. Central Perspective Images Registration: In this step, the central perspective images located

at position (8,8) of the sub-aperture light field image (one for each different perspective image

stack created in the previous step) are registered. The goal of this step is to obtain a set of

registration parameters that will be used to perform the composition (next step) of all perspective

42

images of each different light field. The main processes involved in this step are feature

detection and extraction, image matching, an initial (rough) pairwise camera parameters

estimation (intrinsic and extrinsic parameters), global camera parameters refinement, wave

correction and final perspective panorama scale estimation. This processing module is

described in detail in Section 4.3. The outcome of this process is the registration parameters,

which are the camera intrinsic and extrinsic parameters.

4. Composition: In this step, all corresponding perspective images in each different perspective

image stack will be composed to produce the required perspective panoramic images to create

the final light field panorama. Thus, the goal of this step is to compose all 2D panoramas, this

means one 2D panorama for each different perspective of the 4D light field. The composition

process of these 2D panoramas will use the central perspective images registration parameters

previously estimated (camera parameters). The main processes involved in this step are image

warping (where is applied the spherical projection), exposure compensation, seam detection

and blending. The first perspective panorama created is the central perspective panorama since

the creation of the remaining panoramas use information in the blending process (image warped

masks and corners) obtained from the composition of the central perspective. The perspective

images that correspond to the corners of the 2D array of light field images that are too dark (as

previously described) will be replaced by black panoramic images with the same resolution of

the created 2D panoramas. The outcome of this process will be a set of perspective 2D

panoramas (193 perspective panoramic images and 32 black panoramas), all with the same

resolution.

5. Light Field 360º Panorama Creation: In this step, all perspective 2D panoramic images are

rearranged into a 4D light field in the same way as the input light field is represented after the

pre-processing module previously described in Step 2. This means that by storing the final light

field 360º in this 4D format it is possible to perform some rendering, e.g. extract a single

perspective panoramic image or refocus a posteriori in a specific depth plane of the acquired

visual scene in the same way as a usual light field image.

4.2. Light Field Toolbox Processing Description

This section intends to describe the internal processing of the Light Field Toolbox developed by D.

Dansereau [51] which is largely used in this Thesis. Figure 26 illustrates the architecture of the light field

images processing flow in the Light Field Toolbox (LFT) software [45]. The dashed modules represent

some processing steps that needs to be performed just one time (offline) to obtain parameters that are

relevant for all the remaining input light field images processing.

43

Figure 26 - Light Field Toolbox software: light field images processing architecture.

In the following, the walkthrough of the architecture in Figure 26 is presented, notably by highlighting

for each module its main objectives as well as inputs and outputs:

1. Lenslet Grid Model/Structure Estimation: In this step, a collection of white light field images

is used to create a set of lenslet grid models/structures; each Lytro Illum camera has internally

stored its own collection of white light field images, which is also available in the external

memory card of the camera. A white image is an image captured using a real-world setup that

requires a white diffuser (i.e. translucent material used to soften the hard light produced for

example by a strobe light). The acquisition of the white light field images is performed by the

manufacturer (just one time) when each camera is produced. For each white image in the

camera card memory, a lenslet grid model/structure (which is hexagonally packed in the case

of Lytro Illum camera, as shown in Figure 27(a) is generated and stored as a *.grid.json file. An

example of a white image also displaying its predicted lenslet centers (illustrated as red dots) is

shown in Figure 27(b). The zoom and focus settings of each white light field image in the camera

card memory is stored in a file called WhiteFileDatabase.mat; this file will be later used to select

the proper white light field image to perform the transforming and slicing process, but also the

devignetting operation as will be further explained. This step is executed using the

LFUtilProcessWhiteImages command of the LFT software and is explained in [51].

(a) (b)

Figure 27 - Hexagonal micro-lens array: close up [52]; and (b) example of a white image and associated estimated lenslet centers represented as red dots [51].

2. Camera Calibration Parameters Estimation: In this step, a set of camera calibration

parameters is estimated (performing in advance to the processing of each of the light field

images); this is performed using a collection of calibration grids/checkerboard light field images

(previously acquired), each one acquired with a different pose. The camera calibration

parameters will be mainly used to compensate the angular difference between the

Rectified

Light Field

Image

Rectified

Light Field

Image

Demosaicing DevignettingTransforming

and Slicing

White Light

Field Images

WhiteFileData

base.mat

Color

Correction

RGB Lenslet

Light Field

(LF) Image

Devignetted

Lenslet LF

Image

Aligned LF

Image

Color

Corrected LF

Image

Lenslet Grid

Model/Structure

Estimation

Checkerboard

Light Field

Images

CalibrationDatabase.matCamera Calibration

Parameters

Estimation

Input Light

Field Image

(.LFR file)

Input Light

Field Image

(.LFR file)Rectification

44

corresponding rays of adjacent pixels (within a micro-lens) and the radial distortion due to the

shape of the micro-lenses. To estimate the camera calibration parameters from the

checkerboard images it is necessary to perform: i) feature detection: corners are located in the

checkerboard light field images, as illustrated in Figure 28; ii) initialization: pose and intrinsic

parameters for each image are initialized; iii) optimization: pose and intrinsic parameters without

lens distortion (minimizing an RMSE) are coarsely estimated. Then, a second optimization

considering lens distortions is performed; and finally iv) refinement: camera intrinsic and poses

estimated parameters are refined. The outcome of this step is a set of camera calibration

parameters, notably plenoptic camera intrinsic and micro-lens radial distortion parameters

(stored in a file called CalibrationDatabase.mat). The rectification process will use the estimated

calibration parameters to attenuate the radial distortion effect which is present due to the

characteristics of the lenselet array as well as all the main optical lenses. This procedure is

explained in detail in [51]. This process is executed using the LFUtilCalLensletCam and

LFUtilProcessCalibrations commands of the LFT software.

(a) (b)

Figure 28 - Example of: (a) calibration pre-processed light field checkerboard image; (b) checkerboard corners identification [51].

3. Demosaicing: In this step, a conventional linear demosaicing technique is applied over the

acquired raw image. The demosaicing process is responsible to produce a full color RGB lenslet

light field image from the non-overlapping Bayer pattern color samples of the Lytro Illum camera.

This process may produce undesirable effects for some pixels, notably those closer to the micro-

lens edges due to the reduction of light intensity; for that reason, the edge pixels may have to

be ignored. This process is described in [43] and is executed using the LFUtilDecodeLytroFolder

command of the LFT software. An example of an image before and after demosaicing is

illustrated in Figure 29.

45

(a) (b)

Figure 29 - Example of an image: (a) before and (b) after demosaicing [53].

4. Devignetting: In this step, the vignetting effect in the RGB lenslet light field image is corrected.

The vignetting effect is the darkening of the pixels near the border of the micro-lens. Thus, from

the white images database that was previously created (as described in Step 2), the appropriate

white image for each acquired light field is selected; the RGB lenslet image is then divided by

the chosen white image to compensate the lower intensity close to the micro-image edges [43].

The outcome of this step is a devignetted light field this means a light field where the vignetting

effect is (almost) not present. This process is described in [43] and is executed using the

LFUtilDecodeLytroFolder command of the LFT software. An example of a demosaiced raw

lenslet image without vignetting effect correction is illustrated in Figure 30.

Figure 30 - Example of a demosaiced raw lenslet image before devignetting [43].

5. Transforming and Slicing: In this step, the devignetted RGB lenslet light field image is aligned

and sliced to a square grid of micro-images. Originally, the accurate placement of the micro-

images array in the camera’s optical path is unknown and each lenslet element is spaced from

their neighbors by a non-integer multiple of the pixel/sample pitch (i.e. the distance from the

center of one pixel to the center of the next pixel). Thus, the outcome of this step is a lenslet

light field image organized into a square grid of lenslets and thus also micro-images. Each

micro-image has a resolution corresponding to the number of directions for which light intensity

is measured (the angular resolution). For Lytro Illum, the lenslet grid includes 625 x 434

elements and each lenslet element captures the light rays coming from 15 x 15 different

46

directions. After this step, each light field image is stored as a 4D matrix called LF(i,j,k,l,c) which

contains all sub-aperture images (i.e. perspective images), which is called here as perspective

image stacks. This process is also explained in [43] and is executed using the

LFUtilDecodeLytroFolder command of the LFT software.

6. Color Correction: In this step, the previously obtained 4D light field image organized as grid of

micro-images is color corrected. The main operations involved in the LFT module are gamma

correction, RGB color correction and color balancing (i.e. a global adjustment of the intensity of

the colors). This is achieved by using some light field metadata, e.g. the basic RGB color and

the gamma correction parameters. The outcome of this step is a color corrected lenslet light

field image and this process is executed using the LFUtilDecodeLytroFolder command of the

LFT software while including the ColourCorrect task.

7. Rectification: In this final step, each color corrected light field image is rectified with the goal

of significantly reducing the radial distortion of the micro-lens (mainly because micro-lens Fly’s

Eye shape). The light field rectification process relies on the camera calibration parameters (i.e.

plenoptic camera intrinsic model and micro-lens radial distortion parameters) previously

estimated in Step 3, as described before. Thus, the outcome of this step is a rectified lenslet

light field image corresponding to a rectangular grid of lenslet images aka micro-images. This

last step is executed using the LFUtilDecodeLytroFolder command of the LFT software by

including the Rectify argument to the function and it is explained in [43].

The final output light field image is stored as a 4D matrix called LF(i,j,k,l,c) (which is available after

the transforming and slicing processing module) with size (15,15,434,625,4). This matrix has the

following data indexing: (i,j) corresponds to the coordinates of each pixel within each micro-image, i.e.

when the first two indices i and j are fixed it corresponds to a 2D perspective image; (k,l) corresponds

to the spatial coordinates of the lenslet element in the array; (c) corresponds to the color component,

notably the 3 RGB components, but can also include a weight channel representing the confidence

associated to each pixel intensity value.

4.3. Main Tools: Detailed Description

This section describes in detail the main tools of the proposed light field based 360º panorama creation

solution. In the implementation of the proposed solution, it was used the OpenCV library [54] where

some of the processing modules described in the following are implemented. The main tools are the

central perspective images registration and composition processes of the global system architecture

presented in Section 4.1.

4.3.1. Central Perspective Images Registration Processing Architecture

The central perspective images registration process architecture of the proposed light field based 360º

panorama creation solution is shown in Figure 31. The main goal of the central perspective images

registration is to compute a set of registration parameters from all central perspective images of the

different perspective image stacks (obtained from the several 4D LF images covering different areas of

the visual scene). This central perspective images registration parameters will be used to compose all

47

different perspective panoramas in the composing process described in detail in Section 4.3.2. Figure

31 depicts the central perspective images registration process architecture.

Figure 31 – Central perspective images registration architecture of the proposed light field based 360º

panorama creation solution.

A walkthrough of the registration process architecture illustrated in Figure 31 is presented, where the

main tools deserve more detail:

1. Feature Detection and Extraction: In this step, local features [11] are detected and extracted

from all central perspective images (one for each perspective image stack) using the SURF

feature detector and extractor [10]. The SURF detector is a blob detector which is based on

the Hessian matrix to find points of interest. The SURF descriptors characterize how pixel

intensities are distributed within a neighborhood of each detected point of interest (keypoint).

SURF descriptors are robust to rotation, scale and perspective changes in a similar way to the

SIFT descriptors. Figure 32 shows the features detected from 2 overlapping central perspective

images (no feature scale or orientation is shown to allow the visualization of the content and

keypoint descriptor location).

Figure 32 – Features detected and extracted from 2 overlapping central perspective images.

2. Sequential Image Matching: In this step, the set of features detected and extracted (from all

central perspective images of each perspective image stack) in the previous step is pairwise

matched according to the order presented by the directory file (created in Section 4.1 - Step 2).

This order reflects the position of each acquired light field image in the final light field 360º

panorama. The feature matcher does: 1) for a given feature in one image it is identified the two

best descriptors in the other image and thus two matches are obtained; 2) then, the two

corresponding distances are obtained which express how similar the two descriptors involved

in a match are, 3) the ratio between the two distances (for the two matches) is computed and

the best match is preserved only if this ratio is larger than a given threshold. This process is

repeated for every feature detected in one of the images. After, the RANSAC algorithm [12] with

DLT [29] is applied to each pair of central perspective images, estimating the transformation

Central Perspective Images Registration

Feature

Detection

and

Extraction

Sequential

Image

Matching

Rough

Camera

Parameters

Estimation

Global Camera

Parameters

Refinement

Central

Perspective

Images

Central

Perspective

Images Final

Perspective

Panorama Scale

Estimation

Central

Perspective

Images

Registration

Parameters

Central

Perspective

Images

Registration

Parameters

Wave

Correction

https://en.wikipedia.org/wiki/Hessian_matrix

48

model (i.e. homography) between them. After estimating the homography between each pair of

overlapping central perspective images, the features that are coherent with the estimated

transformation model are classified as inliers and the remaining ones are classified as outliers

and filtered (removed). Figure 33 illustrated the image matching between the 2 overlapping

central perspective images after applying the RANSAC algorithm (i.e. inlier matches). Again,

the scale or orientation of the descriptors are not shown.

Figure 33 – Image Matching after applying RANSAC algorithm (inlier matches).

3. Rough Camera Parameters Estimation: In this step, the camera intrinsic (focal length) and

extrinsic parameters (camera rotation) are roughly estimated. For each pair of overlapping

central perspective images, the camera intrinsic (focal length) and extrinsic (rotation)

parameters are estimated from the corresponding homography under the assumption that the

camera undergoes a pure rotation to capture different areas of the visual scene. All

transformations (i.e. homographies) used to estimate the camera intrinsic and extrinsic

parameters are generated from the previously estimated sequential pairwise matches (Step 2).

Then, the median value from all estimated focal length values (one for each pair of overlapping

central perspective images) will be considered the focal length value to be used in the next step.

Camera translation is assumed to be zero during the whole light field 360º panorama creation

pipeline.

4. Global Camera Parameters Refinement: In this step, the camera intrinsic (focal length) and

extrinsic parameters (rotation) roughly estimated in the previous step are globally refined with a

global alignment procedure over each pair of matching images thus reducing accumulated

registration errors resulting from the sequential pairwise image registration. This is achieved

using a bundle adjustment technique [27] which simultaneously refines the camera intrinsic

(focal length) and extrinsic (camera rotation) parameters. The bundle adjustment technique only

considers the overlapping pair of images that have a confidence value (expresses the reliability

of estimated homography for each pair) above a given threshold. In this case, the bundle

adjustment technique used minimizes the sum of the distances between the rays passing

through the camera center and the SURF features and the matches estimated in Step 2. The

Levenberg-Marquardt algorithm [28] is used to update the camera parameters by minimizing

the sum of squared projection errors associated to the projections of each feature into

overlapping images with corresponding features.

49

5. Wave Correction: In this step, a panorama straightening technique is used with the goal of

reducing the wavy effect that may occur in each final 2D perspective panoramic images. This

technique is able to straighten the final panorama by correcting the camera extrinsic parameters

(e.g. rotation) to keep the ground level. This effect is due to unknown motion of the camera

rotation central point relative to a chosen world coordinates frame since it is rather hard to

maintain the camera rotation central point perfectly static and stable during the camera

acquisition of all light field images that compose the final panorama. Since only camera

horizontal rotations are considered during the whole light field 360º panorama creation pipeline,

the unknown motion of the camera rotation central point is not considered in previous

registration steps. Camera parameters are updated according to a global rotation which is

applied such that the vector normal to the horizontal plane containing both the horizon and

camera centers is vertical to the projection plane. Figure 34 illustrated the result of applying the

panorama straightening technique described over perspective panoramic image.

(a)

(b)

Figure 34 – Wave correction examples: (a) without and (b) with applying the panorama straightening technique. Both examples presented are the final panorama that was obtained after all composition steps.

6. Final Perspective Panorama Scale Estimation: In this step, the perspective panoramic image

scale of all 2D perspective panoramas is estimated according to a specific focal length value.

This is done by sorting in ascending order all the focal length values previously refined (i.e.

updated in the global camera parameters refinement step) and selecting the middle value of

this set. This module is performed in parallel with the previous step since the focal length values

will not be changed and are already available after Step 4. This value will be used later in image

warping process of all perspective panoramic images.

4.3.2. Composition Processing Architecture

This section describes in detail the composition process architecture of the proposed light field based

360º panorama creation solution. Figure 35 depicts the composition process architecture. The

composition process aims to create all perspective panoramic 2D images by using the previously

estimated registration parameters of the central perspective image panorama. The registration

parameters required by the composition module are the camera (intrinsic and extrinsic) parameters. The

first perspective panorama created is always the central perspective panorama. The dashed modules

50

and arrows represent a processing step (seam detection) that is only performed to the central

perspective images. The creation of the remaining panoramas requires the use of some information of

the image warping and seam detection processes of the central perspective panorama, namely top-left

corners (relating the position of each image in the final light field 360º panorama) and image warped

masks, respectively. The orange arrow coming from the blending process to the image warping process

symbolize the iteration loop over all perspective images of different perspective images stacks, i.e. the

proposed solution iterates over all perspective stacks to create a set of perspective panoramas.

Figure 35 – Composition architecture of the proposed light field based 360º panorama creation solution.

In the following, the walkthrough of the composition process architecture illustrated in Figure 35 is

presented while describing in detail the main tools:

1. Image Warping: In this step, image warping is performed using all perspective image stacks

and the central perspective images registration parameters (i.e. camera intrinsic and rotation

parameters) previously estimated in the registration process. The goal of this process is to apply

a deformation of all input images according to the selected projection and to obtain a set of top-

left corners that will be used in the blending process later described. Thus, all perspective

images are projected/warped using a spherical rotation warper according to the final perspective

panorama scale value and the camera parameters (i.e. intrinsic parameters and rotation)

previously estimated in the central perspective images registration process. Besides the warped

images, the output of this step is also a collection of top-left corners (one corner for each warped

image). The top-left corners obtained from the image warping of the central perspective image

are used later in the blending process of all remaining perspectives of the final light field 360º

panorama. All warped images will be used later in the exposure compensation and blending

processes. Figure 36 illustrates a central perspective image before Figure 36(a) and after

(Figure 36(b)) suffering the described image warping process (to help visualizing the difference

between the two images, it is advised to look to the at the left border of Figure 36(b)).

Composition

Final

Perspective

Panoramas

Final

Perspective

Panoramas

Perspective

Images

Stacks

Perspective

Images

Stacks

Image

Warping

Central

Perspective

Images

Registration

Parameters

Central

Perspective

Images

Registration

Parameters

Exposure Compensation Blending

Central

Perspective

Images Mask

Central

Perspective

Images Mask

Central

Perspective

Images

Central

Perspective

Images

Seam Detection

Loop to Iterate Over All

Perspective Images

Loop to Iterate Over All

Perspective Images

51

(a) (b)

Figure 36 - Image warping example: (a) before (b) after applying image warping in a central perspective image.

2. Exposure Compensation: In this step, an exposure compensation technique [55] is used with

the goal of attenuate the intensity differences between the warped images that compose each

final 2D perspective panorama. The technique used tries to remove exposure differences

between overlapping perspective images by adjusting image block intensities. By dividing each

warped image in blocks and making use of the overlapping and non-overlapping information for

each pixel, soft transitions are achieved within a perspective panorama containing various

overlapping regions and also between different overlapping perspective images.

3. Seam Detection: In this step, a graph-cut seam detection technique [56] is used with the goal

of estimating seams, i.e. lines which define how the overlap areas in warped images will

contribute in the creation of the final perspective panoramic image. With this goal in mind, image

masks and seams are estimated jointly with the goal of finding the optimal seams between

overlapping central perspective images (note that this step is only perform for the central

perspective images, which is the reason why this module is dashed). The graph-cut seam

detection technique determines the optimal position of each seam between all warped central

perspective images, enabling the composition process of all perspective panoramas. This

technique creates the images masks which defines the seam to compose all images of the

panorama using the top-left corners obtained from the image warping process previously

described. Figure 37 illustrate an image mask resulting of the seam detection process over all

central perspective images. The white region defines the position of the central perspective

image corresponding to the image mask presented and the lines that separate the white region

from the black one are detected seams.

Figure 37 – Image mask example.

52

4. Blending: In this step, a multi-band blending technique [29] is applied to the region where

images are overlapping. The goal of this technique is to attenuate some undesired effects that

may exist in each final perspective panorama, such as visible seams due to exposure

differences, blurring due to misregistration, ghosting due to objects moving in the scene, radial

distortion, vignetting, parallax effects, among others. A detailed description of this blending

technique is available in Section 2.3.2 B – Step 6. This step uses the image masks obtained in

the seam detection step (which is only performed to the set of central perspective images) and

top-left corners associated to the central perspective images and the warped perspective

images needed to create a perspective 2D panorama.

After finish the blending process over some perspective panoramic image, the proposed solution

starts the composition of the next perspective panorama (i.e. it comes back to Step 1 of the composition

process) that corresponds to the neighboring perspective of the right side. The final outcome of this

process are a set of perspective 2D panoramic images, which all together and rearranged in a 4D array

can be understood as a light field panorama.

53

Chapter 5

5. Light Field based 360º Panorama Creation:

Assessment

In this chapter, the performance of the light field based 360º panorama creation solution proposed will

be assessed. To achieve this goal, this chapter begins by introducing the test scenarios used for the

assessment of the proposed solution and the corresponding acquisition conditions, followed by the

presentation and analysis of results for a representative number of light field panoramas examples; the

analysis will consider both the multi-perspective and refocus capabilities.

5.1. Test Scenarios and Acquisition Conditions

This section intends to describe the test scenarios designed to appropriately assess the performance of

the light field based 360º panorama creation solution proposed. In addition, it intends to describe the

acquisition conditions adopted for each test scenario.

5.1.1. Test Scenarios

The design of appropriate test scenarios is critical for the good assessment of the light field based 360º

panorama creation solution. Each test scenario attempts to reproduce relevant acquisition conditions of

a common user in a given real scenario. Each scenario will enable the assessment of different

capabilities, e.g. refocus capability, perspective shift/parallax, among others, of the created light field

based 360º panoramas. Table 1 illustrates the main characteristics defining the test scenarios. notably

the position of the interesting objects (i.e. the objects that may be a posteriori refocused) and the camera

refocus range used in the acquisition. The camera refocus range refers to the range of depth planes a

priori selected by the user in the acquisition moment for which a posteriori refocusing should be possible.

In Table 1, each combination of characteristics is labelled with a letter and a number combination where

the letter corresponds to the position of the interesting objects and the number to the position of the test

scenario in Table 1. Some of the combinations were not pursued because they have no practical

relevance, i.e. they have no interest from the point of view of possible real scenarios.

54

Table 1 – Test scenarios characteristics.

Camera Refocus Range

Interesting Objects

Close to the camera Close and far away to the camera

Far way to the camera

Short Test A.1 - -

Test A.2 - -

Large - Test B.3 Test C.3

In the following, the selected test scenarios defined in Table 1 are briefly discussed to highlight their

added value for the performance assessment to be made later:

A. Interesting objects close to the camera and short camera refocus range: This test case was

designed to evaluate the performance of the proposed solution when all objects are very close to

the camera, thus with large disparity between the partly overlapping light field images used in the

panorama creation process. In this context, the following two panoramas were created:

Case A.1: Room with toys 1 – As the whole scene is within the camera refocus range, it will

be possible to refocus on any region of the visual scene and observe a large disparity between

different perspectives; also the background (the last scene’s depth plane containing objects) is

rather close to the camera. Taking into account the characteristics of this test scenario, an indoor

environment was selected. Figure 38 illustrates the central perspective/view panorama

extracted from the created light field 360º panorama for the Room with toys 1 case which

belongs to test scenario A.1.

Figure 38 – Central view for the Room with toys 1 light field 360º panorama corresponding to test scenario A.1.

Case A.2: Room with toys 2 – The interesting objects in the scene are within the camera

refocus range but the scene background is not. This case should evaluate the performance of

the proposed solution when the background is outside the refocus range; in this case, in theory,

the background will always be blurred despite the light field refocusing capabilities. The fact that

the background is blurred in each individually captured light field image can compromise the

capability of the proposed solution to conveniently extract and match features between the sub-

aperture images of the partly overlapping captured light field images. Taking into account the

characteristics of this test scenario, again an indoor environment was selected. Figure 39 shows

the central perspective panorama extracted from the light field 360º panorama for the Room

with toys 2 case which belongs to test scenario A.2.

Figure 39 – Central view for the Room with toys 2 light field 360º panorama corresponding to test scenario A.2.

55

B. Interesting objects are close and far away from the camera and large camera refocus range:

This case was designed to evaluate the proposed solution when the interesting objects are near

and far away from the camera, thus with very different disparities (from large to small). Since there

will be objects in this test that are very far away from the camera, the background is necessarily

rather far away from the camera.

Case B.3: Sea landscape and Park landscape – As the whole scene acquired is within the

camera refocus range, the entire visual scene may be refocused after the acquisition moment

and it will be possible to notice very distinct disparities (large disparities corresponding to closer

objects and small disparities corresponding to far away objects) in the created panorama.

Additionally, the scene background is within the refocus range. Figure 40 illustrates the two

central perspective panoramas extracted from two different light field 270º panoramas (Figure

40(a) and Figure 40(b) illustrate the Sea landscape and Park landscape cases, respectively)

which belong to test scenario B.3. The examples presented in Figure 40 do not encompass the

full horizontal FOV.

(a)

(b)

Figure 40 – Light field 270º panoramas corresponding to test scenario B.3: (a) Sea landscape; and (b) Park landscape.

C. Interesting objects far away from camera and large camera refocus range: This case was

designed to evaluate the performance of the proposed solution when the whole scene to be acquired

is within the refocus range but there are relevant objects very far away from the camera. In this

case, all the objects have very small disparities.

Case C.3: Empty park: As the whole scene acquired is within the camera refocus range, the

scene objects present small disparities, what may compromise the refocus capability. Figure 41

presents a central perspective panorama extracted from the light field 300º panorama for the

Empty Park 2 case which belongs to test scenario C.3. The example in Figure 41 does not cover

the full horizontal FOV.

Figure 41 – Central view for the Empty Park light field 300º panorama corresponding to test scenario C.3.

56

5.1.2. Acquisition Conditions

This section intends to describe the acquisition conditions that were used in the test scenarios described

above. In all acquisition tests, a Lytro Illum camera [46], a Nodal Ninja 4 panoramic tripod head [47] and

a Manfrotto 190CXPRO 4 tripod [48] were used. Figure 42 presents the full acquisition system used in

all test scenarios previously described.

Figure 42 – Full acquisition system used.

For each acquisition case described above, the rotation angle around the camera’s optical center

between each acquisition and the camera zoom, focus and refocus range remained constant. In the

following, the camera settings used in the acquisition of the light field images for the defined test

scenarios are described, starting by presenting the common camera settings for all test scenarios and

then presenting the specific settings for each test:

A. Common camera settings:

Zoom ring position: minimum (capturing the largest possible horizontal and vertical field of

view).

Exposure mode: Manual Mode, where the user sets manually the ISO and shutter speed.

Exposure Value (EV) compensation: this is a measured value (ideally between -1 and +1)

in each acquisition; the measured EV compensation value for each acquisition should be

very close to the measured value for the acquisition performed immediately before in order

to avoid large differences in exposure in the final light field 360º panorama.

B. Specific camera settings:

Rotation angle between acquisitions: 15º (used in test scenarios A.1, A.2 and B.3 Sea

landscape) and 30º (used in test scenario B.3 Park landscape, and C.3).

Camera refocus range:

- Test A.1: 22cm to 6m.

- Test A.2: 21cm to 1m.

- Test B.3 (Sea landscape): 22cm to ∞.

- Test B.3 (Park landscape): 30cm to ∞.

- Test C.4: 30cm to ∞.

White balance mode: Auto White Balance mode which sets the camera white balance

automatically (used in test scenarios A.1, A.2 and B.3 Park landscape example). Sunny

mode when acquiring in a sunny environment (used in test scenarios B.3 Sea landscape

and C.3).

57

All the remaining camera acquisition settings used can be found in the metadata associated to each

light field image, e.g. focal length, ISO, shutter speed, among others.

5.2. Example Results and Analysis

This section intends to present some light field panorama examples created using the light field 360º

panorama creation solution proposed. Some of the panoramas capture the full horizontal FOV and are

called light field 360º, and others only a portion of the full horizontal FOV. As previously mentioned, the

light field 360º panoramas examples presented in this section were acquired in different tests scenarios,

where each one attempts to reproduce relevant acquisition conditions of a common user in a given real

scenario. Each considered test scenario allows to assess different characteristics (refocus capability,

perspective shift, among others) of the light field based 360º panoramas created using the developed

solution.

5.2.1. Perspective Shift Capability Assessment

This section will evaluate the perspective shift capabilities associated to the created light field 360º

panorama. In the following, some created light field panoramas from the cases in Section 5.1.1 will be

used to show results.

Assessment Conditions

The light field panoramas created using the light field 360º panorama creation solution proposed are

represented (as stated in Section 4.1) by a set of 225 perspective panoramic images, where:

1) 193 are slightly different 2D perspective panoramas that result from the composition process of

corresponding perspective images of different perspective images stacks;

2) 32 black panoramas that correspond to the corners of a sub-aperture light field image (i.e. the

perspective images located at the corner of each sub-aperture light field) that are too dark to be

used in the composing process.

Each final light field panorama is organized into a 4D light field image format (i.e. organized as sub-

aperture images or perspective images) as the input light fields are expressed after suffering a pre-

processing procedure described in Section 4.1 – Step 2. This format allows to easily extract a specific

2D perspective panoramic image, which should be an important capability of the light field 360º

panorama creation.

Figure 43 illustrates a final light field panorama presented as a 2D matrix of perspective panoramic

images this means the so-called sub-aperture images. Red rectangles indicate the perspectives

selected for each one of the five light field panoramas to be used to enable a convenient assessment of

the perspective shift capability after applying the proposed solution. Note that for all the test scenarios

previously referred the same five perspectives will be used. The perspectives located at the border of

the sub-aperture image, this means at the maximum angular distance from the central perspective, have

stronger problems related to vignetting, and radial distortion, among others, as these problems

propagate from the perspective images used in their composing process (the yellow rectangle depicts

one of those perspectives).

58

Figure 43 – Light field panorama presented as a 2D matrix of perspective panoramic images.

The first perspective of each light field panorama to be presented is the central perspective (located

in position (8,8) of the 15x15 2D matrix of perspective panoramas) as this perspective is the one that

originates the central perspective images registration parameters used to compose the remaining

perspectives. Two of the four remaining perspectives are located five perspectives apart horizontally

(one to the left, located at (8,3) position, and other to the right, located at (8,13) position) from the central

perspective. The last two perspectives are located six perspectives apart vertically (one above, located

at (2,8) position, and the other bellow, located at (14,8) position) from the central perspective. The

selection of this set of perspectives should enable to perform a good analysis of the final light field

panoramas created in terms of the desired perspective shift capability. For each perspective that will

assessed later in this section, some image close-ups will be presented (highlighted with red rectangles

in each corresponding perspective) when evaluating the perspective shift capability in both the horizontal

and vertical parallax directions of the perspective panoramas (two different close-ups will be shown for

each direction). The close-ups will be used to help visualizing both perspective shifts. In each different

perspective that will be presented, the orange circles highlight some undesired effects (described in the

following) present in the acquired light field images when capturing a bright environment and the yellow

circles highlight noticeable 2D stitching artifacts. In each different close-up presented, the red vertical

and horizontal lines were drawn to help visualizing both the horizontal and vertical perspective shifts,

respectively.

Figure 44(a) shows some of the undesired effects previously mentioned for the perspectives located

at the border of the 2D matrix of perspective panoramas (or sub-aperture light field image); in this case,

the presented perspective panorama corresponds to the yellow highlighted perspective illustrated in

Figure 43, located at (1,8) position of the 2D matrix of perspective panoramas. This perspective

panorama was extracted from the light field 270º panorama created for the evaluation of the test

scenario B.3, named Sea landscape. This light field panorama was created using 18 captured light field

images, where Figure 44(b) and Figure 44(c) illustrate the corresponding first and second perspective

images (extracted from the first and second acquired light field images) used in the compositing process

of the perspective panorama presented in Figure 44(a). As it is possible to conclude by observing Figure

44(b) and Figure 44(c), the undesired effects previously mentioned are presented in all the perspective

images located at the border of each sub-aperture light field image (i.e. present in each different

perspective image stacks) used to compose the final perspective panoramic image. This fact leads to

the presence of these effects, repeatedly, across the whole perspective panorama. This problem is

present in the perspectives at the border of each sub-aperture image of all the created light field

panoramas.

59

(a)

(b) (c)

Figure 44 – Extreme left perspective panorama example (position (8,1)) with undesired effects (such as vignetting and blurring): (a) perspective panorama located at the border of the perspective panoramas; (b) first and (c) second perspective images (extracted from

the first and second acquired light field images belonging to the presented light field panorama) used to compose the presented perspective panorama.

In addition, all the perspective images located at the border of each sub-aperture image are not

sharply focused (i.e. they appear blurred), as it is possible to see in Figure 44(b) and Figure 44(c). This

leads to perspective panoramas that are not sharply focused (see Figure 44(a)) at the border of each

sub-aperture light field panoramic image created.

Panorama by Panorama Perspective Shift Assessment

Test scenario A.1: Room with Toys 1

In the following, the light field 360º panorama created for the evaluation of the test scenario A.1,

named Room with toys 1, is presented. Figure 45 depicts five different perspectives that were selected

for the assessment of the horizontal and vertical perspective shift capability with light field 360º

panorama created (according to the perspectives selected previously, see Figure 43): Figure 45(a)

presents the central perspective (8,8); Figure 45(b) the left perspective (8,3); Figure 45(c) the right

perspective (8,13); Figure 45(d) the top perspective (2,8); and Figure 45(e) the bottom perspective

(14,8). Figure 46 presents the horizontal perspective shift close-ups extracted from each perspective

panorama in Figure 45: Figure 46(a) and Figure 46(d) correspond to the two close-ups from the left

perspective (8,3); Figure 46(b) and Figure 46(e) correspond to the two close-ups from the central

perspective (8,8); lastly Figure 46(c) and Figure 46(f) corresponds to the two close-ups from the right

perspective (8,13). Figure 47 depicts the vertical perspective shift close-ups: Figure 47(a) and Figure

47(d) correspond to the two close-ups from the top perspective (2,8); Figure 47(b) and Figure 47(e)

correspond to the two close-ups from the central perspective (8,8); lastly Figure 47(c) and Figure 47(f)

corresponds to the two close-ups from the bottom perspective (14,8).

60

(a)

(b)

(c)

(d)

(e)

Figure 45 – Five perspectives extracted from the Room with toys 1 light field 360º panorama created for the test scenario A.1: (a) central perspective (8,8); (b) left perspective (8,3); (c) right perspective (8,13); (d) top perspective (2,8); and (e) bottom perspective (14,8).

61

(a) (b) (c)

(d) (e) (f)

Figure 46 – Horizontal perspective shift close-ups: (a) and (d) correspond to the two close-ups from the left perspective (8,3); (b) and (e) correspond to the two close-ups from the central perspective (8,8); lastly (c) and (f) correspond to the two close-ups from the right perspective (8,13).

62

(a) (b) (c)

(d) (e) (f)

Figure 47 - Vertical perspective shift close-ups: (a) and (d) correspond to the two close-ups from the top perspective (2,8); (b) and (e) correspond to the two close-ups from the central perspective (8,8); lastly (c) and (f) correspond to the two close-ups from the bottom perspective (14,8).

63

The undesired effects highlighted with orange circles are due to the fact that the captured light field

images and, consequently, the sub-aperture images become overexposed when shooting a bright area

of the visual scene. The noticeable 2D stitching artifacts highlighted with yellow circles are probably due

to incorrect global alignment of all light field images and associated sub-aperture images. In addition,

the blending technique used could not correct them in an imperceptible way.

As expected, it is not very easy to recognize the horizontal and vertical perspective shifts just by

looking to the five different perspectives presented in Figure 45. As it can be observed from the

horizontal and vertical perspective shifts close-ups depicted in Figure 46 and Figure 47, respectively,

the light field 360º panorama created for the evaluation of the test scenario A.1 (named as Room with

Toys 1) presents the desired perspective shift capability. From Figure 46(a), Figure 46(b) and Figure

46(c), it is possible to notice a slight horizontal perspective shift by looking to the position of the vertical

red line drawn: in Figure 46(a) the red line is located over the table post; in Figure 46(b) the red line is

drawn against the table post limit and, in Figure 46(c), the line is slightly away from the table post limit.

From Figure 46(d), Figure 46(d) and Figure 46(e), it is possible to observe the same horizontal

perspective shift by looking to the position of the vertical red line drawn relatively to the clothes hanger.

Thus, by shifting from position (8,3) of the sub-aperture image, corresponding to the close-up of Figure

46(a) and Figure 46(d), to the position (8,13), corresponding to the close-up of Figure 46(c) and Figure

46(f), it is possible to observe a slight horizontal perspective shift to the left in the location of the objects

in the acquired scene. From Figure 47(a), Figure 47(b) and Figure 47(c), it is possible to perceive a

slight vertical perspective shift through inspection of the horizontal red line location relatively to the

background scene and the Eiffel Tower toy. This slight vertical perspective shift is most noticeable in

Figure 47(d), Figure 47(e) and Figure 47(f), again by looking to the position of the vertical red line

relatively to the clothes hanger. The observed vertical perspective shift is expected since the considered

shift in perspective is done from position (2,8) of the sub-aperture image, corresponding to the close-up

of Figure 47(a) and Figure 47(d), to the position (14,8), corresponding to the close-up of Figure 47(c)

and Figure 47(f), i.e. a downward shift in perspectives of the sub-aperture image. However, the observed

perspective shifts are rather small both in the horizontal and vertical directions. This occurs because the

light field images used to create the presented light field 360º panorama are acquired using he Lytro

Illum camera [46] which have a lenslet array on the optical path, inserted between the digital sensor and

the main lens, that is rather small. Thus, the design of this light field camera does not permit to capture

great levels of disparity between different perspectives. This was a bigger limitation when acquiring in

visual scenes where the interesting objects that are relatively far away from the camera. Thus, the

created light field panoramas acquired for visual scenes where the interesting objects are very close to

the camera (which is the case of the test scenario A.1, named Room with Toys 1, and test scenario A.2,

named Room with Toys 2) will present much larger disparity between different perspectives of the sub-

aperture image.


In the following, the light field 360º panorama produced for the assessment of the test scenario A.2,

named Room with toys 2, is presented. As previously stated in Section 5.1, the added value of this test

64

is to evaluate the performance of the proposed solution when the background is outside the refocus

range. Thus, in theory, each light field image used to create the presented light field 360º panorama

should present a blurry background which can interfere in the extraction and matching of features

between the sub-aperture images of the partially overlapping light field images. However, the proposed

solution could conveniently create the desired light field 360º panorama. Figure 48(a) presents the

central perspective (8,8), where the orange circles highlight some noticeable camera undesired effects

(i.e. overexposure) and the yellow circles highlight noticeable 2D stitching artifacts. Figure 48(b) and

Figure 48(c) are close-ups to allow a better inspection of these problems.

(a)

(b) (c)

Figure 48 - Perspective extracted from the Room with toys 2 light field 360º panorama created for the test scenario A.2: (a) central perspective (8,8); (b) and (c) two close-ups presenting camera overexposure problems and 2D stitching artifacts.

As for test A.1 with Room of toys 1, the presence of camera acquisition undesired effects in the final

light field image (observe the highlighted orange circles in Figure 48) is due to the fact that the captured

light field images and, consequently, the sub-aperture images become overexposed when acquiring a

bright area of the visual scene. This was a considerable limitation of the camera when acquiring this

type of visual scenes, and every light field panorama created presents this type of problems. The 2D

stitching artifacts highlight with orange circles (see Figure 48(b) and Figure 48(c)) are probably due to

incorrect global alignment (i.e. global camera parameters refinement which applies bundle adjustment

technique) of all light field images and associated perspective images. In addition, the blending

technique used (i.e. multi-band blending) could not correct them in an imperceptible way, thus leading

to noticeable artifacts in the final light field panorama. By inspection of Figure 48(c), it is possible to see

that the background of the central perspective image is not blurred as it was expected. Although the

camera refocus range does not cover the background, the camera assumes that all objects in the

background scene are in a depth plane coincident with the last depth plane selected a priori by the user

(i.e. in the range of depth planes selected using the camera refocus range) at the acquisition moment

for which a posteriori refocusing should be possible. This fact enables the proposed solution to

conveniently create the desired light field panorama, when this fact could be a limitation.

Test scenario B.3: Sea landscape

In the following, the light field 270º panorama produced for the assessment of the test scenario B.3,

named Sea landscape, is presented. Figure 49 presents the corresponding five different perspectives

65

red highlighted in Figure 43. Figure 50 and Figure 51 illustrate the horizontal and vertical perspective

shift close-ups, respectively, extracted from each perspective in Figure 49.

The perspectives depicted present the same undesired effects that are originated when acquiring a

bright area of the visual scene using the Lytro Illum camera [46]. The 2D stitching artifacts highlighted

with orange circles (see Figure 49) come again from incorrect global alignment of all light field images

and associated perspective images and the inability to correct them with the used blending technique.

All these effects are replicated in the created perspective panoramas as it can be observed in Figure

49.

As it can be noticed in the horizontal and vertical perspective shift close-ups presented in Figure 50

and Figure 51, respectively, the light field 270º panorama produced for the assessment of the test

scenario B.3, named Sea landscape, presents the desired perspective shift capability. By looking to

Figure 50(a), Figure 50(b) and Figure 50(c), it is possible to see a small horizontal perspective shift by

observing, again, the red line in each close-up: in Figure 50(a), the red vertical line is over the arm of

the person at the center of the image; Figure 50(b), the red line is against the person’s arm limit; and in

Figure 50(c), the red line is slightly away from the person’s arm limit. Again, by looking to Figure 50(d),

Figure 50(e) and Figure 50(f), it is possible to observe the same horizontal perspective shift. From Figure

51(a), Figure 51(b) and Figure 51(c), it is possible to see a small vertical perspective shift by observing

the location of the horizontal red line relatively to the metal grid post in the background scene. It is

possible to see the same small vertical perspective shift by observing Figure 51(d), Figure 51(e) and

Figure 51(f), where it is noticeable the change in the position of the peninsula in the scene background

relatively to the horizontal red line. The horizontal and vertical perspective shifts present in the close-

ups in Figure 50 and Figure 51 are concordant between them and small. Also, they give the impression

that the light field 270º created for the assessment of the test scenario B.3, named Sea landscape, has

a smaller perspective shift capability compared to the previously presented test scenarios (Room with

toys 1 and Room with toys 2). This was expected since the distance from the camera to the majority of

the interesting objects in the acquired scene (the persons in the scene) is much higher compared to the

previously presented tests. Since the light field images were captured using the Lytro Illum camera [46],

which has a very limited micro-lens array due to its very small size. The amount of disparity captured is

less because the distance between the camera and the interesting objects is much higher compared to

the distance of objects in test scenarios previously presented.

Test scenario B.3: Park landscape

The light field 300º panorama created for the evaluation of the test scenario B.3, named Park

landscape, is not presented here since this case leads to the same conclusions in terms of perspective

shift capability as for the previous test case B.3, named Sea landscape.

66

(a)

(b)

(c)

(d)

(e)

Figure 49 – Five perspectives extracted from the Sea landscape light field 270º panorama created for the test scenario B.3: (a) central perspective (8,8); (b) left perspective (8,3); (c) right

perspective (8,13); (d) top perspective (2,8); and (e) bottom perspective (14,8).

67

(a) (b) (c)

(d) (e) (f)


68

(a) (b) (c)

(d) (e) (f)


69

Test scenario C.3: Empty park

The perspective shift capability assessment for the light field 360º panorama created for the

evaluation of the test scenario C.3, named Empty park, is reported in Appendix A.

5.2.2. Refocus Capability Assessment

This section will assess the performance of the light field 360º panorama creation solution proposed in

terms of refocusing capabilities. Similarly to what was done for the evaluation of the perspective shift

capabilities, some created light field panoramas from the cases described in Section 5.1.1 will be

analyzed.

Assessment Conditions

The refocus capability is obtained using the Light Field Toolbox software [51], developed by D.

Dansereau, notably by using the function LFFiltShiftSum. This function works by shifting all the available

sub-aperture images of each light field image to the same depth, and adding after all the sub-aperture

images together to produce a 2D depth plane extracted from the original light field. This function uses

an input value called slope, which allows controlling the optical focal plane, and thus the object, that

should be focused. For each created light field panorama presented, some different focal planes are

extracted and presented, as well as some close-ups corresponding to the presented depth planes to

help visualizing the objects in focus in each considered example. In each different focal plane presented,

the red rectangles highlight the close-ups that will be used to help visualizing the focus in specific parts

of the created light field panorama. In each different close-up presented, the red circles highlight the

interesting objects in focus.

Panorama by Panorama Refocusing Assessment



named Room with toys 1, is presented. Figure 52 presents three different depth planes extracted from

the created light field 360º panorama and two corresponding close-ups for each depth plane selected.

Figure 52(a) was extracted with a slope = -0.05 and Figure 52(d) and Figure 52(e) are the two

corresponding close-ups; Figure 52(b) was extracted with a slope = 0.25 and Figure 52(f) and Figure

52(g) are the two close-ups; lastly, Figure 52(c) was extracted with a slope = 0.6 and Figure 52(d) and

Figure 52(e) are the two close-ups.

70

(a)

(b)

(c)

(d) (e)

71

(f) (g)

(h) (i)

Figure 52 – Three depth planes extracted from the Room with toys 1 light field 360º panorama and two corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope = - 0.05 where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.25 where (f) and (g) are the corresponding

close-ups; (c) depth plane extracted with slope = 0.6 where (h) and (i) are the corresponding close-ups.

72

As it can be observed from Figure 52, the light field 360º panorama created for the assessment of

the test scenario A.1, named Room with toys 1, presents the desired refocus capability. The images

presented in Figure 52 appear a little bit dark after applying the refocus processing over the sub-aperture

images of the light field 360º panorama created. Observing Figure 52(a) and the two corresponding

close-ups highlighting the objects in focus in the considered depth plane, which are in Figure 52(d) and

Figure 52(e), it is possible to see that the two toy cars (the blue toy car in Figure 52(d) and the grey one

in Figure 52(e)) are in focus and the remaining acquired scene is not (i.e. blurred). Moreover, by

observing Figure 52(b) and the two associated close-ups, which are in Figure 52(f) and Figure 52(g), it

is possible to recognize that the Eiffel Tower and the white toy cars (see Figure 52(f)) and the red car

and the lighthouse toys (see Figure 52(g)) are in focus. Lastly, Figure 52(c) and the two associated

close-ups, which are in Figure 52(h) and Figure 52(i), show the background scene in focus and the rest

of the acquired visual scene blurred. In summary, the light field 360º panorama created presents the

desired a posteriori refocusing capability, which is one of the most important user functionalities involved

in light field 360º panorama creation.



named Room with toys 2, is presented. Figure 53Figure 52 presents the last depth plane extracted from

the created light field 360º panorama and two corresponding close-ups. Figure 53(a) was extracted with

a slope = 0.6 and Figure 53(d) and Figure 53(e) are the two corresponding close-ups.

(a)

(b) (c)

Figure 53 - Last depth plane extracted from the Room with toys 2 light field 360º panorama and two corresponding close-ups: (a) depth plane extracted with slope = 0.6 where (b) and (c) are the corresponding close-ups. Red rectangles highlight the close-ups that will be

used to help visualizing the focus in specific parts of the light field image.

From visual inspection of Figure 53(b) and Figure 53(c), it is possible to conclude that the background

is focused and the remaining objects present in the acquired visual scene are not focused (they are

blurred). This reinforces the fact that, despite the background not bring included in the used camera

refocus range (see Section 5.1), the background can be refocused a posteriori and still the proposed

73

solution could conveniently create the desired light field 360º panorama. This might happen because

the refocus technique considers that all objects at a larger distance that the last depth plane considered

in the camera refocus range used are located at that last depth plane. In addition, this light field 360º

panorama can refocus the objects in the scene in a similar way to the test case previously presented,

this means test A.1 with Room with toys 1. Thus, it was decided not to include here again the same

three planes examples for this light field 360º panorama as it was done for test A.1.

Test scenario B.3: Sea landscape

In the following, the light field 270º panorama created for the evaluation of the test scenario B.3,

named Sea landscape, is presented. Figure 54 presents three different depth planes extracted from the

created light field 270º panorama and one associated close-up for each depth plane. Figure 54(a) was

extracted with a slope = 0.15 and Figure 54(d) is the associated close-up; Figure 54(b) was extracted

with a slope = 0.45 where Figure 54(e) is the associated close-up; lastly, Figure 54(c) was extracted

with a slope = 0.55 and Figure 54(f) is the associated close-up.

As it can be observed from Figure 54, the light field 270º panorama created for the assessment of

the test scenario B.3, named Sea landscape, presents the desired refocus capability. By observing

Figure 54(a) and the corresponding close-up where the objects in focus are highlighted (see Figure

54(d)), it is possible to conclude that the person wearing a green t-shirt is focused and the background

scene is blurred. Moreover, by observing Figure 54(b) and the associated close-up (see Figure 54(e)),

it is possible to recognize that the person at the center of the close-up is the only interesting object in

focus. Finally, Figure 54(c) and the associated close-up (see Figure 54(f)) present the background scene

in focus despite the rest of the acquired visual not being in focus. Thus, the light field 360º panorama

created present the much desired a posteriori refocusing capability. However, the resolution of the sub-

aperture images created by using the Light Field Toolbox [45] (as described in Section 4.2) and thus,

the resolution of the final light field 360º panorama is a limitation in finding very different depth planes to

refocus the interesting objects since it is not very easy to accurate distinguish focus in different

interesting objects when these objects are beyond a certain distance from the camera. This was a bigger

limitation for the cases B.3 (both Sea landscape and Park landscape) and C.3, since the majority of the

objects are much far away from the camera than for the test cases A.1 and A.2 (named Room with toys

1 and Room with toys 2). Furthermore, as previously stated in the assessment of the perspective shift

capability, the camera used (i.e. Lytro Illum camera [46]) has a rather small and limited lenslet array that

cannot distinguish objects at different depth planes if these objects are at a considerable distance from

the camera.

(a)

74

(b)

(c)

(d) (e) (f)

Figure 54 - Three different depth planes extracted from the Sea landscape light field 270º panorama and two corresponding close-ups

for each depth plane extracted: (a) depth plane extracted with slope = 0.15 where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.45 where (f) and (g) are the corresponding close-ups; (c) depth plane extracted with slope = 0.55 where

(h) is the corresponding close-up.

Test scenario B.3: Park landscape

In the following, the light field 270º panorama created for the evaluation of the test scenario B.3,

named Park landscape, is presented. Figure 55 presents three different depth planes extracted from the

created light field 270º panorama and the corresponding close-ups. Figure 55(a) was extracted with a

slope = 0 and Figure 55(d) and Figure 55(e) are the two corresponding close-ups; Figure 55(b) was

extracted with a slope = 0.15 and Figure 55(f) and Figure 55(g) are the two close-ups; lastly, Figure

55(c) was extracted with a slope = 0.25 and Figure 55(d) and Figure 55(e) are the two close-ups.

Observing Figure 55, it may be concluded that the light field 270º panorama created for the

assessment of the test scenario B.3, named Park landscape, presents the desired refocusing capability.

By looking to Figure 55(d) and Figure 55(e), it is possible to see that the girl wearing a red top together

with the metal hand support is focused and the background scene is blurred. By visual inspection of

Figure 55(f) and Figure 55(g), it is possible to notice that the person at the center of the close-up is the

only interesting object in focus. Finally, Figure 55(h) presents the girl wearing a grey top and black jeans

in focus despite the rest of the scene not being in focus. Thus, the light field 360º panorama created

presents the desired posteriori refocusing capability.

75

(a)

(b)

(c)

(d) (e)

76

(f) (g)

(h)

Figure 55 - Three different depth planes extracted from the Park landscape light field 270º panorama and two corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope = 0 where (d) and (e) are the corresponding close-ups; (b) depth plane extracted with slope = 0.15 where (f) and (g) are the corresponding

close-ups; (c) depth plane extracted with slope = 0.25 where (h) is the corresponding close-up.

77

Test scenario C.3: Empty park

The refocus capability assessment for the light field 360º panorama created for the assessment of

the test scenario C.3, named Empty park, is reported in Appendix A.

78

Chapter 6

6. Summary and Future Work

In this chapter, a summary of the work performed in the context of this Thesis is presented, followed by

a highlight of its main conclusions. Then, some suggestions for the future work are presented.

6.1. Summary and Conclusions

The motivation of this work is the enhancement of 360º panoramic photography with additional features

such as refocusing. There are major limitations in conventional cameras are: 1) the poor imaging

representation model used, which is the conventional 2D trichromatic images; and 2) the conventional

cameras only capture the total sum of the light rays that reach the same point in the lens using only the

two available dimensions at the camera sensor instead of capturing the amount of light carried by each

single light ray. Thus, important visual information of the acquired scene is irreversible loss. These

limitations lead to the emergence of new sensors and cameras (i.e. light field cameras) adopting higher

representations of the visual information (more faithful and complete imaging representation models).

These new representations are reinventing the concepts and functionalities associated with panoramic

image creation solutions. Thus, the major objective of the Thesis is the development of a light field based

360º panorama image creation solution.

Thus, this Thesis introduce the main concepts, approaches and tools related with the development

of conventional 360º panoramic images, namely a global 360º panorama creation architecture, followed

by a description of several different types of 360º panoramas and the key conventional panorama

creation solutions available in the literature. Although there are not many light field panoramic solutions,

the first methodologies and tools associated with the creation of 360º panoramas exploiting this light

field imaging representation are described next.

The light field 360º panorama creation solution proposed in this Thesis is named as multi-perspective

image stitching and is inspired by the work developed by Brown and Lowe [24]. The main concept behind

this solution is to create light field 360º panoramas from a collection of 2D perspective panoramas. Each

different 2D perspective panorama is created by stitching all corresponding perspective images from

different perspective images stacks (i.e. light field sub-aperture images). The conventional 360

panorama creation architecture was adapted to deal with light field input and thus, for the stitching to be

79

coherent among sub-aperture images it was necessary to calculate key registration and composition

parameters only for the central view which are applied to other views of the collection of light field images

that compose the final light field panorama.

The performance assessment of the proposed multi-perspective image stitching solution is made

with relevant test scenarios proposed first time in this Thesis, which are critical for the adequate

assessment of the proposed solution. These test scenarios attempt to reproduce relevant acquisition

conditions of a common user in a given real scenario. Each created light field panorama presents a few

stitching artifacts. The experimental results obtained show both light field refocus and multi-perspective

capabilities in panoramic images created with the multi-perspective solution. In this context, it is possible

to conclude that the proposed multi-perspective image stitching solution allows the creation of light field

360º panoramas under different types of realistic scenarios. Also, both light field refocus and multi-

perspective capabilities are possible in all light field panoramas created. By observing the perspective

shift capability assessment, it is possible to conclude that: 1) the light field panoramas that were acquired

in visual scenes where the objects are close to the camera present higher perspectives shifts in both

horizontal and vertical directions. This is justified with the fact the objects that are close to the camera

present higher disparity that the objects that are far away from the camera; 2) the design of the light

field camera used (i.e. the Lytro Illum camera) does not allow to capture great levels of disparity between

different perspectives. By observing the refocus capability assessment, it can be deduced that: 1) the

light field panoramas created can be refocused on different objects present in the acquired visual scene

at the user’s choice; 2) if the objects in the acquired visual scene are very distant from the camera, they

will present very small disparities which can compromising the light field refocus capability because, in

this case, the refocus technique considers that the depth of all scene objects is the same; 3) the

resolution of the sub-aperture created using the Light Field Toolbox [45] (and thus, the resolution of the

final light field 360º panorama) is a considerable limitation when finding very different depth planes to

refocus the scene objects since it is not very easy to accurate distinguish focus in different objects if

these objects are beyond a certain distance from the camera. Also, the captured light field images and,

consequently, the sub-aperture images become overexposed when acquiring a bright area of the visual

scene. This was a considerable limitation of the light field camera used when acquiring this type of visual

scenes, and every light field panorama created presents this type of problems. Considering all results

obtained one of the major conclusions of this Thesis is that the creation of light field panoramas excels

for visual scenes containing objects close to the camera.

The light field 360º panorama creation developed is able to still maintain the desired refocus and

perspectives shift capability on the light field panoramas created. However, there are important

limitations that may be addressed to improve the proposed multi-perspective solution as explained next,

in the future work Section.

6.2. Future Work

Since the light field imaging representation is a relatively new topic there are not many panorama

creation solutions based on light field images and thus, it is expected that new and innovative light field

360º panorama creation techniques will be proposed in the future. Regarding the proposed solution,

80

some improvements are possible to improve the quality of the light field 360º panoramas created. Some

suggestions aiming to improve the developed solution are listed:

Depth-based Light Field Panorama Creation: To minimize the stitching errors and allow to

capture well the disparity to objects close to the camera and of objects that are far away it is

possible to improve the stitching process by: 1) estimate the depth of the acquired visual scene

in each light field image used and 2) use this information in the registration process by estimating

multiple homographies for regions of the image which are in different depth planes, thus, enabling

a more accurate multi-perspective stitching process [40].

Light Field Panorama Rendering: Another topic, is the development of a rendering tool

appropriate for the light field panoramas, giving the user the possibility to interact with light field

360º panorama content, e.g. using his/her mouse by rotating it in all directions or navigate through

the whole acquired visual scene, making zoom-ins and zoom-outs, etc. to enjoy a more immersive

user experience. In addition, to the usual interactions with conventional panoramas the visual

scene could be rendered with certain depth of field and allow minor perspective adjustments. This

type of rendering tool could be also relevant for the visualization of light field panoramas while

giving the user the depth impression, e.g. rendering the content in stereoscopic or virtual reality

head mounted displays.

Unrestricted Light Field Panorama Creation: Another topic that could be interesting to improve

the proposed solution is the creation of light field 360º panoramas in an unrestricted way, i.e.

moving the camera handheld and thus with some unrestricted rotation and translation camera

motion. The tripod-base scenario used here assumes that camera undergoes a pure rotation

around its no-parallax point and is very common among professional photographers. However,

there are many solutions which do not have this constraint (e.g. using smartphone cameras) and

thus it is important to also target these cases.

81

Appendix A

A. Test Scenario C.3 Named Empty Park:

Perspective Shift Capability Assessment

In this chapter, the light field 300º panorama created for the evaluation of test scenario C.3, named

Empty park, is presented.

Perspective Shift Capability Assessment

For the evaluation of this test, it is not necessary to present the five different perspective highlighted

in red in Figure 43, since it is almost impossible to recognize the difference in perspective between them.

Instead, only some close-ups for the five perspectives highlighted in red in Figure 43 are depicted. In

Section 5.1, the central perspective (8,8) of the light field panorama created for this test has been

presented. Figure 56 presents the horizontal perspective shift close-ups extracted from each

perspective panorama. Figure 57 presents the vertical perspective shift close-ups.

By observing Figure 56 and Figure 57 with the horizontal and vertical perspective shifts close-ups,

respectively, it is possible to conclude that the light field 300º panorama created for the evaluation of

test scenario C.3, named Empty park, does not present noticeable horizontal and vertical perspective

shifts. From Figure 56(a), Figure 56(b) and Figure 56(c) it is practically impossible to notice a horizontal

perspective shift by looking to the red line in each close-up. The same happens when observing Figure

56(d), Figure 56(e) and Figure 56(f). For the vertical perspective shift, the same happens as the

perspective shifts are very small and almost imperceptible (compare the vertical perspective shift for

Figure 57(a), Figure 57(b) and Figure 57(c) for the first close-up, or Figure 57(d), Figure 57(e) and Figure

57(f) for the second close-up). As previously stated, this occurs because the lenslet array inserted in the

Lytro Illum camera [46] is very rather and limited. Thus, the light field panoramas acquired in visual

scenes where the interesting objects are very far away from the camera (which is the case for this test

scenario) will present disparities that are almost unnoticeable, thus leading to very small perspective

82

differences. This fact will compromise the refocus capability as will be seen in the next section. The

farther away are the objects from the camera, less disparity will be possible to observe and less

perspective shifts (both in the horizontal and vertical directions of the sub-aperture image) will be

possible to visualize.

83

(a) (b) (c)

(d) (e) (f)


84

(a) (b) (c)

(d) (e) (f)


85

Refocus Capability Assessment

Figure 58 presents two different depth planes extracted from the created light field 300º panorama and

the corresponding close-ups. Figure 58(a) was extracted with a slope = 0.25 and Figure 58(d) and Figure

58(e) are the two corresponding close-ups; Figure 58(b) was extracted with a slope = 0.5 and Figure

58(f) and Figure 58(g) are the two close-ups.

(a)

(b)

(c) (d)

(e) (f)

Figure 58 - Two different depth planes extracted from the Empty park light field 300º panorama and two corresponding close-ups for each depth plane extracted: (a) depth plane extracted with slope = 0.25 where (d) and (e) are the corresponding close-ups; (b) depth

plane extracted with slope = 0.5 where (f) and (g) are the corresponding close-ups.

86

As expected, by observing Figure 58, it may be concluded that the light field 300º panorama created

for the evaluation of the test scenario C.3, named Empty park, does not present the desired refocus

capability. By looking to Figure 58(c) and Figure 58(d), it is possible to see two close-ups of depth planes

extracted from the created light field image that look sharply focused and Figure 58(e) and Figure 58(f)

present two close-ups from an example depth plane that is blurred. This occurs because all the objects

in the acquired visual scene are very distant from the camera, thus the refocus technique assume that

the depth of these objects are the same, so it is not possible to extract various different depth planes

from the created light field image. As previously stated in the perspective shift capability assessment,

each acquired light field captures very small disparities, thus compromising the refocus capability of the

created light field panorama. This is related with the design of the Lytro Illum camera that was used, as

previously explained for the other test cases presented.

87

Bibliography

[1] "Lytro web page," [Online]. Available: https://www.lytro.com/. [Accessed 28 12 2015].

[2] "Raytrix web page," [Online]. Available: http://www.raytrix.de/. [Accessed 28 12 2015].

[3] E. Adel, M. Elmogy and H. Elbakry, "Image Stitching Based on Feature Extraction Techniques: A

Survey," International Journal of Computer Applications, vol. 99(6), pp. 1-8, August 2014.

[4] K. Shashank, N.SivaChaitanya, G.Manikanta, Ch.N.V.Balaji and V.V.S.Murthy, "A Survey and

Review Over Image Alignment and Stitching Methods," International Journal of Electronics &

Communication technology, vol. 5, pp. 50-52, March 2014.

[5] Z. Zhang, "A Flexible New Technique for Camera Calibration," IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. 22, pp. 1330-1334, November 2000.

[6] R. Szeliski, "Image Alignment and Stitching: A Tutorial," Foundations and Trends in Computer

Vision , 2006.

[7] J. R. Bergen, P. Anandan and K. J. &. H. R. Hanna, "Hierarchical Model-Based Motion Estimation,"

in Proceedings of the Second European Conference on Computer Vision, Santa Margherita

Liguere, Italy, 1992.

[8] R. Szeliski, Computer Vision: Algorithms and Applications, Springer, 2010.

[9] J. Davis , "Mosaics of Scenes with Moving Objects," in IEEE Computer Society Conference on

Computer Vision and Pattern Recognition (CVPR’1998), Santa Barbara, June 1998.

[10] D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of

Computer Vision, vol. 60, pp. 91-110, November 2004.

[11] R. Karthik, A. AnnisFathima and V. Vaidehi, "Panoramic View Creation using Invariant

Momentsand SURF Features," in IEEE International Conference on Recent Trends in Information

Technology (ICRTIT'2013), Chennai, India, July, 2013.

[12] M. A. Fischler and R. C. Bolles, "Random Sample Consensus: A paradigm for Model Fitting with

Applications to Image Analysis and Automated Cartography," Communications of the ACM, vol.

24(6), pp. 381-395, June 1981.

[13] P. J. Rousseeuw, "Least Median of Squares Regresssion," Journal of the American Statistical

Association, vol. 79, pp. 871-880, 1984.

[14] "Understanding Projecting Modes," Kolor, [Online]. Available: http://www.kolor.com/wiki-

en/action/view/Understanding_Projecting_Modes. [Accessed 15 11 2015].

[15] P. F. McLauchlan and A. Jaenicke, "Image Mosaicing using Sequential Bundle Adjustment," Image

and Vision Computing, vol. 20, pp. 751-759, August 2002.

[16] M. Brown and D. Lowe, "Recognizing Panoramas," in Ninth International Conference on Computer

Vision (ICCV’2003), Nince, France, October 2003.

88

[17] H.-Y. Shum and R. Szeliski, "Panoramic Image Mosaics," Microsoft Research , Redmond, WA,

USA, 1997.

[18] J. Brosz and F. Samavati, "Shape Defined Panoramas," in Shape Modeling International

Conference (SMI), Aix-en-Provence, France, 2010.

[19] "Panoramic Image Projections," [Online]. Available:

http://www.cambridgeincolour.com/tutorials/image-projections.htm. [Accessed 24 09 2015].

[20] "Panorama Projections," [Online]. Available: http://wiki.panotools.org/Projections. [Accessed 24 09

2015].

[21] "PTAssembler Projections," PTAssembler, February 2009. [Online]. Available:

http://www.tawbaware.com/projections.htm. [Accessed 15 11 2015].

[22] "Some Projections Created with Higin Software," [Online]. Available:

http://www.360facil.com/eng/360-degree-photo-other-projection-panorama-edition.php.

[Accessed 22 09 2015].

[23] H.-Y. Shum and R. Szeliski, "System and Experiment Paper: Construction of Panoramic Image

Mosaics with Global and Local Alignment," International Jornal of Computer Vision, vol. 36(2), pp.

101-130, February 2000.

[24] M. Brown and D. G. Lowe, "Automatic Panoramic Image Stitching using Invariant Features,"

International Journal of Computer Vision, vol. 74(1), pp. 59-73, 2007.

[25] J. S. Beis and D. G. Lowe, "Shape Indexing using Approximate Nearest-Neighbour Search in High-

Dimensional Spaces," in Proceedings of the Interational Conference on Computer Vision and

Pattern Recognition (CVPR'1997), San Juan, Puerto Rico, 1997.

[26] R. Harley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd edn, New York:

Cambridge University Press, 2004.

[27] W. M. P. H. R. a. F. A. Triggs, "Bundle Adjustment: A Modern Synthesis," in Vision Algorithms:

Theory and Practice, number 1883 in LNCS, Corfu, Greece, Springer-Verlag., 1999, pp. 298-373.

[28] R. K. S. B. Szeliski, "Recovering 3D Shape and Motion from Image Streams using Nonlinear Least

Squares.," Journal of Visual Communication and Image Representation 5, vol. 1, pp. 10-28, March,

1994.

[29] P. Burt and E. Adelson, "A Multiresolution Spline with Application to Image Mosaics," ACM

Transactions on Graphics, vol. 2(4), pp. 217-236, 1983.

[30] A. Eden, M. Uyttendaele and R. Szeliski, "Seamless Image Stitchin of Scenes with Large Motions

and Exposure Differences," in IEEE Computer Society Conference on Computer Vision and

Pattern Recognition (CVPR'2006), New York, NY, USA, June 2006.

[31] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin and M.

Cohen, "Interactive Digital Photomontage," in ACM SIGGRAPH, Los Angeles, CA, USA, 2004.

[32] T. Mitsunaga and S. Nayar, "Radiometric Self Calibration," in IEEE Conference on Computer

Vision and Pattern Recognition (CVPR'1999), Fort Collins, CO, June, 1999.

89

[33] "Wikipedia on Exchangeable image file format," [Online]. Available:

https://en.wikipedia.org/wiki/Exchangeable_image_file_format. [Accessed 28 12 2015].

[34] Y. Boykov, O. Veksler and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," in

IEEE Transactions on Pattern Analysis and Machine Intelligence, Kerkyra, Greek, 2001.

[35] V. Kolmogorov and R. Zahib, "What Energy Functions Can Be Minimized via Graph Cuts?,"

Transactions on Pattern Analysis and Machine Intelligence, vol. 26(2), pp. 147-159, 2004.

[36] J. Zaragoza, T.-J. Chin, Q.-H. Tran, M. S. Brown and D. Suter, "As-Projective-As-Possible Image

Stitching with Moving DLT," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.

36(7), pp. 1285-1298, July 2014.

[37] F. Pereira, "Efficient Plenoptic Imaging: Why Do We Need It?," in Submitted to IEEE International

Conference on Multimedia and Expo (ICME'2016), Seattle, USA, 2016.

[38] E. H. Adelson and J. R. Bergen, The Plenoptic Function and the Elements of Early Vision, M. L. a.

J. A. Movshon, Ed., Massachusetts: The MIT Press, Cambridge, Mass., 1991, pp. 3-20.

[39] M. Levoy, "Light Fields and Computational Imaging," IEEE Computer, Vols. 38, no. 8, pp. 46-55,

2006.

[40] W. Lu, W. K. Mok and J. Neiman, "3D and Image Stitching with the Lytro Light-Field Camera,"

New York, NY, 2013.

[41] B. Esfahbod, "Behnam Esfahbod's open source projects," [Online]. Available:

http://code.behnam.es/. [Accessed 30 12 2015].

[42] C. Birklbauer and O. Bimber, "Panorama Ligh-Field Imaging," Computer Graphics Forum, vol.

33(2), p. 43–52, 2014.

[43] D. G. Dansereau, O. Pizarro and S. B. Williams, "Decoding, Calibration and Rectification for

Lenselet-Based Plenoptic Cameras," in IEEE Computer Vision and Pattern Recognition

(CVPR'2013), Portland, OR, June 2013.

[44] C. Birklbauer, S. Opelt and O. Bimber, "Rendering Gigaray Light Fields," Computer Graphics

Forum, vol. 32, pp. 469-478, 2013.

[45] D. G. Dansereau, "Light Field Toolbox v0.4 for MATLAB," [Online]. Available:

http://www.mathworks.com/matlabcentral/fileexchange/49683-light-field-toolbox-v0-4. [Accessed

March 2016].

[46] "Lytro Web Page," [Online]. Available: https://www.lytro.com/. [Accessed 28 12 2015].

[47] "Nodal Ninja Web Page," [Online]. Available: http://shop.nodalninja.com/. [Accessed 01 08 2016].

[48] "Manfrotto Web Site," [Online]. Available: https://www.manfrotto.com/. [Accessed 01 08 2016].

[49] "Bayer-Pattern Filter," [Online]. Available: https://keyassets.timeincuk.net/inspirewp/live/wp-

content/uploads/sites/13/2014/12/Bayer-filter.jpg. [Accessed 17 March 2016].

[50] A. Kondoz and T. Dagiuklas, Eds., Novel 3D Media Technologies, Springer-Verlag New York,

2015.

[51] D. G. Dansereau, "Light Field Toolbox for MATLAB," February, 2015, Thecnical Report.

90

[52] "Light Field Photography," [Online]. Available: http://tdistler.com/2010/09. [Accessed 17 March

2016 ].

[53] "Bayer Demosaicing," [Online]. Available: http://www.cambridgeincolour.com/tutorials/camera-

sensors.htm. [Accessed 01 08 2016].

[54] "OpenCV 2.4.12 Stitching API," [Online]. Available:

http://docs.opencv.org/2.4.12/modules/stitching/doc/stitching.html. [Accessed 01 03 2016].

[55] M. Uyttendaele, A. Eden and R. Szeliski, "Eliminating Ghosting and Exposure Artifacts in Image

Mosaics," in Computer Vision and Pattern Recognition, 2001. CVPR'01, Kauai, HI, USA, 2001.

[56] V. Kwatra, A. Schõdl, I. Essa, G. Turk and A. Bobick, "Graphcut Textures: Image and Video

Synthesis Using Graph Cuts," in SIGGRAPH'03, San Diego, California, USA, July 2003.

[57] "Light Field Camera System," [Online]. Available:

http://photo.stackexchange.com/questions/13378/what-are-the-basic-workings-of-the-lytro-light-

field-camera. [Accessed 01 08 2016].

Light Field based 360o Panoramas - ULisboa...tornam-se volumes (3D), alterando o paradigma...

Documents

Transcript of Light Field based 360o Panoramas - ULisboa...tornam-se volumes (3D), alterando o paradigma...