3D Graphical User Interfaces

8/12/2019 3D Graphical User Interfaces

1/50

3D Graphical User Interfaces

Farid BenHajji

Erik Dybner

Department of Computer and Systems Sciences

Stockholm University and

The Royal Institute of Technology

July 1999

ABSTRACT

Human-computer interaction currently faces the challenge of taking advantage of newtechnologies, which efficiently use the capabilities of the computing systems of today andmore effectively match human capabilities and perception.

We investigate how 3D Graphical User Interfaces (GUI) can be useful in softwareapplications of today and identify general areas where 3D technologies can be more usefulthan other technologies. This paper consists of a survey of 3D standards, GUIs and in

particular 3D GUIs. A 3D GUI prototype based on applications of ABB Network Partner ABis described.

There are several possible advantages using 3D graphics. Interesting aspects are thepossibilities of natural, intuitive navigation and interaction with computer systems. However,there are a couple of problems. The lack of commonly accepted standards for 3D interfacese.g. widgets and graphic formats is one of them.1

Keywords

Human-Computer Interaction (HCI), Three-dimensional (3D), Graphical User Interface

(GUI)

1 This thesis corresponds to 20 full-time working weeks per author.


2/50

Table of Contents

I

1 INTRODUCTION........................................................................................................................................ 1

1.1 BACKGROUND ............................................................................................................................................11.2 EMPLOYER .................................................................................................................................................11.3 PURPOSE ....................................................................................................................................................11.4 METHODOLOGY..........................................................................................................................................11.5 RESPONSIBILITIES .......................................................................................................................................21.6 STRUCTURE OF THETHESIS ........................................................................................................................21.7 ACKNOWLEDGEMENTS ...............................................................................................................................2

2 COGNITIVE ASPECTS ON USER INTERFACE DESIGN................................................................... 3

2.1 NAVIGATION...............................................................................................................................................32.1.1 Cognitive Maps ................................................................................................................................ 3

2.1.2 Spatial Ability ..................................................................................................................................4

2.2 INTERACTION .............................................................................................................................................42.2.1 Attention...........................................................................................................................................4

2.2.2 Affordances ......................................................................................................................................4

2.2.3 Abstraction.......................................................................................................................................5

3 INTRODUCTION TO USER INTERFACES ...........................................................................................6

3.1 CHARACTER-BASEDINTERFACES...............................................................................................................63.2 2D GRAPHICAL USERINTERFACES.............................................................................................................6

3.2.1 Desktop Metaphor............................................................................................................................7

3.2.2 Interface Components ......................................................................................................................7

4 3D GRAPHICAL USER INTERFACES....................................................................................................8

4.1 PERCEPTION OF DEPTH ............................................................................................................................... 84.1.1 Monocular Depth Cues........................................................................................................ ............ 8

4.1.2 Binocular Depth Cues........................................................................................................ ..............9

4.2 INTERFACE COMPONENTS .........................................................................................................................104.3 INPUT DEVICES.........................................................................................................................................124.4 OUTPUT DEVICES .....................................................................................................................................134.5 VIRTUAL REALITY ....................................................................................................................................14

5 APPLICATION AREAS FOR 3D GRAPHICAL USER INTERFACES .............................................16

5.1 INFORMATIONVISUALISATION .................................................................................................................165.2 SIMULATION .............................................................................................................................................17

5.2.1 Virtual Wind Tunnel.......................................................................................................................17

5.2.2 Flight Simulator .............................................................................................................................17

5.3 COMPUTERAIDEDDESIGN .......................................................................................................................185.4 COMPUTERSUPPORTED COOPERATIVEWORK .........................................................................................195.5 ENTERTAINMENT......................................................................................................................................205.6 MARKETING .............................................................................................................................................20

6 OVERVIEW OF 3D GRAPHICS STANDARDS....................................................................................21

6.1 OPENGL...................................................................................................................................................216.1.1 Introduction ................................................................................................................................... 21

6.1.2 History ...........................................................................................................................................21

6.1.3 Design............................................................................................................................................22

6.1.4 Architecture....................................................................................................................................23

6.1.5 Application Programming Interface Specification ........................................................................23

6.1.6 Libraries on top of OpenGL...........................................................................................................23

6.1.7 Applications ...................................................................................................................................24

6.2 VIRTUAL REALITYMODELLING LANGUAGE .............................................................................................246.2.1 Introduction ................................................................................................................................... 24

6.2.2 History ...........................................................................................................................................25

6.2.3 Design............................................................................................................................................25

6.2.4 Presentation and Interaction .........................................................................................................26

6.2.5 External Authoring Interface .........................................................................................................26

6.2.6 Problems ........................................................................................................................................27


3/50

Table of Contents

II

6.2.7 Future............................................................................................................................................. 28

6.3 CONCLUSION ............................................................................................................................................28

7 COMPARISON OF 2D AND 3D GRAPHICAL USER INTERFACES ...............................................29

7.1 NAVIGATION.............................................................................................................................................297.1.1 Cognitive Maps ..............................................................................................................................29

7.1.2 Spatial Ability ................................................................................................................................29

7.1.3 Motion Sickness .............................................................................................................................29

7.2 INTERACTION ...........................................................................................................................................307.2.1 Attention.........................................................................................................................................30

7.2.2 Affordance......................................................................................................................................30

7.2.3 Abstraction.....................................................................................................................................30

7.2.4 Input and Output............................................................................................................................31

7.3 DISCUSSION ..............................................................................................................................................31

8 ANALYSIS..................................................................................................................................................32

8.1 CURRENT SITUATION ................................................................................................................................328.2 DESIRED SITUATION .................................................................................................................................32

8.2.1 Information Visualisation ..............................................................................................................33

8.2.2 Attractiveness .................................................................................................................................33

8.3 NEW USERINTERFACEDESIGN ................................................................................................................338.3.1 Navigation......................................................................................................................................33

8.3.2 Interaction......................................................................................................................................35

8.4 SYSTEMARCHITECTURE ..........................................................................................................................36

9 SUMMARY ................................................................................................................................................37

BIBLIOGRAPHY................................................................................................................................................38

APPENDIX A....................................................................................................................................................... 41

APPENDIX B .......................................................................................................................................................46


4/501

1 Introduction

This chapter gives the reader the background and purpose of the thesis and describes ourmethodology

1.1 Background

The rapid development of computer hard- and software has caused significant changes in theway users interact with computers. We have seen a development from the early stages withcharacter based interfaces to modern Graphical User Interfaces (GUI) of today.

The Windows, Icons, Menus, and Pointer (WIMP) interface paradigm dominates thecomputing systems of today, these interaction techniques were originally developed formachines that are now 10 or nearly 20 years old. Human-computer interaction (HCI) currentlyfaces the challenge of evolving past the WIMP plateau and taking advantage of newtechnologies, which efficiently use the capabilities of the computing systems of today andmore effectively match human capabilities and perception.

Until recently, Three Dimensional (3D) graphics were primarily used in specialised domainssuch as Computer Aided Design (CAD) and flight simulators. Due to the increasingperformance of new computer hardware, in combination with state-of-the-art interactive 3Dcomputer games, e.g., the category of Doom-like games [Lampton 1994], there has been arapid growth in interest for real-time 3D graphics [Mullet et al 1995].

1.2 Employer

Asea Brown Boveri (ABB) is a Swedish/Swiss industrial enterprise and a market leader inproducing power stations, power transmissions, and power transport. It is also active in otherareas of industry. ABB Network Partner AB is a subsidiary that among other things produceinformation management systems and protection and control terminals for power stations.

ABB Network Partner AB is like many other companies, very interested in how 3D graphicstechnologies can be applied to improve their processes and products.

1.3 Purpose

The purpose of the thesis is twofold: to conduct a survey of 3D GUIs from a cognitiveperspective and to investigate how to implement 3D GUIs.

We will therefore investigate what the term 3D GUI implies and identify current applicationareas. Further, we will discuss how 3D GUIs can be applied to improve company processesand products.

To accomplish our second goal we will study 3D graphics standards in order to investigatewhat it takes to implement 3D GUIs. Finally, we intend to implement a prototype to show howABB Network Partner can take advantage of 3D GUIs in their applications, which will be ourmain contribution to the field.

1.4 Methodology

Our work consists of a survey of GUIs and in particular 3D GUIs. There are several papers inthis area of research on which we have based our results. Many of these papers are onlyavailable on the Internet due to this being a newly developed research area. The parts of the


5/502

thesis covering 3D standards are based on studies of the reference literature of the standardsand our own experience of using them. Finally we have studied the user interface of differentABB Network Partner AB applications, for which we have constructed a 3D GUI prototype, toshow how their products can be improved by 3D.

1.5 Responsibilities

The contents of chapter 2, chapter 5, chapter 7 and chapter 8 are under the responsibility of

Farid BenHajji, and the contents of chapter 3, chapter 4, chapter 6, and chapter 9 are under theresponsibility of Erik Dybner. We have thoroughly discussed and jointly developed all parts ofthe thesis.

1.6 Structure of the Thesis

We start off in chapter 2 by describing the cognitive aspects on user interface design. Inchapter 3 we will give a brief introduction to user interfaces and in chapter 4 an introductionto 3D graphical user interfaces. Chapter 5 is a survey of application areas for 3D GraphicalUser Interfaces followed by chapter 6 wich gives an overview of 3D graphics standards.Chapter 7 includes a comparison of 2D and 3D graphical user interfaces based on the

cognitive aspects introduced in chapter 2. Chapter 8 describes the current and wanted situationfor the user interfaces of the applications of ABB Network Partner which leads to adescription of a new user interface. Finally chapter 9 sums up the thesis.

1.7 Acknowledgements

This thesis originated from a commission given by ABB Network Partner AB.

We would like to thank our supervisor at DSV and ABB Network Partner AB, JohanKummeneje. We would also like to thank Jan Bystedt who was kind enough to help proof readthis thesis.


6/503

2 Cognitive Aspects on User Interface Design

User interface design is an interdisciplinary activity in which research from the field ofpsychology can be of great use. An understanding of how humans perceive and interact withtheir environment can contribute to the design of more intuitive user interfaces. Therefore, thischapter is a short introduction to some cognitive aspects of navigation and interaction.

2.1 Navigation

Navigation is the process by which people control their movements using environmental cuesand artificial aids such as maps so that they can reach their goals without getting lost. A majorproblem for users in all forms of navigation is how to maintain knowledge of their locationand orientation while moving through space.

Important aspects when discussing navigation in computer systems are cognitive maps andspatial abilities. [Darken and Sibert 1993] have shown that people use physical worldwayfinding strategies in large virtual worlds, i.e. cognitive maps for navigational aid.According to [Hk and Dahlbck 1997] individual differences in spatial abilities affect users'

ability to navigate in computer systems.

2.1.1 Cognitive Maps

Birds has often proven to be superior navigators during migration and although research hasnot produced definitive answers, there exists evidence that indicates the use of advancedperceptual abilities to detect and encode environmental cues [Darken and Sibert 1993].

It is believed that birds use a landmarking technique in which a cognitive map is created[Darken and Sibert 1993]. Landmarks on this map may be visual, acoustic, or olfactory.Resources such as shelter and water is encoded along with information concerning the shortestand safest paths between points. Increased altitude enables spatial relationships to be refined

since more landmarks can be seen simultaneously, but it also decreases the strength of thestimuli.

In similarity with birds humans are also thought to form cognitive maps of their environmentsfor use in navigation, using a set of generic components such as paths, edges, landmarks,nodes, and districts. According to [Darken and Sibert 1993]they include: Paths: linear separators, walkways and passages.

Edges: linear separators, walls or fences.

Landmarks: objects that are in sharp contrast to their immediate surroundings, such as achurch spire.

Nodes: sections of the environment with similar characteristics. For example, a group ofstreets with the same type of light posts.

Districts: logically and physically distinct sections.

According to [Chen and Czerwinski 1998] simply adding cues like borders, boundaries, andgridlines in a user interface significantly improve navigation performance.


7/504

2.1.2 Spatial Ability

It has been shown that users spatial ability is relevant to their ability to make use of computersystems. There exists enough knowledge of individual differences and navigation to say thatthese are important and significant [Hk and Dahlbck 1997].

When discussing spatial ability three various dimensions are commonly addressed [Satalich1995]. They are spatial orientation, spatial visualisation, and spatial relations. Spatial

orientation involves the ability to mentally move or transform stimuli, while retaining theirrelationships. It also involves the mental manipulation of an object using oneself as areference. Spatial visualisation goes further, in that the person can manipulate therelationships within an object. Finally, spatial relations, consists of the ability to imagine howan object will align from different perspectives.

In [Hk and Dahlbck 1997] it is concluded that spatial ability affects user ability to navigatelarge file structures. Similar results are also found concerning navigation in hypermedia, i.e. acollection of non-linear text-based nodes that are linked together. The best user was almost 20times faster than the slowest in using such a system to solve six problems.

2.2 InteractionInteraction deals with how humans exchange information with the environment, typically whywe choose to focus on certain elements (attention), how we know what we can do with theseelements (affordances) and how we filter redundant information (abstraction).

2.2.1 Attention

The attention of humans is among other things attracted to differences in visual appearance,e.g. colour, shape, texture, shading, size, location, and the ability to register movement [Wissand Carr 1998].

The location of an object also determines the amount of attention we give to it. Objectslocated in the centre of the visualisation tend to get more attention than peripheral objects.Also, if our attention is focused on something particular, elements close to this will get moreattention than elements further away. Another thing is our ability to register movement, whichamong other things makes blinking objects catch our eyes, making it difficult to concentrateon other things. This ability is inherited from our ancestors for whom it was very important tonotice movement that could be either danger or potential food in their environment [Smith1993].

2.2.2 Affordances

Affordance refers to the perceived and actual properties of the things, primarily those

fundamental properties that determine just how things can possibly be used [Norman 1988]. Inother words, affordance is the cues that an object gives to explain what can be done with it. Abutton affords pushing, a knob affords turning, and a ball affords bouncing or throwing. Whatthe object affords to a person relates to that persons current mental model, cultural aspects,etc.

A psychology of causality is also at work as we interact with the environment, meaning thatthe perceived result of an action appears to be caused by a particular action. The failure andthe action may very well be related only through a coincident. This kind of false causality mayoccur in user interfaces due to poor design [Norman 1988].


8/505

2.2.3 Abstraction

The amount of data that flow through the human senses is so large that we are forced to filtermuch of it, in order to focus on elements that we consider being important. The problem isthat filtering removes global context and may leave us disoriented. One solution to retainglobal context is to organise the data that we do retain in ways that promote understanding atthe expense of detail. Abstraction implies a many-to-one transformation between the possible

variants of a feature to a single invariant form. This can be illustrated by the gestalt law ofperception [Smith 1993].

The gestalt law of perception is one way to describe how the human brain relates visualsymbols, by grouping symbols that are close together and forms structure that is onlysuggested in the visualisation. Gestalt psychology is a school of psychology that emphasisespatterns of organisation. gestalt is a German word, which translates to form, shape, pattern,or organised whole. The basic belief is that the whole is greater than the sum of its parts. Thegestalt psychologists catalogued a number of laws of perceptual grouping which provideevidence for the nature of low-level visual processing, as well as a clue to the nature of thevisual representation. They suggest that people group and interpret stimuli in accordance withfour laws of perceptual organisation: proximity, similarity, closure and continuity [Smith

1993].The principle of proximity suggests that elements closer together, tend to be grouped togetheras opposed to those further away. Objects that are grouped together based on similarproperties apply to the principle of similarity. Another law of perceptual organisation is calledclosure, it includes the ability to see apparently disorganised or unrelated parts as a meaningfulwhole and to construct a total picture from incomplete or ambiguous material.The last gestalt principle of organisation, continuity deals with objects that define smoothlines or curves. Good continuation is such a powerful organising principle that when the figureis restored to its initial state, you may lose the new organisation you just found as youroriginal organisation takes over again. Similarity, Proximity, and closure are described in

figure 2-1.

Figure 2-1 Three laws of perceptual organisation: proximity, similarity, and closure

[Smith 1993].


9/506

3 Introduction to User Interfaces

The user interface is one of the most important parts of any program because it is the onlyconnection between the system and the user. Up until now there has been basically twodifferent ways to communicate with the interfaces; either by typing commands in a character-based interfaces (CBI) or by pointing and clicking on objects in a GUI. CBIs are generally

more difficult to learn but is often considered to be more flexible than GUIs [Preece et al.1994].

We will give a brief introduction to user interfaces and present different types of userinterfaces, starting with the CBI and the two dimensional (2D) GUIs. Next chapter introduces3D GUIs.

3.1 Character-Based Interfaces

The display screen of a CBI is typically represented as an array of boxes, each of which canhold only one character. Since a CBI can only display characters, the users must interact withthe system by typing commands on a keyboard (See figure 3-1).

The basic input device on most small computer systems is a keyboard. As characters aretyped, they are stored in the memory and copied to a basic output device, a display screen.

CBIs can be concise and unambiguous, but are often difficult for a novice to learn andremember mainly because it is essential for the users to learn the syntax. Once it is learned aCBI can often be flexible and powerful for advanced users. However, error rates are typicallyhigh, training is necessary, retention may be poor and error messages can be difficult tounderstand [Preece et al. 1994].

Many character-based interfaces include some features of GUIs, such as menus. Othercharacter based interfaces include shapes for drawing simple objects but are not capable ofrepresenting more complicated objects e.g. curves; these are sometimes called graphicalcharacter based user interfaces to separate them from true GUIs [Preece et al. 1994].

Microsoft MS-DOS Version 6.2

Copyright Microsoft Corp 1981-1993.

C:\>copy g:\supervisor\final_thesis.doc a:\cheater\

C:\>

Figure 3-1 A Character Based Interface.

3.2 2D Graphical User Interfaces

Fundamental for the GUIs is that the screen is divided into a grid of picture elements (pixels),and by turning the pixels on or off pictures can be formed [Davis 1991].

Similar to Character Based Interfaces, GUIs also uses a keyboard as an input device and adisplay as an output device. With the introduction of the desktop metaphor (see section 3.2.1),a new input device that revolutionised the interaction was introduced. The mouse is a small


10/507

device that moves the cursor relative to its present position and makes it easy to interact withthe computer system [Smith et al. 1982].

GUIs usually features windows, icons, menus, and pointers. The window makes it possible todivide the screen into different areas and to use each area for different tasks. Each window cancontain icons, which are small pictures of system objects representing commands, files orwindows. Menus are a collection of commands and objects from where a user can select andexecute a choice. The pointer is a symbol that appears on the display screen, that you move toselect and execute objects and commands.

A GUI often includes well-defined standard formats for representing text and graphics, thusmaking it possible for different programs using a common GUI to share data.

The use of GUIs helps reduce the mental effort required to interact with the programs. Insteadof remembering sequences of complex command languages, users only need to learn how tointeract with the simulated world of objects, e.g. icons and menus [Preece et al. 1994].

3.2.1 Desktop Metaphor

One of the most famous of the first graphical user interfaces is called the Star user interface

and was designed by Xerox Corporations Palo Alto Research Centre in the 1970s. Yet it wasnot until the early 1980s that the development of the Apple Lisa and the Macintosh madegraphical user interfaces popular [Smith et al. 1982].

The Star interface is based on the physical office and thus designed to be more like thephysical world that already is familiar to the users [Smith et al. 1982]. The idea of theinterface metaphor was to create electronic counterparts to the physical objects in an office.This involved representing office objects, e.g. papers, folders, filing cabinets, and trashcans,on the screen. The overall organising metaphor presented on the screen was that of a desktop,resembling the top of an office desk.

Some of the principles that were developed with the Star computer included direct

manipulation, What You See Is What You Get (WYSIWYG), and consistency of commands[Smith et al. 1982]. Direct manipulation is a communication style where objects arerepresented on the screen and can be manipulated by the user in ways analogues to how theuser would work with the real objects. WYSIWYG is a way to represent the users conceptualmodel in a form that can be displayed. For example a document is displayed on the screen, asit would look in printed form [Preece et al. 1994].

3.2.2 Interface Components

Interface components (sometimes known as widgets) are graphical objects with behaviour thatare used to control the application and its objects [Conner et al. 1992]. The objects, such as

buttons, menus, sliders, texts, labels, droplists, and lists are combined to make up the GUI.They all allow user interaction via a pointer device and/or a keyboard.


11/508

4 3D Graphical User Interfaces

In the literature the term 3D interface is used to describe a wide variety of interfaces fordisplaying and interacting with 3D objects. True 3D interfaces i.e. interfaces with all itscomponents in a 3D environment have not yet had any major impact outside the laboratory.The interfaces nowadays referred to as 3D GUIs are almost exclusively hybrids between 2D

and 3D interfaces. In this thesis we will refer to these "hybrid" interfaces simply as 3D GUIs.In the early days of 3D computing the only users of 3D graphics were exclusive groups ofscientists and engineers (i.e. experts) with access to expensive super computers andworkstations. Today, the growth in platform-level support for real-time 3D graphics hasincreased the interest in adding to the dimensionality of the GUIs [Mullet et al 1995]. The usergroup has grown from an exclusive group of expert users, to a diverse group of researchersand practitioners in such areas as CAD/CAM, medicine, engineering, etc. This imposes highrequirements on the user interface since it is now going to be used by a broad heterogeneoususer group with a big diversity in knowledge of computing. One possible way to accomplishthis is to design the user interface in a way that makes it possible for users to engage morenatural skills in navigating and orientating themselves within applications. The conceptVirtual Reality (VR), also known as Virtual Environment (VE), is an attempt to accomplishthis. It is also a field closely coupled to 3D GUIs.

4.1 Perception of depth

An important topic in the field of 3D graphics is how to visualise objects so that the users geta good impression of their shapes as well as their relative positions. To accomplish this it isimportant to consider how depth and shape are perceived by human beings.

How does one tell from a two dimensional image where objects are located in space? In anormal situation this is nothing we have to worry about since our brain takes care of all that

for us by using the information it obtains from the retina. However to be able to createcredible 3D graphics it is necessary to have a deeper insight into these matters.

The retina receives information in two dimensions but the brain translates these cues intothree-dimensional perceptions. It does this by using both monocular cues(which require onlyone eye) andbinocular cues(which require both eyes) [Smith 1993].

4.1.1 Monocular Depth Cues

The monocular depth cues are relevant when discussing computer graphics since these are thecues that will have to be used to create the illusion of depth on a normal two-dimensionaldisplay. These depth effects are experienced just as powerful with one eye closed as when

both eyes are used.With clever use of monocular cues it is possible to create images whose individual parts makesense but whose overall organisation is impossible in terms of our existing perceptualschemas. The artist Oscar Reutersvrd skilfully used this in the creation of his famousimpossible figures (see figure 4-1) [Gabriel 1985].


12/509

Figure 4-1 Swedish artist Oscar Reutersvrd cleverly manipulated monocular depth

cues to create his impossible figures [Gabriel 1985].

Imagine standing on a straight railroad track watching the rails recede in the distance. The tworails will seem to get closer together the further away you look. This illustrates an importantmonocular cue known as linear perspective.

Other more obvious monocular cues include decreasing size, i.e. objects further away from us

produce a smaller image than objects close to us and the fact that objects further away appearin a higher horizontal plane (i.e. objects in the distance appear closer to the horizon). Textureand clarity also affect the way we perceive depth, we can see objects close to us more clearlyand detailed than objects far away, also the texture of the objects is more fine grained in thedistance.

Finally just the effective use of patterns of light and shadow can provide important depthinformation. All the monocular depth cues are summarised in figure 4-2.

Linear Perspective

Texture

Linear Perspective

Decreasing Size

Height in the Horizontal Plane

Texture

Clarity

Light and Shadow

Summary of the Monocular Depth Cues

Figure 4-2 The computer game Quake II uses monocular depth cues to provide a

realistic game world.

4.1.2 Binocular Depth Cues

Earlier discussed monocular depth cues are not as convincing as the much richer depth effectthat occurs when both eyes are used. The appearance of depth that results from the use of both


13/5010

eyes is called stereopsis [Smith 1993]; it arises when the brain combines the two slightlydifferent images it receives from each eye into a single image with depth (See figure 4-3). Theeffect of stereopsis can be achieved artificially by using devices like Head Mounted Displaysor shutter glasses, as described in section 4.4. As opposed to monocular depth cues 10 % ofpeople can not interpret stereopsis, i.e. they can not detect binocular depth cues, thus makingthem stereoblind [Media Lab 1998].

Figure 4-3 Stereopsis occurs when each eye sees a slightly different image [Media Lab

1998].

4.2 Interface components

The 3D user interfaces of today often take little advantage of the opportunities of the addedthird dimension using mostly plain 2D interface components that are mapped to rotation andtranslation of the 3D objects. A normal computer mouse is also commonly used. The mousemovement is then mapped to rotation/translation and a selection mechanism is used forselecting which degrees-of-freedom the device is controlling, as can be seen in figure 4-4.

Figure 4-4 An example of 2D Interface components controlling the view of a 3D world

[Superscape 1998].


14/5011

Various approaches have been made to develop new metaphors and interaction techniques totake advantage of the greater possibilities of 3D. One approach that is still in its infancy is theconcept of 3D widgets, which involves making the interface components first class objects inthe same 3D environment as the rest of the objects in an application. There are no generallyaccepted metaphors for 3D interfaces analogous to the well-established metaphors for 2Dinterfaces discussed in section 3.2.1. The lack of metaphors denotes that essentially no toolkitsfor 3D interfaces exists [Conner et. al].

Selective Dynamic Manipulation is a paradigm for interacting with objects in visualisations[Chuah et. al 1995]. Users control the visualisations using 3D handles that may be attached toobjects (see figure 4-5). To manipulate the objects the handles are pushed or pulled. Theobjects move and react continuously to the manipulation of the handles. Use of interactiveanimation in this way can help users understand and track the changes that occur to an objectset [Chuah et. al 1995].

Figure 4-5 Selective Dynamic Manipulation uses handles on objects to afford stretching,

rotation etc [Chuah et. al 1995].

Other examples of 3D widgets include the Arcball which is a mouse-driven 2D interface to

simulate a physical trackball (see figure 4-6). The virtual object is shown on the screen, andwhen the user clicks and drags on the virtual object, the computer interprets these drags astugging on the simulated trackball. The virtual object rotates correspondingly. To provide thethird rotational degree of freedom, a circle is drawn around the object, and when the userclicks and drags in the area outside of the circle, rotation is constrained to be about the axisperpendicular to the computer screen [Hinckley 1996].

Figure 4-6 Example of the Arcball [Hinckley 1996].


15/5012

4.3 Input Devices

Systems that use plain 2D widgets as interface objects require no special input devices besidesa normal mouse and/or keyboard. The problems with using a normal mouse to manipulate 3Dobjects include that you have to indicate in some way which degrees of freedom you wish tocontrol by moving the mouse. Other input devices are available to eliminate the need formultiple mouse clicks and menu selections. These can be divided in two categories: desktop

devices and free-space input devices. Examples of desktop devices that provide six degrees offreedom are the Space Mouse and the Spaceball (see figure 4-7). These kinds of devices areoften used as a complement to the 2D mouse to rotate objects or views. The 2D mouse isusually held in the dominant hand and is used for picking objects and selecting menu itemswhile the other hand controls the 3D input device. In the case of the Spaceball input isaccomplished by light twisting, pulling or pushing of the ball in all directions.

Figure 4-7 The Space Mouse (left picture) and the Spaceball are desktop devices with six

degrees of freedom suitable for interaction with 3D objects.

The free-space input devices consists of camera-based devices and devices with magnetictrackers. Many different input devices using magnetic trackers are shaped as gloves,sometimes referred to as data gloves. The glove measures the movements of the wearers

fingers and transmits them to the computer. Sophisticated data gloves also measure movementof the wrist and elbow. A data glove may also contain control buttons or act as an outputdevice, e.g. vibrating under control of the computer. An application usually generates animage of the data glove on the computer screen so that the user can grab objects in thecomputer generated 3D environment. By using different gestures such as pointing the user canchange view or move around in the computer model.

A typical so called data glove is the Fakespace Pinch Glove (see figure 4-8) described in[Hinckley 1996]. The pinch glove uses tabs on the fingertips to sense when the user ispinching. On the back of the hand there is a magnetic tracker used to provide hand positiondata to the computer.


16/5013

Figure 4-8 A data glove and its representation in the computer generated environment

[Hinckley 1996].

4.4 Output Devices

3D output devises come in two types: monoscopic, in which both eyes see exactly the sameview, and stereoscopic, where the eyes see separately computed views to give the sensation of

stereoscopic vision [Preece et al. 1994]. Of course most 3D-applications use monoscopicoutput devices i.e. conventional computer monitors to display the 3D world. Theseapplications are said to provide a window on the world interface. This is the technique withthe least demands for computing power and special equipment but since no binocular depthcues are provided the users do not get a very strong feeling of depth.

The stereoscopic output devices include Head Mounted Displays (HMD) (see figure 4-9).HMDs are commonly used in the field of VR to create an immersive experience. A typicalHMD is composed of a set of goggles or a helmet with tiny monitors in front of each eye togenerate images seen by the wearer as three-dimensional. Often the HMD is combined with ahead tracker so that the images displayed change as the head moves.

Other simpler forms of goggles for stereoscopic output are so called shutter glasses. They areused in combination with a normal computer monitor. The basic technique behind shutterglasses is that they alternately block eye views in synchrony with the computer display of leftand right eye images to provide stereoscopic images on the computer screen. This techniquecannot be used successfully together with head tracking since the computer monitor will notfollow the user. Also shutter risk producing a flickering image.

Figure 4-9 The Head Mounted Display MRG 3c [Liquid Image].

A quite different approach to 3D output is the CAVE Automatic Virtual Environment fromthe University of Illinois. The CAVE is a surround-screen, surround-sound, projection-based


17/5014

system The computer graphics are projected into a 10x10x9 cube composed of displayscreens that completely surround the viewer. The viewer explores the virtual world by movingaround inside the cube and grabbing objects with a three-button, wand-like device.

Another way of providing true 3D output is the boom display described in [Hinckley 1996](see figure 4-10).

Figure 4-10 The Boom Display.

Although the technology for HMDs is available today we consider it still to be in its infancyand it is not yet commonly used. It is hard to imagine an office of today using HMDs forcommon tasks. HMDs would significantly complicate ordinary communication or even such asimple thing as viewing what is located at your desk.

4.5 Virtual Reality

There are as many definitions of VR as there are people who try to define it. Although nosingle definition is available, a common goal of VR applications is to give the user an illusionof immersion in a computer-generated environment. The idea of VR originates from the booksby William Gibson, where he describes a computer generated environment called the Matrix[Gibson 1984].

The term is sometimes used more generally to refer to any virtual world represented in acomputer, even if it is just a text-based or graphical representation.

A more narrow definition of VR is sometimes referred to as immersive VR. It includescomputer hardware that replaces some or all of the human senses. This creates an illusion ofimmersion in the virtual environment (see figure 4-11), e.g. replaces sight by using a HeadMounted Display, hearing by using headphones or the sense of touch or force on the body byusing a special suit or gloves. For an application to fit in this definition of VR some or all ofthese techniques have to be included and combined.

When provided with techniques from the area of immersive VR we believe there is no need

for widgets for translation and rotation of objects. If the user wants to rotate an object in suchan environment it should be possible to grab the object in some way and rotate it as if it werean actual object.

Desktop VR is a subset of VR that does not require as much special hardware. It can beaccomplished at little additional cost to a typical personal computer. The desktop VRapplications all have a window on the world interface, they display the 3D world as if youwere watching it through a window.


18/5015

Figure 4-11 An immersive VR system using a head mounted display and a freespace

input device.

The negative part is that the media has brought forward unrealistic expectations on thecapabilities of VR. No computer application so far has provided the realistic feedback thatpurists claim to be necessary for VR. We believe that the real power of VR lies in the wayobject properties and behaviours are an approximation of their real life counterparts. Suchfeatures of VR make the objects behaviour more accessible and understandable to humanusers. According to [Corder 1995], immersive VR is still in an experimental state but desktopVR and VR in the field of e.g. flight simulators already do a lot of good.


19/5016

5 Application areas for 3D Graphical User Interfaces

This section covers examples of application areas where 3D GUIs have been applied.Examples of existing applications are given within the areas of information visualisation,simulation, navigation, entertainment, and marketing.

5.1 Information VisualisationSystems based on the desktop metaphor use spatial metaphors but do not take advantage ofthem since they really are flat environments. The advent of 3D metaphors like rooms, houses,cities, etc. introduces the possibility of taking advantage of our spatial abilities like we areused to in the real world. Spatial concepts are undoubtedly very important ingredients in ourdaily lives and such spatial environments could therefore be of advantage in computerinterfaces. This is shown in the File System Navigator described in [Andrews 1994].

The File System Navigator (FSN) from Silicon Graphics is a prototype known from itsappearance in the movie Jurassic Park. The prototype visualises a Unix file system bydisplaying the files as an information landscape in an attempt to draw advantage of the users

spatial capabilities (See figure 5-1).

Figure 5-1 The File system Navigator visualises a Unix file system as an information

landscape.

It uses 2D interface components but visualises the file system in 3D. Directories arerepresented hierarchically by pedestals laid out on a plane, their height is proportional to thesize of the contained files and different colours represent the age. Smaller blocks on top of thedirectory pedestals represent files in the directory. The directories are connected by wires, onwhich it is possible to travel. Clicking on the arc to a subdirectory results in an invisible handgrabbing you and leading you through space to that subdirectory. Clicking on a file block


20/50


21/5018

The Vertical Motion Simulator from NASA is one of the largest three-dimensional simulatorsin the world [Corder 1995]. It consists of a simulated air craft cockpit sitting on a platformwhich is itself supported by a number of hydraulic pistons to move the whole cockpit around(See figure 5-3).

The simulated cockpit consists of a removable room that is used to represent different types ofaircraft such as fighters, helicopters, transports, and space shuttles. The cockpits differ in thenumber of windows, configuration of the controls and instruments, since they must be able torepresent a wide variety of current and planned aircraft. Behind the windshields are high-resolution displays designed to approximate what the pilot would see in the corresponding realworld situation.

Figure 5-3 The simulated air craft cockpit of the Vertical Motion Simulator.

5.3 Computer Aided Design

Computer Aided Design systems are designed to facilitate the design process for engineersand architects when creating and manipulating 2D and 3D models. These models may berotated, scaled to various proportions, edited, labelled, and manipulated in a variety of ways.CAD systems deal with displaying and manipulating blue prints for real objects and can

therefore benefit from 3D graphics to a great extent (see figure 5-4).

Figure 5-4 The CAD system MiniCAD for the Apple Machintosh.


22/5019

The 3D models created with the CAD system can be used to create architectural walk-throughs. Interactive computer programs that simulate the experience of walking through abuildings interior are useful for visualisation and evaluation of building models before thebuildings are constructed. With a walk-through, things can be discovered, added or correctedbefore the actual construction begins, thus potentially saving a lot of money.

5.4 Computer Supported Cooperative Work

Computer Supported Cooperative Work (CSCW) is the study of how people work togetherusing computer technology. Typical applications include email, awareness and notificationsystems, videoconferencing, chat systems, multi-player games, and real-time sharedapplications (such as collaborative writing or drawing). The potential benefits of designingcomputers and their interfaces to facilitate group working appear to be huge [Preece et al.1994]. As shown in the next section virtual environments can provide powerful and naturalways of collaboration using computers.

The Distributed Interactive Virtual Environment (DIVE) system from the Swedish Institute ofComputer Science (SICS), is a toolkit for building distributed multi-user VR applications in aheterogeneous network environment [Stl and Andersson 1997]. As shown in figure 5-5,

participants navigate in 3D space and observe, meet and collaborate with other participants. Aparticipant in the world is called an actor, and is either a human user or an automatedapplication process. To facilitate the recognition and awareness of ongoing activities, the actoris represented by an avatar, which is an icon or a representation of a user in a shared virtualreality world or multi-user domain [Carey and Bell 1997].

Figure 5-5 Conference in the DIVE.

The visualiser is a rendering2 application through which the user sees the world, it renders thescenes from the viewpoint of the actors eye. Changing the position of the eye, or changing theeye to an another object, will change the viewpoint. A visualiser can be used in combinationwith a wide range of devices such as an HMD and gloves. Further, it reads the users inputdevices and maps the physical actions taken by the user to logical actions, such as navigationand interaction, with the objects in the Dive system.

2 Rendering is the process by which a computer creates images from models [Nieder et al. 1993]


23/5020

In a typical DIVE world, a number of actors leave and enter dynamically. Additionally, anynumber of applications exists within a world. Such applications typically build their userinterfaces by creating and introducing necessary graphical objects. Thereafter, they scan afterevents in the world, so when an event occurs, the application reacts according to some controllogic.

5.5 Entertainment

Whether you look at computer games or movie effects there is no doubt that 3D modelling hashad a tremendous impact on the entertainment industry over the last decade or so.

Many movies of today feature special effects in 3D and may even become blockbusters justbecause of them. As described in [Govil-Pai 1998], there even are movies completely made upof computer animation e.g. Toy Story. Most new computer games include 3D graphics and 3Duser interfaces in one form or another. This implies that users prefer the games with 3Dgraphics to their 2D counterparts. One possible reason is that new computer games must lookmodern to attract its audience. Many game developers have spent a lot of money and effortdeveloping attractive, amusing, and easy to use 3D interfaces. According to us, this suggeststhat there is much to learn from studying the user interfaces of these games.

To apply techniques from games in e.g. office applications, one must bear in mind thatcomputer games do not necessarily have to include a user-friendly interface. A problem incontrolling the new user interfaces can be viewed simply as part of the game.

5.6 Marketing

The computer, telephone, and television boom, as well as the merging of these technologies,has had a major impact on the way companies produce and market their products. The adventof the Internet promises companies access to millions of new customers at a fraction of thecost of printed or televised advertising [Kotler 1997]. With relative ease multimediadocuments can be constructed for the WWW. Text and pictures can provide a great deal ofproduct information, but are static and limited to what the creator chooses to present.

It would be more useful if a potential buyer was able to interact with an animated 3D model ofthe product. This would engage the attention of the buyer, and could provide information ofthe product that otherwise could only be gained from real experience of the product [Burton etal. 1997]. Therefore, it is important that the technology for marketing a product is able toattract and maintain the attention of a potential customer. It must be easy to use even forinexperienced users, it should also provide useful information about the product. Finally, itmust be cost-effective to employ and easy to create and maintain.

People do find 3D interesting and if there is sufficient 3D interaction built into the

environment, it will not only attract interest but maintain it as well [Burton et al. 1997], thusmaking the users in control not the computer. As we have mentioned before, the main problemwith this technology is the lack of developed 3D navigation techniques. Other drawbacks arethe time it takes to create complex scenes and the slow rendering of these scenes.

There are basically two different ways of using 3D graphics for marketing purposes.Attractiveness, i.e. attracting users by making applications look and feel and advertising.Advertising possibilities include making impressive commercials for television and thecinema using computer graphics, but there is also a new way of advertising possible, namelythe Internet. Using modelling software, companies can create 3D models of their products.This allows an interactive form of marketing where the customers have more possibilities ofexamining a product before is bought.


24/5021

6 Overview of 3D Graphics Standards

The use of 3D graphics has been limited not only by the performance of computer systems butalso by the lack of programming standards. Without standards, it has been a big undertakingfor developers to include 3D interfaces in their applications.

Standards allow developers to include 3D graphics in their applications without having to

reinvent the wheel over and over again. Standards also allow for hardware developers tooptimise their hardware for the use of that particular standard. This leads to higherperformance and hopefully also better cross-platform compatibility, making it easier to portapplications to other platforms.

This chapter will focus on two standards; Open Graphics Library (OpenGL), a low-levelprogramming interface for 3D graphics and Virtual Reality Modelling Language (VRML) ahigh-level description language of 3D worlds. These standards are well accepted at the writingof this thesis and they represent two different approaches to standardisation of 3D graphics.The insights gained from writing this chapter guided us into the choice of tools for theimplementation of the prototype described in chapter 8.

6.1 OpenGL

This chapter provides an overview of OpenGL with focus on the design considerations in thedesign of the system. In the end of the section a list of application areas for OpenGL ispresented.

6.1.1 Introduction

OpenGL is a software interface to graphics hardware designed to be independent of operatingsystem, window system, and hardware operations. The interface consists of commands thatallow specification of geometric objects in two or three dimensions, together with commands

that control how these objects are rendered. OpenGL has no high-level commands fordescribing complex 3D objects, instead it includes a set of geometric primitivespoints,lines, and polygons which when used in combination can build up more complex models (seefigure 6-1).

6.1.2 History

Silicon Graphics Inc. developed OpenGL as a successor to the graphics library called IRISGraphics Library (IRIS GL). IRIS GL is a hardware independent graphics interface that hasbeen implemented on numerous graphics devices of varying sophistication. [Kilgard 1993]

OpenGL has a 3D rendering philosophy similar to IRIS GL, but has removed outdatedfunctionality or replaced it with more general functionality making it a distinctly newinterface. Experience with IRIS GL was useful in designing OpenGL and another advantagewas the fact that programmers with experience of IRIS GL were able to change systemswithout great trouble.

In 1992 OpenGL was proposed as a standard for 3D graphics and for this reason the industryconsortium known as the OpenGL Architectural Review Board (ARB) was formed. TheOpenGL ARB licenses and directs further development of OpenGL. Current members are


25/5022

Digital Equipment, Evans & Sutherland, Hewlett-Packard, IBM, Intel, Intergraph, Microsoft,Silicon Graphics, and Sun Microsystems3.

Figure 6-1 OpenGL Geometric Primitives.

6.1.3 Design

OpenGL provides a layer of abstraction between graphics hardware and an applicationprogram. To the programmer it is visible as an application programming interface (API)

consisting of about 120 distinct commands. For an example see figure 6-2.

GlBegin(GL_POLYGON);

GlVertex2f(0,0);

GlVertex2f(0,1);

GlVertex2f(1,0);

GlVertex2f(1,1);

glEnd();

Figure 6-2 OpenGL code used to describe a rectangle.

The specification is window system independent meaning it does not specify how tomanipulate windows or receive input from the user. This allows OpenGL to be implementedfor any window system.

3 According to the OpenGL ARB FAQ (http://www.opengl.org)


26/5023

6.1.4 Architecture

To maintain good performance rates the OpenGL API allows complete access to graphicsoperations at the lowest possible level that still provides device independence. As a result itdoes not provide a means for describing or modelling complex geometric objects [Segal1994]. This means that the OpenGL commands specify how a certain result should beproduced rather than what exactly that result should look like, i.e. OpenGL is procedural

rather than descriptive.OpenGL uses immediate mode rendering. That is, when the graphics system is used to create ascene or object, each function and command has an immediate effect on the frame buffer andthe result of each action is immediately visible on the scene.

The boundary between the OpenGL implementation and an application is well defined toclearly specify how data is passed between the two. An application (the client) issuescommands that are interpreted and processed by OpenGL (the server). However the client-server model is an abstract model and does not demand OpenGL to be implemented as distinctclient and server processes. If it is run on a single computer, that computer is both client andserver.

6.1.5 Application Programming Interface Specification

The designers of OpenGL present the graphic system as a state machine. Specifying the API asa state machine allows consistent, precise specification and eliminates ambiguity about what agiven operation does and does not do [Kilgard 1993]. The routines that OpenGL suppliesprovide a means for the programmer to manipulate OpenGLs state machine to generate thedesired graphics output. The programmer puts the machine in different states/modes thatremain in effect until he changes them. The programmer can at any time query the system forthe current value of any of the state variables. E.g. one of the variables holds the currentdrawing colour: after it is set, every object drawn will have that colour until the variable ischanged. [Neider et al. 1993].

To ensure that all implementations of OpenGL fully conforms to the standard, allimplementers must perform a suite of tests to ensure source code compatibility [Neider et al.1993]. This allows users/programmers to be sure all basic features and functions will beavailable on all different platforms.

The OpenGL API is defined in terms of the programming language C, but bindings for severalother languages exist. Fortunately the OpenGL function parameters are all simple types(Boolean, integer, floating point, arrays) so the API translates easily from C to otherlanguages. Currently the ARB controls C, C++, Fortran, Pascal, and Ada bindingspecifications. However several unofficial bindings exists for other languages, e.g. Java,Tcl/Tk, and Python. [Neider et al. 1993].

6.1.6 Libraries on top of OpenGL

Higher-level drawing is cumbersome using only OpenGLs powerful but primitive set ofdrawing commands. This and the lack of support for communication with a specific windowsystem have led to the development of several libraries written on top of OpenGL, e.g.OpenGL Utility Library, OpenGL extension to the X window system, and Open Inventor.


27/5024

The OpenGL Utility Library

The OpenGL standard also defines the OpenGL Utility Library (GLU). The GLU serves bothas a means to render useful geometric objects and as a model for building other libraries thatuse OpenGL for rendering. Among other things the GLU includes routines for simplifyingrendering of spheres, discs, and cylinders.

OpenGL Extension to the X Window System

The X window system has become standard for UNIX workstations [Segal 1994]. Theprogrammer uses the window system to obtain windows on the display on which he candisplay images or text. The window system also provides the programmer with means toobtain user input via keyboard or mouse. OpenGL Extension to the X Windows System makesOpenGL rendering available to the programmer just like any other extension to the windowsystem.

Open Inventor

Open Inventor represents an object-oriented application policy built on top of OpenGL,providing a programming model and user interface for OpenGL programs [Wernecke 1994].Open Inventor provides pre-built objects and a built in event model for user interaction, high-level application components for creating and editing three-dimensional scenes, and the abilityto print objects and exchange data in other graphics formats. Open Inventor also defines a fileformat for describing 3D scenes that were used as a base for the design of the Virtual RealityModelling Language (see section 6.2).

6.1.7 Applications

OpenGL is according to us a platform well suited for building libraries that handle structuredgeometric objects, no matter how the structures are described. This is because of itsindependence of particular methods for describing the objects.

A common use of OpenGL is in the field of direct manipulation where the graphics engine hasto perform fast rendering to provide immediate feedback to the user. In the making of themovie Jurassic Park an Open Inventor based lighting and shading editor was used. The editorprovided the graphic artists with a tool that could quickly show the effects of changing lights[Wernecke 1994]. Other applications of OpenGL include CAD/CAM applications, scientificvisualisations, and games.

6.2 Virtual Reality Modelling Language

This section is an introduction to the history and basics of the Virtual Reality ModellingLanguage (VRML). The languages current problems and the future of the language are alsodiscussed.

6.2.1 Introduction

VRML is a simple platform independent language for describing different 3D shapes andinteractive environments. VRML can be used in a variety of areas, e.g., engineering, scientificvisualisation, multimedia presentations, entertainment, educational titles, web pages, andshared virtual worlds [Marrin et al 1997].


28/5025

6.2.2 History

In 1989 a new project was started at Silicon Graphics, Inc., with the aim to design and buildan infrastructure for interactive 3D graphics applications. In 1992 the Iris Inventor 3D toolkitwas released as the first product of these efforts. In 1994 a major revision of Inventor wasreleased under the name Open Inventor. Open Inventor is based on OpenGL and is portable toa variety of platforms. The reference manual describing the objects and file format in the Open

Inventor was eventually used to write the first draft of the VRML1.0 specification. The secondrelease of VRML adds significantly more interactive capabilities to the language [Carey andBell 1997]. The formal processing of the current version called VRML97 to become anInternational Standard began in June 1996.

The VRML Consortium, Inc. is a non-profit corporation that was formed in 1996 with themission of fostering and promoting VRML as the open standard for 3D multimedia and sharedvirtual worlds on the Internet. Today the Consortium has 52 Members including among othersApple computer, Silicon Graphics, Fujitsu, IBM, Intel, Microsoft, Mitsubishi, Oracle,Samsung, and Sony [Marrin et al 1997].

6.2.3 Design

The design of the language is influenced by several general design philosophies. Thesephilosophies have arisen from the constraint put on the design by the VRML standardisationprocess and the intended users of VRML [Carey and Bell 1997]. VRML is a high-levellanguage designed to fit into the existing infrastructure of the Internet [Marrin et al 1997].Using existing standards makes it easier for the Web developer as well as for theimplementers of the VRML standard. VRML can contain references to files in many differentgraphical and sound standard formats as well as programming languages such as Java [Careyand Bell 1997]. VRML allows graphic designers to create objects in a 3D space and to addsimple behaviour to those objects. Java can be used to allow more complex behaviour orprovide access to external data. There are basically two ways to use Java in combination with

VRML. It can either be used as a scripting language inside the VRML code, or as an externalJava applet that controls a VRML world (See section 6.2.5) [Carey and Bell 1997].

VRML can be regarded as a 3D counterpart to the World Wide Web standard HyperTextMarkup Language (HTML) [Carey and Bell 1997]. This means that VRML serves as a simplemultiplatform language for publishing 3D Web pages. One important shared property ofVRML and HTML is the simple specification and the ease to take files created by differentpeople or tools and merge them into a new document (see figure 6-3).

#VRML V2.0 utf8

Shape {

AppearanceAppearance {

material Material { }

}

geometry Sphere {

radius 1.0

}

}

Figure 6-3 Example of VRML source code to create a sphere.


29/5026

6.2.4 Presentation and Interaction

All the interpretation, execution, and presentation of VRML files is done by a browser, whichdisplays the scene graph [Carey and Bell 1997]. The scene is an acyclic directed graph, whichcontains hierarchically grouped nodes that describe the objects and their properties [Marrin etal 1997]. This presentation is known as a virtual world and is navigated by a user. The worldis displayed as if experienced from a particular location. The browser may define navigation

paradigms that enable the user to walk or fly through the virtual world.The user can also interact with the world through sensor nodes in the scene graph. Sensorsrespond to user interaction through the different objects in the world, the passage of time, orthe movement of the user through the world.

The visual presentation of the objects in the virtual world is designed to resemble the physicalcharacteristics of light. A lighting model is used to describe how appearance properties andlights in the world are combined to produce the displayed colours [Carey and Bell 1997].

The browser consists of three main components (see figure 6-4), the parser, the scene graph,and the audio and visual presentation [Marrin et al 1997]. The parser interprets the VRML fileand creates a scene graph. The scene graph consists of a transform hierarchy (the nodes), and

routes graph (the connections between the nodes). There is also an execution engine in thescene graph that processes the route graph, and makes changes to the transformationhierarchy. User interaction is transferred to the route graph component and the audio andvisual presentation. The audio and visual presentation performs the graphics and audiorendering of the transform hierarchy and feeds it back to the user [Marrin et al 1997].

6.2.5 External Authoring Interface

The External authoring interface (EAI) is a standard programming interface that defines theset of functionality of a VRML-browser that the external environment e.g. a Java applet canaccess [Marrin 1997]. It allows users to write external programs that interact with and

dynamically updates a world. Such VRML interfaces could for instance be used to helporganise complex data sets in real time. This allows VRML to be part of a larger multimediaweb presentation.

Several VRML-browsers support EAI, which is not yet a part of the VRML specification[Marrin et al 1997]


30/5027

Figure 6-4 A Conceptual Model of a VRML Browser [Marrin et al 1997].

6.2.6 Problems

The language has the same types of problems as HTML. There are browsers for most of thecommon platforms, (Macintosh, Windows NT/95/3.1, SGI Irix, SUN) but unfortunately thebrowsers are not all consistent with the VRML specification nor with each other. Some of themajor problems encountered according to [Sanders and Gay 1996] are:

Different browsers can display the same object with different colours and shading.

Worlds written in a given programming language will only run on browsers that supportthat particular language.

Most browsers let the users interact with the world while it loads, this means that thedevelopers must design user interaction with caution to prevent worlds from crashing ifonly partially loaded.

One of the major problems with the language is that it is new; neither browsers, nor toolsare completely up to the final specification.

Other problems that we have encountered are that many complex objects with texturemapping makes the VRML worlds slow and sometimes almost impossible to explore. It is


31/5028

also difficult to create source code that is easy to read and follow thus making it difficult tounderstand or debug the source code. Complex VRML environments are therefore most likelyto be created with some kind of authoring tool.

6.2.7 Future

VRML is a novelty that we feel has potential, but not many essential applications today. It isat a relatively early stage of development and promises to be a prominent technology in thefuture. It has received industry-wide support as a standard to build on for the future of the weband is unique in that it does not have much competition.

There are several VRML extensions in progress. Many users and developers have requested acompressed binary format for VRML files, emphasising the reduction of file size for fasttransmission, and also a simplification of the file structures for fast parsing [Carey and Bell1997]. There has also been a request for a standard programming interface for VRMLbrowsers to be defined. Finally, many feel that a very important issue is about adding multi-user interaction or autonomous creatures (i.e. avatars) that can sense and react to theirenvironment [Carey et al. 1997].

6.3 Conclusion

OpenGL and VRML are two different standards that are good at different things. UsingOpenGL requires a lot of relatively low-level programming. VRML on the other hand uses amore high-level approach similar to HTML thus making it simple to produce 3D models.When creating simple models, that are to be published on the Internet e.g. as part of a website, VRML is to prefer. When creating more complex applications using 3D graphicsOpenGL is the better choice.

Other potential standards for 3D graphics have emerged but have not reached the sameamount of spread as VRML or OpenGL. To show the reader that there are alternatives to theearlier described standards, this section gives a quick overview of other potential standards for3D graphics.

Direct3D is a Windows API developed by Microsoft offering similar functionality toOpenGL. This API is used primarily to provide 3D acceleration in games.

Java3Dis an extension to the Java programming language. With Java3D the programmer cancreate scene graphs similar to those in VRML.

Quickdraw 3D is a cross-platform API from Apple. It also includes a set of interfacestandards and guidelines to promote consistency between applications. [Young 1996]

Superscape Virtual Reality Toolkit (VRT) is a commercially available virtual realityauthoring system from Superscape based on the PC platform. The VRT consists of severaldifferent editors combined into a single authoring environment. A plug-in and an ActiveXcontrol allow you to navigate within 3-D worlds on the Web.


32/5029

7 Comparison of 2D and 3D Graphical User Interfaces

We do not intend to argue that any of the 2D or the 3D GUIs is better than the other. Eachinterface device or input technique will work better for some tasks and worse for others. Themost appropriate user interface for an application depends on the context of tasks to besupported and the intended users.

The discussion in this chapter is based on the cognitive aspects of user interface designdescribed in chapter 2.

7.1 Navigation

One could argue that since we live in a 3D environment in real life, this would be the morenatural way to navigate computer systems. On the other hand it has been shown that comparedto 2D GUIs, 3D GUIs provide new ways to get lost. In this section we discuss navigationalaspects of 2D and 3D GUIs i.e. cognitive maps and spatial abilities. The problem with motionsickness in 3D GUIs is also discussed.

7.1.1 Cognitive Maps

A problem can be noticed while exploring 2D hypermedia environments, like the Internet. Thespace is not fully known so users must construct their own maps, using search engines orbrowsing, easily confusing the navigator [Jul and Furnas 1997].

Since we live in a 3D environment in real life, it is possible that it could be more convenientfor us to navigate in the same way when we use our computers. Compared to a 2D GUI a 3DGUI more resembles a real world environment. The 3D GUI can e.g. include genericcomponents like landmarks, paths etc. as discussed in section 2.1.1. This implies that onewould be able to take more advantage of the users natural ability to create a cognitive map,thus facilitating the users navigation of the information space [Darken and Sibert 1993].

7.1.2 Spatial Ability

One of the major problems in 3D user interface design, according to [Boyd and Darken 1996]is that with the increased degrees of freedom, new and interesting ways to get lost whilemoving through the worlds are provided. As suggested by [Hk and Dahlbck 1997] usersspatial ability affects their ability to make use of computer systems. The ability to perceive andunderstand spatial relationships is of great importance when using a 3D GUI. This implies thatHk and Dahlbcks results are also applicable on these GUIs.

7.1.3 Motion Sickness

A problem similar to motion sickness has been experienced by users during, and sometimesafter, a session in a 3D environment. Early studies with both fixed-base and moving-basedriving simulators implicated cue conflict as the source of the problems [Goldberg and Hiller1995]. The theory of cue conflict has been described in [Goldberg and Hiller 1995] to occurwhen there is a ambiguity between senses or within a sense. The two primary conflicts thoughtto be at the root of simulator sickness occur between the visual and vestibular senses. In afixed-base simulator, the visual system senses motion while the vestibular system senses nomotion. Thus, according to the cue conflict theory, a conflict results. In a moving-basesimulator, the visual stimuli experienced may not correspond exactly to the motion that thevestibular system registers. Thus, the result can still be a conflict.


33/5030

This problem needs to be solved before 3D graphics can replace 2D graphics as the standarduser interface in everyday work.

7.2 Interaction

Since the breakthrough of the desktop metaphor most users have developed an understandingof 2D interface components. No such commonly accepted standards are available for 3D GUIsbut 3D has other advantages e.g. its potential to take advantage of differences in visual

appearance. This section compares cognitive aspects of interaction with 2D and 3D GUIs i.e.attention, affordance and abstraction. Aspects of input and output are also discussed.

7.2.1 Attention

The interface design should allow the users to focus attention with as little cognition overloadas possible. [Preece et al. 1994]

Techniques for presenting information at the interface to guide attention in 2D GUIs hasincluded the use of spatial and temporal cues, different colours, and alerting techniques suchas blinking, pop-up boxes, and auditory warnings.

3D GUIs can take better advantage of the differences in visual appearances that attract humanattention, such as colour, shape, texture, shading, size, location, and the ability to registermovement [Wiss and Carr 1998]. A drawback is that if used wrong, size can be confusing,since size is a method to create a feeling of perspective in 2D display. Another drawback with3D GUIs is that objects in the interface can stand in the way of each other when viewed fromcertain angles.

7.2.2 Affordance

The direct manipulation paradigm has caused the development of a kind of visual languagefor affordances in applications. Starting with the Star interface, the desktop metaphor has been

established as a standard throughout different platforms (e.g., Macintosh, Microsoft Windows)[Preece et al. 1994]. Therefore most users have acquired some sort of understanding of theaffordances of general 2D interface components [Wiss and Carr 1998]. This is despite the factthat the GUIs of these operating systems no longer have a strong connection to the actual realworld desktop.

The paradigms and metaphors of 3D interfaces are less developed than those for 2Dinterfaces, thus forcing the developers of 3D applications to develop new metaphors andparadigms or to utilise 2D interface components in a 3D interface [Conner et al. 1992]. Some3D metaphors are natural analogies of those familiars in 2D, however developers of 3Dinterfaces must find new metaphors and interaction techniques to take advantage of 3Denvironments. The use of real-world metaphors could be used to support affordances, e.g., anobject with the resemblance of a box can be opened. After all we live in a true 3D world andby the time we reach adulthood we have perfected many manipulative and navigational tasksto the point where we can perform them without conscious attention. This level of naturalnessapplied to a 3D user interface would be intuitive and easy to learn.

7.2.3 Abstraction

A technique often used when implementing 3D graphics is to reduce the level of details forobjects far away from the viewpoint. The reason for this is the limitations of the computerhardware but it is also done in order to reduce cluttering. Only the nearest objects aredisplayed in full detail. Objects far away can e.g. be displayed as simple cubes that the user


34/5031

perceives as abstractions of the actual objects. This technique can for example be used to letthe user get a general view of a large amount of information.

Different levels of detail can also be used in 2D interfaces, e.g. for displaying maps withdifferent magnifications, but it is even more useful when applied to 3D graphics.

7.2.4 Input and Output

2D interface components often have very simple behaviour. They commonly have fewDegrees of Freedom (DOF) and usually support only a small range of values within that DOF.Interaction in a 3D space on the other hand deals with six degrees of freedom versus three in a2D environment [Conner et al. 1992]. Users need to examine 3D objects from differentviewpoints and navigate through scenes to gain a better understanding of the world. Thereforeviewing and interacting with 3D data demands more user interaction. In our own informalstudies we have not found any advantages in displaying blocks of text in three dimensions,since it only leads to the cumbersome task for the user to try to view the text in a straight anglefrom above.

Sufficient hardware in 2D desktop environments usually consists of a display, keyboard, and amouse. The same kind of equipment can also be used for a 3D environment with variousmodes and mapping to compensate for the lack of degrees in freedom [Hand 1997]. To reallytake advantage of 3D technology, input and output devices that support interaction in 6-DOFare to be preferred.

2D graphics of today is well understood and the hardware has been optimised by the vendors.Most applications that present a 3D world still use the same kind of display as a normal 2Dapplication, and to some extent also the interaction techniques of the desktop metaphor,disregarding the fact that this could be inappropriate for a large class of 3D applications.Almost all 3D-interaction techniques must be created from scratch since there are no essentialtoolkits specific for 3D interfaces. This places the burden of implementing interfacecomponents on the developer.

7.3 Discussion

As we stated earlier, The most appropriate device for an application depends on the contextof tasks to be supported and the intended users. A more controversial standpoint in thismatter could be to claim that 3D GUIs are like colour television. In the beginning days ofcolour television people were sceptical about it but nowadays almost no one will settle for atelevision set without colour display. Since it is more natural for us to watch a television showin colour since we can perceive colour, would it not be more natural for us to be able to rotateobjects in all directions instead of just having to see them in two dimensions? One possibilityis that all objects in a computer program always will have a 3D representation but that the user

has the option of ignoring it. Compare it with a colour television set that can also be used towatch programs in only black and white. Imagine e.g. a computer program with a desktopmetaphor where all objects have 3D representations. The user is able to view the desktop fromany angle but chooses just to watch the paper he is writing from above.


35/5032

8 Analysis

To illustrate how 3D graphics can be applied to user interfaces, we have developed aprototype with a GUI applicable to several of the existing products of ABB Network Partner.We call it A Bumble Bee.

In this chapter we will therefore describe the current and wanted situation for the user

interfaces of the applications of ABB Network Partner. This will lead to a description of a newuser interface.

8.1 Current Situation

Most applications of ABB Network Partner run in Windows or UNIX environments withtraditional GUIs. The user interfaces of these applications are described in Appendix A.

One common element in a number of these applications is the visualisation of data inhierarchies using a tree view. These hierarchies can contain a substantial amount ofinformation e.g. all information about relays in a complete power company like Vattenfall4.This leads to very large hierarchies that are difficult to take stock of.

Products like S.P.I.D.E.R. and IDES/RDE rely largely on the use of map displays in whichpower stations are placed. In the case of S.P.I.D.E.R. the map is used for surveillancepurposes, while it is used as a graphical interface to a database in IDES/RDE. These mapsmust be easily navigated since they contain a lot of information that sometimes must beavailable fast in critical situations like power failure.

8.2 Desired Situation

This section gives some examples of how we believe that 3D interfaces can be used in thefuture to gain profit.

The rapid changes in technology of today force all companies involved in the business ofinformation technology to keep track of the latest development. Companies see technology asproducing an endless stream of opportunities. Yet taking advantage of technology entails

3D Graphical User Interfaces

Documents

Transcript of 3D Graphical User Interfaces