Profiling user interactions for predictive 3D streaming and rendering

8/12/2019 Profiling user interactions for predictive 3D streaming and rendering

1/10

Profiling User Interactions of 3D ComplexMeshes for Predictive Streaming & Rendering

Vani V1, Pradeep Kumar R

2and Mohan S

3

1 Department of Information Technology, Dr.N.G.P.IT, Affiliated to Anna University,

Coimbatore, India.2 Department of Computer Science Engineering, Adithya IT, Affiliated to

Anna University, Coimbatore, India.3 Department of Computer Science Engineering, Dr.N.G.P. IT, Affiliated to

Anna University, Coimbatore, India.

[email protected],[email protected],[email protected]

Abstract. Inspired by the cache model, a predictive agent is analyticallyconstructed to determine the user navigation based on the patterns derived out ofuser profiles. The user profiling is derived based on the user interactions made bythe diversified set of users over different 3D models. An attempt has been madeto analyze how efficiently the prediction works to stream a 3D model based on

the pre determined transition path generated out of the user profiles. Thetransition paths for various models are generated by exploiting the properties ofMarkov Chain model. The analytics collected from the transition paths affirmthat the predictive agent lessens the rendering latency significantly. Therendering latency is lessened by streaming the required data well before it isrequested from the server to the client. The streaming and rendering process withuser interactions from client would stream and render only the visible portion ofthe 3D models while ensuring that there is no compromise on the visual qualityof the objects. This paper mainly focuses on profiling the user interactions duringthe navigation of 3D meshes and analyses various outcome of it.

Keywords: User Profiling, Web 3D, 3D Streaming, Predictive Agent, 3DModeling and Rendering, 3D Virtual Environment, Transition Path.

1 Introduction

3D modeling and rendering over the network has been the advancement in therecent research as the application of 3D over web is seamless. While creating a

photo realistic virtual environment, the major challenge is to stream the 3Dmodels within the available network bandwidth. At the same time, the visualquality and the delay in response to user navigation may considerably getaffected. Hence a system which can promise a better virtual 3D environment overthe existing network without placing any constraint on the user navigation is theneed of the hour. In this paper, an attempt is made towards reducing the waitingtime of the user/client during the interaction further by predicting the operationthat would be performed by the users.Mohan S. et al. (Eds.): Proceedings of the International Conference ICSIP 2012, , pp.

springerlink.com Springer India2012


2/10

1 Vani V et al.

2 3DStreaming An Overview

3D streaming [3, 4 & 5] is the process of delivering 3D content in real-time for theusers over a network. 3D streaming is carried in such a way that the interactivityand visual qualities of content may match as closely as if they were stored locally.

The main resource bottleneck here is usually assumed to be the bandwidth and notrendering or processing power of the clients. To achieve this goal, simplification ofthe model and transmission of the content based on the users view are twodominant strategies adopted. The model simplification [6] and transmissionstrategies exploit the resolution of the model with respect to the users view. Whenthe users view point is far off from the 3D models screen space then only thecoarse low (resolution) mesh is brought to the client and in case when the usersview point is closer to the 3D models screen space then the refined (highresolution) mesh is brought to the client. Therefore, the Multi-resolution models [7,

8] offer the possibility to manipulate representations of 3D objects at differentlevels of detail (LOD). It is possible to adapt to different hierarchies of LODs [9,10& 11] based on the application requirement and mostly view dependent LODswhich would incrementally bring in the required quality of the 3D mesh is highlyused in the Virtual Environments by effectively utilizing the available bandwidth.

However, apart from the bandwidth, we also need to consider the rendering latency.Rendering latency could be reduced up to some extent by considering the multiresolution 3D model based on the users view point. An attempt is made to reducethe rendering latency further by predictive the users next move.

As most of the 3D streaming and rendering systems deals only with what the user isviewing (in the frustum) and not the actual way the user interacts, the mainobjective of the proposed work is to analyze the user interactions and determine therelationship between the interaction elements and the streaming (and rendered)elements of that 3D model.

This would determine the amount of data that ought to be sent to the client wellbefore it is demanded by the client by prediction. This approach would definitelyresult in the reduction of rendering latency.

3 Proposed Work

3.1 Predictive Model (Prm)The proposed predictive model is based on understanding the user navigation in thevirtual world. Based on the current position of user navigation, only visible verticesand faces of the selected triangular meshes are brought to the client machine during

visualization. At the same time, based on the previous history collated from varioususer inputs, the next set of predicted vertices and faces are also pushed to the clientwith the help of the Predictive Agent (PA).

APredictive Agent (PA)is built after successful offline analysis that was carried outon user profiles collected from 55 different users (aged 18 to 22, from engineeringinstitutions with good visual and computer senses. The reason is that, this age groupspends more time on gaming and relatively having a better understanding of theinterfaces to navigate). As part of user analysis, the speed of the key press, totalsession time spent by every user, visual coverage of the model and pattern of the


3/10

keys/buttons pressed are taken across complex 3D models/meshes. Forexperimentation purpose, complex 3D models of sizes 26 MB(Armadillo) and 45MB (Brain) are considered with different shapes. Various shapes where the basic

building block is a triangular mesh are considered in order to profile the moves if the3D shape is oriented either horizontally / vertically. Based on the shapes even theuser movement would vary if the user wishes to check the visual appearance of theentire 3D shapes.

The PA contains conventional transition probabilities [1, 2] of users when theymove from one state to another state. During the transition, the maximum

probability from a given step to current step is chosen for prediction and furthertowards predictive streaming. This algorithm uses greedy approach by consideringthe maximum probability (compared to all states transition probability) to movefrom a given state to the next. These transition probability paths generated forvarious models are used to predict the user interactions at every state. This wouldhelp further to optimize the 3D streaming and rendering over the network byreducing the time delay between user request and response.

3.1.1 Analytical ModelThe main objective of the proposed work is to develop an analytical model based onthe user interaction while viewing the 3D models over the network. The central ideais to predict the user navigation and construct an analytical model for every 3Dobject (3D triangular meshes) using thePA. This predictive model hence would beuseful in bringing the necessary surfaces during streaming. This prediction is useful

to reduce the rendering & response time. To construct the predictive model(Predictive Agent: PA), the following notations have been used:Let Svbe a set of mesh vertices in the server and Sfbe a set of corresponding meshfaces in the server for the selected 3D mesh. Let Cvbe the set of mesh vertices inthe client

where CvSvand Cfbe the set of corresponding mesh faces in the client where CfSf.

On an Operation Oi , which can be an arbitrary rotation(x, y, z) or Zoom in/Zoom out(Zin, Zout) or Translate (Tx, Ty, Yz), Cvand Cfcan undergo a change if its

a rotation operation in terms of Vertices and Faces as follows : {Vi} & {Fi}.For {Vi}:+ {Vi} Sv , - {Vi} Cv ,

where+{Vi}is the set of vertices chosen from Svand

- {Vi}is the set of vertices chosen out from Cv.

For {Fi}:+ {Fi} Sf , - {Fi}Cf ,

where+{Fi}is the set of faces chosen from Sfand- {Fi}is

the set of faces chosen out from Cf.

Table 1 summarizes the notations used in our model.

User Profiling for Predictive Streaming & Rendering 2


4/10

3 Vani V et al.

Table 1:Notations

Sv Server Vertex Set

Sf Server Face Set

Cv Client Vertex Set

Cf Client Face Set

Oi ithOperation

{Vi} Vertices Changes

{Fi} Faces Changes

R Rotation

x Rotation about x axis

y Rotation about y axis

z Rotation about z axis

Tx Translate about x axis

T Translate about y axisTz Translate about z axis

Zin Zoom In

Zout Zoom Out

3.1.2 OperationProfilingTo profile the interaction performed by the user, basically the Rotation operation Rin any one of the directions: +x/- x, +y/- y, +z/- z and also Translation /

Scaling with a fixed Translation/scale factors are considered.For every key press/ mouse move during the rotation, a fixed angle of rotation isapplied on the 3D object and outcome of the rotation generates updated eye positionand eye orientation (eye refers to the camera position, which is the view point of theuser in 3D world). Based on this operation, the speed of rotation is estimated basedon the number of key pressed per second. The key presses would determine theamount of angle being rotated per second.Based on the rotation output, the amount of change in the vertices and faces (+ {Vi} and + {Fi}) that ought to be transmitted to the client is predicted. The

predicted faces and vertices only are transmitted to the client. The prediction, hence,would reduce the rendering latency based on the client input.

3.1.3 User ProfilingTo construct the predictive agent, an offline analysis has been carried out byconsidering 55 user profiles taken from a range of novice to professionalsinteracting with 3D virtual world. The user profiles include, rate at which the key is

pressed/ mouse button clicked with a drag and the actual key/mouse/scroll button

that is pressed per user session on various complex 3D meshes are considered foranalysis. Using the collated user profiles, operation patterns are determined byconsidering the transition probabilities. At each step of constructing the transition

path, transition from maximum probable state to all other possible states based onthe user profiles are extracted by exploiting the markov chain model and greedyapproach. As per the greedy approach, at every transition step locally optimalchoice (state with maximum probability) is considered and a complete transition

path is constructed. Also, as per the markov chain process, summation of thetransition probabilities from one state to all possible states should be 1. Based on the


5/10


6/10

5 Vani V et al.

4 Result and AnalysisWe have used two different 3D models (Table 4) which contain only meshes toexperiment the user interactions and build a rigid predictive agent. Meshes areconsidered so that the number of culled faces can be easily computed. 3D meshes

with its number of vertices, faces, triangle strips and file size considered is given inTable 5. The 3D meshes are carefully chosen so that it differs in shape andcomplexity (like total no. of vertices and faces). The experimental setup was run onan Intel Core2 Duo CPU P8600 @ 2.4 GHz with 4 GB RAM and ATI Radeon 1GBgraphics card system. The system has been used to simulate a server environmentwhere the server was streaming the 3D meshes as requested by user from a clientmachine based on the navigation. The client module was run on the machines withthe following configuration which does not have exclusive graphics card. Theconfigurations of 55 client machines are Intel Core 2 Duo CPU P6550 @ 2.3 GHz

with 1GB RAM. The profiling collected from 55 client machines helps in renderingthe 3D meshes more efficiently.

4.1 Analysis of User Profiling

We have collected the profiling of 55 different users (aged 18 to 22, fromengineering institutions with good visual and computer senses). For each 3D model,the users/clients are asked to navigate through the object with various keystrokes/mouse moves as defined earlier and user manual is also circulated among

the users to get a fair idea about key/mouse press and the corresponding operationbeing performed. In addition to it, all the users are instructed properly on how to usethe key strokes/mouse moves to navigate and visualize the 3D meshes. Number oftimes a key/mouse button pressed is counted. Later, the probability of pressing eachkey stroke/mouse button and the transition probability of moving from onekey/mouse button to other key/mouse button is calculated as in Figure 4(a) and 4(b).

Figure 1 describes the overall visual coverage of 3D mesh models considered andcould infer that brain model is covered by maximum number of users (29 users,

53%) than that of armadillo model (13 users,24%) out of 55 users. This visualcoverage is estimated based on the interactions of the model and the operation

performed and the total number of vertices and faces covered by them. Afterapplying visibility culling algorithm, the study reveals that 40% of the mesheswould be saved without rendering in a client machine for a complex mesh. Itimplies that only 60% of meshes would be rendered and viewed by maximumnumber of users who covers the entire model.

Figure 2(a) and (b) highlights time spent on the model (session time) by each of the

55 users. From the result obtained, we could see, some of the users spent more 560seconds that is around 10 minutes on the brain model with 1446 key/ mouseinteractions (Figure 3(a)).Also, on the other hand 353 seconds are spent on the brainmodel with 4030 key presses. The results imply that the interactions performed bythe user also influence his psychological aspects such how much he/she is interestedon the model he is viewing, how long he takes to press the next key(think time) etccan also be analyzed. Similarly, if we consider the Armadillo model, maximum 266seconds is spent on a model with 1988 key/mouse interactions (Figure 3(b)) and onthe other hand maximum of 3129 key/mouse interactions is done for the duration of


7/10


8/10

7 Vani V et al.

Fig. 1.% of Users covered entire model

Fig. 2(a). Users Session Time (in Sec) for Armadillo 3D mesh

Fig. 2(b). Users Session Time (in Sec) for Brain 3D mesh

Fig. 3(a). Users Interactions on Armadillo 3D mesh

Fig. 3(b). Users Interactions on Brain 3D mesh


9/10


10/10

9 Vani V et al.

6 References

[1] Gerald Benoit Simmons Application of Markov chains in an interactive informationretrieval system.Inf. Process. Manage.41, 4 (July 2005), 843-857.DOI=10.1016/j.ipm.2004.06.005. (2005).

[2] Dong Hyun Jeong, Soo-Yeon Ji, William Ribarsky, Remco Chang: A state transitionapproach to understanding users' interactions. IEEE VAST 2011: 285-286. (2011).

[3] Soumyajit Deb and P. J. Narayanan. Design of a geometry streaming system. InProc. ICVGIP, pages 296-30. (2004).

[4] Nien-Shien Lin, Ting-Hao Huang, and Bing-Yu Chen3D model streaming based onJPEG 2000.IEEE Transactions on Consumer Electronics (TCE), 53(1). (2007).

[5] William J. Schroeder, Jonathan A. Zarge, and William E. Lorensen. Decimation oftriangle meshes. SIGGRAPH Comput. Graph.26, 2 (July 1992), 65-70.DOI=10.1145/142920.134010. (1992).

[6] Hugues Hoppe. Progressive meshes. InProc. SIGGRAPH, pages 99-108. (1996).[7] Wei Cheng. 2008. Streaming of 3D progressive meshes. InProceedings of the 16th

ACM international conference on Multimedia (MM '08). ACM, New York, NY,

USA, 1047-1050. DOI=10.1145/1459359.1459570. (2008).[8] Wei Cheng, Wei Tsang Ooi, Sebastien Mondet, Romulus Grigoras, and Geraldine

Morin. Modeling progressive mesh streaming: Does data dependency matter?.ACMTrans. Multimedia Comput. Commun. Appl.7, 2, Article 10 (March 2011), 24 pages.DOI=10.1145/1925101.1925105. ( 2011).

[9] Cohen, Jonathan D. and Dinesh Manocha. Model Simplification for InteractiveVisualization. Visualization Handbook. 13 pages. Eds. Chris Johnson and Chuck

Hansen. Elsevier Butterworth-Heinemann. Chapter 20, pp. 393-410. (2005).[10] Huimin Ma, Tiantian Huang, Yanzhi Wang, Multi-resolution recognition of 3D

objects based on visual resolution limits, Pattern Recognition Letters, Volume 31,Issue 3, 1 February 2010, Pages 259-266, ISSN 0167-8655,10.1016/j.patrec.2009.08.015.

[11] Li Xin, Research on LOD Technology in Virtual Reality, Energy Procedia, Volume13, 2011, Pages 5144-5149, ISSN 1876-6102, 10.1016/j.egypro.2011.12.142.

Profiling user interactions for predictive 3D streaming and rendering

Documents

Transcript of Profiling user interactions for predictive 3D streaming and rendering