View-Casting: View-based Stream Dissemination and Control for Multi- Party 3D Tele-immersive...

View-Casting: View-based Stream Dissemination and Control for Multi-

Party 3D Tele-immersive Environments

Klara Nahrstedt University of Illinois at Urbana-Champaign

Joint work with UC Berkeley: Prof. Ruzena Bajcsy, Dr. Gregorij Kurillov,

Dr. Sang-Jung Hack (3D Video) Illinois: Dr. Zhenyu Yang, Dr. Bin Yu, Dr. Jin Liang, Dr. Yi

Cui , Wanmin Wu (QoS Management), NAIST Japan: Dr. Morihiko Tamai, Prof. Keiichi Yasumoto

(Wii Interfaces),UC Berkeley/Illinois: Lisa Wymore, Renata Sheppard

(Dance Choreography)

Outline

Motivation and Goals Challenges TEEVE Solution Details

View Concept View Casting: view selection, forest-casting, BW

allocation, delay-sensitive streaming Experimental Results

Conclusion

Tele-immersive Environment

site 1

site 2

site 3

Photo courtesy of Prof. Ruzena Bajcsy.

Tele-Immersive Environment (TI) Abstraction

TI Application Model

3D capturing3D camera

transmission

3D rendering

user view

3D Camera Characteristics and their Challenges

for Internet and Multimedia Systems

3D Camera RGB and Depth yield 5 Bytes per pixel Frame Size: 640x480 pixels Real-Time Capturing and Content

Creation (Currently 10-15 fps, Desired 30-60 fps)

3D Camera needs one PC Need FireWire Bus for Cameras to get

high throughput 3D Multi-camera Environment

10-20 3D Cameras per Room Camera Spacing Issue

Different spacing for physical activity than collaborative conferencing

180 or 300 or 360 degree camera coverage

Data Model 3D frame for camera i at time

t: fi,t Each pixel in the frame carries

color+depth data and can be independently rendered

Stream for camera i VSi = { fi,t1 fi,t2 … }

Macro-frame Ft = { f1,t f2,t … fn,t }

TI stream TIsite

Camera Orientation Oi

f1,t1 fn,t1Ft1

…f1,t2 fn,t2Ft2

VS1 VSn

TIUIUC

Ft3VS1

VS2 VS3

Implications: 4D Content Challenge

Deal with Huge TI Frame Size Ft

Macro-frame (e.g., 10 3D Frames in one time instance) TI Frame rate

Applications desire from 15-60 Macro-frames per second Huge bandwidth uncompressed

One 3D stream Si: 640x480 (pixels/frame) x 5 (bytes/pixel) x15 (frames/sec) = 23,040,000 bytes/second (approx. 23MBytes/sec) = 184 Mbps

One TI stream TIroom (e.g., 10 3D Streams): 230,040,000 bytes/sec (approx. 230MBytes/sec = 1.84 Gpbs)

If we increase frame-rate, explosion of bandwidth !!!!! Slow 3D Reconstruction, 3D Modeling, 3D Rendering

71ms per frame in 3D reconstruction, … Need 3D Compression Need Parallel Processing

D CG service gateway display camera

UC Berkeley

Multi-Party 3D Tele-Immersive System

Network Characteristics and their Challenges on Internet and Multimedia Systems

Heterogeneous networks/mismatch in bandwidth (LAN/WAN) LANs with 100 Mbps Ethernet cards and Gigabit Ethernet

switch WAN Internet2 network with a 100 Mbps bottleneck link

between UIUC and UC Berkeley Varying Internet2 bandwidth depending on the time

of the day TCP/IP Protocol Stack due to large packet sizes,

hence smaller context switching overhead No Real-time/no QoS capabilities in Internet2 or OS

running TCP/IP stack

Implications: Timing Challenges

InternetWAN

Tsend(i)

FiNeed Multi-stream Coordination • Synchronization of multiple stream • Synchrony at frame capturing time• Spacing at Fj’s sending time • Jitter control at Fj’s receiving time• Skew control at rendering/display time

Deal with • limited buffering at IP routers• TCP/IP retransmission delays• different end-to-end (LAN/WAN) delays

Implications: Distributed Operating System Challenges

Deal withLarge scale of cameras/displays

Camera/Display PCs are distributed over LAN and WAN networks PCs are multi-core

Real-Time Processing (3D Reconstruction, Rendering) Consider between 5-15 3D streams with frame rate of 10 fps and render them

into one 4D stream with 10 fps Run rendering station at 50-150 fps with 640x480 pixels per frame

Need Configuration/coordination of various 3D/4D services in a flexible and

effective manner Software synchronization protocols with approximate hardware accuracy Portable and easy deployable distributed TI/OS services Low maintenance OS services Real-Time OS services on multi-core architectures

Implications: Networking ChallengesDeal with

Large Bandwidth Demand One 3D stream: 23MBytes/sec One TI (e.g., 10 3D Streams): 230MBytes/sec)

Real-Time Communication for Interactivity 80-100 millisecond interactive delay

NeedMulti-stream Coordination

due to large bursts at routers, need packet spacing Smart Delay/Bandwidth Management over Best Effort InternetSynchronization of Multiple Streams

need to send all streams within a constrained time interval so that rendered frame can be done in timely fashion

Dealing with Scale multi-party / many-to-many multicast forest protocols (camera arrays,

participants, microphone arrays, …)

TEEVE Architecture

Network Model

D C3D display 3D camera

Internet

SG service gateway

sessioncontroller

TEEVE Service Middleware

Stream Selection

Bandwidth Allocation/View-Casting

Streaming Control

Service Middleware

Networking

Tele-immersive Application

…f1,tm fN,tmFtm

3D macro-frame

Service Middleware Layer

…f1,t1 fN,t1

3D renderer

Networking Layer (Internet2)

…f1,t2 fN,t2Ft2

…f1,t1 fN,t1Ft1

……f1,t2 fN,t2

…f1,tm fN,tm

3D stream

Data Input and Output

user viewcamera view

network

Stream

ing con

frame rate feedback

allocation

View-Cast

Bandwidth Allocation

Streaming Control

Renderer

bandwidth estimation

Stream Selection

Stream

selection

bandwidth allocation scheme

selected streams

Semantic Multi-Camera Stream Management

View-based Forest Dissemination and Control

Multi-Camera Stream Selection (1) Determine which streams are of interest given the

current user view

The color portion of a macro-frame

(Photo courtesy of Sang-Hack Jung.)

Multi-Camera Stream Selection (2)

View-based approach Free view selection is a key feature of 3D

video User selects view Reasonable to use the user view as the major

QoS control feedback Adaptation based on the contribution of each

3D stream to the current user view Metric: Contribution factor based on the

angle between the camera and the user view

View-aware Stream Differentiation (3)

3D capturing

3D camera

transmission

3D rendering

user viewstreams contributingmore to user view

less important streams

user viewing

direction

cameradirection

source: http://zing.ncsl.nist.gov/~gseidman/vrml/

Angle θ

Stream Selection (4)

user viewing

direction

cameradirection

source: http://zing.ncsl.nist.gov/~gseidman/vrml/

Angle θ

View Model

View-aware Stream Differentiation

Differentiation function

df(s, v) = s.fov · v.w (dot product of unit vectors)

When s.fov and v.w are unit vectors, the dot product gives cos where is the angle between s.fov and v.w

Therefore, if s.fov and v.w are close, df(s, v) is close to 1. Otherwise, df(s, v) goes to -1.

Select streams with df > T (T being threshold)

of – optimization function for selected view across multiple parties of(S’,v) = sS’ df(s,v)

Differentiating Factor Metric (df)

3D capturing

3D camera

transmission

3D rendering

User/node’s view request

streams contributingmore to user view

s.fov for S4s.fov for S5

less important streams

df(S4,v) = s.fov (S4)*v.w df(S5,v) = s.fov(S5)*v.wof(S,v) = df(S4,v)+df(S5,v)

ViewCast (VC) – Problem Formulation

Minimum Quality Problem Satisfy vi V of(Ri , vi)

Subject to system constraints NP-hard

Maximum Quality Problem Maximize i of(Ri,vi)

Subject to system constraints NP-hard

VC – General Guideline

Minimum quality guarantee: each renderer

should receive a minimum set of streams if

possible

View change resilience: when user viewis changed, the impact on other affected nodes should be minimized

U2.wU3.w

S 4, S

S2, S3

S 2, S

sessioncontroller

S7, S8

victim

Why view change is a problem?Research: On Going

streamselection

bandwidthallocation

VS1 fs A1

VSi fs Ai'

…VSN

quality

Bandwidth Allocation (1)

Frame size allocation TFS : target macro-frame size estimated by the

underlying streaming control Given macro-frame Ft and TFS, allocate the

frame size (Ai) for each frame (fi,t) Priority scheme based on CF

Frame size allocation Ai is allocated in descending order of CF fs: frame size of 3D frame

Streaming Control (1)

Basic end-to-end protocol sender

receives bandwidth allocation generates token = ((VS1 , A1), (VS2 , A2), …, (VSN , AN)) passes token along camera clusters

receiver maintains periodic timer for rendering at fixed interval monitors the deadline of each macro-frame when the last frame of current macro-frame received at

T calculate the difference between T and deadline send the difference back to sender

Network

5 deadline

Tmacro-frame(after adaptation)

streamingcontrol

streamingcontrol e(t) = deadline T

bandwidthallocation

( ) ( )t

t p ij

TFS k e t k e j

Experimental Examples of Tele-immersive Environments

TI in UIUC (Prof. Nahrstedt) TI in UC Berkeley (Prof. Bajcsy)

Internet 2

Experimental Setup (TEEVE) Metrics:

Overall throughput of macro-frame Fj

Completion time interval for macro-frame Fj

Individual throughput of stream i End-to-end Delay of a macro-frame Fj

Experimental parameters of remote testbed Number of sender gateways 2 Number of receiver gateways 2 Number of 3D streams 12 UIUC-UCB

Equipment Dell precision 450 (Dual Xeon processor with 1 GBytes memory),

running Fedora Core 2 LAN 100 Mbps and Internet 2 between UCB and UIUC

Timing Performance Validation

Macro-Frame Delay (End-to-end Delay – UIUC/UCB)

Macro-frame Completion Interval at the receiver side

Sender rate of each cluster camera – 5-8fps

Continuity Test

average: 70.99 ms(frame rate: 14 Hz)

Performance of 3D Reconstruction

Resolution of each 3D frame is 320x240 pixels

Bandwidth Performance between UIUC/UCB on Internet2 during Dance Experiment

average: 44.5 Mbps

Evaluation

TFS reduces

Dancer’s Perceptual Evaluation

Experiment SetupSimulated 3DTI session

topology mesh (connected)

# of vertices (n) 5, 6,…, 10

# of edges (CR) 25%, 50%, 75%, 100% (n(n-1))

degree bound (Ii,Oi) 12, 24, 36

streams/vertex (|Si|) 8

max. streams/view 4

simulation time 200 minutes

view change interval normal distribution ( = 60 s, = 10 s)

view change pattern (a) random walk with average degree

change of 30, (b) Zipf (n = 10) delay bound (B) 200 ms

edge delay normal distribution ( = 80 ms, = 20 ms)

Experiment Setup

Counterpart for comparison, MC–n A multicast-based scheme where n

streams (n =1, 2, 3, or 4) must be retrieved to serve a view. Otherwise, the view request is rejected.

No preemption.

Rejection Ratio

CR = 100%, Ii (Oi) = 12

MC–4

MC–3

MC–2

MC–1

Rejection Ratio

CR = 100%, Ii (Oi) = 24

MC–4

MC–3

MC–2

MC–1

Streams/View

CR = 100%, Ii (Oi) = 24

average 3.2 better than MC–3 performance but with 22% less rejection ratio

Conclusion

Question: Shall we dance …. across thousands of miles?

Answer: We can waltz, but not tango yet!!!

References

Z. Yang, K. Nahrstedt, Y. Cui, B. Yu, J. Liang, S-J. Hack, R. Bajcsy “ TEEVE: The Next Generation Architecture for Tele-immersive Environments”, IEEE International Symposium on Multimedia (ISM 2005), Irvine, CA, 2005

Z. Yang, B. Yu, K. Nahrstedt, R. Bajcsy “ A Multi-stream Adaptation Framework for Bandwidth Management in 3D Tele-Immersion”, ACM International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 2006), New York, NY, 2006

M. Tamai, W. Wu, K. Nahrstedt, K. Yasumoto, “A View Control Interface for 3D Tele-Immersive Environments”, IEEE International Conference on Multimedia and Expo (ICME 2008), Hannover, Germany, 2008

Acknowledgement

This work is being continuously supported by the National Science Foundation

Streaming Control (2)

Bandwidth estimation Lots of schemes proposed (e.g., work by Dovrolis) Bandwidth estimation based on PI-controller

(deadline T ) is used as feedback e(t) in PI-controller to derive the suitable target frame size (i.e., TFS ).

Program Flow

1. Initializing data structures, PTG cameras, sockets2. Establishing connections with trigger and gateway3. Grabbing background images4. Main grabbing and processing loop (new code using OpenMP) 4.1. Waiting for trigger 4.2 Grabbing images 4.3. Pre-process: median filter, background subtraction (new code using OpenCV functions, e.g., cvBacSub) 4.4 Computing depth (new code computing triangulation) 4.5 Building data packets and send to gateway

View-Casting: View-based Stream Dissemination and Control for Multi- Party 3D Tele-immersive...

Documents

Transcript of View-Casting: View-based Stream Dissemination and Control for Multi- Party 3D Tele-immersive...

MP design and implementation CS414: Multimedia System Instructor: Klara Nahrstedt April 13, 2012.

CS 414 - Spring 2014 CS 414 – Multimedia Systems Design Lecture 5 – Digital Video Representation Klara Nahrstedt Spring 2014.

CS 414 - Spring 2012 CS 414 – Multimedia Systems Design Lecture 23 – Synchronization Issues Klara Nahrstedt Spring 2012.

CS 241 Spring 2007 System Programming 1 Memory Replacement Policies Lecture 32 Klara Nahrstedt.

CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section 12.5.1-12.5.3 Klara Nahrstedt.

CS 241 Spring 2007 System Programming 1 Introduction to Memory Management Lecture 28 Lawrence Angrave/Klara Nahrstedt.

MP 2: Audio/ Video Streaming CS414: Multimedia System Instructor: Klara Nahrstedt March 21, 2014.

CS 414 - Spring 2012 CS 414 – Multimedia Systems Design Lecture 22 – Multimedia Session Protocols Klara Nahrstedt Spring 2012.

Lecture 11 Failure Detectors (Sections 12.1 and part of 2.3.2) Klara Nahrstedt

CS 414 - Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.

CS 414 - Spring 2012 CS 414 – Multimedia Systems Design Lecture 26 – Synchronization Issues Klara Nahrstedt Spring 2012.

Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.

CS 414 - Spring 2011 CS 414 – Multimedia Systems Design Lecture 34 – Synchronization (Part 2) Klara Nahrstedt Spring 2011.

CS 414 - Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.

MP3: Multi-view Surveillance System Instructor: Klara Nahrstedt April 20, 2012 CS414.

CS 414 - Spring 2011 CS 414 – Multimedia Systems Design Lecture 30 – Media Server (Part 6) Klara Nahrstedt Spring 2011.

Computer Science 425 Distributed Systems (Fall 2009) Lecture 5 Multicast Communication Reading: Section 12.4 Klara Nahrstedt.

CS 414 - Spring 2014 CS 414 – Multimedia Systems Design Lecture 1 - Introduction Klara Nahrstedt Spring 2014.

Multimedia I (Audio/Video Data) CS423, Fall 2007 Klara Nahrstedt/Sam King 9/19/20151.

CS 414 - Spring 2012 CS 414 – Multimedia Systems Design Lecture 25 – Synchronization Issues Klara Nahrstedt Spring 2012.