Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems
description
Transcript of Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems
Prediction-based Prefetching to Support VCR-like Operations in Gossip-based P2P VoD Systems
Tianyin Xu, Weiwei Wang, Baoliu Ye Wenzhong Li, Sanglu Lu, Yang Gao
Nanjing University
Dislab, NJU CS
Nanjing University
2
Outline
Background P2P VoD streaming; Gossip-based systems; VCR-like interactive behavior.
Motivation
Solutions System architecture; Prefetching model;
Data scheduling; VCR-like operation support.
Performance Evaluation Conclusions
Dislab, NJU CS
Nanjing University
Background (1) P2P media streaming
Everyone can be a content producer/provider. Cache-and-relay mechanism: peers actively cache media contents and
further relay them to other peers that are expecting them.
3
* P2P live streaming is very successful!- CoolStreaming
(INFOCOM’05),- PPLive, Joost
Dislab, NJU CS
Nanjing University
Background (2)
P2P VoD streaming is challenging! Provide free access to any segment in the video at anytime by VCR-like operations.
VCR-like (Video Cassette Recorder) operations random seek, pause, fast forward/backward (FF/FB) For VCR-like operations, “jump” process is the most important.
• Most VCR-like operations can be implemented by “jump”.– random seek & pause: 1 jump; FF/FB: series of jump;
4Dislab, NJU CS
Nanjing University
Motivation (1)
How to support the “jump”?
Optimizing the index overlay to realize fast segment relocation• Jump => locate-and-download process;• Necessary, but far more sufficient.
Prediction-based Prefetching• Expect a zero jump delay;• Proactively prefetch segments that are likely to be requested by future VCR-
like operations;• Rely on prediction accuracy.
5
Question: Is the prediction feasible?
Dislab, NJU CS
Nanjing University
User Access Patterns (1)
User rarely view the movie from the beginning to the end. The total playing time of a user is quite limited and tends to be short. Because some popular segments (called highlights) attract more user
requests than non-popular segments.
Brampton et al., NOSSDAV-2007 Zheng et al., P2PMMS-2005
6Dislab, NJU CS
Nanjing University
User Access Patterns (2)
Probability distribution of object and segment popularity Log-normal distribution Zipf distribution
Brampton et al., NOSSDAV-2007 Yu et al., EUROSYS-2006
7Dislab, NJU CS
Nanjing University
User Access Patterns (3)
Fast Forward is more frequent than Fast Backward. Short Jump is more frequent than Long Jump.
Cheng et al., IPTPS-2007
Brampton et al., NOSSDAV-2007
8Dislab, NJU CS
Nanjing University
Motivation (2)
Our Objective: Effective Prediction-based Prefetching Scheme
Effective prediction model• Based on user access patterns
Easy to be integrated in current P2P VoD systems
Practical data scheduling
9Dislab, NJU CS
Nanjing University
System Architecture (1)
Solution 1: Let the server do prediction for each user [1] Pro: Server has large volumes of user viewing logs Con: poor scalability
Solution 2: Let the client exchange user logs and do prediction [2] Pro: scalable Cons: 1. lack of large volumes of user logs 2. high computing cost & training time
[1] Huang et al, “A User-Aware Prefetching Mechanism for Video Streaming”, WWW-2003
[2] He et al, “VOVO: VCR-Oriented Video-On-Demand in Large-Scale Peer-to-Peer Networks”, TPDS-2009
10
Our solution:Server side: offline pattern mining => prediction model
Peer side: lightweight online prediction
Dislab, NJU CS
Nanjing University
System Architecture (2) Take full advantage of tracker
Tracker has large volume of user viewing logs; Every node have to contact the tracker to join the system
• initiate its neighbor & partner list
11Dislab, NJU CS
Nanjing University
Prediction Approach: Overview
Frequent Sequential Pattern Mining PerfixSpan[1] : Mining Sequential Patterns Efficiently by Prefix-
Projected Pattern Growth.
Splitting Video Segments into Abstract States Mapping User Logs to Abstract States Construct Contingency Table (CCT) Model Utilization
[1] Pei et al., “Mining Sequential Patterns by Pattern Growth: The PrefixSpan Approach”, TKDE-2004. 12Dislab, NJU CS
Nanjing University
Prediction Approach (1)
13
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
Frequent Sequential Patterns
Dislab, NJU CS
Nanjing University
Prediction Approach (2)
Sequential patterns found may be overlapped? e.g. <1,2,3,4,5,6,7> and <5,6,7,8,9,10,11,2>
Splitting Approach Filter out the sub-patterns
• e.g. <1,2,3,4>,<1,2,3,4,5>,<1,2,3,4,5,6>,<1,2,3,4,5,6,7>
Scan over the remaining sequential patterns• Cut them into intervals without overlapping
- e.g. <1,2,3,4,5,6,7> and <5,6,7,8,9,10,11,2>[1,7],[8,12]
Take intervals not exist in the mined sequential patterns as separate intervals
Split the contiguous intervals into appropriate granularity intervals(States)• - MIN, MAX
14Dislab, NJU CS
Nanjing University
Prediction Approach (3)
Map Raw User logs into State Transitions <s,s’> e.g. <1,2,3,4,5,6,7,8,9,10> map to [1,6][7,13]
Transition Table Construction Simple Frequency Counting
15Dislab, NJU CS
Nanjing University
Data Scheduling
Two stage scheduling strategy: Stage 1: fetch urgent segments into playback buffer
• Guarantee the continuity of normal playback• Urgent line mechanism [1]
Stage 2: prefetch based on prediction• Reduce jump latency• Utilize residual bandwidth
[1] Li et al., “ContinuStreaming: Achieving High Plackback Continuity of Gossip-based Peer-to-Peer Streaming”, IPDPS-2008.
16Dislab, NJU CS
Nanjing University
VCR-like Operation Support
The jump process caused by VCR-like operations:
• Case 1. The jump segment is already prefetched on the local peer => Just playback!!
• Case 2. The jump segment is cached on the partners’ buffer => download and playback!
• Case 3. Neither cached on the local peer nor cached by the partners => relocate, connect and download
17Dislab, NJU CS
Nanjing University
Simulation Settings
User Log Generation• Modify GISMO [1]
– Using log-normal distribution to let users trend to jump around hot scenes.
The simulation is built on top of a topology of 5000 peer nodes based on the transit-stub model generated by GT-ITM.
The streaming rate is S = 256 Kpbs, the download bandwidth is randomly distributed in [1.5S, 5S].
The default size of the playback buffer is 30Mbytes, i.e., each peer can cache 120 second recent stream (100 for playback, 20 for prefetching).
The arrival of peers follows the Poisson Process with λ = 5.
[1] GISMO: A Generator of Internet Streaming Media Objects and Workloads
18Dislab, NJU CS
Nanjing University
Performance Evaluation (1)
19Dislab, NJU CS
Nanjing University
Performance Evaluation (2)
2
Dislab, NJU CS
Nanjing University
Performance Evaluation (3)
3
Dislab, NJU CS
Nanjing University
Performance Evaluation
4
Dislab, NJU CS
Nanjing University
Conclusions
A practical architecture that can be used in almost all existing P2P VoD systems
A novel and simple prediction approach State abstraction plays an important role
A two stage data scheduling
23Dislab, NJU CS
Nanjing University
24Dislab, NJU CS
The End