Object Placement in Video Content Distribution Networks

Post on 22-Dec-2014

187 views 3 download

Tags:

description

 

Transcript of Object Placement in Video Content Distribution Networks

Object Placement in Video Content Distribution Networks

Object Placement in Video Content Distribution Networks

Mohammad Faraji, Kianoosh Mokhtarian

Department of Electrical and Computer EngineeringUniversity of Toronto

December 2011

BackgroundBackground

8 years of video content added to YouTube every day

Terabytes a day; Petabytes a year

Trend is to further accelerate

Higher-quality video streams (currently only 10% are HD)

Content distribution infrastructure

Several datacentres around the world

User request sent to closest datacentre (DNS/HTTP redirect)

MotivationMotivation

Store video files across datacentres (DCs)

Generously replicate all videos on DCs?

Not viable

Growth of data volume >> storage cost

Good News from Measurement Studies

Good News from Measurement Studies

Popularity of video depends on geographical location

More than half of the time, only a fraction from the beginning of video is downloaded

=> Place (partial) video files in selected locations

ModelingModeling

Input: history of user requests (video v for IP address i)

Distance of i to any of datacentres?

Use an Internet Coordinate System (ICS)

Delay(i, j) = Eucledian_distance[ ICS(i), ICS(j) ]

Make tracking of requests scalable

Cluster user IPs into regions in the Eucledian space of ICS

Popularity matrix P[region, video]

Distance matrix D[region, datacentre]

Partial Video FilesPartial Video Files

First minute of video downloaded many more times

Store partial video files

More effective caching

Lower start-up delay

Partial popularity assumed independent of region

Download reports: (v, 1MB), (v, 2.3MB), (v, 0.5 MB), ...

Compress into a few entries for each video (dynamic alg)

PP[v] = (0...1MB, 100 times), (1MB...end, 50 times)

Problem StatementProblem Statement

Assign (part of) each video to one or more DC

Minimize distance of video to user (region), given:

The distance matrix D[region, datacentre]

The expected download pattern P[region, video]

Partial popularity PP[video]

The storage limitation of each DC

Problem HardnessProblem Hardness

Simpler alternatives

Store one video file on a few selected DCs

NP-Complete (min set cover, max coverage)

Store multiple video files on one DC

NP-Complete (knapsack)

SolutionSolution

Maintain a utility matirx U[v, d]

Utility of replicating "the next chunk of" video v on DC d

Auxiliary priority queues

1. Find the highest-utility video v*:

2. Place the next chunk of v* on the best DC d*

3. Update row v* of U, and what the next chunk is for v*

Complexity: O[ (total video replicas) x

(log[# videos] + log[# DCs] + log[max

chunks/video]) ]

Evaluation (in Progress): DataEvaluation (in Progress): Data

File size and length of ~200K videos from [Cheng 2010]

Distances in Internet

Pairwise delay between 2500 nodes from [Wong 2005]

Video popularities

Global: Zipf-distributed (as repeatedly reported)

Local: synthetic

Partial video popularities

Generated according to [Qiu 2010]

Evaluation (in Progress): Results

Evaluation (in Progress): Results

Total delay, given our placement

Delay w/ and wo/ partial file storage

Comparison to simple threshold based distributed caching

Running time

Estimated communication overhead

Take-AwayTake-Away

Benefits of storing partial video files on selected DCs

Future work

Sevral further details for a complete working system ...

Low-overhead collection of (sub-samples of) downloads

Estimate near-future download patterns

Carefully cluster users in a limited num of regions

Solving video placement by multiple nodes

Incremental algorithm; can't shuffle everything every night

Appendix: Previous WorksAppendix: Previous Works

Cooperative web caching

Hierarchical, distributed, hybrid

CDN design (various flavors)

Video caching

On a single cache

To optimize for VCR-like functions