Evaluating Content Management Techniques for Web Proxy Caches
description
Transcript of Evaluating Content Management Techniques for Web Proxy Caches
Internet ServerInternet ServerEvaluating Content Evaluating Content Management Techniques for Management Techniques for
Web Proxy CachesWeb Proxy Caches
Cho Joon-ho(CA Lab, CS department, KAIST)Cho Joon-ho(CA Lab, CS department, KAIST)
2001 . 11. 62001 . 11. 6
Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories)
(in 2nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99)
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 22 / 20 / 20
AgendaAgenda
ProblemsProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale
Data Collection and ReductionKey Workload CharacteristicsExperimental Design
Simulation ResultsVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 33 / 20 / 20
ProblemsProblems
Current Web Proxy caches utilize simple Current Web Proxy caches utilize simple replacement policiesreplacement policies
Relatively low hit ratesRelatively low hit rates
Additional delaysAdditional delays
So what?Developing a quantitative understanding of Web traffic
How effective are current proxy cache replacement policies for real workloads?Focus on two performance metrics
Hit rate
Byte hit rate
Designing new replacement policiesUtilize frequency for higher performanceAre neither susceptible to cache pollution nor require parameterization
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 44 / 20 / 20
AgendaAgenda
Problems
Quick Tour (Summary)Quick Tour (Summary)CritiqueDesign & Design Rationale
Data Collection and ReductionKey Workload CharacteristicsExperimental Design
Simulation ResultsVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 55 / 20 / 20
Quick Tour (Summary) Quick Tour (Summary) – 1/3– 1/3
The problems of existing studiesShort-term traces of busy proxies or long-term traces of relatively inactive proxies
Long-term traces in busy environments are neededLong-term traces in busy environments are needed
Trace driven simulationCollect total 117,652,652 requests during five monthUse smaller and more compact log
The points to be consideredObject sizeRecency of ReferenceFrequency of ReferenceTurnover
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 66 / 20 / 20
Quick Tour (Summary) Quick Tour (Summary) – 2/3– 2/3
Existing replacement policyExisting replacement policyLRULRU (Least-Recently-Used)
SizeSize – replaces the largest object
GD-SizeGD-Size (GreedyDual-Size)Replaces the object with the lowest utility
LFULFU - replaces the least frequently used object
New replacement policyNew replacement policyGDSFGDSF (GreedyDual-Size with Frequency)
GD-Size + a frequency factor
LFU-DALFU-DA (Least Frequently Used with Dynamic Aging)LFU-Aging + a dynamic mechanism(Running age L)
Virtual CachesVirtual CachesLogically partitions the cache into N virtual caches
Ki=Ci/Si+L
Ki=Fi*Ci/Si+L
Ki=Ci*Fi+L
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 77 / 20 / 20
Quick Tour (Summary) Quick Tour (Summary) – 3/3– 3/3
Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA
Comparison of Proposed Policies to Existing Replacement Policies
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 88 / 20 / 20
AgendaAgenda
ProblemsQuick Tour (Summary)
CritiqueCritiqueDesign & Design Rationale
Data Collection and ReductionKey Workload CharacteristicsExperimental Design
Simulation ResultsVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 99 / 20 / 20
CritiqueCritique
ProsQuantitative understanding of Web traffic
Long term trace-driven simulation in busy proxy servers
Providing two new replacement algorithms that run efficientlyProviding a new cache management method, ‘Virtual Cache’
ConsNot freshNo consideration of dynamic dataNo consideration of processing overhead for these more complex algorithmsPerformance improvements are insignificant
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1010 / 20 / 20
AgendaAgenda
ProblemsQuick Tour (Summary)Critique
Design & Design RationaleDesign & Design Rationale
Data Collection and ReductionData Collection and Reduction
Key Workload CharacteristicsKey Workload Characteristics
Experimental DesignExperimental DesignSimulation ResultsVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1111 / 20 / 20
Data Collection and ReductionData Collection and Reduction
Data collectionLong term trace-driven simulationTotal 117,652,652 requests were handled during five month periodData include
Client IP address, request time, response status, the time required for the proxy to complete its response…
Data reductionSmaller, more compact log
Due to storage constraintTo ensure that analyses and simulations could be completed in a reasonable amount of time
Reduction by Storing data in more efficient mannerRemoving information of little value
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1212 / 20 / 20
Key Workload CharacteristicsKey Workload Characteristics
Cacheable ObjectsMost client requests be for cacheable objects (96%)
Object Set Size total 389GB
Object SizesVariable – medium : 4KB, maximum : 148MB video clip
Recency of reference1/3 of all re-references occurred within one hour
Frequency of referenceWeb referencing patterns are non-uniform
TurnoverObjects that were once popular are no longer requested
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1313 / 20 / 20
Experimental Design Experimental Design – 1/2– 1/2
Least-Recently-Used(LRU)Replaces the object requested least recentlyConsiders only a single work load characteristic
SizeReplaces the largest objectTries to minimize the miss ratio (target to byte hit rate)Cache pollution
GreedyDual-Size(GD-Size)
GD-Size(1) for Hit RateGD-Size(Packets) for Byte Hit Rate
Ki=Ci/Si+LCi – the cost associated with bringing object i into the cache
Si – the object size
L – a running age factor
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1414 / 20 / 20
Experimental Design Experimental Design – 2/2– 2/2
LFUReplaces the least frequently used objectLFU-Aging = LFU + Aging → avoids cache pollutionParameterization problem still remains
Greedy Dual-Size with Frequency(GDSF)GD-Size doesn’t take into account frequency
Least Frequently Used with Dynamic Aging(LFU-DA)
LFU-Aging requires parameterization to perform wellLFD-DA uses inflation factor as well as the frequency count
Ki=Fi*Ci/Si+L Fi – a frequency count
Ki=Ci*Fi+L
L – a running age factor
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1515 / 20 / 20
AgendaAgenda
ProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale
Data Collection and ReductionKey Workload CharacteristicsExperimental Design
Simulation ResultsSimulation ResultsVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1616 / 20 / 20
Simulation Results Simulation Results – 1/2– 1/2
Figure1. Comparison of existing Replacement Policies
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1717 / 20 / 20
Simulation Results Simulation Results – 2/2– 2/2
Figure2. Comparison of Proposed Policies to Existing Replacement Policies
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1818 / 20 / 20
AgendaAgenda
ProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale
Data Collection and ReductionKey Workload CharacteristicsExperimental Design
Simulation Results
Virtual CacheVirtual Cache
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1919 / 20 / 20
Virtual Cache Virtual Cache – 1/2– 1/2
An approach that can focus on both of An approach that can focus on both of hit ratehit rate and and byte hit ratebyte hit rate simultaneously simultaneously
MechanismLogically partitions the cache into N virtual cachesEach virtual cache(VC)is managed with its own replacement policySteps
Initially all objects are in VC0
Replacements from VCi are moved to VCi+1
Replacements from VCi+1 are evicted form the cache
When reaccessed, objects are reinserted in VC0
Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 2020 / 20 / 20
Virtual Cache Virtual Cache – 2/2– 2/2
Figure 4. Analysis of Virtual Cache Performance; VC0 using LFU-DA, VC1 using GDSF-Hits
Figure 3. Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA