Dynamic Web Application Deployment
description
Transcript of Dynamic Web Application Deployment
Dynamic Web Application Dynamic Web Application DeploymentDeployment
Instructor: Dr. ZhangInstructor: Dr. ZhangPresenter: Ningfang MiPresenter: Ningfang MiDate: Nov. 3 2004Date: Nov. 3 2004
OutlineOutline
MotivationMotivation Challenge of Dynamic ContentChallenge of Dynamic Content ApproachesApproaches
• ESIESI• CSICSI• ACDNACDN
DiscussionDiscussion ConclusionConclusion ReferenceReference
MotivationMotivation
CachingCaching• Important tool to deal with the rate of requests Important tool to deal with the rate of requests
to Internet serversto Internet servers Reduce network congestionReduce network congestion Reduce page display timeReduce page display time
• Client-centric: proxy caching Client-centric: proxy caching • Server-centric:Server-centric:
reverse proxy cachingreverse proxy caching content delivery network (CDN)content delivery network (CDN)
• Limitation: mostly oriented toward static Limitation: mostly oriented toward static contentcontent
Dynamic Web PagesDynamic Web Pages Static Web pages are not Static Web pages are not ENOUGHENOUGH!! More and more pages contain dynamic More and more pages contain dynamic
contentcontent• News headlines, stock information, current News headlines, stock information, current
temperature ……temperature …… Good news:Good news:
• a more compelling experience for the end-usera more compelling experience for the end-user• an easier development model for the an easier development model for the
application designerapplication designer Bad news:Bad news:
• Bad for caching! Bad for caching!
Dynamic component
Challenge of Dynamic ContentChallenge of Dynamic Content
Web developers frequently use technoloWeb developers frequently use technologies like JavaServer Pages (JSP) and Actigies like JavaServer Pages (JSP) and Active Server Pages (ASP) to design their apve Server Pages (ASP) to design their applicationsplications
But when traffic on Web sites increases, But when traffic on Web sites increases, the computing overhead can result in inthe computing overhead can result in increasing delays and failures in data delivcreasing delays and failures in data deliveryery
Challenge of Dynamic Content Challenge of Dynamic Content (2)(2)
Dynamic content places significant strain Dynamic content places significant strain on traditional Web site architectures on traditional Web site architectures • the same infrastructure used to generate the the same infrastructure used to generate the
content is used to deliver the contentcontent is used to deliver the content
www.esi.org
Challenge of Dynamic Content Challenge of Dynamic Content (3)(3)
Generating dynamic content typically Generating dynamic content typically incurs:incurs:• network overhead as user requests are network overhead as user requests are
dispatched to appropriate software modules dispatched to appropriate software modules that service these requeststhat service these requests
• processing overhead as these modules processing overhead as these modules determine which data to fetch and presentdetermine which data to fetch and present
• disk I/O as these modules query the back-end disk I/O as these modules query the back-end databasedatabase
In short, building dynamic Web pages is In short, building dynamic Web pages is computationally expensivecomputationally expensive
Challenge of Dynamic Content Challenge of Dynamic Content (4)(4)
Two major issues:Two major issues:• Site Experience and Effectiveness
dynamic content, abandon rate, download speed
• Site Cost Structure investments to support scalability, reliability, investments to support scalability, reliability,
performance, system management, etc. performance, system management, etc. One more problem:One more problem:
• How to facilitate caching for dynamic How to facilitate caching for dynamic Web pages?Web pages?
ApproachesApproaches Caching dynamic responses and on Caching dynamic responses and on
mechanisms of timely invalidation of the mechanisms of timely invalidation of the cached copiescached copies
Assembling a response at the edge from Assembling a response at the edge from static and dynamic componentsstatic and dynamic components• Edge Side Includes (ESI) assemblingEdge Side Includes (ESI) assembling• Client Side Includes (CSI) assemblingClient Side Includes (CSI) assembling
Application distribution networks, running Application distribution networks, running complete applications at the edge of the complete applications at the edge of the networknetwork• Application Content Delivery Network (ACDN)Application Content Delivery Network (ACDN)
Fragment-based TechnologiesFragment-based Technologies
Dynamic pages are not all dynamicDynamic pages are not all dynamic• Most bytes are in a static page Most bytes are in a static page templatetemplate• Dynamic Dynamic fragmentfragments are a small fraction of ts are a small fraction of t
he entire page he entire page Different portions have different propertDifferent portions have different propert
iesies• Template: slow-changing contentTemplate: slow-changing content• Fragment: fast-changing contentFragment: fast-changing content
Full page: • 30731 bytes
news headlines: • 927 bytes (3%)• refetched every few h
ours
stock quotes: • 1231 bytes (4%)• refetched every minu
te
Fragment 1
Fragment 2
Reassembling Fragmented PageReassembling Fragmented Page
Historically, page is assembled at origin Historically, page is assembled at origin sites using ASP, JSP, Server-Side Includes.sites using ASP, JSP, Server-Side Includes.
Edge Side Includes LanguageEdge Side Includes Language• An XML-based mark-up language An XML-based mark-up language • A mechanism to assemble page from different A mechanism to assemble page from different
components at edge servers (reverse proxies) components at edge servers (reverse proxies) • Separate cache control for each componentSeparate cache control for each component• Independently download changed fragments Independently download changed fragments
Akamai’s Approach for ESI-encoded Akamai’s Approach for ESI-encoded ContentsContents
Browser Edge server Origin server
GET /www.att.com
Full page
No ESI
Full page
GET /www.att.com
Frag1
ESI with edge-side page assembly
(template cached)
Page Assembly
GET /www.att.com
Full page
GET /frag1.html
Browser
Edge serverOrigin server
ESI -- Benefits and LimitationESI -- Benefits and Limitation Key Benefits of ESI
• extends the performance and cost-saving benefits extends the performance and cost-saving benefits of Web caching and content delivery services of Web caching and content delivery services
Bottleneck of the Bottleneck of the Last MileLast Mile• A large majority of Web users still rely on dial-up coA large majority of Web users still rely on dial-up co
nnections.nnections. Network traffic & revenue analysis: 79% of consumer subsNetwork traffic & revenue analysis: 79% of consumer subs
cribers as of March 2002.cribers as of March 2002. Jupiter Media Metrix, Aug 2001: 59% of the predicated on-lJupiter Media Metrix, Aug 2001: 59% of the predicated on-l
ine households in the US in 2006.ine households in the US in 2006.• ESI does NOT help dial-up customers!
The speed of the last mile dominates the page display timThe speed of the last mile dominates the page display time.e.
Client-Side Includes (CSI)Client-Side Includes (CSI) Key idea: Key idea:
• Assemble page in the browsersAssemble page in the browsers• Dramatically reduce user response timeDramatically reduce user response time
Not need browser modifications or configurations.Not need browser modifications or configurations. Use existing technologies inside Internet ExplorerUse existing technologies inside Internet Explorer
• Page parsing and assembly: JavaScriptPage parsing and assembly: JavaScript• Retrieval of page components: ActiveXRetrieval of page components: ActiveX
Free to use or not use a CDNFree to use or not use a CDN• Without: client Without: client origin server directly origin server directly• With: client With: client edge server edge server origin server origin server
scalable delivery of page template and fragments scalable delivery of page template and fragments Traffic reduction between client and edge serverTraffic reduction between client and edge server
Page Assembly AlternativesPage Assembly Alternatives
Browser Edge server Origin server
GET /www.att.com
Full page
Frag1
No ESI
ESI with edge-side assembly
(template cached)
Page Assembly
Full page
GET /www.att.com
GET /www.att.com
Full page
GET /frag1.html
Page Assembly
Frag1
GET /frag1.html(template cached)
GET /frag1.html
Frag1
ESI with client-side assembly
ESI vs. CSIESI vs. CSI
Same markup language (ESI) ESI assembling:
• Reduces bandwidth and server load CSI assembling:
• Reduces connectivity costs at origin server (less load and bandwidth).
• Reduces CDN-related costs (less bandwidth from edge to clients).
• Reduces browser download times (less bandwidth at last mile).
Implementation of CSIImplementation of CSI
Implement CSI for the prevalent Implement CSI for the prevalent browser only (Microsoft Internet browser only (Microsoft Internet Explorer MSIE)Explorer MSIE)
Resort to edge-side or server-side Resort to edge-side or server-side page assembly for all other browserspage assembly for all other browsers
Implementation (with a CDN)Implementation (with a CDN) Javascript: assem
ble a Web page
ActiveX: download page component
Wrapper: invoke Javascript and pass it the URL of the requested page
Browser
Wrapper
(cacheable, immutable for given page)
Edge server Origin server
GET /www.att.com
GET CSI Javascript
(cacheable, same for all pages)
Obtain fragments using ActiveX Obtain fragments Using HTTP
Typically satisfied from client’s cache
Performance EvaluationPerformance Evaluation
Synthetic pages: random generated contentsSynthetic pages: random generated contents• Sizes: 20K, 60K, 100KSizes: 20K, 60K, 100K• Template (80%) + four fragments (5% each)Template (80%) + four fragments (5% each)
AT&T page: AT&T page: http://www.att.comhttp://www.att.com• One template, two fragmentsOne template, two fragments
Wall Street Journal page: Wall Street Journal page: http://online.wsj.comhttp://online.wsj.com• One template, three fragmentsOne template, three fragments
Display Time of Synthetic PagesDisplay Time of Synthetic Pages
ESI processing overhead
nothing cached
CSI script cached
Template cached
Over dial-up linksOver dial-up links
Conclusion: Substantial reduction in display time across all page sizes
Bandwidth ReductionBandwidth Reduction
AT&T PageAT&T Page WSJ PageWSJ Page
Full PageFull Page 30731 30731 (100%)(100%) 79608 (100%)79608 (100%)
Page TemplatePage Template 28661 (93%)28661 (93%) 56324 (71%)56324 (71%)
Current TimeCurrent Time N/AN/A 55 (0%)55 (0%)
News HeadlinesNews Headlines 927 (3%)927 (3%) 20161 (25%)20161 (25%)
Stock QuotesStock Quotes 1231 (4%)1231 (4%) 3166 (4%)3166 (4%)
All numbers are in bytes
Conclusion: CSI can achieve significant reduction in bandwidth when the templates are cached in the browser.
Limitation of CSILimitation of CSI The need to download the wrapper increases l
atency when first time access the page. Sequentially and synchronously downloading
fragments may slow down page assembly. Javascript downloads the template and all fra
gments only from the same Web site. Some pages that are well suited to ESI assemb
ly may not be amenable to CSI.• Accessed by very many clients• Once per client over a long interval
Application Content Delivery NetApplication Content Delivery Networks (ACDNs)works (ACDNs)
Currently CDN provide access to static and Currently CDN provide access to static and streaming contentstreaming content• Proxy caches can improve the deliveryProxy caches can improve the delivery
Unique CDN value:Unique CDN value:• Delivering dynamic contentDelivering dynamic content• Proxy can’t cache the dynamic contentProxy can’t cache the dynamic content
An Application CDN (An Application CDN (ACDNACDN))• Deploy the application on a single computerDeploy the application on a single computer• Replicate or migrate the application as neededReplicate or migrate the application as needed
Issues of ACDNIssues of ACDN Application distribution frameworkApplication distribution framework
• Dynamically deploy a replicaDynamically deploy a replica• Keep consistency of replicasKeep consistency of replicas
Content placement algorithmContent placement algorithm• Decide which applications to deploy where and Decide which applications to deploy where and
whenwhen Request distribution algorithmRequest distribution algorithm
• Decide how to distribute requests among Decide how to distribute requests among replicasreplicas
System stability – reach a steady stateSystem stability – reach a steady state Bandwidth overhead – create replicasBandwidth overhead – create replicas
Architecture OverviewArchitecture Overview
CGI scriptsStart-upLoad reporterRepl targetRepl sourceUpdater
Decision process
Local replicator
Application
Metafile
Application
Metafile
Server
Server
Local replicator
Central Replicator
Load-balancing DNS
Client ClientClient
Standard Web server
Keep track of application
replicas
Compute request distribution policy
Architecture OverviewArchitecture Overview
CGI scriptsStart-upLoad reporterRepl targetRepl sourceUpdater
Decision process
Local replicator
Application
Metafile
Application
Metafile
Server
Server
Local replicator
Central Replicator
Load-balancing DNS
Client ClientClient
Invoked by system administrator when a new
ACDN server on-line
Architecture OverviewArchitecture Overview
CGI scriptsStart-upLoad reporterRepl targetRepl sourceUpdater
Decision process
Local replicator
Application
Metafile
Application
Metafile
Server
Server
Local replicator
Central Replicator
Load-balancing DNS
Client ClientClient
Invoked by central replicator and report load of the server
Architecture OverviewArchitecture Overview
CGI scriptsStart-upLoad reporterRepl targetRepl sourceUpdater
Decision process
Local replicator
Application
Metafile
Application
Metafile
Server
Server
Local replicator
Central Replicator
Load-balancing DNS
Client ClientClient
Periodically examine every application to
decide replicate or delete
ACDN ComponentsACDN Components
Application distributed frameworkApplication distributed framework• Dynamically create and delete Dynamically create and delete
application replicas based on demandapplication replicas based on demand• Maintain replica consistencyMaintain replica consistency
Content placement algorithmsContent placement algorithms Request distribution algorithmsRequest distribution algorithms
Application Distributed Application Distributed Framework -- MetafileFramework -- Metafile
Two parts in a metafile:Two parts in a metafile:• A list of all files comprising the application along A list of all files comprising the application along
with their last-modified dateswith their last-modified dates• An initialization script (or a URL of the file with An initialization script (or a URL of the file with
the script) ran by the recipient server before the script) ran by the recipient server before accepting any requestaccepting any request
FILE /home/applications/mapping/query_engine.cgi 1999.apr.14.08:46:12 FILE /home/applications/mapping/map_database 2000.oct.15.13:15:59 FILE /home/applications/mapping/user_preferences 2001.jan.30.18:00:05
Two Data files
Executable file
Create a directory
Set the environment variable
Metafile
SCRIPT mkdir /home/applications/mapping/access_stats setenv ACCESS_DIRECTORY /home/applications/mapping/access_stats ENDSCRIPT
Application MetafileApplication Metafile
A metafile is treated as a static Web A metafile is treated as a static Web page with its own URL.page with its own URL.
Using a metafile, the application Using a metafile, the application distribution framework can be distribution framework can be implemented over standard HTTP.implemented over standard HTTP.
Operations of framework:Operations of framework:• Replica creationReplica creation• Replica deletionReplica deletion• Replica consistencyReplica consistency
Migration = creation + deletionMigration = creation + deletion
Replica CreationReplica Creation
Initiated by the decision process Initiated by the decision process on the source serveron the source server
Source Server
Target Server
Central Replicator
overload
Query for le
ast-load se
rver
Return the le
ast-load se
rver
Invoke the repl target CGI scriptURL of the application metafile
Replica CreationReplica Creation
Initiated by the decision process Initiated by the decision process on the source serveron the source server
Source Server
Target Server
Central Replicator
overload
Query for le
ast-load se
rver
Return the le
ast-load se
rver
Invoke the repl source CGI scriptURL of the application metafile
Replica CreationReplica Creation
Initiated by the decision process Initiated by the decision process on the source serveron the source server
Source Server
Target Server
Central Replicator
overload
Query for le
ast-load se
rver
Return the le
ast-load se
rver
Tar file of application
UnpackInstall
Execute initialization
script
Replica CreationReplica Creation
Initiated by the decision process Initiated by the decision process on the source serveron the source server
Source Server
Target Server
Central Replicator
overload
Query for le
ast-load se
rver
Return the le
ast-load se
rver
New replica
DNS Server
Update
compute request distribution policy
Replica DeletionReplica Deletion
Initiated by the decision process on a Initiated by the decision process on a server with the replicaserver with the replica
Source Server
Central Replicator
DNS Server
query to delete Deletion update
Not the last replica
compute request distribution policy
Confirmwith DNS TTL
Permission to delete with DNS TTL
Mark the replica as “deleted”Delete it after the TTL
TTL: delay for the application requests arriving due to earlier DNS responses
Consistency MaintenanceConsistency Maintenance
Only deal with the developer updatesOnly deal with the developer updates Three issues:Three issues:
• Replica divergence: conflicting updatesReplica divergence: conflicting updates Only update the Only update the primaryprimary application replica application replica
• Replica staleness and replica coherencyReplica staleness and replica coherency Missing updates and updates not to all filesMissing updates and updates not to all files If detect the cached metafile not valid, then If detect the cached metafile not valid, then
download the new metafile and copy all download the new metafile and copy all modified objects from the modified objects from the primaryprimary server server
ACDN AlgorithmsACDN Algorithms
Application distribution frameworkApplication distribution framework Content placement algorithmContent placement algorithm
• Decide which applications to deploy Decide which applications to deploy where and whenwhere and when
Request distribution algorithmRequest distribution algorithm
Content Placement AlgorithmContent Placement Algorithm Executed periodically by ACDN server Executed periodically by ACDN server
• Make a local decision on deleting, replicating, Make a local decision on deleting, replicating, migrating its applicationsmigrating its applications
For each application app:(1) If demand below Deletion threshold, delete
app unless the only replica;(2) If demand from another server’s region
exceeds Deletion threshold and replication benefits are likely to exceed transfer overhead, try to replicate there;
(3) If demand from another server’s region exceeds 50% of total and migration benefits are likely to exceed transfer overhead, try to migrate there;
Improve proximity of servers to client requests
Content Placement Algorithm (2)Content Placement Algorithm (2)
If server is overloaded:(1) Find the least-loaded server from central r
eplicator;(2) Replicate some applications there if the lo
ad at the least-loaded server is above the deletion threshold
(3) Otherwise, migrate some applications there if its projected load after receiving the application will remain acceptable (below LW)
Achieve load balancing among servers
ACDN AlgorithmsACDN Algorithms
Application distribution frameworkApplication distribution framework Content placement algorithmContent placement algorithm Request distribution algorithmRequest distribution algorithm
• Decide how to distribute requests among reDecide how to distribute requests among replicasplicas
Request Distribution AlgorithmRequest Distribution Algorithm
Goal: Never skip the nearest non-overloGoal: Never skip the nearest non-overloaded server and yet reduce oscillations iaded server and yet reduce oscillations in request distributionn request distribution
iDNS: load-balancing DNS serveriDNS: load-balancing DNS server Request distribution policyRequest distribution policy
• ((RR, , ProbProb(1), …, (1), …, ProbProb((NN))))• ProbProb((ii) is the probability of selecting server ) is the probability of selecting server ii
for a request from the region for a request from the region RR
Request Distribution Algorithm(2)Request Distribution Algorithm(2)
Three phases:Three phases:• Assign the probability to each server based Assign the probability to each server based
on its loadon its load• Examine all servers with a replica of the appExamine all servers with a replica of the app
lication in the order of the increasing distanlication in the order of the increasing distance from the regionce from the region
• Normalize the probabilities of these servers Normalize the probabilities of these servers so that they sum up to one so that they sum up to one
Initial probabilities: Set prob(i) = 0 for all i Loop through the replicas in order of decreasing proximity if load(i) < LW prob(i) =1.0 exit else if LW <= load(i) < HW prob(i) = (HW – load(i)) / (HW – LW)
Adjustments to distance from region R: remainder = 1.0 Loop through the servers with a replica of the application in order of incre
asing distance from region R prob(i) = prob(i) * remainder remainder = remainder – prob(i)
Final probabilities:if sum of all > 0
prob(i) = prob(i) / sum of all else prob(i)=1/n, where n is the number of replicas
ACDN PerformanceACDN Performance -- Request Distribution -- Request Distribution
Three servers with decreasing proximity to Three servers with decreasing proximity to all clientsall clients• Server 1 is the closest, server 2 is the next Server 1 is the closest, server 2 is the next
closest, server 3 is the farthest.closest, server 3 is the farthest. HW=1000 request/secondHW=1000 request/second LW=200 request/secondLW=200 request/second Start with 10 clients, gradually increase to Start with 10 clients, gradually increase to
over server capacity, then decrease back over server capacity, then decrease back to 10 clientsto 10 clients
ACDN, pure random and CDN brokeringACDN, pure random and CDN brokering• CDN brokering: select the closest one with load CDN brokering: select the closest one with load
< 80% of its capacity< 80% of its capacity
ACDN
Random
Prefect load balancingUnnecessary high latency
Efficient using proximity informationAvoid overloading the replicas
CDN brokeringConsider both load and proximityBut not as well as ACDN
ACDN PerformanceACDN Performance -- Content Placement -- Content Placement
10% of servers are in “hot” regions with 10% of servers are in “hot” regions with 90% of demand90% of demand
90% of servers are in “cold” regions with 90% of servers are in “cold” regions with 10% of demand10% of demand
The set of hot regions changed every 400 The set of hot regions changed every 400 seconds, see how the system adapts.seconds, see how the system adapts.
Two other algorithms:Two other algorithms:• Static: a replica is created when the simulation Static: a replica is created when the simulation
starts and is fixed through the simulationstarts and is fixed through the simulation• Ideal: can get instantaneous knowledge of hot Ideal: can get instantaneous knowledge of hot
region and replicates or deletes application.region and replicates or deletes application.
Network bandwidth consumption Response Latency
Conclusion: Quickly adapt to the set of hot regions and significantly reduce network bandwidth and response time
static
ACDN
Ideal
static
ACDN
Ideal
ACDN PerformanceACDN Performance -- Redeployment Threshold -- Redeployment Threshold
Low threshold Low threshold more replicasmore replicas• response latency response latency
overheadoverhead High threshold High threshold
less replicasless replicas• response latency response latency
overheadoverhead
Conclusion: threshold that are either too high or too low result in increased bandwidth consumption
DiscussionDiscussion
Fragment-based techniques reduce Fragment-based techniques reduce bandwidth because only modified bandwidth because only modified fragments are needed to transfer and fragments are needed to transfer and most part of dynamic page are still static. most part of dynamic page are still static. How about a totally dynamic page with How about a totally dynamic page with frequently changed fragments?frequently changed fragments?
ACDN only consider read-only application. ACDN only consider read-only application. How to deal with consistency when user How to deal with consistency when user updates the application data?updates the application data?
ConclusionConclusion Static Web pages are not Static Web pages are not ENOUGHENOUGH!! ESI
• reduces bandwidth and server load • but not help dial-up customers!
CSI • reduces load and bandwidth at origin server, bandwidth from
edge to clients, bandwidth consumption over the last mile and decrese browser download time.
• But not good for some pages that are accessed by very many clients but once per client over a long interval
ACDN• A middleware platform for providing scalable access to Web a
pplication• Unique CDN value: dynamic Web page
ReferenceReference www.esi.orgwww.esi.org Michael Rabinovich, et. al., Michael Rabinovich, et. al., “Moving Edge-Side Includes to the R
eal Edge—the Clients” Proceedings of the 4th USENIX SymposiuProceedings of the 4th USENIX Symposium on Internet Technologym on Internet Technology, 2003. , 2003.
Michael Rabinovich and Zhen Xiao, “Computing on the Edge: A Michael Rabinovich and Zhen Xiao, “Computing on the Edge: A Platform for Replicating Internet. Applications “, Platform for Replicating Internet. Applications “, Proceedings of Proceedings of WCW'03WCW'03, 2003. , 2003.
Arun Iyengar, Jim Challenger, “Improving Web Server PerformaArun Iyengar, Jim Challenger, “Improving Web Server Performance by Caching Dynamic Data”, USENIX Symposium on Internet nce by Caching Dynamic Data”, USENIX Symposium on Internet Technologies and Systems, 1997. Technologies and Systems, 1997.
Fred Douglis, Antonio Haro, Michael Rabinovich, “HPP: HTML MFred Douglis, Antonio Haro, Michael Rabinovich, “HPP: HTML Macro-Preprocessing to Support Dynamic Document Caching “, Uacro-Preprocessing to Support Dynamic Document Caching “, USENIX Symposium on Internet Technologies and Systems, 1997. SENIX Symposium on Internet Technologies and Systems, 1997.