1 Challenges in Scaling E-Business Sites Menascé and Almeida. All Rights Reserved. Daniel A....
-
Upload
nora-booth -
Category
Documents
-
view
215 -
download
1
Transcript of 1 Challenges in Scaling E-Business Sites Menascé and Almeida. All Rights Reserved. Daniel A....
1
Challenges in Scaling E-Business Sites
Menascé and Almeida. All Rights Reserved.
Daniel A. MenascéDepartment of Computer Science
George Mason University
2
Impacts of Bad Performance
Bad performance: response time above 8 seconds (eight-second rule).
$43.5 billion lost each year in e-commerce due to bad performance (Zona Research, April 1999).
Holiday Season of 1998: over 1/3 of customers gave up due to slowness, 44% turned to conventional stores, 14% moved to another site.
Menascé and Almeida. All Rights Reserved.
3
Performance Problems for E-commerce tend to get worse!
Proliferation of mobile devicesEasier to use interfaces (VUI, wireless
and Web services on cars and airplanes, novel browsing paradigms)
Increasing load placed by agentsImpacts of authentication protocols
(e.g., TLS) on e-commerce site performance.
Menascé and Almeida. All Rights Reserved.
4
Typical QuestionsIs the online trading site prepared to
accommodate a 75% increase of trades/day?
Do I have enough servers to handle a peak demand 10x the average?
How fast can the site architecture be scaled up? What components should be upgraded? Database servers? Web servers? Application servers? Bandwidth?
How can I design a site that will meet its business goals?
Menascé and Almeida. All Rights Reserved.
5
Outline
ScalabilityA Reference Model for E-BusinessWorkload Characterization
Customer behavior model graphsClient/Server Interaction Diagrams
Concluding Remarks
Menascé and Almeida. All Rights Reserved.
6
A Reference Model for Electronic Business
BusinessModel
FunctionalModel
CustomerModel
ResourceModel
Characteristicsof the Business
NavigationalStructure of the
Site
Patterns ofCustomerBehavior
Site Architectureand ServiceDemands
External Metrics
Internal Metrics
Business View
TechnologicalView
Descriptors
Menascé and Almeida. All Rights Reserved.
7
External Metrics and Descriptors
Cover the Nature of BusinessMetrics:
Revenue throughput (dollars/sec)Potential lost revenue/secClick-to-look ratioLook-to-basket ratioBasket-to-buy ratioClick-to-buy ratioAvailabilityDownload timesPage views/dayUnique visitors/day
Menascé and Almeida. All Rights Reserved.
8
External Metrics and Descriptors
Descriptors:Number of registered customersNumber of potential customersMaximum number of simultaneous
customers in the storeNumber of items in the catalogEstimated operational cost
Menascé and Almeida. All Rights Reserved.
9
Workload Characterization
1. Determine the e-business functions made available by the site. Associate URLs or URL patterns to each e-business function.
2. Analyze the site’s HTTP logs to determine customer sessions.
3. Cluster customer sessions into groups of “similar” customer sessions, i.e., sessions that exhibit similar behavior.
Menascé and Almeida. All Rights Reserved.
10
home
browse
search
add select
pay
0.5
0.50.3
0.3
0.35
0.35 0.15
0.15 0.2
0.2
0.3
0.3
0.3
0.1
0.2
0.10.4
0.10.1
0.1 1.0
entry
1.0
Customer Behavior Model Graph
Menascé and Almeida. All Rights Reserved.
11
Metrics Derived from the CBMG
Average Number of Visits Per StateE.g., average number of searches per visit
to the site,
Average Buy (or open account) to Visit Ratio – also called conversion ratio.
Average Session Length Per Visit
Menascé and Almeida. All Rights Reserved.
12
Metrics Derived from the CBMG
1
1,
1n
kjkkj
entry
pVV
V
Menascé and Almeida. All Rights Reserved.
13
Workload Characterization Methodology
Merge and Filter
Get Sessions
Get CBMGs (clustering algorithm)
HTTP Logs
Session Log
CBMGs
Request Log
Menascé and Almeida. All Rights Reserved.
14
Result of Clustering Analysis
Cluster 1 2 3 4 5 6% of Sessions 44.28 28.00 10.60 9.29 6.20 1.50BV Ratio (%) 5.70 4.50 3.70 4.00 3.50 2.00Session Length 5.6 15 27 28 50 81AV Ratio (%) 11 15 21 20 32 50Vb+Vs 3.6 11.4 20 23 39 70
• Cluster 1: majority of sessions, short sessions, and highest BV ratio.
• Cluster 6: small fraction of sessions, large sessions, smallest BV ratio.
Menascé and Almeida. All Rights Reserved.
15
Buy to Visit Ratio vs. Session Length
y = 0.0003x2 - 0.07x + 5.7919
R2 = 0.931
0
1
2
3
4
5
6
7
0 10 20 30 40 50 60 70 80 90
Session Length
Menascé and Almeida. All Rights Reserved.
16
Architecture of E-Commerce Sites
Internet
E-business site
web andauthentication
servers
transactionservers
client
Mainframe:•legacy applications•enterprise data
LAN
LAN
Internet
E-business site
web andauthentication
servers
transactionservers
client
Mainframe:•legacy applications•enterprise data
LAN
LAN
Menascé and Almeida. All Rights Reserved.
17
RemarksFor e-commerce, need to characterize
the workload at higher levels of abstraction: sessions vs. requests.
Workload characterization has to be customer behavior-oriented as opposed to request-oriented.
Customer Behavior Model Graphs capture customer behavior and can be mapped to resource demands.
Menascé and Almeida. All Rights Reserved.
18
Capacity Planning and Performance Management:key to EC: the competitors are just a click
away!require predictive models: avoid ROTs!throwing more plumbing is not the
solution.models have to integrate the business,
customer, and resource aspects of the problem.
Remarks (cont’d)
Menascé and Almeida. All Rights Reserved.
19
Performance Models ofE-Commerce Sites
Daniel A. MenascéDept. of Computer ScienceGeorge Mason University
23
Component-level Models
Each component is represented by a resource (e.g. CPU, disk, communication link) and a queue of requests waiting for the resource.
resource
queue
24
Basic ConceptsUtilization of a resource:
Fraction of time the resource is busy serving requests during a measurement interval.
Example: the CPU was busy during 40 minutes during a measurement period of one hour. It’s utilization is then: 40/60 = 0.667 = 66.7%
Utilizations are measured as dimensionless numbers
25
Basic Concepts (cont’d)
A request may be using a resource (e.g., CPU, disk, etc) or waiting to use it.
The time spent using the resource does not depend on the number of resources waiting to use the resource.
The time spent waiting to use the resource depends on the load, i.e., on the number of requests in the queue waiting to use the resource.
26
Performance Model Parameters
Workload IntensityHTPP Requests/secTransactions/secE-business functions/sec
Service demands for each resource and each type of request.
27
Service Demandscpu
disk 1
disk 2
incominglink
outgoinglink
server
0.109 sec0.00107 sec
0.003 sec
0.08 sec
0.12 sec
Service demands do not include any queuing time! It is justservice time.
28
Computing Waiting Timescpu
disk 1
disk 2
incominglink
outgoinglink
server
0.109 sec0.00107 sec
0.003 sec
0.08 sec
0.12 sec
Waiting times depend on the load (arrival rate of requests) and on the service demands.
29
Di
Ri
i
ii D
DR
1'
Service demandat resource i
Utilization of resource i (Ui)
Computing Residence Times
Residence time at resource i
30
Residence Time at Incoming Linkcpu
disk 1
disk 2
incominglink
outgoinglink
Webserver
0.109 sec0.00107 sec0.003 sec
0.08 sec
0.12 sec req/sec
sec 00108.0
00107.051
00107.0
1'
IncLink
IncLinkIncLink D
DR
31
Residence Time at Outgoing Linkcpu
disk 1
disk 2
incominglink
outgoinglink
Webserver
0.109 sec0.00107 sec0.003 sec
0.08 sec
0.12 sec req/sec
sec 239.0
109.051
109.0
1'
OutLink
OutLinkOutlink D
DR
32
Residence Time at the CPUcpu
disk 1
disk 2
incominglink
outgoinglink
Webserver
0.109 sec0.00107 sec0.003 sec
0.08 sec
0.12 sec req/sec
sec 00305.0
003.051
003.0
1'
cpu
cpuCPU D
DR
33
Residence Time at Disk 1cpu
disk 1
disk 2
incominglink
outgoinglink
Webserver
0.109 sec0.00107 sec0.003 sec
0.08 sec
0.12 sec req/sec
sec 133.0
08.051
08.0
1 1
1'1
disk
diskdisk D
DR
34
Residence Time at Disk 2cpu
disk 1
disk 2
incominglink
outgoinglink
Webserver
0.109 sec0.00107 sec0.003 sec
0.08 sec
0.12 sec req/sec
sec 3.0
12.051
12.0
1 2
2'2
disk
diskdisk D
DR
36
Response vs. Arrival Rate
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5 6 7 8
Arrival Rate (requests/sec)
Res
po
nse
Tim
e (s
ec)