Self-Adaptive Cloud Infrastructures with Bidirectional Programming
Self learning cloud controllers
-
Upload
pooyan-jamshidi -
Category
Software
-
view
286 -
download
1
Transcript of Self learning cloud controllers
![Page 1: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/1.jpg)
Self-Learning Cloud Controllers
Pooyan Jamshidi, Amir Sharifloo, Claus Pahl, Andreas Metzger, Giovani Estrada IC4, Dublin City University, Ireland University of Duisburg-Essen, Germany Intel, Ireland
Invited presentation at UFC, Brazil, Fortaleza
![Page 2: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/2.jpg)
~50% = wasted hardware
Actual traffic
Typical weekly traffic to Web-based applications (e.g., Amazon.com)
![Page 3: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/3.jpg)
Problem 1: ~75% wasted capacity Actual
demand
Problem 2: customer lost
Traffic in an unexpected burst in requests (e.g. end of year traffic to Amazon.com)
![Page 4: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/4.jpg)
Really like this??
Auto-scaling enables you to realize this ideal on-demand provisioning
Time
Demand
? Enacting change in the Cloud resources are not real-time
![Page 5: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/5.jpg)
Capacity we can provision with Auto-Scaling
A realistic figure of dynamic provisioning
![Page 6: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/6.jpg)
![Page 7: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/7.jpg)
0 50 1000
500
1000
1500
0 50 100100
200
300
400
500
0 50 1000
1000
2000
0 50 1000
200
400
600
![Page 8: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/8.jpg)
![Page 9: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/9.jpg)
These quantitative values are required to be determined by the user ⇒ requires deep
knowledge of application (CPU, memory, thresholds)
⇒ requires performance modeling expertise (when and how to scale)
⇒ A unified opinion of user(s) is required
Amazon auto scaling Microsoft Azure Watch
9
Microsoft Azure Auto-scaling Application Block
![Page 10: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/10.jpg)
![Page 11: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/11.jpg)
Naeem Esfahani and Sam Malek, “Uncertainty in Self-Adaptive Software Systems”
Pooyan Jamshidi, Aakash Ahmad, Claus Pahl, Muhammad Ali Babar , “Sources of Uncertainty in Dynamic Management of Elastic Systems”, Under Review
![Page 12: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/12.jpg)
Uncertainty related to enactment latency: The same scaling action (adding/removing a VM with precisely the same size) took different time to be enacted on the cloud platform (here is Microsoft Azure) at different points and this difference were significant (up to couple of minutes). The enactment latency would be also different on different cloud platforms.
![Page 13: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/13.jpg)
Ø Offline benchmarking
Ø Trial-and-error
Ø Expert knowledge
Costly and not systematic
A. Gandhi, P. Dube, A. Karve, A. Kochut, L. Zhang, Adaptive, “Model-driven Autoscaling for Cloud Applications”, ICAC’14
arrival rate (req/s)
95% Resp. :me (m
s)
400 ms
60 req/s
![Page 14: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/14.jpg)
RobusT2Scale Initial setting + elasticity rules + response-time SLA
environment monitoring
application monitoring
scaling actions
Fuzzy Reasoning Users
Prediction/Smoothing
![Page 15: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/15.jpg)
![Page 16: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/16.jpg)
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Region of definite satisfaction
Region of definite dissatisfaction Region of
uncertain satisfaction
Performance Index
Pos
sibi
lity
Performance Index P
ossi
bilit
y
words can mean different things to different people
Different users often recommend different elasticity policies
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Type-2 MF
Type-1 MF
![Page 17: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/17.jpg)
![Page 18: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/18.jpg)
Workload
Response time
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2
uM
embe
rshi
p gr
ade
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2u
Mem
bers
hip
grad
e
=>
=>
UMF
LMF
Embedded
FOU
mean
sd
![Page 19: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/19.jpg)
Rule (𝒍)
Antecedents Consequent
𝒄𝒂𝒗𝒈𝒍 Workload Response-‐
time Normal (-‐2)
Effort (-‐1)
Medium Effort (0)
High Effort (+1)
Maximum Effort (+2)
1 Very low Instantaneous 7 2 1 0 0 -‐1.6 2 Very low Fast 5 4 1 0 0 -‐1.4 3 Very low Medium 0 2 6 2 0 0 4 Very low Slow 0 0 4 6 0 0.6 5 Very low Very slow 0 0 0 6 4 1.4 6 Low Instantaneous 5 3 2 0 0 -‐1.3 7 Low Fast 2 7 1 0 0 -‐1.1 8 Low Medium 0 1 5 3 1 0.4 9 Low Slow 0 0 1 8 1 1 10 Low Very slow 0 0 0 4 6 1.6 11 Medium Instantaneous 6 4 0 0 0 -‐1.6 12 Medium Fast 2 5 3 0 0 -‐0.9 13 Medium Medium 0 0 5 4 1 0.6 14 Medium Slow 0 0 1 7 2 1.1 15 Medium Very slow 0 0 1 3 6 1.5 16 High Instantaneous 8 2 0 0 0 -‐1.8 17 High Fast 4 6 0 0 0 -‐1.4 18 High Medium 0 1 5 3 1 0.4 19 High Slow 0 0 1 7 2 1.1 20 High Very slow 0 0 0 6 4 1.4 21 Very high Instantaneous 9 1 0 0 0 -‐1.9 22 Very high Fast 3 6 1 0 0 -‐1.2 23 Very high Medium 0 1 4 4 1 0.5 24 Very high Slow 0 0 1 8 1 1 25 Very high Very slow 0 0 0 4 6 1.6
Rule ()
Antecedents Consequent
Work load
Response
-time -2 -1 0 +1 +2
12 Medium Fast 2 5 3 0 0 -0.9
10 experts’ responses
𝑅↑𝑙 : IF (the workload (𝑥↓1 ) is 𝐹 ↓𝑖↓1 , AND the response-time (𝑥↓2 ) is 𝐺 ↓𝑖↓2 ), THEN (add/remove 𝑐↓𝑎𝑣𝑔↑𝑙 instances).
𝑐↓𝑎𝑣𝑔↑𝑙 = ∑𝑢=1↑𝑁↓𝑙 ▒𝑤↓𝑢↑𝑙 ×𝐶 /∑𝑢=1↑𝑁↓𝑙 ▒𝑤↓𝑢↑𝑙
Goal: pre-computations of costly calculations to make a runtime efficient elasticity reasoning based on fuzzy inference
![Page 20: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/20.jpg)
Liang, Q., Mendel, J. M. (2000). Interval type-2 fuzzy logic systems: theory and design. Fuzzy Systems, IEEE Transactions on, 8(5), 535-550.
Scaling Actions
Monitoring Data
![Page 21: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/21.jpg)
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5954
0.3797
𝑀
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2212
0.0000
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2
u
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2
uMonitoring data
Workload
Response time
![Page 22: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/22.jpg)
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5954
0.3797
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
10.95680.9377
![Page 23: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/23.jpg)
![Page 24: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/24.jpg)
Performance index
𝑦↓𝑙 , 𝑦↓𝑟
![Page 25: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/25.jpg)
![Page 26: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/26.jpg)
0 50 1000
500
1000
1500
0 50 100100
200
300
400
500
0 50 1000
1000
2000
0 50 1000
200
400
600
0 50 1000
500
1000
0 50 1000
500
1000
![Page 27: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/27.jpg)
0 10 20 30 40 50 60 70 80 90 100-500
0
500
1000
1500
2000
Time (seconds)
Num
ber o
f hits
Original databetta=0.10, gamma=0.94, rmse=308.1565, rrse=0.79703betta=0.27, gamma=0.94, rmse=209.7852, rrse=0.54504betta=0.80, gamma=0.94, rmse=272.6285, rrse=0.70858
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Big spike Dual phase Large variations Quickly varying Slowly varying Steep tri phase
0 50 1000
500
1000
1500
0 50 100100
200
300
400
500
0 50 1000
1000
2000
0 50 1000
200
400
6000 50 1000
500
1000
0 50 1000
500
1000
Roo
t Rel
ativ
e S
quar
ed E
rror
![Page 28: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/28.jpg)
SUT Criteria Big spike
Dual phase
Large variations
Quickly varying
Slowly varying
Steep tri phase
RobusT2Scale 973ms 537ms 509ms 451ms 423ms 498ms
3.2 3.8 5.1 5.3 3.7 3.9
Overprovisioning
354ms 411ms 395ms 446ms 371ms 491ms
6 6 6 6 6 6
Under provisioning
1465ms 1832ms 1789ms 1594ms 1898ms 2194ms
2 2 2 2 2 2
SLA: 𝒓𝒕↓𝟗𝟓 ≤𝟔𝟎𝟎𝒎𝒔 For every 10s control interval
• RobusT2Scale is superior to under-provisioning in terms of guaranteeing the SLA and does not require excessive resources • RobusT2Scale is superior to over-provisioning in terms of guaranteeing required resources while guaranteeing the SLA
![Page 29: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/29.jpg)
0
0.02
0.04
0.06
0.08
0.1
alpha=0.1 alpha=0.5 alpha=0.9 alpha=1.0
Roo
t Mea
n S
quar
e E
rror
Noise level: 10%
![Page 30: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/30.jpg)
Self-Learning Controller
![Page 31: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/31.jpg)
Current Solution (my PhD Thesis + IC4 work on auto-scaling)
Future Plan Updating K in MAPE-K @ Runtime Design-time
Assistance
Multi-cloud
![Page 32: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/32.jpg)
Fuzzifier
Inference Engine
Defuzzifier Rule base
Fuzzy Q-learning
Cloud Application Monitoring Actuator
Cloud Platform
Fuzzy Logic Controller
Knowledge Learning
Aut
onom
ic C
ontr
olle
r
𝑟𝑡 𝑤
𝑤, 𝑟𝑡, 𝑡ℎ, 𝑣𝑚
𝑠𝑎
system state system goal
![Page 33: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/33.jpg)
RobusT2Scale
Learned rules
FQL
Monitoring Actuator
Cloud Platform
.fis
LW
W
ElasticBench
𝑤, 𝑟𝑡
𝑤, 𝑟𝑡, 𝑡ℎ, 𝑣𝑚
𝑠𝑎
Load Generator
C
system state
WCF
REST
𝛾,𝜂,𝜀,𝑟
![Page 34: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/34.jpg)
Cloud Platform (PaaS) On-Premise
P: Worker
Role
L: Web Role
P: Worker
Role
P: Worker
Role
Cache
M: Worker
Role
Results:
Storage
Blackboard: Storage
LG: Console
Auto-scaling Logic (controller)
Policy Enforcer
1 2 3
7 8
11 12
4
10 9
LB: Load
Balancer
6
5 Queue
Actuator
Monitoring
![Page 35: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/35.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 4 12
15
21
26
33
39
47
53
60
68
76
83
90
95
103
110
118
124
130
136
145
152
159
166
171
179
185
193
200
206
214
218
219
228
236
242
249
255
263
267
271
275
283
290
295
303
307
314
320
S1 S2 S3 S4 S5
![Page 36: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/36.jpg)
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250 300 350
q(9,5)
0 0.02 0.04 0.06 0.08
0.1 0.12 0.14 0.16 0.18
0.2
0 50 100 150 200 250 300 350
q(7,5)
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 50 100 150 200 250 300 350
q(9,3)
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0 50 100 150 200 250 300 350
q(3,3)
![Page 37: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/37.jpg)
0
1
2
3
4
5
6
7
8
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
![Page 38: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/38.jpg)
1
2
3
4
![Page 39: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/39.jpg)
Challenge 1: ~ 75% wasted capacityA c t u a l
d e m a n d
Challenge 2: customer lost
Fuzzifier
Inference Eng ine
DefuzzifierRule base
FuzzyQ-‐learning
Cloud ApplicationMonitoring Actuator
Cloud Platform
Fuzzy Log ic Controller
Knowledge Learning
Auton
omic Con
troller
𝑟𝑡𝑤
𝑤,𝑟𝑡,𝑡ℎ,𝑣𝑚
𝑠𝑎
system state system goal
RobusT2Scale
Learned rules
FQL
Monitoring Actuator
Cloud Platform
.fis
LW
W
ElasticBench
𝑤, 𝑟𝑡
𝑤, 𝑟𝑡, 𝑡ℎ, 𝑣𝑚
𝑠𝑎
Load Generator
C
system state
WCF
REST
𝛾, 𝜂, 𝜀, 𝑟
![Page 40: Self learning cloud controllers](https://reader034.fdocuments.net/reader034/viewer/2022051315/55a93cde1a28aba6758b46fc/html5/thumbnails/40.jpg)
http://computing.dcu.ie/~pjamshidi/PDF/SEAMS2014.pdf
More Details? => http://www.slideshare.net/pooyanjamshidi/
Slides?
=>
Thank you!
Aakash Ahmad Claus Pahl Soodeh Farokhi Amir Sharifloo Armin Balalaie
https://github.com/pooyanjamshidi
Code? =>
Hamed Jamshidi
Nabor Mendonca
Brian Carroll Reza Teimourzadegan