LSI Seminar on Marina Zapater's PhD Thesis
-
Upload
greenlsi-team-lsi-upm -
Category
Education
-
view
196 -
download
1
description
Transcript of LSI Seminar on Marina Zapater's PhD Thesis
Proactive and Reactive Thermal Aware Optimization Techniques to Minimize the Environmental Impact of Data Centers
Marina Zapater Sancho
Laboratorio de Sistemas Integrados (LSI)Universidad Politécnica de Madrid
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
About me
● Ingeniería de Telecomunicación, 2010. Ingeniería Electrónica, 2010.Universitat Politècnica de Catalunya (Barcelona)
● PICATA Pre-doctoral Fellowship, CEI Campus Moncloa➢ Research in collaboration with:
ArTeCs Group, Facultad de Informática, UCM
● Research Stay at Performance and Energy-Aware Computing Lab.➢ Boston University (BU)➢ In collaboration with Oracle, Inc.
ENERGY OPTIMIZATION of DATA CENTERS at LSI
2009
2010
PFC@LSI
PICATA
2011
2012
2013
2014
Research Stay @BU
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
● Energy consumption of Data Centers➢ 1.3% of worldwide energy production in 2010➢ USA: 2.0% production in 2011 = 1,5 x NYC➢ 1 data center = 25 000 houses➢ 12GW in 2007, 24 GW 2011, 43 GW in 2013 worldwide➢ By 2015, total worldwide electricity use of 400 GWh/year
● More than 43 Million Tons of CO2 emissions per year (2% worldwide)
● More water consumption than many industries (paper, automotive, petrol, wood, or plastic)
The energy challengeMOTIVATION
● From 30% to 50% of energy costs devoted to cooling:➢ Air conditioning units➢ Server fans
● PUE metric➢ Average PUE = 1.92➢ SoA PUE = 1.3
● CeSViMa Data Center @UPM:➢ Cooling costs/year: 360k€➢ IT costs/year: 240k€
The energy challengeMOTIVATION
CeSViMa IT power consumption
● Cloud Computing● The March towards the Internet of Everything
➢ e-Health, Smart-everything (cities, cars, offices...)● Huge increase of computational needs
➢ ...Data Centers
Future trendsMOTIVATION
Global Data Center traffic growth (Cisco)Global M2M Communication Growth
● Industry focused on PUE ➢ Metric shifting to Performance Per Watt ➢ Costly CFD simulations of the Data Center
State-of-the-ArtMOTIVATION
● Academia ➢ Problem faced from
multiple perspectives➢ Lack of a holistic approach➢ Lack of scalable models➢ No joint cooling +
computing approaches
Proactive and reactive holistic approach:
● Using the knowledge about the energy demand of applications, the features of the computational and cooling resources to apply proactive optimization techniques
● Global strategy to integrate multiple information sources and coordinate decisions to reduce overall power consumption.
● Energy optimization beyond PUE
Our perspectiveMOTIVATION
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
Global FrameworkFOCUS OF THE PhD THESIS
Datacenter
ModelOptimization
We derive accurate and flexible models of the Data Center to be able to predict power and energy consumption
We use the models and the knowledge of computing and cooling resources to jointly optimize cooling and computational costs
We propose actuations to reduce the energy consumption
Data Center Energy Optimization
Datacenter
Workload Model
Sensors
Actuators
Sensor configuration
Visualization
Power Model
Energy Model
Thermal Model
Dynamic Cooling Opt.
Resource Alloc. Opt.
Global DVFS
VM Opt.
Anom
aly Detection
and Reputation
System
s
Communication network
Sensor network
Application framework
FOCUS OF THE PhD THESIS
Optimization
Optimization
● Develop models and propose optimizations to minimize energy.● Leveraging heterogeneity and application-awareness● Multi-level orthogonal optimizations
➢ Server➢ Data Center ➢ Application framework → emphasis on e-Health
Optimization
ObjectivesFOCUS OF THE PhD THESIS
Server
Models
Models
DataCenter
Nodes
Models
ApplicationFramework
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
Server modeling and optimization
● Splitting contributors to power:➢ Dynamic power → workload➢ Static power → leakage (exp(T))➢ Fan power → (RPM)³
SERVER LEVEL
Goal 1: Exploiting the leakage-cooling tradeoffs at the server level
Goal 2: Energy-efficient workload allocation policy
● Joint workload and cooling management policy to minimize energy consumption at the server level➢ CPU affinity
Server modeling (I)● Experimental set-up:
➢ SPARC T3 server■ 32 cores, 256 hw threads■ 128GB RAM■ Monitoring via IPMI (SP)
➢ Control over cooling subsystem
● Workloads:➢ Training:
Synthetic workloads (LoadGen, RandMem)
➢ Test set:SPEC PowerSPEC CPU 2006PARSEC
SERVER LEVEL
CPU thermal dynamics (training)
Server modeling (II)SERVER LEVEL
CPU Steady-State Temperature (RMSE < 2.1ºC)CPU Leakage Power modeling (RMSE < 0.5W)
Sensor measurementsModels
● Modeling contributors to power consumption:➢ Leakage power ➢ CPU steady-state temperature➢ Memory dynamic power (via performance counters)➢ CPU dynamic power (via perf. counters, WIP)
Optimization● Optimum cooling-management
to improve energy efficiency ➢ Proactive fan control policy➢ Tested with statistically different
workloads (random power, Poisson arrival times )
➢ Up to 9% savings compared to server default policy
➢ Up to 6% savings compared to other SoA policies
SERVER LEVEL
Optimization● Energy-efficient workload allocation policy
➢ Comparing allocations: energy, power, EDP, temperature➢ Guided by application parameters: performance counters
(Mem accesses, L1 misses, IPC…)➢ Up to 13% energy savings when combining optimum
allocation and cooling
SERVER LEVEL
Work in progress
● Proactive workload allocation policy:➢ Now we were using “qualitative” knowledge about workload
behavior.➢ Working on contention-aware models to develop co-
assignment policies➢ Predict how we should combine several workloads in the same
server to minimize energy.➢ Proactive joint workload and cooling management.
SERVER LEVEL
M. Zapater, O. Tuncer, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, “Leakage-aware cooling management for Improving Server Energy Efficiency,” submitted to TPDS (JCR Q1), under review. in collaboration with Oracle, BU, UCM
M. Zapater, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, “Leakage and temperature aware server control for improving energy efficiency in data centers,” in DATE 2013. in collaboration with Oracle, BU, UCM
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
DC Modeling and optimizationSERVER LEVEL
Goal: Energy efficient assignment of computational and cooling resources of the DC to execute a workload
DC Modeling and optimizationDATA CENTER LEVEL
Goal: Energy efficient assignment of computational and cooling resources of the DC to execute a workload
SLURM Resource Manager
Data Center Room modeling
● The maximum CPU temperature limits the minimum cooling of the Data Room.➢ Development of fast, accurate and flexible models to
predict:■ Server Inlet temperature■ CPU temperature
➢ Literature uses CFD simulation → Complex non-linear models...
➢ Classical regression techniques no longer valid…
● Usage of a WSN to gather environmental parameters
● Usage of Genetic Programming techniques
DATA CENTER LEVEL
Data Center Room modeling
● Genetic programming techniques:➢ Find the best model to predict a time series given a set of
variables and a fitness function.➢ Each model is an individual with a genotype and a phenotype➢ Fitness function is RMSE➢ Models evolve → individuals with best fitness survive
● 1 minute ahead CPU temperature prediction:
DATA CENTER LEVEL
CPU Temperature prediction in Intel Xeon server (RMSE = 2.1ºC)TS(k+1) = TS(k-6)-PS(k-8)+6.3+PS(k-6)-PS(k-25)/49.4
Data Center Room modeling
● Work-in-Progress:➢ Extending CPU temperature
prediction to CeSViMa servers → Power7 architecture, blade center■ 245 servers eServer
BladeCenter PS702, each with 2 CPU x 8 cores @3.3 GHz
➢ Running (currently evolving) models for inlet temperature at LSI servers.
➢ Going to extend to CeSViMa
DATA CENTER LEVEL
Optimizing IT allocation (I)● Heterogeneity-aware and application-aware resource
management➢ Energy profiling of tasks of the SPEC CPU 2006 benchmark in
3 servers➢ Static optimization: finding the best data center setup, given a
number of heterogeneous servers➢ Dynamic: run-time allocation using the resource manager
● MILP algorithms to allocate tasks to servers:➢ Minimize total IT energy
DATA CENTER LEVEL
Optimizing IT allocation (II)
● Implemented in SLURM resource manager:➢ BSC SLURM Simulator➢ Random arrival distribution (light, medium,
heavy load)➢ Simulating around 1.000 cores
● Results show that the best solution is achieved with a heterogeneous data center:➢ 5% to 22% savings for static solution➢ 7.5% to 24% energy savings (depending on
the scenario) for dynamic solution when compared to SLURM round-robin allocation
DATA CENTER LEVEL
M. Zapater, J. L. Ayala, and J. M. Moya, “Leveraging heterogeneity for energy minimization in data centers,” in CCGRID 2012. CORE A, in collaboration with UCM
Cooling & IT optimization● Cooling reduction of 15% in LSI server room (Aug’13)
➢ Leakage and temperature-aware control
● Work in progress:➢ Using the data room modeling at CeSViMa and LSI
rooms, development of joint cooling and IT optimizations■ MILP■ GA-based
DATA CENTER LEVEL
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
e-Health scenarios
● Next-generation applications need higher computational demands to analyze data.➢ We propose the usage of other elements in the application
framework (i.e. personal servers) to offload computation from the data center.
APPLICATION FRAMEWORK
Off-loading workload
● Tasks that do not have high computational demands, can be executed in intermediate nodes:➢ Not all computation is performed in the Data Center➢ Clustering tasks according to IPC and memory boundedness ➢ Each node decides whether to:
a) execute a task or b) forward it to the data center
APPLICATION FRAMEWORK
Off-loading workload● Usage of SMT Solvers (Satisfiability Modulo Theory)
➢ SMT solvers determine whether a certain condition can be satisfied
➢ Each node runs an SMT solver: if a task satisfies certain parameters, it is executed in the node.■ Lower EDP product in the node than in the DC■ Minimum QoS (constrains max. execution time)■ Maximum amount of battery used
● Tested with Yices SMT Solver
● Different nodes capabilities depending on scenario:➢ Hardware equivalent to a Samsung Galaxy SII Smartphone
(ARM Cortex-A9, 1GB RAM)➢ MIPS32 @500MHz, 256MB RAM➢ Dual-core AMD PC @2GHz, 1GB RAM
APPLICATION FRAMEWORK
Off-loading workload● Depending on the number of nodes to execute the
workload and on the workload (light, medium, heavy) different benefits are achieved:➢ 10% to 24% energy savings➢ Up to 16% performance increase
M. Zapater, C. Sánchez, J. L. Ayala, J. M. Moya, and J. L. Risco-Martín, “Ubiquitous green computing techniques for high demand applications in smart environments,” Sensors, 2012. JCR Q1, in collaboration with IMDEA Software, UCM
M. Zapater, P. Arroba, J. L. Ayala, J. M. Moya, and K. Olcoz, “A novel energy-driven computing paradigm for e-Health scenarios”, Future Generation Computer Systems, 2014. JCR Q1, in collaboration with UCM
APPLICATION FRAMEWORK
● About me● Motivation ● Focus of this PhD Thesis● Multi-level approach
➢ Server level➢ Data Center level➢ Application framework
● Conclusions
OutlineENERGY OPTIMIZATION of DATA CENTERS at LSI
The energy challenge
● Unsustainable energy costs of Data Centers
● Proposal of multi-layer holistic approaches to the energy issue → energy as a first-class requirement
● Combining the proposed approaches:➢ server, data center and application level
we can reach high energy savings
CONCLUSIONS
Related Research
Datacenter
Workload Model
Sensors
Actuators
Sensor configuration
Visualization
Power Model
Energy Model
Thermal Model
Dynamic Cooling Opt.
Resource Alloc. Opt.
Global DVFS
VM Opt.
Anom
aly Detection
and Reputation
System
s
Communication network
Sensor network
Application framework
CONCLUSIONS
Datacenter
Workload Model
Sensors
Actuators
Sensor configuration
Visualization
Power Model
Energy Model
Thermal Model
Dynamic Cooling Opt.
Resource Alloc. Opt.
Global DVFS
VM Opt.
Anom
aly Detection
and Reputation
System
s
Juan Carlos Salinas
Communication network
Sensor network
Workload
MarinaZapater
Patricia Arroba
Pedro Malagón
David Fraga
Josué Pagán
Juan-Marianode Goyeneche
CONCLUSIONS
Related Research
Most relevant publications
M. Zapater, P. Arroba, J. L. Ayala, J. M. Moya, and K. Olcoz, “A novel energy-driven computing paradigm for e-Health scenarios”, Future Generation Computer Systems, 2014. JCR Q1, in collaboration with UCMJ. Pagán, M. Zapater, Ó. Cubo, P. Arroba, V. Martín, and J. M. Moya, “A Cyber-Physical approach to combined HW-SW monitoring for improving energy efficiency in data centers,” in DCIS 2013. in collaboration with CeSViMaM. Zapater, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, “Leakage and temperature aware server control for improving energy efficiency in data centers,” in DATE 2013. in collaboration with Oracle, BU, UCMP. Arroba, M. Zapater, J. L. Ayala, J. M. Moya, K. Olcoz, and R. Hermida, “On the Leakage-Power modeling for optimal server operation,” in IWIA, 2014. in collaboration with UCMM. Zapater, C. Sánchez, J. L. Ayala, J. M. Moya, and J. L. Risco-Martín, “Ubiquitous green computing techniques for high demand applications in smart environments,” Sensors, 2012. JCR Q1, in collaboration with IMDEA Software, UCMM. Zapater, J. L. Ayala, and J. M. Moya, “GreenDisc: a HW/SW energy optimization framework in globally distributed computation,” LNCS, 2012, in collaboration with UCMM. Zapater, J. L. Ayala, and J. M. Moya, “Leveraging heterogeneity for energy minimization in data centers,” in CCGRID 2012. CORE A, in collaboration with UCM
ENERGY OPTIMIZATION of DATA CENTERS at LSI
Know-How and skills
● Methodologies to develop models➢ Data sets, tests to perform, etc.➢ Extracting useful information from large data sets
● Metaheuristics➢ Genetic programming
● Benchmarks➢ CPU and memory intensive, disk, etc.
● Collecting data from servers:➢ Sensors, performance counters
CONCLUSIONS
Preguntas
Marina [email protected]
(+34) 91 549 57 00 (x-4227)ETSI Telecomunicación, B105
Avenida Complutense, 30Madrid, 28040 (Spain)
ENERGY OPTIMIZATION of DATA CENTERS at LSI
MotivationBACKUP SLIDES
MotivationBACKUP SLIDES
Energy Efficiency at GoogleBACKUP SLIDES
WSN deploymentBACKUP SLIDES
Genetic programming
● Off-spring generation
BACKUP SLIDES