LSI Seminar on Marina Zapater's PhD Thesis

Click here to load reader

  • date post

    25-Jun-2015
  • Category

    Education

  • view

    193
  • download

    1

Embed Size (px)

description

Slides of the Marina's talk on the orientation and current status of her PhD Thesis on March 20, 2014.

Transcript of LSI Seminar on Marina Zapater's PhD Thesis

  • 1. Proactive and Reactive Thermal Aware Optimization Techniques to Minimize the Environmental Impact of Data Centers Marina Zapater Sancho Laboratorio de Sistemas Integrados (LSI) Universidad Politcnica de Madrid

2. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework Conclusions Outline ENERGY OPTIMIZATION of DATA CENTERS at LSI 3. About me Ingeniera de Telecomunicacin, 2010. Ingeniera Electrnica, 2010. Universitat Politcnica de Catalunya (Barcelona) PICATA Pre-doctoral Fellowship, CEI Campus Moncloa Research in collaboration with: ArTeCs Group, Facultad de Informtica, UCM Research Stay at Performance and Energy-Aware Computing Lab. Boston University (BU) In collaboration with Oracle, Inc. ENERGY OPTIMIZATION of DATA CENTERS at LSI 2009 2010 PFC @LSI PICATA 2011 2012 2013 2014 Research Stay @BU 4. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework Conclusions Outline ENERGY OPTIMIZATION of DATA CENTERS at LSI 5. Energy consumption of Data Centers 1.3% of worldwide energy production in 2010 USA: 2.0% production in 2011 = 1,5 x NYC 1 data center = 25 000 houses 12GW in 2007, 24 GW 2011, 43 GW in 2013 worldwide By 2015, total worldwide electricity use of 400 GWh/year More than 43 Million Tons of CO2 emissions per year (2% worldwide) More water consumption than many industries (paper, automotive, petrol, wood, or plastic) The energy challenge MOTIVATION 6. From 30% to 50% of energy costs devoted to cooling: Air conditioning units Server fans PUE metric Average PUE = 1.92 SoA PUE = 1.3 CeSViMa Data Center @UPM: Cooling costs/year: 360k IT costs/year: 240k The energy challenge MOTIVATION CeSViMa IT power consumption 7. Cloud Computing The March towards the Internet of Everything e-Health, Smart-everything (cities, cars, offices...) Huge increase of computational needs ...Data Centers Future trends MOTIVATION Global Data Center traffic growth (Cisco)Global M2M Communication Growth 8. Industry focused on PUE Metric shifting to Performance Per Watt Costly CFD simulations of the Data Center State-of-the-Art MOTIVATION Academia Problem faced from multiple perspectives Lack of a holistic approach Lack of scalable models No joint cooling + computing approaches 9. Proactive and reactive holistic approach: Using the knowledge about the energy demand of applications, the features of the computational and cooling resources to apply proactive optimization techniques Global strategy to integrate multiple information sources and coordinate decisions to reduce overall power consumption. Energy optimization beyond PUE Our perspective MOTIVATION 10. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework Conclusions Outline ENERGY OPTIMIZATION of DATA CENTERS at LSI 11. Global Framework FOCUS OF THE PhD THESIS Datacenter ModelOptimization We derive accurate and flexible models of the Data Center to be able to predict power and energy consumption We use the models and the knowledge of computing and cooling resources to jointly optimize cooling and computational costs We propose actuations to reduce the energy consumption 12. Data Center Energy Optimization Datacenter Workload Model Sensors Actuators Sensor configuration Visualization Power Model Energy Model Thermal Model Dynamic Cooling Opt. Resource Alloc. Opt. Global DVFS VM Opt. AnomalyDetection andReputation Systems Communication network Sensor network Application framework FOCUS OF THE PhD THESIS 13. Optimization Optimization Develop models and propose optimizations to minimize energy. Leveraging heterogeneity and application-awareness Multi-level orthogonal optimizations Server Data Center Application framework emphasis on e-Health Optimization Objectives FOCUS OF THE PhD THESIS Server Models Models Data Center Nodes Models Application Framework 14. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework Conclusions Outline ENERGY OPTIMIZATION of DATA CENTERS at LSI 15. Server modeling and optimization Splitting contributors to power: Dynamic power workload Static power leakage (exp(T)) Fan power (RPM) SERVER LEVEL Goal 1: Exploiting the leakage-cooling tradeoffs at the server level Goal 2: Energy-efficient workload allocation policy Joint workload and cooling management policy to minimize energy consumption at the server level CPU affinity 16. Server modeling (I) Experimental set-up: SPARC T3 server 32 cores, 256 hw threads 128GB RAM Monitoring via IPMI (SP) Control over cooling subsystem Workloads: Training: Synthetic workloads (LoadGen, RandMem) Test set: SPEC Power SPEC CPU 2006 PARSEC SERVER LEVEL CPU thermal dynamics (training) 17. Server modeling (II) SERVER LEVEL CPU Steady-State Temperature (RMSE < 2.1C)CPU Leakage Power modeling (RMSE < 0.5W) Sensor measurements Models Modeling contributors to power consumption: Leakage power CPU steady-state temperature Memory dynamic power (via performance counters) CPU dynamic power (via perf. counters, WIP) 18. Optimization Optimum cooling-management to improve energy efficiency Proactive fan control policy Tested with statistically different workloads (random power, Poisson arrival times ) Up to 9% savings compared to server default policy Up to 6% savings compared to other SoA policies SERVER LEVEL 19. Optimization Energy-efficient workload allocation policy Comparing allocations: energy, power, EDP, temperature Guided by application parameters: performance counters (Mem accesses, L1 misses, IPC) Up to 13% energy savings when combining optimum allocation and cooling SERVER LEVEL 20. Work in progress Proactive workload allocation policy: Now we were using qualitative knowledge about workload behavior. Working on contention-aware models to develop co- assignment policies Predict how we should combine several workloads in the same server to minimize energy. Proactive joint workload and cooling management. SERVER LEVEL M. Zapater, O. Tuncer, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, Leakage-aware cooling management for Improving Server Energy Efficiency, submitted to TPDS (JCR Q1), under review. in collaboration with Oracle, BU, UCM M. Zapater, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, Leakage and temperature aware server control for improving energy efficiency in data centers, in DATE 2013. in collaboration with Oracle, BU, UCM 21. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework Conclusions Outline ENERGY OPTIMIZATION of DATA CENTERS at LSI 22. DC Modeling and optimization SERVER LEVEL Goal: Energy efficient assignment of computational and cooling resources of the DC to execute a workload 23. DC Modeling and optimization DATA CENTER LEVEL Goal: Energy efficient assignment of computational and cooling resources of the DC to execute a workload SLURM Resource Manager 24. Data Center Room modeling The maximum CPU temperature limits the minimum cooling of the Data Room. Development of fast, accurate and flexible models to predict: Server Inlet temperature CPU temperature Literature uses CFD simulation Complex non- linear models... Classical regression techniques no longer valid Usage of a WSN to gather environmental parameters Usage of Genetic Programming techniques DATA CENTER LEVEL 25. Data Center Room modeling Genetic programming techniques: Find the best model to predict a time series given a set of variables and a fitness function. Each model is an individual with a genotype and a phenotype Fitness function is RMSE Models evolve individuals with best fitness survive 1 minute ahead CPU temperature prediction: DATA CENTER LEVEL CPU Temperature prediction in Intel Xeon server (RMSE = 2.1C) TS(k+1) = TS(k-6)-PS(k-8)+6.3+PS(k-6)-PS(k-25)/49.4 26. Data Center Room modeling Work-in-Progress: Extending CPU temperature prediction to CeSViMa servers Power7 architecture, blade center 245 servers eServer BladeCenter PS702, each with 2 CPU x 8 cores @3.3 GHz Running (currently evolving) models for inlet temperature at LSI servers. Going to extend to CeSViMa DATA CENTER LEVEL 27. Optimizing IT allocation (I) Heterogeneity-aware and application-aware resource management Energy profiling of tasks of the SPEC CPU 2006 benchmark in 3 servers Static optimization: finding the best data center setup, given a number of heterogeneous servers Dynamic: run-time allocation using the resource manager MILP algorithms to allocate tasks to servers: Minimize total IT energy DATA CENTER LEVEL 28. Optimizing IT allocation (II) Implemented in SLURM resource manager: BSC SLURM Simulator Random arrival distribution (light, medium, heavy load) Simulating around 1.000 cores Results show that the best solution is achieved with a heterogeneous data center: 5% to 22% savings for static solution 7.5% to 24% energy savings (depending on the scenario) for dynamic solution when compared to SLURM round-robin allocation DATA CENTER LEVEL M. Zapater, J. L. Ayala, and J. M. Moya, Leveraging heterogeneity for energy minimization in data centers, in CCGRID 2012. CORE A, in collaboration with UCM 29. Cooling & IT optimization Cooling reduction of 15% in LSI server room (Aug13) Leakage and temperature-aware control Work in progress: Using the data room modeling at CeSViMa and LSI rooms, development of joint cooling and IT optimizations MILP GA-based DATA CENTER LEVEL 30. About me Motivation Focus of this PhD Thesis Multi-level approach Server level Data Center level Application framework