Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.

Post on 24-Dec-2015

217 views 0 download

Tags:

Transcript of Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.

Analyzing the Energy Efficiency of a Database Server

Hanskamal Patel

SE 521

Article

• Analyzing the Energy Efficiency of a

Database Server

– Dimitris Tsirogiannis – University of

Toronto

– Stavros Harizopoulos – HP Labs

–Mehul A. Shah – HP Labs

Introduction

• Evaluating database system in terms of performance is

measured in task per second or queries per second.

• Similarly, energy-efficiency is determined by the

measure of completed task per energy/Queries per

Joule.

• Improving performance is hardware/platform oriented

or workload-management oriented.

• Exploring ways to improve energy efficiency of a single-

machine database server.

Test Machine ConfigurationComponent Min (W) Max (W)

Two Intel Xenon E5430 Quad Core 2.66 GHz 48 W 160 W

Four 4GB FB-DIMMS (RAM) 40 W 40 W

Three 300 GB Seagate Savvio 10k.3 2.5” 14W 24W

Four 64 GB Intel X-25E 2.5’ (SSD) 0.2 W 10W

System board components 54W 54W

Power Breakdown• About half of the peak power

is idle system

– Two CPU’s

– Fixed RAM Power

– Board components

– SDD and HDD Minimal Power

• Left side of the chart is active

power consumption

– CPU is dominant component

– SSD and HDD draw similar

power

CPU Usage vs. Power

What affects energy efficiency?

• EE = Work/Energy = Performance/Power

• Several options affect power-use and potentially

affect energy efficiency

– CPU cycles to fetch data from disk

– Scans, record access, compressions, sorting, and

joining

• Energy efficiency can be improved but it may

sacrifice performance

Energy efficiency vs. Performance

• Experimented with five different overhead

kernels

– Parallel performing, cache-conscious hash join,

sorting, alphasort and parallel merging

• High performance storage engine that supports

column and row oriented database scans.

• PostgreSQL and System-X DBMS

Performance vs. Energy

Performance vs. Energy

Assembling data-management architectures

• Scale-up

– Shared memory and shared disk

– Choosing the balance of components and power down

unneeded resources

• Scale-out

– Share nothing

– Single node configurations connected by scaled network

– Choose energy efficient components for one node and

performance optimized for another

Power Profiles of Hardware Components

• RAM

– RAM is responsible for 20% of the power

consumption and stays the same

throughout

– Only way to vary power usage by

memory is to physically remove the

modules from the board

Power Profiles of Hardware Components

• Disks

– Both HDD and SSD in the configuration

– Supports active and idle stages, consuming

different amount of power – 15% in the active

stage

• Test Configuration

– Raid-0 configuration for both HDD and HDD

– Reading 100GB file @ block size of 128KB

Power Consumption of Disks

Power Profiles of Hardware Components

• CPU

– The two CPU’s are responsible for the 85% of power

increase in the system while active

– Interested in understanding:

• How CPU power is affected by database operations and the

efficacy of hardware and software power management

• Developed a set of micro-benchmarks that performs three

classes of database operations: hashing, sorting, and scans.

Micro-benchmarks

• Custom Join Kernel

– Hash join algorithm for computing join of two

relations in parallel.

• Sort Kernel

– Two in-memory parallel sorting algorithm

• Scan kernel

– Scan uncompressed rows in memory

– Scan compressed column on disk

Analyzing Power Consumption

Memory bus utilization

Hashjoin Operator

Sort Operator

Scan Operator

Energy vs. Performance

• Parameters that have greatest

impact on energy

– Algorithm/plan selection

– Intra-operator parallelism

– Inter-query parallelism

Algorithm/Plan selection

• Access Methods

• Join Algorithms

• Complex Queries and Join Ordering

Intra-operator and Inter-query Parallelism

• Intra-operator parallelism

– Parallel hash join

– Parallel Sorts

• Inter-query parallelism

– Executing multiple queries at the same

time

Implications for Database Computing

• One size fits all

– Collection of nodes, where each node is optimized for

specific task

– High parallelism, low-frequency, small cache, and simple

design CPU

– Solid state drives

• Shared nothing, everything, or in-between

– Shared nothing and shared disk

• Controlling peak power

Conclusion

• CPU power usage by different operators can vary by

up to 60%

• The best performing system was the most energy

efficient

• Future investigations:

– Improving resources across unutilized nodes to save

power

– Alternative energy efficient hardware for lower fixed-

power cost

Questions?