HMI 2010 EnterpriseReport Jan 2011

download HMI 2010 EnterpriseReport Jan 2011

of 38

Transcript of HMI 2010 EnterpriseReport Jan 2011

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    1/38

    HMI

    ?

    HowM

    uchInfor

    mation

    EnterpriseServerInformatio

    n

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    2/38

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    3/38

    How Much Information? 2010 Report on Enterprise Server Information

    How Much Information? 2010

    Report on Enterprise Server Information

    James E. ShortRoger E. BohnChaitanya Baru

    Date of Publication: January 2011

    Website Publication: April 2011

    Last Update: December 2010

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    4/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    HOW MUCH INFORMATION? 2010Reort on Enterrise Server Information

    ACKNOWLEDGEMENTS

    EXECUTIVE SUMMARY

    1 INTRODUCTION .........................................................................................................................................8

    1.1 Data and Information................................................................................................................................................9

    1.2 What is Enterprise Information? ............................................................................................................................10

    1.3 Measuring Transaction Work .................................................................................................................................11

    1.4 How Many Bytes? ..................................................................................................................................................13

    2 HOW MANY SERVERS? ............................................................................................................................15

    2.1 Types of Servers .....................................................................................................................................................15

    2.2 Counting Servers ....................................................................................................................................................15

    2.3 World Server Sales .................................................................................................................................................16

    3 WORLD SERVER CApACITY.....................................................................................................................18

    3.1 Server Workloads ...................................................................................................................................................19

    3.2 Core and Edge Computing Model: Server Measurement Points ...........................................................................22

    3.3 Server Workload Allocations ..................................................................................................................................24

    3.4 World Server Capacities ......................................................................................................................................25

    4 WORLD SERVER INFORMATION .............................................................................................................26

    4.1 How Many Bytes? .................................................................................................................................................26

    4.2 Contribution to Total Bytes ...................................................................................................................................26

    4.3 Our Results in Context: Comparisons with Other Information Studies ................................................................27

    4.4 Discussion: Estimating Enterprise Information from Servers ................................................................................28

    5 TRENDS, pERSpECTIVES AND CHALLENGES IN INFORMATION INTENSIVE COMpUTING ..............28

    5.1 Measuring Capacity and Performance in Emerging Information Architectures ....................................................28

    5.2 Data Intensive Computing Platforms .....................................................................................................................29

    5.3 Back to the Future: Data Discovery, Data Generation, Data Preservation .............................................................30

    AppENDIX ....................................................................................................................................................32

    ENDNOTES ...................................................................................................................................................33

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    5/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    TABLES AND FIGURES

    Figure 1: Information Flows In An Enterprise ...............................................................................................................11

    Figure 2: The Flows We Are Interested In .....................................................................................................................12

    Figure 3: Example Server Workloads .............................................................................................................................13

    Figure 4: Modern Computer Servers ..............................................................................................................................16

    Figure 5: Improvements In Server Performance ...........................................................................................................19

    Figure 6: Schematic Of Modeling Approach ................................................................................................................20

    Figure 7: TPC-C Simulated Workow ..........................................................................................................................21

    Figure 8: SPECweb2005 Simulated Workow .............................................................................................................22

    Figure 9: VMmark Simulated Workload .........................................................................................................................23

    Figure 10: Three-Tier Web Server Confguration ..........................................................................................................24

    Figure 11: Core and Edge Computing Model ................................................................................................................ 25

    Figure 12: Estimated Workload Percentages 2004-2008 ..................................................................................................................26

    Figure 13: World Server Information Summary 2008 .........................................................................................................................27

    Figure 14: Contribution to World Server Information 2008 ...........................................................................................29

    Table 1: World Server Information .................................................................................................................................... 9

    Table 2: Data and Information ......................................................................................................................................... 10

    Table 3: World Server Information by Server Class 2008 in Zettabytes ........................................................................14

    Table 4: Installed Base, Shipments and Retirements of Servers for the World and U.S., 2000-2005............................. 17

    Table 5: World Server Sales 2004-2008 .......................................................................................................................... 17

    Table 6: Performance Benchmarks by Server Class by Server Workload ..................................................................... 20

    Table 7: Estimated Workload Percentages by Year ......................................................................................................... 24

    Table 8: Server Potential Capacities 2004-2008 ............................................................................................................ 26

    Table 9: Contribution to World Server Information 2008 ..............................................................................................28

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    6/38

    ACKNOWLEDGEMENTS

    Tis report is the product o industry and university collaboration. We are grateul or the support o ourindustry sponsors and university research partners.

    Financial support or the HMI? research program and the Global Inormation Industry Center is grateully

    acknowledged. Our oundation and corporate sponsors are:Alfred P. Sloan Foundation

    AT&T

    Cisco Systems

    IBM

    Intel Corporation

    LSI

    Oracle

    Seagate Technology

    Special thanks or research and technical advice is extended to the ollowing individuals:

    Richard Clarke, A&

    Clod Barrera, IBM

    Jerey Smits and erry Yoshii, Intel

    Dieter Gawlick, Garret Swart and Tomas Oestreich, Oracle

    Dave Anderson, Brook Hartzell and Je Burke, Seagate

    Bruce Herndon, VMware

    Te authors bear sole responsibility or the contents and conclusions o the report.

    Questions about the report may be addressed to the Global Inormation Industry Center at the School oInternational Relations and Pacic Studies, UC San Diego:

    Roger Bohn, Director [email protected]

    Jim Short, Research Director [email protected]

    Pepper Lane, Program Coordinator [email protected] 858-534-1019

    Press inquiries should be directed to Rex Graham,IR/PS Communications Director, [email protected] (858) 534-5952

    Center Website: http://hmi.ucsd.edu/howmuchino.php

    Report Design by Teresa Jackson, Orchard View Color: www.orchardviewcolor.com

    mailto:rbohn%40ucsd.edu?subject=ESI%20Reportmailto:%20jshort%40ucsd.edu?subject=ESI%20Reportmailto:pelane%40ucsd.edu?subject=ESI%20Reportmailto:ragraham%40ucsd.edu?subject=HMI%20Press%20Inquirieshttp://hmi.ucsd.edu/howmuchinfo.phphttp://www.orchardviewcolor.com/http://www.orchardviewcolor.com/http://hmi.ucsd.edu/howmuchinfo.phpmailto:ragraham%40ucsd.edu?subject=HMI%20Press%20Inquiriesmailto:pelane%40ucsd.edu?subject=ESI%20Reportmailto:%20jshort%40ucsd.edu?subject=ESI%20Reportmailto:rbohn%40ucsd.edu?subject=ESI%20Report
  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    7/38

    How Much Information? 2010Reort on Enterrise Server Information

    James E. ShortRoger E. BohnChaitanya Baru

    Executive Summary

    In 2008, the worlds servers processed 9.57 zettabytes o inormation, almost 10 to the 22nd power, or tenmillion million gigabytes. Tis was 12 gigabytes o inormation daily or the average worker, or about 3terabytes o inormation per worker per year. Te worlds companies on average processed 63 terabytes oinormation annually. Our estimates come rom an analysis o the total work capacity o the installed baseo computer servers in enterprises worldwide. Inormation through non-computer sources telephones orphysical newspapers or example is not included.

    We dene enterprise server inormation as the ows o data processed by servers as inputs plus the owsdelivered by servers as outputs. A single chunk o inormation, such as an email message, may ow throughmultiple servers and be counted multiple times. wo-thirds o the worlds total o 9.57 zettabytes wasprocessed by low-end, Entry-level servers costing $25,000 or less. Te remaining third was processed byMidrange and High-end servers, those costing more than $25,000. ransaction processing workloads issuing an invoice, paying a bill, checking a stock level amounted to approximately 44% o all the bytesprocessed. Web services and oce applications contributed the other 56%. Servers congured as virtualmachines processed about hal o all the bytes in Web services and oce applications.

    We also conducted a separate analysis o improvements in server perormance and capital cost. Midrangeservers processing Web services and business application workloads doubled their perormance per dollarin 1.5 years. Raw perormance or this server class doubled approximately every 2 years. High-end serversprocessing transaction workloads had the longest doubling times: both perormance/cost and raw serverperormance doubled approximately every 4 years.

    Tis report covers how much inormation was processed by the installed base o computer servers incompanies worldwide in 2008. It complements an earlier report on inormation consumption, whichestimated 3.6 zettabytes o inormation was consumed by American households in 2008. Later reports willcover storage systems and enterprise networks.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    8/38

    8

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    1 INTRODUCTION

    Businesses today are awash with inormation andthe data used to create it. Daily, managers areconronted with growing inormation volumes ar

    greater than can possibly be consumed.1

    Whereis all o this data being created? What happens toit? How much information is being processed bycomputer servers in companies worldwide?

    Te goal o the How Much Inormation? Programis to create a census o the worlds data andinormation. How much inormation is created andconsumed annually? What types o inormation iscreated? Who consumes it? And what happens toinormation afer it is used? For our purposes, wedistinguish between inormation that is created andused in organizations work inormation used or

    productive purposes and consumer inormationseen or heard by people not at work inormationcreated and used or consumption. We expand onthese denitions below. Last year we reported onconsumer information or dierent media in andoutside the home, such as watching television,playing computer games, going to the movies,listening to the radioor talking on a cellularphone.2 Nationwide, weound that Americansspent approximately 11.8hours viewing or listening

    to media on an averageday. Teyconsumed3.6zettabytes o inormationin 2008, or approximately34 gigabytes per personper day. Our estimate ismany times greater thantotals rom previousstudies. Why? In largepart the dierence lies inour use o a very inclusivedenition o inormation- we measured thefow

    o inormation, not just the raction o inormationthat is retained. A zettabyte is 10 bytes, or 1,000billion gigabytes. See Appendix: Counting VeryLarge Numbers.

    Our second report conveys our ndings orenterprise inormation. We dene enterprise serverinformation as the ows o data processed bycomputer servers as inputs plus the ows delivered

    by servers as outputs.3 Note that our denitionso work and consumer inormation are dierentand complementary. Work inormation is dataprocessed or productive use or example, toguide an immediate action, or to use as contextor a uture action. Consumer inormation is data

    processed and delivered or consumptive use todelight, to entertain, to enjoy.How much workinformation is processed by computer serversannually in companies worldwide? And why are weusing computer servers to estimate the ow o dataprocessed? Servers are the digital workhorses othe modern rm. Servers host the companys workapplications, process the data ows, and managethe data trac going in and out o the rms storagesystems. Small companies may have tens o serverso varying sizes and capacities; large enterprises mayhave tens o thousands o servers. We do not includein our denition inormation that people may see or

    hear while at work that is not processed by servers.

    O course, the data processed by servers is notall o the inormation that exists in any company(although in most companies it is likely to be thegreat majority o it). Tere is a wealth o paperdocuments in every organization, and there exists

    many digital data storage devices, rom personalstorage media DVDs, ash drives and the like - tomammoth capacity, enterprise storage systems. Datastored and archived on storage systems is dened asdata at rest. Data at rest requires data processingand output to constitute what we dene asinormation. Conversely, paper documents, records,image libraries and the like have long been dened asinormation in printed or image orm, such as that

    What does this report cover?

    Tis study covers how much inormation was processed and delivered by the installed

    base o computer servers in enterprises worldwide in 2008. Server capacity measureswere derived rom world sales and shipments data published by analyst rms Gartner andIDC. Server perormance was estimated using industry benchmarks published by theransaction Processing Perormance Council (PC), the Standard Perormance EvaluationCorporation (SPEC), and VMware. We used industry standard benchmarks to dene aconsistent measure o server work perormed, and converted it into its byte equivalent.

    Perormance data was taken rom server test results submitted to benchmark standardbodies by hardware vendors. We adjusted this data based on the date o availability o thetest system and other actors.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    9/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    9

    stored in case les in a law library, or customer lesarchived or long-term storage in an outside storageacility. We will add these sources in the uture.

    A ew highlights rom our ndings (Table 1):

    9.57 zettabytes o inormation was processed byservers in companies worldwide in 2008. Tatamounts to:

    3.0 terabytes o inormation per worker per year,or 12 gigabytes per worker per day (based on theILO and CIA Factbooks estimate o 3.18 billionpeople in the world labor orce in 2008).4

    63 terabytes o inormation per company peryear (based on Dun & Bradstreets 151 millionworld businesses registered with D&BS D-U-N-S system in 2008).5

    wo thirds o the world total o 9.57 zettabyteso inormation was processed by low-end,entry-level servers costing less than $25,000USD per machine.

    Te remaining third was processed bymidrange and high-end servers, costingbetween $25,000 and $500,000 (or midrangeservers), and over $500,000 (or high-endservers).

    Our report is divided into ve sections:

    Section 1 introduces our concepts andmeasurement methods.

    Section 2 looks at dierent types o servers andhow to count them.

    Section 3 considers server workloads, charts

    server perormance measured by industrybenchmarks, and calculates world servercapacity.

    Section 4 summarizes total annual serverinormation and dierent contributions to thattotal.

    Section 5 discusses some interesting actorsin enterprise inormation growth, serverperormance and data intensive computingplatorms.

    1.1 Data and InformationData are collections o numbers, characters, imagesor other outputs rom devices that representphysical quantities as articial signals intendedto convey meaning. Articial because data iscreated by machines such as sensors, barcodereaders, or computer keyboards. Digital data hasthe desirable properties that it is easy to capture,create, communicate and store: so easy in act,that increasingly we are ooded by it. Inormationis a subset o data, data being the lowest level oabstraction rom which inormation and knowledge

    are derived. In its most restrictive technicalmeaning, inormation can be thought o as anordered sequence o signals.6

    Inormation processing reers to the capacity ocomputers and other inormation technology (I)machinery to process data into inormation. Pasthigh-level studies o enterprise inormation havegenerally measured data o only two kinds: thedata that gets stored on physical storage media, andcommunications data that is in ow, transmittedover local-area or wide-area networks in the rm.7Unlike data, inormation has the urther property

    that it must have meaning or its intended use.8

    People dene that meaning, whether it is theinormation required or an immediate decisionor the collection o background inormation or a

    judgment or action to be taken in the uture. Teamount o human involvement increases as wemove rom a ocus on data to one o inormation we store data on computers; we use computers tocreate and manage inormation. (Table 2)

    Table 1: World Server Inormation

    Summary o Results

    What is measured World 2008 Total Notes

    Bytes processed plus

    delivered9.57 zettabytes

    Bytes per worker per year 3.01 terabytes 3.18 bi llion workers in world labor orce

    Bytes per worker per day 12.0 gigabytes

    Bytes per company per year 63.4 terabytes 151 million world businesses registered

    Sources: CIA Factbook, World Labor Organization, Dun & Bradstreet

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    10/38

    10

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    1.2 What is Enterprise Information?

    Tere is no common denition o enterpriseinormation in use today certainly there arehundreds. For our purposes, we dene enterpriseinormation as the ows o data processed byservers as inputs, plus the ows o data deliveredby servers as outputs. As every company has itsown unique computing architecture, we have basedour estimates on a simplied model o computingand inormation ows in the rm. (Figure 1)Te ows o data rom storage devices to serversto user devices are the ows we are interested in.(Figure 2) Our model assigns servers processingdierent types o work or example, a databaseserver or a web server, or a server congured to runas a virtual machine to the appropriate industrybenchmark. We use the perormance data reportedby the benchmark along with server hardware costs

    to derive measures o server capacity.

    Our byte total is based on the analysis o perormancedata rom our standard industry workloads:

    Online transaction processing (OLP).Te OLP workload processes clericaldata entry and retrieval processes in a realtime transaction processing environment.OLP workloads require very high server

    perormance, are optimized or a commonset o transactions, and in large rms supportthousands o concurrent users.

    Web server processing (WWW or Web).Web server workloads process documents(in the orm o Web pages) to the Web clientsrequesting them a typical application wouldbe a user searching or inormation usinga Web browser. Web server perormance

    typically trades o the number o requests theserver must process, with the number o bytesthat the server must transer (disk I/Os).

    Virtual Machine processing. Virtualization isan important sofware technology deployedin most companies today. Te basic principleis that virtualization allows a single physicalmachine to run multiple virtual machines,

    sharing the resources o the single machineacross multiple applications. Dierent

    virtual machines can run dierent operatingsystems and applications on the same physicalcomputer. We include virtual machineprocessing o multiple workloads in ouranalysis.

    Application processing. Our analysis includessome, but not all application processing done

    Table 2: Data and Inormation

    Data Inormation

    Artifcial signals intended to

    convey meaning

    Data with meaning or intended use

    Easily captured by machinesRequires human intervention to defne meaning

    (relevance, purpose)

    Easily manipulated Requires consensus o meaning or action

    Easily transerred Can be replicated, but oten hard to transer accurately

    In orm suitable or quantifcation Can be stored, but oten difcult to recall economically

    Easily stored

    Which Bytes?

    Our analysis estimates the amount o enterprise inormation by counting the number o bytes processed anddelivered to end users or to applications accessed by end users. Why bytes? And what is the relationship betweenbytes and inormation? We utilize a set o benchmarks (PC, SPEC, VMmark) representative o enterpriseworkloads. We compute the total number o bytes delivered by deriving how many bytes are processed ordelivered by transactions and applications dened within each o the standardized benchmarks. otal bytes are

    each benchmarks measure o how much work the server has perormed. How well this denition o inormationmatches inormation in a real enterprise environment depends upon how well the selected benchmarks representthe transaction and application work perormed in companies.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    11/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    11

    on application servers. Examples o applicationserver workloads not directly measured wouldinclude customer relationship management(CRM) and human resources management(HRM), and business analytics. We will addthese sources in the uture.

    Our denition o inormation emphasizes the owo data processing and data outputs. We countall instances o data processing and every owdelivered as output. Our denition expands onmany other denitions o enterprise inormation.9An alternative approach, or example, could goto the opposite extreme: only counting data thatis stored on some media somewhere in the rm- printed material, digital images or digital video -whether that data is subsequently used or not.10

    1.3 Measuring Transaction Work

    Servers are the natural starting point or theanalysis o enterprise inormation. Servers input,process and output the raw material o the rmsinormation resources - the companys data -transorming that data into usable inormation and

    directing it to its point o use. Measuring the workdone by servers counts the great majority o thetotal amount o inormation in any enterprise.

    Our goal is to estimate how many bytes oinormation are being processed annually by allservers in the world. o do this, we need a commonyardstick that we can apply to all servers and to alltypes o work being completed in rms.

    DatabaseServers

    ApplicationServersStorage

    Devices

    Web ApplicationServers

    User Devices

    WebPresentation

    Servers

    Web Servers

    EdgeServersData

    inpu

    tand

    proc

    esse

    din

    multi-tier

    serve

    rarch

    itectu

    res

    Processeddatadeliveredtoedge

    devicesasinformationflows

    Stored

    data

    flows

    Storedataonstoragedevices

    Stored

    data

    flows

    Access applications and data

    at edge and core

    User

    sInput

    data

    onlocald

    evices

    Figure 1: Inormation Flows In An Enterprise

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    12/38

    12

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    We divide this into the components:

    Total bytes per year

    = World server capacity, expressed in

    transactions per minute

    Bytes per transaction

    Annual load factor (hours of operation x

    fraction of full load)Where:

    World server capacity in transactions per minute

    = $ spent / $ per measured transactions per

    minute

    We ocus on transactions as the unit o workperormed by servers or both heuristic andpractical reasons. Workload transactions arecommon to all enterprises regardless o companysize, industry sector, or technology complement.All companies process orders, make payments, pickrom inventory, and deliver products and servicesto their customers. In support o these activities,

    company I departments run application and emailservers, manage le and print servers, and provideWeb access and Web services.

    Te diverse mix o transaction work perormedby servers has been classied into workloads,shorthand or application-level transactions anddata processed by servers. (Figure 3) Te mostimportant o these workloads have been simulatedin benchmarks designed to test server perormanceand derive comparative price-perormancemeasures. We use results rom three o the mostextensively applied industry benchmarks, PC-C,SPECweb2005, and VMmark. Each benchmark,explained in Section 3, simulates one or moreenterprise workloads and computes results orserver transaction perormance, which we convertinto byte equivalents. All told, we analyzed price,perormance and capacity data or over 250servers tested rom 2004 to 2009 using one ormore o the benchmarks.

    Database Servers

    Virtual

    Machines

    StorageDevices

    ApplicationServers

    User Devices

    WebServers

    EdgeServers

    Virtual

    Machines

    Database

    Machines

    Web

    Services

    Figure 2: The Flows We Are Interested In

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    13/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    13

    1.4 How Many Bytes?

    Our calculations or measuring enterpriseinormation start by breaking down server typesand locations where servers are deployed in rms.Analyst rms Gartner and IDC map the server

    market into three price ranges, according to the price(stated as actory revenue) o the manuacturersentry-level system in each price range.11

    Entry-level servers are machines priced lessthan $25,000. Servers sold in this category in2008 were dual core, single or dual processormachines with a minimum o rills typicallythey would be deployed in non-critical businessapplication areas, congured or low-costgeneral computing. An example workloadwould be a le and print server.

    Midrange servers are machines costingless than $500,000. Server systems in thisprice range encompass many congurations,including multi-core, multi-processor towersystems, blade servers and small mainrameservers. Midrange servers are housed in server

    closets, server rooms and company datacentersand run a diverse mix o workloads, includingtransaction processing, Web services andemail, online transaction processing and

    virtual machines. More expensive midrangeservers would be deployed in medium andcritical business application areas and would bemanaged by proessional I sta.

    High-end servers are machines costing over$500,000. Tese systems are large, complex,multi-core, multi-processor mainrame servers

    DatabaseServers

    Virtual Machines

    Web ApplicationServers

    ApplicationServers Web Servers

    Web PresentationServers

    EdgeServers

    Web Servers:Financial (eBanking)

    Consumer (eCommerce)

    Web Services (Search)

    Email and Messaging

    Database Servers:Online Transaction Processing (OLTP)

    Online Analytical Processing (OLAP)

    Application Servers:Customer Relationship Management

    Financial Management

    Supply Chain Management

    Virtual Machines:Web Services and eCommerceApplication Processing

    File and Print

    Email and Messaging

    Edge Servers:Web Services (Search)

    Email and Messaging

    Firewall and Security

    File and Print

    Figure 3: Example Server Workloads

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    14/38

    14

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    located in mid-tier and corporate datacenters.Tey almost always are deployed exclusively tobusiness critical application workloads where

    very high perormance and very high reliability

    are required. Example workloads are onlinetransaction processing (OLP) and onlineanalytic processing (OLAP).

    Much o our research has gone into estimating theamount o inormation processed by each serverclass or the installed base o servers worldwide in2008. For this measure o inormation we used bytes the number o bytes processed as input and bytesdelivered as output.

    When measured in bytes, our results show thatservers are processing an enormous quantity o

    inormation work in rms today. Entry-level servers(lower perormance but ar more numerous)processed 6.31 zettabytes o inormation in 2008,66 percent o all enterprise inormation createdworldwide (Table 3). Midrange servers processed2.8 zettabytes or approximately 29 percent. High-end servers processed 451 exabytes o enterpriseinormation, or approximately 4.7 percent o the total.

    Our estimate o 9.57 zettabytes is many timesgreater than that ound in previous studies. AMarch 2008 study by IDC reported that thetotal worldwide digital universe in 2007 was

    approximately 281 exabytes, and would not reach onezettabyte until 2010.12 According to IDC, companiescreated, captured or replicated about 35% o the totaldigital universe, approximately 14 exabytes, comingdirectly rom servers in corporate datacenters. Whyis there such a huge discrepancy in our numbers?Potentially there are many possible actors, andIDC probably did not include throughput data(data processed and output) in its estimates.13 Wecomment urther on how our results compare withother inormation studies later in this report.

    Computer Transactions

    In computer programming, a transaction is an activity or request involving a sequence o data exchange and dataprocessing. In a database management system, transactions are sequences o operations that read or write database

    elements. Orders, purchases, changes, additions and deletions are typical business transactions. An example o anorder-entry transaction would be a catalog merchandise order phoned in by a customer and entered into a computer

    by a telephone sales representative. Te order transaction involves checking an inventory database, conrming that theitem is available, placing the order, conrming that the order has been successully placed, and advising the customer othe expected time o shipment. As a rule, the entire sequence is viewed as a single transaction, and all o the steps mustbe completed beore the transaction is successul and the database is updated.

    Server Workloads

    A server workload is the amount o work that a server produces or can produce in a specied period o time. Butwhat do we mean by work? echnically, workload reers to both the request stream presented by clients (the work)as well as the server response to the requests (the load). An example would be users submitting requests to a Web

    server to search and display Web pages showing product inormation or an eCommerce purchase. How quickly theserver responds to the requests denes the load. ypical server workloads would include eCommerce transactions,database transactions, Web server and email server transactions, and le-and-print. We analyzed server perormancedata or ten simulated enterprise workloads.

    Table 3: World Server Inormation by Server Class 2008

    Entry-level Midrange High-end Total

    Total Bytes by Server Class

    (in zettabytes)6.31 2.80 .451 9.57

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    15/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    15

    2 HOw MaNy

    SERvERS?

    Servers are the inormation workhorses o themodern rm. (Figure 4) Servers are powerulcomputers that serve out applications and datato networks o users. For example, a company Webserver connects users to the Internet, providingusers with the compute resources necessary toaccess Web pages and background support such asrewall protection. Servers are ubiquitous. Smallrms may have a ew servers; large enterpriseswill have tens o thousands. Google reportedly hasthe largest installed base o servers in the world,

    estimated to be over one million.14

    Similar estimateshave Microsof between hal a million and three-quarters o a million servers worldwide.15

    2.1 Types of Servers

    Servers are either dedicated or shared. A dedicatedserver perorms only one task such as hosting

    a Web site. A shared server may be accessed bymultiple users or multiple purposes. Dedicatedservers are more common in larger businesseswhere computing resources are customized orspecic needs. Dedicated servers provide asterdata access, allow higher network trac rates, and

    can be more closely controlled (or example, orperormance, security, or backup).

    Servers are congured by the work tasks theyperorm. For our purposes, we are most interestedin the ollowing server types:

    Database Server: A computer server dedicatedto database storage and retrieval. Te databaseserver holds the database management systemand accesses the companys data storage devices.

    Application Server: Application servers arededicated to running one or more sofwareapplications.

    Web Server: Web servers host internal orexternal Web sites, serving Web pages backand orth to users.

    Mail Server: Mail servers host the companysemail system.

    File Server: File servers house applicationsthat are congured to send and receive leswithin applications. Tink o them as a superseto the server types above. A database servermay be part o a le server, or example.

    Server benchmark testing is organized by server

    type and by simulated enterprise workload.Benchmark results are used in industry tocompare the perormance and price-perormanceo dierent servers and workloads. For example,the ransaction Processing Perormance Councilbenchmarks database servers using the PC-Cbenchmark. Since 2001, over 250 perormance testso database servers have been conducted. As we willexplain, we make use o PC-C and other serverbenchmarks to compute the amount o serverwork perormed.

    2.2 Counting Servers

    Our analysis relies on data and estimates we havecalculated rom IDC and Gartner data reportingserver sales, the installed base o servers and servershipments, plus measured data and estimates oserver capacities or representative server models oreach server class.16 Gartner and IDC data are widelyused in the I industry, but as with all data theirstrengths, weaknesses and comparability shouldbe clearly understood beore drawing conclusions.

    Measuring Information Value

    Instead of Quantity?

    We quantiy enterprise inormation by summing bytes processedplus bytes delivered by each server. Tis puts equal weighton every byte. I we could look at the details o individualinormation ows, we could use a measure that counts someinormation as more valuable than others. But there are nogenerally accepted metrics or inormation value, much less a

    practical way to measure in detail. Similar valuation issues comeup with any aggregate measure, such as total economic activity (isevery dollar equally valuable?) or total miles driven (a speedingambulance is more valuable than joyriding).

    Our measure does capture the higher implicit value o redundantows. When an enterprise views some inormation as importantenough to process it on redundant servers, the worlds totalserver capacity reects this and so does our measurement.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    16/38

    16

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    We report in Table 4 data on the installed base o

    servers published by Koomey (2007) or the years2000-2005.17

    Te estimated total installed base o serversworldwide, including shipments and retirements,is shown in Table 4. Entry-level servers dominatethe installed base, representing over 90% o thetotal number o servers worldwide on a unit basis.Midrange servers comprise most o the rest. High-end servers represent only a ew tenths o onepercent o the total number o servers on a unit basis.Depending on the server class chosen, the U.S hasabout 30 to 40 percent o the servers in the world.

    2.3 World Server Sales

    U.S. and world server sales, a barometer o theU.S. computer manuacturing sector, are closelytracked by analysts and vendors. Broadly speaking,servers are a $50+ billion market worldwide. Since2006, annual shipments or all server classes haveaveraged slightly over 8 million units.

    Our analysis has relied on annual and quarterly

    sales data published by IDC and Gartner Group(all data in U.S. dollars). We used IDCs publiclyreleased data or annual worldwide sales andquarterly, year-over-year percentage increasesor decreases in sales or three classes o servers -entry-level, midrange, and high-end, as our baselinedataset.18IDC reported total worldwide serversales were $53.3 billion dollars in 2008. (able 5)Shipments came in at just over 8.1 million units.Entry-level server sales were $29.3 billion; midrangesales were $11.7 billion, and high-end server saleswere $12.3 billion. Reecting recession eects, themarket contracted 14 percent in the nal quarter

    o 2008, to $13.5 billion.19Worldwide server unitshipments declined 12 percent compared to thesame quarter in 2007. Overall, the 2008 marketdeclined approximately 3.3% to $53.3 billiondollars. Unit shipments, however, grew slightly to8.1 million units.

    Table 5 presents annual sales in U.S. dollars orall server classes or the years 2004-2008. Te

    Figure 4: Modern Computer Servers

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    17/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    17

    units reported in able 5 are current dollars spent.Current dollars are appropriate or our purposesbecause price-perormance ratios each year arebased on current dollars. Midrange and high-end server revenue was relatively at over our

    target years 2004-2008. Entry-level server revenueincreased rom 24.4 billion in 2004 to 30.8 billion in2007. Midrange and entry-level server revenue ellin 2008; high-end server revenue increased slightlyover the same period.

    Table 4: Installed Base, Shipments and Retirements o Servers or the World and U.S., 2000-2005 (in 000s)

    World U.S.

    Totals Year Entry-level Midrange High-end Total Entry-level Midrange High-end Total

    Installed Base

    2000 12,240 1,808 65.6 14,114 4,927 663 23 5,613

    2001 15,596 1,890 69.1 17,555 5,907 701 22.5 6,630

    2002 16,750 1,683 59 18,492 6,768 574 23.1 7,365

    2003 18,523 1,540 62.3 20,125 7,578 530 21.4 8,130

    2004 23,441 1,238 66 24,746 8,658 432 23.3 9,113

    2005 25,959 1,264 59.4 27,282 9,897 387 22.2 10,306

    Shipments

    2000 3,926 283 13 4,223 1,659 111 4.8 1,774

    2001 3,981 206 10.4 4,198 1,492 66 3.6 1,562

    2002 4,184 204 9.4 4,397 1,714 67 3.1 1,784

    2003 5,017 211 8.8 5,237 2,069 76 2.9 2,148

    2004 6,083 184 8.6 6,275 2,517 53 2.8 2,572

    2005 6,822 187 8.5 7,017 2,721 62 2.6 2,786

    Retirements

    2000 1,631 264 10 1,905 300 116 5 420

    2001 626 125 6.9 757 513 28 4.1 545

    2002 3,030 411 19.6 3,461 853 194 2.5 1,049

    2003 3,243 355 5.5 3,603 1,259 120 4.6 1,383

    2004 1,165 485 4.9 1,655 1,437 151 0.9 1,589

    2005 4,304 161 15.1 4,481 1,482 106 3.7 1,592

    Source: Koomey (2007) and IDC. Units are in 000s.Notes:1 Installed base is measured at the end o the year (December 31).2 Installed base and shipments include both enterprise and scientic servers. Te data does not include server upgrades.3 Retirements are calculated rom the installed base and shipments data. 2000 Retirements calculated using the 1999 installed base and year 2000 shipments.4 World includes the U.S.

    Table 5: World Server Sales 2004-2008

    Annual World Server Sales 2004-2008 Current U.S. Dollars (in billions)

    Server Class 2004 2005 2006 2007 2008 Total

    Entry-level $24.40 $27.30 $28.50 $30.80 $29.30 $140.50

    Midrange $12.80 $12.80 $12.20 $12.60 $11.70 $62.30

    High-end $12.20 $11.60 $12.00 $11.60 $12.20 $59.90

    Total $49.5 $51.80 $52.80 $55.10 $53.30 $262.70

    Source:HMI? 2010. Data compiled rom IDC Quarterly Server racking Reports, 2004-2009.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    18/38

    18

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    3 wORlD SERvERCapaCITy

    Te tremendous growth in datacenter capacityrequired by new consumer and enterprise mediais a amiliar topic in todays business press. Nielsenreported that unique visitors to witter.comincreased 1,382 percent year-over-year, rom 475,000unique visitors in February 2008 to 7 million visitorsin February 2009, making it the astest growing sitein Nielsens Member Communities category orthe year 2008. Facebook grew 228% year-over-year,

    rom 20 million unique visitors in February 2008 to65.7 million visitors in February 2009.21Projectingthese trends orward, HP expects a 650% growth inenterprise data over the next ve years, with more thanthree-quarters o that growth in unstructured data.22

    Server capacity and perormance can be dened

    both technically and operationally. echnically itreers to the servers theoreticalcapacity what isthe servers hardware capacity to input, process andoutput work, measured in a maximum transactionrate or in total bytes? Operationally, capacity reersto the ability o a server conguration to meetuture resource needs. A typical capacity concern odatacenter managers is whether the server, storageand network resources will be in place to handlean increasing number o requests as the numbero users and transactions increase. Planning oruture increases is an ongoing task in a datacentermanagers lie capacity planning.

    For our purposes, we are interested in analyzing asnapshot o the installed capacity and the utilizedcapacity o all world servers in 2008. We deneinstalledcapacity as the sum o the maximumperormance ratings o all installed servers in

    Where are Google, Microsoft and Yahoo! Servers In These

    Numbers?

    Te IDC and Gartner data may under estimate the number o custom servers in use at large Internet companies such asGoogle, Microsof and Yahoo! Tese companies order components such as personal computer motherboards directlyrom the manuacturer, and assemble custom-designed severs themselves.20Google, Microsof and Yahoo! do not

    release internal data on their server numbers, but a cottage industry o energetic observers exists in the blogsphere,routinely publishing estimates (some o which may even be accurate). Google has been reported to have over 1 millionservers; Microsof and Yahoo! each perhaps hal that. I all o these servers were custom-designed units, and all wereentry-level class servers, adding 2 million servers to the entry-level server category or the world in 2005 wouldincrease the installed base rom 25.9 million to 27.9 million servers, or about 8%.

    Improvements in Server Performance: Doubling Time

    We also conducted a separate analysis o improvements in server perormance and capital cost over the years

    2004-2008. Measuring computing perormance is controversial, since perormance is context-specic and manyactors are involved.23We used a doubling time metric, dened as the number o years it takes or a given parameter

    to double, in order to compare our results with other studies using this metric.24Our results were as ollows:Midrange servers processing Web services and business application workloads doubled their perormance perdollar in 1.5 years. Raw perormance or this server class doubled approximately every 2 years. At the other end,high-end servers processing transaction workloads had the longest doubling times: both perormance/cost andraw server perormance doubled approximately every 4 years.25(Figure 5)

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    19/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    19

    2008. Utilized capacity is dened as the sum o

    the measured perormance o all installed serversthat year, adjusted by server load actors and thehours the servers are available or use. Since neithernumber can be directly measured, we estimate both.Our capacity model expresses installed capacity asthe maximum number o transactions per minutetheoretically possible or all servers added togetherin 2008. What do we mean by this? Imagine or amoment that every server in the world was runningat its maximum perormance rating, and we had away to accurately count the number o transactionsthat every server processed.Installed capacitywould be the total number o transactions processed

    in a year. Tis sum is a theoreticalmaximum, not arealistic one. Dening installed capacity in this wayis akin to saying that the maximum perormancecapacity or all automobiles in the world is thesum o their top speeds driving at-out downthe highway. It is an interesting number, but not arealistic one.

    Instead, we need to adjust our theoretical maximumby taking into account the estimated utilization

    o server capacity how many hours are serversactually working? What is their average load actor?What workloads are they processing? Recallingour car example, i we think o a server as a deliverytruck delivering bytes, how many hours is a deliverytruck operated in a year (hours worked)? What is

    the trucks average speed compared to its maximumspeed (load actor)? How many packages can itcarry at a time (bytes)?

    Figure 6 illustrates our modeling approach.

    3.1 Server Workloads

    Our analysis uses workload transactionperormance measured by three benchmarks -PC-C, SPECweb2005, and VMmark - or threeclasses o servers entry-level, midrange and high-end (Table 6). Te three benchmarks test dierentworkloads, with some overlap: PC-C tests asingle workload, online transaction processing;SPECweb2005 tests three Web server workloads,Banking, eCommerce and Support; VMmarktests six concurrent virtual machines (VMs) eachrunning a single workload: Database server, Mailserver, Web server, Java server and File server. OneVM workload, Standby server, is congured as parto the VMmark perormance test, but runs entirelyin the background as a xed load, with no results

    Doubling Time In Years (a shorter doubling time is better)

    0 1 2 3 4 5

    Price-Performance

    Performance

    Notes

    1. Price-Performance calculated using HMI? adjusted server hardware costs.

    2. Performance calculated using TPC-C and SPECweb2005 benchmark results for representative servers in

    each server class for years 2004-2008.

    3. SPECweb2005 benchmark results are for Banking workload.

    Sources: TPC-C, SPEC, HMI? 2010

    SPEC Midrange

    SPEC Entry-level

    TPC-C High-end

    TPC-C Midrange

    TPC-C Entry-level

    Figure 5: Improvements In Server Perormance

    A New Way to

    Measure Capacity

    One contribution o this research isa way to aggregate capacity over verydierent kinds o servers. When server

    prices vary rom $500 to $500,000,counting server units (boxes) is notan adequate measure. Instead weimplicitly assume that companiesspend dollars or server capacity in

    the most ecient way or their specicrequirements. hen we careullymeasure capital cost per unit obenchmarked perormance. Ten wetranslate dierent benchmarks into a

    common unit: bytes. Tis is discussedurther in Section 5.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    20/38

    20

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    reported (this workload is intended to simulate theuse o a standby server or peak load or back-upprocessing as would be typical in an operationalenvironment).26

    PC-C (Online ransaction Processing)

    Te PC-C benchmark simulates a large wholesaleoutlets inventory management system. Tetest system is made up o a client system, whichsimulates users entering and receiving screen basedtransaction data, a database server system, whichruns the database management system (DBMS), andthe storage subsystem, which provides the requireddisk space or database and processing needs.27 Teperormance o the test system is measured when itis tasked with processing numerous short businesstransactions concurrently(Figure 7).

    Te PC-C workload involves a mix o veconcurrent transactions o dierent types andcomplexity either executed on-line or queued ordeerred execution:28

    New Order: a new order entered into thedatabase (approx 45%)

    Payment: a payment recorded as received roma customer (approx 43%)

    Order Status: an inquiry as to whether an orderhas been processed (approx 5%)

    Stock Level: an inquiry as to what stocked itemshave a low inventory (approx 5%)

    Delivery: an item is removed rom inventoryand the order status is updated (approx 5%)

    PC-C publishes two sets o results: rawperormance, measured in transactions per minute,and price-perormance, where the cost o thetest system is divided by the transaction rate. Weadjusted the published costs to reect realistichardware congurations.

    By sales year 2004-2008

    By server type (Entry-level, Midrange, High-end)

    By workloadW (TPC, SPEC, VMmark)

    Summing over sales, server types and workloads

    1

    2

    200x Worldwide

    Server Sales in

    current U.S. dollars

    Server

    availabilityhours/year

    X fraction of full load

    Workload

    Allocation(% servers each

    workload)

    Total Bytesall types

    Adjusted Price Performance

    200x Reference Server($ per benchmark transaction / hour)

    Bytes Per Transaction(byte equivalent for eachbenchmark transaction)

    Installed capacity of 200x

    servers to process

    workloadW(bytes/hour)

    Sales year

    2004-2008

    Workload W

    TPC, SPEC,VM

    Server types

    Entry, Mid, High

    Installed capacity

    of 200x servers toprocess workloadW

    (bytes/hour)

    Figure 6: Schematic O Modeling Approach

    Table 6: Perormance Benchmarks by Server Class by

    Server WorkloadEntry-level Midrange High-end

    Database Server TPC-C TPC-C TPC-C

    Web Server SPECweb2005 SPECweb2005

    Virtual MachineSix virtual machines each running a single workloadconcurrently

    VMmark VMmark

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    21/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    21

    SPECweb2005 (Web server)

    SPECweb2005 is a benchmark published by theStandard Perormance Evaluation Corporation(SPEC) or measuring a systems ability to act asa Web server. Te benchmark is designed aroundthree workloads: banking, e-commerce, andsupport. SPECweb2005 reports a perormance scoreor each o the three workloads, measured in thenumber o simultaneous user sessions the systemis able to support while meeting quality o service(QOS) requirements. An overall, weighted score isalso reported.29(Figure 8)

    Te three workloads are designed to simulateenterprise applications and contain the ollowing tasks:

    - SPECweb2005_Banking Te banking loademulates a user session where the banking sitetransers encrypted and non-encrypted inormation

    with simulated users. ypical user requests includelog-on/log-o, bank balance inquiry, moneytransers, etc.

    - SPECweb2005_Ecommerce Te e-commerceload emulates an e-commerce site where customers

    browse product inormation and place items ina shopping cart or purchase. Simulated activityincludes customer scanning o product web-pages,

    viewing specic products, placing orders in ashopping cart and completing the purchase.

    - SPECweb2005_Support Te supportworkload emulates a vendor support site thatprovides downloads such as driver updates anddocumentation. Te load simulates customers

    viewing and downloading product and supportdocumentation.

    VMmark (virtual machine workloads)

    VMmark, published by VMware, is the rstvirtual machine benchmark in the industry.30 It isdesigned to measure the perormance o virtualizedservers using a collection o sub-tests derivedrom benchmarks developed by the Standard

    Perormance Evaluation Corporation (SPEC).VMmark test workloads include: Database server,Mail server, Java server, Web server (using a versiono SPECweb2005), File server, and a Standby (idle)server. Te unit o server work measured is called a

    Transaction Mix

    Client System

    Database Server

    StorageSubsystem

    New Order (45%) Payment (43%)

    Order Status (5%)

    Delivery (5%)

    Stock Level (5%)

    Sources:

    Transaction Processing Performance Council (TPC)

    Hewlett Packard, An overview of the TPC-C benchmark on HP

    ProLiant servers and server blades, August 2007. Mid-Tier Transaction Servers With Queuing Application

    Figure 7: TPC-C Simulated Workow

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    22/38

    22

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    ile. Each ile represents one group o six virtualmachines, each machine running one workload:(Figure 9)

    Mail server Tis workload simulates a mailserver in a company data center.

    Java server Tis workload simulates Javaperormance, important in many multi-tieredenterprise applications.

    Web server Tis workload simulates Webserver perormance. A modied version oSPECweb2005 is used.31

    Database server Te database serverworkload simulates an online transactionworkload, similar to a light version oPC-C.32

    File server Tis workload simulates theperormance o a le server. File servers arecomputers responsible or the central storage andmanagement o data les so that other deviceson the same network can access the les.

    Standby server Te standby workloadsimulates a stand-by or idle server, used incomputing environments to handle newworkloads, or workloads with unusual peakload behavior.

    VMmark reports a perormance metric or eachworkload, and the total number o iles the systemis able to run within quality o service (QOS)requirements.33

    3.2 Core and Edge Computing

    Model: Server Measurement Points

    Figure 10 illustrates a traditional three-tierWeb server conguration. Te rst tier, the Webpresentation server, serves content to the clientdevices and users accessing the system. Teapplication, or example, could be customers

    accessing their online banking accounts andconducting transactions. Te second server tier,comprising the Web application server, processesrequests made by the presentation server and sendsback the appropriate responses. Te Web server

    eCommerce

    Web Server

    BESIM

    (Back End Simulator)

    Client System

    Prime Client

    StorageSubsystem

    Source: Standard Performance Evaluation Corporation (SPEC).

    Request Type

    index

    search

    browse

    browse product line

    product detail

    customize1

    customize2

    customize3

    cart

    login

    shipping

    billing

    confirm

    Total

    Figure 8: SPECweb2005 Simulated Workow

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    23/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    23

    may need to access the third server tier, the databaseserver, to process these requests. Te edge servers

    job is to pass requests and data (content) back andorth between the Web application server, the clientdevices and users. Each server tier may containmultiple processors, and the entire system couldbe congured in a traditional one-application-per-server deployment, or in a virtualized serverdeployment.

    A scaled-up version o the three-tier model isillustrated in Figure 11. Tis gure illustratesa simplied model o a current enterprisecomputing environment. Tere are three I systemenvironments a Core high perormance

    transaction processing environment, located inthe companys main datacenters. An Edge

    applications and Web services environmentsurrounds the core - edge servers process workloadsconnecting the company and its business processesto business partners and to customers. Said to besitting on the outside, or edge o the datacenter,edge servers are a mix o Web servers, rewallservers, le and print servers, storage servers andemail servers in almost all companies they comprisethe largest part o a companys I inrastructure.Client devices are in user hands and attachedto the edge. Client devices include all o the digitaldevices used by employees - mobile phones,notebook computers, storage devices and so orth.34

    How Representative Are Industry Benchmarks?

    Much o our work has gone into estimating the number o bytes processed in workloads simulated by industrybenchmarks. Determining the correct byte equivalent o server transactions or simulated workloads, however, is

    tricky. Benchmark workloads are imperect representations o workloads in real lie. And benchmark engineersuse a variety o tricks to optimize the perormance o server hardware when setting up and running benchmarktests. Tis includes adding high perormance hardware that would be economically i not technically ineasible inmost work situations. Vendors themselves advise customers to view benchmarks as inormative, but no substituteor customers conducting their own conguration and perormance testing. We have attempted to account or

    the many actors as best we can. Our assumptions are subject to change based on improved methodology andbetter data.

    Mail Server File Server Standby Server

    OLTP Database Web Server Java Order Entry

    Six Virtual Machines = One VM Tile

    Virtual Machines

    Test Server

    Notes: OLTP (online transaction processing)

    File Server and Standby Server not included in calculations.

    Source: VMware

    Figure 9: VMmark Simulated Workload

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    24/38

    24

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    Database Server

    Storage Device

    Web ApplicationServer

    Web PresentationServer

    EdgeServers

    User Devices

    Three-Tier Web Server

    Figure 11 also illustrates our server measurementpoints and their corresponding benchmarks.Core OLP transactions are measured by PC-C;Web services applications are measured bySPECweb2005; and VM servers, which can be ineither environment, are measured by VMmark.We do not measure application servers directly there is no single benchmark that addresses arepresentative subset o applications running ongeneral-purpose servers in a typical company.35While some raction o Web-based applicationtransactions are measured in VMmark andSPECweb2005, omitted are server transactionsthat support middleware, packaged sofwareapplication suites such as PeopleSof, SAP andbusiness intelligence programs such as SAS. Wewould need to know how these programs scalewith respect to a database transaction, or a WebeCommerce transaction, to estimate how theirinclusion would aect our capacity and inormationcalculations. We continue to research this area.

    3.3 Server Workload Allocations

    We estimate server capacity as the ratio o dollars spentover dollars per measured transactions per minute:

    World server capacity = $ spent / $ per

    measured transactions per minute

    Where:

    $ spent = dollars spent in a given calendar

    year on Entry-level, Midrange and High-end

    servers

    $ per measured transactions per minute =

    price of test server hardware divided by

    measured server performance

    o derive capacities, we need 1) the dollars spentor a specic server class, 2) benchmark tests thatreport the price-perormance o the server class byyear, and 3) the dollar value o servers allocated toa particular benchmark. As workload allocationsby companies are not directly measurable, we relyon our own estimates, guided by expert interviews,industry data and our own judgment. Table 7

    presents our workload allocations. We use thesepercentages in our capacity calculations. Weestimate, or example, that just over a third o allserver work processed in companies is made upo core database transactions. he remainingtwo-thirds o server work is processed on Webservers, with a quarter o that work virtualized.36(Figure 12)

    Figure 10: Three-Tier Web Server Confguration

    Table 7: Estimated WorkloadPercentages by Year

    Year TPC-C SPECweb2005 VMmark

    2004 35% 65%

    2005 35% 65%

    2006 37% 63%

    2007 38% 37% 25%

    2008 40% 30% 30%

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    25/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    25

    3.4 World Server Capacities

    Table 8 summarizes our calculations or servercapacities by benchmark, by server class, byyear. For example, reerring to the top rowo data, PC-C, entry-level servers, the dataentry or 2004, 80.8 billion transactions, can beinterpreted as ollows: i all entry-level serverssold in 2004 processed only PC-C workloads,their maximum perormance capacity would be80.8 billion transactions per minute. By 2008, thecorresponding PC-C capacity was 392 billiontransactions per minute, or slightly less than a ve-

    old increase in our years. In comparison, entry-level servers processing SPECweb2005 workloadshad a core capacity o 125 billion transactionrequests per minute in 2004. By 2008, thecorresponding capacity was 957 billion transactionrequests per minute, an eight old increase in ouryears. SPECweb2005 capacity, thereore, increasedat roughly twice the rate o PC-C capacity orentry-level servers rom 2004 to 2008.

    Midrange servers show dierent capacity trends.I all midrange servers sold in 2004 processed onlySPECweb2005 workloads, their core capacity was26.9 billion transaction requests per minute. By2008, the corresponding capacity or SPECweb2005workloads was 274 billion transaction requestsper minute, or better than a ten-old increasein our years. Several actors could account orthe much higher growth rate o midrange servercapacity compared with that o entry-level servers.Factors include: a higher proportion o multi-processor, multi-core midrange servers weresold (more processors and more cores positively

    aects benchmark perormance); midrange serverprice-perormance improved aster than entry-levelserver price perormance; or midrange server testcongurations may have been able to take greateradvantage o the other resources in the test systempositively aecting test perormance.

    Our server capacity assumptions, methodology andcalculations are complex and we will not attempt

    PROCESSED

    OUTPUT

    PROCESSED

    OUTPUT

    PROCESSED

    OUTPUT

    Database Servers

    Virtual MachinesStorageDevices

    Web ApplicationServers

    User Devices

    Web PresentationServers

    Edge Servers

    Edge Servers

    VMmark

    TPC-C

    SPEC

    CORE

    EDGE

    Figure 11: Core and Edge Computing Model

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    26/38

    26

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    PercentageofCoreandEdgeMeasuredWorkloads

    2004 2005 2006 2007 2008

    100%

    90%

    80%

    70%

    60%

    50%

    40%

    30%

    20%

    0%

    35% 35% 37%38%

    40%

    65% 65%

    63%

    37% 30%

    25% 30%

    VM

    SPEC

    TPC

    to explain them in this report. For interestedreaders, we have completed a background technicalworking paper which explains our key assumptions,describes our methodology in much greater detail,and gives sample calculations.37

    4 wORlD SERvER

    INFORMaTION

    4.1 How Many Bytes?

    World servers processed 9.57 zettabytes (9.57 1021 bytes) o inormation in 2008. Te majority othis inormation (6.31 zettabytes) was processedby entry-level servers. Midrange servers processed

    2.8 zettabytes, and high-end servers processed 451exabytes o inormation. (Figure 13)

    4.2 Contribution to Total Bytes

    Table 9 breaks down total bytes by server class andbenchmark workload. Te respective contributiono each server type and each workload is given bythe row and column percentages in the table.

    Online transaction processing, measured byPC-C, accounts or almost 45% o all serverbytes processed in 2008. Web services and generalcomputing, processed by entry-level and midrangeservers and including bytes processed by virtualmachines, account or the rest. Midrange and high-end servers process a disproportionate share o total

    Figure 12: Estimated Workload Percentages 2004-2008

    Table 8: Server Potential Capacities 2004-2008

    Capacity by Year o Purchase

    Benchmark Metric Class Server 2004 2005 2006 2007 2008

    TPC-CtpmC

    (billions)

    Entry-level 80.80 98.10 191.10 184.00 392.30

    Midrange 16.90 19.20 35.30 73.30 108.00

    High-end 4.40 4.30 4.80 7.50 10.10

    SPECrpmSPEC

    (billions)

    Entry-level 125.10 148.00 609.50 848.00 956.70

    Midrange 26.90 53.80 83.90 175.20 274.00

    High-end --- --- --- --- ---

    VMmarkapmVM

    (billions)

    Entry-level --- --- --- 296.10 523.30

    Midrange --- --- --- 56.50 76.70

    High-end --- --- --- --- ---

    NOES:Each number is transaction capacity i all class servers wereperorming a single benchmark.tpmC = transactions per minute C

    rpmSPEC = requests per minute SPECapmVM = actions per minute VMmark--- = Workload not allocated to server class, or no benchmark test available

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    27/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    27

    bytes high-end servers make up only about two-tenths o one percent o all installed servers (0.22%),but process 5% o total annual bytes. Midrangeservers, ar more numerous, make up approximately5% o all installed servers, and process 29% o totalannual bytes. In contrast, the ubiquitous entry-levelservers make up over 94% o all the installed serversin the world, and process two-thirds o all the bytes.(Figure 14)

    Te magnitude o OLP transaction processing almost hal o all bytes reects the growingimportance o workload specic computing inrecent years. Te model reverses the generalpurpose computing model that has dominatedenterprise computing or over a decade. We discussworkload specic computing and other importantcomputing trends in Section 5.

    Our byte totals are based on assumptions abouthow servers actually workin rms how serversare allocated and what workloads are assigned tothem, how they are utilized and what load actorsare typical. We have based our assumptions on thebest available data at the time o our study. However,

    there are inherent uncertainties in many o ourassumptions, and they are subject to change basedupon improved methodology and data. Interestedreaders should consult our background technicalworking paper or details on key assumptions,our methodology or converting benchmark

    transactions into bytes, and or a completedescription o how we derived total 2008 serverbytes.

    4.3 Our Results in Context:

    Comparisons with Other Information

    Studies

    Our total o 9.57 zettabytes is many times greaterthan ndings rom previous studies. For example,in their 2003 How Much Inormation? study, PeterLyman and Hal Varian reported that the total

    volume o inormation owing through electronic

    channels worldwide telephone, radio, V andthe Internet contained almost 18 exabytes o newinormation in 2002. Teir total is three orders omagnitude less than our total calculated or justenterprise inormation why? Te key was theirocus on counting new inormation - Lyman andVarian did not count digital copies nor did theyattempt to count the amount o data processing thatwas necessary to produce new inormation. Teycounted just what they could identiy as a new orunique piece o inormation or example, the rstairing o a radio or television show, a telephone call,or the rst print copy o a book.38

    Our results also dier signicantly rom currentindustry studies. In 2007 IDC and EMC reportedthat the total digital universe what they denedas inormation that is either created, captured, orreplicated in digital orm - was 281 exabytes.39

    Tey hypothesized that by 2010 70% o the digitaluniverse would be created by individuals, butenterprises would have responsibility or 85% o it.IDC updated their ndings in 2010, estimating thatby year-end the total digital universe would growto1.2 zettabytes, or about 5 times their 2007 total.IDC calculated a compound annual growth rate

    o approximately 60% between the years 2007 and2010. 40

    Why are there such large dierences in ourinormation totals rom these earlier studies? Tereare many possible actors, including dierencesin denitions, what sources o inormation areincluded and not included in the totals, amongothers. Most important is our use o a broader, more

    How much is 9.57 zettabytes?

    How much is 9.57 zettabytes? Tere are about 2.5 megabytes in

    Stephen Kings longest novel, so he would have to stack his novelsrom here to Neptune and back about 20 times to equal one yearo server inormation. Each o the worlds 3.2 billion workerswould have to read through a stack o books 58 km (36 miles)

    long each year.

    High-end

    Midrange

    Entry-level

    6.31 zettabytes

    2.8 zettabytes

    451 exabytes

    Total Server Information in 2008: 9.57 zettabytes

    Figure 13: World Server InormationSummary 2008

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    28/38

    28

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    Table 9: Contribution to World Server Inormation 2008

    Workload Contribution By Server Class (in zettabytes)

    Entry-level Midrange High-end Row Total % by Workload

    TPC-C 2.58 1.21 0.45 4.24 44.40%

    SPEC 1.66 0.89 2.55 26.70%

    VMmark 2.07 0.69 2.77 29.00%

    Column Total 6.31 2.8 0.45 9.57 100%

    % by Server Class 66.00% 29.30% 4.70% 100%

    Grand Total = 9.57 zettabytes = 9.57 * 1021 bytes

    inclusive denition o inormation. We included

    estimates o the amount o data processed as inputand delivered by servers as output. Tis is a dierentemphasis than estimating the amount o stored data(data at rest) or counting the rst instance o newinormation being created (the rst airing o a radioprogram, or the rst release o a new television show).

    4.4 Discussion: Estimating

    Enterprise Information from Servers

    I there is one major surprise in our analysis, it is thesheer volume o inormation processed by servers.Using a more detailed methodology, we ound

    much greater volumes of information is in ow inenterprises than has been previously documented.Moreover, our data also points to the continuinggrowth in inormation processing capacity andin perormance per thousand dollars o servercost. Capacity trends appear to track the popularinterpretation o Moores Law namely, that thedelivered price perormance o computing systemsdoubles roughly every 1.5 to 2 years.

    While our analysis develops a composite picture oserver workload processing in rms, we only look atservers. We have yet to add other important sources

    such as voice communication that does not gothrough a server.

    5 TRENDS,pERSpECTIvES

    aND CHallENGES

    IN INFORMaTION

    INTENSIvE COMpUTING

    5.1 Measuring Capacity and

    Performance in Emerging

    Information Architectures

    Our methodology or measuring enterprise serverinormation asks, how much work do serversactually do? We calculated the amount o workprocessed and reported our results in capacity

    per dollarrather than capacity per server. Tisis an important transormation, and we made itor a number o reasons. First, our analysis usesbenchmark tests that are consistent across servertypes and over the time period covered in ouranalysis (2004 to 2009). Te same PC-C workload,or example, used to test an entry-level server in2005 was used to test a midrange or high-end server

    in 2007. Second, we calculated server capacity perdollarin order to apply a common yardstick acrossall server classes, server ages and workloads. Tird,our method or calculating server capacities yieldsa theoretical upper limit o perormance thenumber o bytes processed per minute or per hourcalculated rom the benchmark results. Benchmarkengineers tweak their systems or maximumperormance. Benchmark results, thereore, need

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    29/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    29

    Entry-level

    Midrange

    High-end

    VM

    TPC-C

    SPEC

    3.0

    2.5

    2.0

    1.5

    1.0

    .5

    0

    Grand Total = 9.57 1021 bytes

    Server Class

    Zettabytes

    Benchmark

    Figure 14: Contribution to World Server Inormation 2008

    to be weighted by average server load actors andavailable hours o use that approximate how anaverage server works in a company. We estimatedserver loads in practical terms how hard theservers get used based on our own data and

    judgment, and vetted our estimates with industryexperts. Our load estimates do not reduce to a singlemeasure o CPU, or memory, or I/O utilization.Rather, they are estimates o average utilizationrelative to the benchmarks and workloads used inour calculations.

    Our capacity per dollarmeasure was especiallyhelpul when conronting the practicalities oestimating total server inormation. Enterprisecomputing environments are context-specic.Computing and application workloads vary widelyacross rms even among those o similar companysize and industry sector. o make sense out o acomplex environment, it was necessary to dene acommon yardstick that we could use with all servers

    and with all types o work being completed in rms.We will continue to research a more extensivecapacity measure - one including additional Iequipment such as storage and network devices.

    5.2 Data Intensive Computing

    PlatformsOur results help conrm the magnitude o bigdata challenges in industry today. Server capacitiesare roughly doubling every other year, drivingsimilar growth rates in stored data and networkcapacity. Te many challenges in data intensivecomputing are driven by increasing data volumes,the need or integration o ever increasing sourceso heterogeneous data, and the need or rapidprocessing o data to support data-intensivedecision-making. Many rms are nding itnecessary to rethink their approaches to corporate

    I or economies o scale and in view o newindustry initiatives in cloud computing and greendatacenters. Massively parallel computing systemsor data intensive computing now come in manydierent orms, including multiple core systems,large memory systems, shared nothing platorms,and sofware optimized, custom hardware systems.Many o the same actors - big data to cost oownership - are putting pressure in the directiono more centralization o I resources to addressburgeoning capacity and data management issues.Increasingly, companies are conronted by a newset o technical, social, and business model issues in

    emerging inormation environments.

    Platorms or data intensive computing must bebalanced in terms o their I/O capability, memorycapacity, processing speeds, and the bandwidtho interconnects that link the correspondingcomponents. While I/O bandwidth may be animportant issue or some types o data intensivecomputing problems, in other cases the criticalaspect is the I/O transaction rate that can besustained. Tree major types o architectureplatorms are o interest to data intensive computingapplications. We note them below and provide

    examples in high-perormance and industrycomputing environments:

    Large memory systems. Very large memorysystems can store very large data structures inmemory and reduce or remove disk latenciesduring processing and also greatly improvethe perormance o applications that exhibitrandom I/O accesses. Hardware and sofware

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    30/38

    30

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    vendors are already exploiting solid-statedisks (SSDs) and investigating other large-scale memory technologies. For example, theNational Science Foundations (NSF) nextsupercomputer, called Gordon, is designed touse traditional High Perormance Computing

    (HPC) with HPD, or High PerormanceData processing. When ully congured anddeployed, Gordon will eature 245 teraopso total compute power (one teraop or Fequals a trillion calculations per second), 64terabytes (B) o DRAM (digital randomaccess memory), and 256 B o ash memory,about one-quarter o a petabyte.41rends in largememory system deployments and processing arenot addressed in our current analysis.

    Shared-nothing platorms, which consisto multiple independent nodes in parallel,

    have been prevalent since the mid-1980s asviable architectures or scalable, highly dataparallel processing. More recently, shared-nothing systems using commodity hardwarehave proven eective or massive scale, dataparallel applications, such as web indexing byGoogle. Systems o this type, with thousandso nodes, are in use at large, Internet-scalebusinesses. Our current analysis does notaddress the special cases o Google, Microsof,and other Internet-scale businesses, eitherin server counts or in estimating capacitiesusing workload analyses. While their inclusion

    would almost certainly not change the order omagnitude o our world analysis, an analysis oUS enterprise server inormation would requireurther investigation. Also, shared-nothingplatorms are a key component in cloudcomputing environments. Tere is a strongpossibility that cloud computing will becomean important, i not key part o the solution orenterprise computing in the uture. Tereore,incorporating shared-nothing architecturesinto our analysis will help address a signicantcomponent o enterprise inormation.

    Database Machines. Another area that isre-emerging is that o computer architecturesdesigned or database intensive computing.Te legacy in this area reaches all the way backalmost 30 years ago to the eld odatabasemachines and other specialized hardwaredesigns to support specic classes o databaseapplications. A number o actors, ranging

    rom the need to eciently deal with thedata deluge, relative exibility and improvedcosts o hardware design and abrication, andthe need or energy conservation are leadingtowards a re-examination o architectures,with the goals o systems optimized or

    massive data processing. In the past, the areao database machines led to the developmento hardware/sofware systems such as eradata,and to parallel sofware systems like IBMDB2 Parallel Edition, a shared-nothingcommercial database system.42Currently,the release o Oracles Exadata DatabaseMachine points in the direction o specializedhardware design and optimization attunedto extreme database perormance.43 In high-perormance computing environments, somescientic experiments that expect to generate

    very large amounts o data are investigating

    hardware embedding o processing algorithmsto deal with the continuous data rates romhigh-resolution instruments.44 In scienticcomputing and large-scale data warehouses,it may soon become necessary to think oprovisioning datasets with hardware. Tedataset becomes the rst order object withthe computing platorm dependent on it, ratherthan the current practice, which is the reverse.

    5.3 Back to the Future: Data

    Discovery, Data Generation, DataPreservation

    Data intensive computing is about the data, andnecessarily requires a deep engagement betweenthe business users on the one hand (sales managers,supply chain managers, nancial analysts, etc.),and I and technical experts on the other (therms I proessionals, technical specialists in

    vendor companies, etc.). Tis alignment doesnot happen without the required investments intime and resources made by senior business, linemanagement and I management. With the vast

    prolieration o available data, there is increasingneed or innovative search techniques that assistusers with data discovery. It should be possible,or example, or users to speciy the type o datathey are looking or and have a system respondwith useul results as well as recommendationsor guiding the next search step. Current businessintelligence and general search applications do not

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    31/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    31

    provide this kind o capability. Another examplewould be the increasing need or integrationo very heterogeneous data, given the need oraddressing complex issues. In medicine somethingas conceptually simple as a lietime personalmedical chart, or a database o all test results or a

    amily, would be examples. raditional methodso data integration require signicant manualintervention to actually integrate the data (e.g. bycreating integrated database views, etc) and do notscale. Novel tools and techniques are needed toacilitate such integration. One approach is reerredto as ad hoc data integration, an approach thatallows users to control, on-the-y, which data areto be integrated. However, this approach requiresa signicant semantic inrastructure to be in place,and that is very rarely the case.

    Finally, a longer-term issue in both enterprise and

    research environments is data archiving and digitaldata preservation. In research settings, preservingscientic data is generally deemed to have intrinsic

    value. In enterprise settings, business policy andstatutory regulations require preservation o dataor a number o years afer use. Federal agenciesincluding NIH and NSF have recently announcedneeds or major data plans to address issues odata archiving and data preservation. And long-term data archiving and data preservation is agrowing challenge or business organizations,beyond current retention policies - typically sevenyears. Tere are many industries nancial services,

    insurance, exploration and geological sciences,engineering or entertainment where arbitrary dataage limits make little sense. We have not addresseddata and inormation storage in this analysis, butwe will do so in the uture. Te issues are complexinvolving technical as well as policy considerations.Nonetheless, in the uture digital data archivingand preservation will require as much enthusiasmin research and in industry settings as we haveprovided to data generation and data processing.

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    32/38

    32

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    appENDIX

    Counting ver lrge Numbers

    Bte (B) = 1 bte = 1 = One chrcter of text

    Kiobte (KB) = 103 btes = 1,000 = One ge of text

    Megbte (MB) = 106 btes = 1,000,000 = One sm hoto

    Gigbte (GB) = 109 btes = 1,000,000,000 =

    One hour of High-Denition video, recorded on a

    digital video camera at its highest quality setting, is

    roximte 7 Gigbtes

    Terbte (TB) = 1012 btes = 1,000,000,000,000 = The largest consumer hard drive in 2008

    petbte (pB) = 1015 btes = 1,000,000,000,000,000 =AT&T carried about 18.7 Petabytes of data trafc on an

    average business day in 2008

    Exbte (EB) = 1018 btes = 1,000,000,000,000,000,000 =Approximately all of the hard drives in home computers

    in Minnesota, which has a population of 5.1M

    Zettbte (ZB) = 1021 btes = 1,000,000,000,000,000,000,000

  • 7/27/2019 HMI 2010 EnterpriseReport Jan 2011

    33/38

    How Much Inormation? 2010 Report on Enterprise Server Inormation

    33

    ENDNOTES

    1. In December o 2007, Steven Lohr o Te New York imes askedwhether inormation overload was a $650 billion drag on the USeconomy, citing estimates he ound in analyst and industry sources.Steven Lohr, Is Inormation Overload a $650 Billion Drag on the

    Economy? Te New York imes, Bits, December 20, 2007.

    2. Roger E. Bohn and James E. Short, How Much Inormation? 2009Report on American Consumers, Global Inormation Industry Center(GIIC), University o Caliornia San Diego, December 2009.

    http://hmi.ucsd.edu/howmuchino.php

    3. We do not include computer, communications or disk overhead in ourcalculations. By overhead, we reer to the amount o processing resourcesused by system sofware, such as the operating system, transactionprocessing (P) monitor or database manager. In communications,data that is not part o the user data, but is stored or transmitted with it.Examples would include data or error checking, channel separation, oraddressing inormation.

    4. Te World Factbook, U.S. Central Intelligence Agency, available at:

    https://www.cia.gov/library/publications/the-world-actbook/geos/xx.html International Labor Organization, Labor Force Statistics, available at:

    http://laborsta.ilo.org/SP/guest

    5. Dun&Bradstreet D-U-N-S registration listed 151 million business inits Worldbase glob