Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR...
-
Upload
lesley-tilton -
Category
Documents
-
view
226 -
download
1
Transcript of Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR...
![Page 1: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/1.jpg)
Data Laden CloudsTrends and Insights
Roger Barga, PhDArchitect
eXtreme Computing Group, MSR
[email protected] http://research.microsoft.com/barga
![Page 2: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/2.jpg)
• Technology trends• Riding the exponentials
• Convergent invisibility• NUIs and computing on behalf
• Client+Cloud experiences• Opportunity for Data and Analytics
• Cloud infrastructure challenges• Packaging, hardware, software, security
• Thoughts on the future
Presentation Outline
![Page 3: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/3.jpg)
It’s Easy To Forget That Not Very Long Ago …
• There were few or no experiences with …• Web sites, email, spam, phishing, computer
viruses• e-commerce, digital photography or telephony
• Cell phones were rare and expensive• A portable cassette player was still cool• HiFi was more common than WiFi• A “friend” was someone you actually knew
The future depends on vision and context …
![Page 4: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/4.jpg)
Pre-PC Era(1980)
PC Era(1995)
Internet Era(2000)
Consumer Era(Today+)
21st century implicit and natural computing• Increasingly natural interfaces• Embedded intelligence in everyday objects• Ubiquitous network access and cloud
services
Computing Eras: Paucity To Plethora
MainframeEra
![Page 5: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/5.jpg)
What Has Changed? • System on a chip designs• Powerful mobile devices
• Graphics processing units • High quality graphics
• Explosive data growth• Ubiquitous sensors and
media
• Inexpensive embedded computing• Everyday smart objects, …
• Wireless spectrum pressure• Mobile device growth
• New software models• Social networks, clients+clouds
…LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
OoO x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
LPIA x86
LPIA x86
1 MB cacheGPU GPU
PCIe ctlr
PCIe ctlr NoC NoC 1 MB
cache1 MB cache
LPIA x86
LPIA x86
1 MB cacheGPU GPU
LPIA x86
LPIA x86
1 MB cache
1 MB cache
LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
OoO x86
LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
DRAM ctlr
DRAM ctlr
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
PCIe ctlr NoC NoC NoC NoC NoC NoC PCIe
ctlrLPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86 Custom accelerationLPIA
x86LPIA x86
Server
Desktop
MobileLPIA
x861 MB cache
1 MB cache
DRAM ctlr
LPIA x86
1 MB 1 MB cache
PCIe ctlr
GPU
GPUcache
![Page 6: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/6.jpg)
• Multidisciplinary challenges are the present and future… and the tools must empower, not frustrate
• A computation task has four characteristic demands:• Networking – delivering questions and answers• Computation – transforming information to produce new information• Data access – access to information needed by the computation• Data storage – long term storage of information
• The ratios among these and their costs are critical
New applications and systems will arise… if we create the right environment
Orders of Magnitude Always Matter
![Page 7: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/7.jpg)
• Your car drives and navigates for you… and also parks the car (already a feature on some cars)
• Your sound system only plays music you love… because it knows about every song you’ve ever heard
• Your phone only rings when you want to answer… because it knows your emotional state and social context
• All your family memories are recorded automatically… via MEMS-based sensors and solid state storage
• Your body calls an ambulance when you’re ill… via implanted, biologically powered diagnostic sensors
• Your DNA sample and lifestyle determine personalized treatment… because genotype-phenotype models are specific
• Your office adjusts its behavior to your needs… because it knows what you want to do
Imagine a Future Where …
![Page 8: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/8.jpg)
8
Successful Technologies Are Invisible
![Page 9: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/9.jpg)
SpreadsheetsWord processors
DOS1981-1995
NUI2012-…
AnticipatoryHuman-centric
TO
DA
Y
Location-based appsSocial networks
CLIENT+CLOUD2006-present
INTERNET1993-present
EmailWeb browsers
GUI1985-present
Desktop publishingMultimedia
![Page 10: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/10.jpg)
Enhanced GUI NUI
Gestures
Voice
Environment
ContextTasks
Expressions
Multi-touch
Speech
Handwriting
Single Touch
Versus
![Page 11: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/11.jpg)
![Page 12: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/12.jpg)
An FFT?• No, it’s an algorithm
A rendering pipeline?• No, it’s a software library
A feature recognition system?• No, it’s a building block
Our notion of “application” is increasingly complex• Many integrated and interoperating components
Our tools must enable creativity accordingly, creating experiences
What Is An Application?Microsoft Kinect
![Page 13: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/13.jpg)
Working at your command
Working on your behalf
![Page 14: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/14.jpg)
Fixed Portable Specialty Mobile
Create the Experience
The CloudThe Clients
Intelligent Objects
![Page 15: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/15.jpg)
15
Client
Cloud
HybridExperiences
AugmentedInteraction
ContextAwareness
EnvironmentAwareness
AnticipatoryProcessing
AdaptiveBehavior
Public DataServices
Trust & SecurityServices
Private DataServices
SensoryInputs
The Future of Experiences
![Page 16: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/16.jpg)
Computer Rooms: Cloud COGS MatterWhat’s A Cloud?
![Page 17: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/17.jpg)
17
2005 2006 2007 2008 2009 2010
Chicago and Dublin Generation 3
Data Center Co-Location Generation
1
Modular Data Center Generation 4
Quincy and San Antonio Generation 2
ContainersServer IT PACRack
Facility PAC
Microsoft’s Data Center Evolution And Economics
Deployment Scale Unit
Time to MarketLower TCO
Scalability & Sustainability
Density & Deployment
Capacity
![Page 18: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/18.jpg)
Discovery and Innovation in 2020
In the last two decades advances in computing technology, from processing speed to network capacity and the Internet, have revolutionized the way scientists work.
From sequencing genomes to monitoring the Earth's climate, many recent scientific advances would not have been possible without a parallel increase in computing power –
and with revolutionary technologies such as the quantum computer edging towards reality, what will the relationship between computing and science bring us over the next 15 years?
http://research.microsoft.com/towards2020science
”
“
![Page 19: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/19.jpg)
19
70M
1M
14M
High Performance Data-intensive Capacity
80%
20%14M
1M
Scientists & Engineers
55M Little to no access to high performance data-intensivecapacity
Lack of Broad Access
![Page 20: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/20.jpg)
New Bytes of Information in 2010Source: IDC, as reported in The Economist, Feb 25, 2010
20
1.2 x 1021
![Page 21: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/21.jpg)
21Sources: The Economist, Feb ‘10; IDC
By 2016 the New Large Synoptic Survey Telescope in Chile will acquire 140 terabytes in 5 days - more than Sloan acquired in 10 years
In 2000 the Sloan Digital Sky Survey collected more data in its 1st week than was collected in the entire history of Astronomy
The Large Hadron Collider at CERN generates 40 terabytes of data every second
![Page 22: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/22.jpg)
22
Economics of Storage
22Source: Wired Magazine April 2010; Figures represented in USD
2000
Disk Storage(per gigabyte)
Web Storage (per gigabyte)
2001200220032004200520062007200820092010
$44.56 $1,250$0.07 $0.15
But remember,… free storage is like free puppies
![Page 23: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/23.jpg)
• Hypothesis-driven• “I have an idea, let me verify it.”
• Exploratory• “What correlations can I glean?”
• Different tools and techniques• Rapid exploration of alternatives• Data volume and complexity are assets• … and challenges
• Simplicity really matters
Social Implications of the Data Deluge
![Page 24: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/24.jpg)
Let Researchers Be Researchers…
Most researchers do not want to be system administrators
They don’t want to learn to use supercomputers
They want to focus on their research
They use standard tools: spreadsheets, statistical packages, desktop visualizationProgramming = modifying a few parameters in a trusted scripting language
![Page 25: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/25.jpg)
Let Researchers Be Researchers…
BUT …
The data deluge means they must solve problems 10000 times the capacity of their desktop
Research is now interdisciplinarySharing access to large data collections and analysis tools is the future
A paradigm shift is coming
![Page 26: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/26.jpg)
• Magnify client power via the cloud• Data analysis and computation
• Persist and share data in the cloud• Multidisciplinary data fusion and scale
• Leverage via client tools/metaphors• Analysis acceleration (Excel, toolkits, codes)• Remote rendering and client visualization• Data provenance, collaboration …
Seamless Client Plus CloudCompute Blob
Storage …Table Storage
![Page 27: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/27.jpg)
NCBI BLAST
BLAST (Basic Local Alignment Search Tool) • One of the most important software in bioinformatics• Identify similarity between bio-sequences
Computationally intensive• Large number of pairwise alignment operations• A normal BLAST running could take 700 ~ 1000 CPU hours
For most biologists, two choices to run large jobs• Build a local cluster • Submit jobs to NCBI or EBI (long job queue times)
![Page 28: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/28.jpg)
R. palustris as a platform for H2 productionIdentify key drivers for producing hydrogen, promising
alternative fuel – understand R. palustris well enough to be able to improve its H2 production;
Characterize a population of strains and use integrative genomics approaches to dissect the molecular networks of H2 production;
BLAST to query 16 strains to sort out genetic relationships• Each strain, estimated ~5,000 proteins • Jobs kicked off NCBI clusters before completion• Against NCBI non-redundant proteins in ~30 min• Against ~5,000 proteins from another strain < 30 sec• Publishable result in one day for roughly $150.
Eric Schadt, Pac Bio and Sam Phattarasukol Harwood Lab, UW
![Page 29: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/29.jpg)
NCBI BLAST on Windows Azure• Parallel BLAST engine on Azure
• Query-segmentation data-parallel pattern• split the input sequences• query partitions in parallel• merge results together when done
• Follows the general suggested application model for Window Azure • Web Role + Queue + Worker
• With three special considerations• Batch job management• Task parallelism on an elastic Cloud• Large data-set management
![Page 30: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/30.jpg)
AzureBLAST Task-Flow
A simple split/Join pattern
Leverage multi-core of one instance • argument “–num_threads” of NCBI-BLAST
Task granularity • Large partition load imbalance • Small partition unnecessary overheads• NCBI-BLAST overhead• Data transferring overhead.
Best Practice: test runs to profile and set size to mitigate the overhead
BLAST task
Splitting task
BLAST task
BLAST task
BLAST task
…
Merging Task
![Page 31: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/31.jpg)
Micro-Benchmarks Inform DesignTask size vs. Performance• Benefit of the warm cache effect• 100 sequences per partition is the best
choice
Instance size vs. Performance• Super-linear speedup with larger size
worker instances• Primarily due to the memory capability.
Task Size/Instance Size vs. Cost• Extra-large instance generated the best
and the most economical throughput• Fully utilize the resource
![Page 32: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/32.jpg)
All-Against-All ExperimentDiscovering Homologs • BLAST Uniref100, non-redundant protein sequence database• Discover the interrelationships of known protein sequences
“All against All” query• The database is also the input query• The protein database is large (4.2 GB size)
• Total of 9,865,668 sequences to be queried• Theoretically, 100 billion sequence comparisons!
Performance estimation• Estimated completion, 3,216,731 minutes (6.1 years) on 8 core VM
One of biggest BLAST jobs as far as we know• This scale of experiment is usually infeasible to most researchers
![Page 33: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/33.jpg)
Our Approach• Allocated a total of ~4000 instances • 475 extra-large VMs (8 cores per VM)
• 8 deployments of AzureBLAST• Each deployment has its own co-located storage service
• Divide 10 million sequences into multiple segments• Each will be submitted to one deployment as one job for execution• 300,000 tasks on 3500 cores on Azure (70,000 bp or 35 sequences per
task)
![Page 34: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/34.jpg)
Cloud System Upgrades
North Europe Data Center, totally 34,256 tasks processed
All 62 nodes lost tasks and then came back together. This is an update domain
~30 mins
~ 6 nodes in one group
![Page 35: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/35.jpg)
35 Nodes experience blob
writing failure at the same time
Failures HappenWest Europe Datacenter; 30,976 tasks are completed, and job was killed
Reasonable guess: Fault Domain is
working
![Page 36: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/36.jpg)
Release of BLAST on Windows Azure• Open source release of NCBI BLAST on Windows Azure
(CTP)• Installation guide and users guide;• Developers guide in preparation; • Support the release for the next year – feature requests, fixes,
…• Free access to NCBI reference data sets on Windows Azure,
auto update; http://research.microsoft.com/azure
• Software can be installed and used immediately, customized for your institution (logos, private database, group databases), extend source
• Releasing result data from “all-against-all” run• BLAST Uniref100, non-redundant protein sequence database• Discover the interrelationships of known protein sequences• Available Dec. 1st, 2010.
![Page 37: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/37.jpg)
Microsoft Client+Cloud Partnership
• Azure cloud services• Storage and computing
• Tier one support• Hardware and Azure software
• Hosted data sets• Multidisciplinary data analysis
• Technical engagement team• Community collaborations• Application support
One step of a worldwide program
![Page 38: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/38.jpg)
HPC and Clouds: Twins Separated At Birth
• Similar technology issues• Node and system architectures • Communication fabrics• Storage systems and analytics• Physical plant and operations• Programming models• Reliability and resilience
• Differing culture and sociology• Design and operations• Management and philosophy
![Page 39: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/39.jpg)
Cloud/HPC Hardware Comparison
Predominate differences today• Network architecture and SAN storage
Dan Reed’s hypotheses• Convergence is coming
Attribute HPC Cloud
Processor High-end x86 x86
Memory/Node 1-8 GB 8 GB+
Local Disk Scratch only Permanent storage
SAN Storage Common Rare
Tertiary Storage Common Rare
Interconnect Infiniband or 10 GigE 1 GigE/10GigE
Network Flat Hierarchical
![Page 40: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/40.jpg)
• Environmental responsibility• Managing under a 100 MW envelope• Adaptive systems management
• Provisioning 100,000 servers• Hardware: at most one week after delivery• Software: at most a few hours
• Resilience during a blackout/disaster• Data center failure• Service rollover for 20M customers
• Programming the entire facility• Power, environmentals, provisioning• Component tracking, resilience, …
Cloud Scaling: Lessons for HPC Exascale
![Page 41: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/41.jpg)
Consistency• Weak consistency is
goodComponent failure• Failure as a first class
objectSystemic resilience• Upgrade during
operation• Never go down
Rethinking Node Architecture
Windows AzureLive Services
Applications
Applications
SQL Azure
OthersWindowsMobile
WindowsVista/XP
WindowsServer
.NET Services
Fabric
Storage
Config
Compute
Application
Windows Azure
![Page 42: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/42.jpg)
• Break the LAN hierarchy• Multiple paths, commodity components• High bisection bandwidth
• We build WAN islands, not continents• Isolated facilities with limited connectivity
• Change the landscape• Serious, multiple terabit WANs• Many lambdas entering a facility• Fused node/LAN/WAN infrastructure
Rethinking LAN/WAN Networking
![Page 43: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/43.jpg)
• People and hardware need not mix• Hardware cooling standards are conservative• Reliable at high temperature/humidity
• Optimize for efficiency• Cooling is (often) unnecessary• Design for ambient environments
• Energy reliability is (often) unnecessary• Design for power outages
• Use larger building blocks• Accept component failures
Rethinking Packaging and Cooling
Temperature
Hum
idity
![Page 44: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/44.jpg)
Truncated Life Models
Performance andFailure Data
MarkovPerformability
Models
TCO andProvisioning
DesiredLifetime
UtilityThreshold
Elapsed Time
Perf
orm
ance
• Factory sealed units (FRUs)• Over-provisioned for failure• Dynamic reconfiguration• Real-time, adaptive control
Rethinking Reliability: Fail In Place
![Page 45: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/45.jpg)
• Power redundancy is a major cost• Batteries to supply up to 15 minutes
• Use multiple sites, based on energy cost and carbon footprint• Electrical grid, solar, wind, fuel cell, …• Workload dispatching based on models
• Real-time optimization and prediction• Workload demand• Weather and seasonal models• Auction-based energy pricing• Infrastructure
• UPS, optical fiber and computing
Rethinking Energy Provisioning
![Page 46: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/46.jpg)
• Draw the right bounding box• It defines the problem you solve
• Understand your workload• Use only the hardware you need
• Metrics reward and punish• Choose carefully what you measure
• Embrace component failure• Hardware is cheap and readily recyclable
• Machines and people do not mix well• Consider sealing hardware at the factory
• Engage multidisciplinary solutions• Mechanical, electrical, economic, social …
• Culture shapes behavior• Implicit versus explicit costs
Some Research/Design Thoughts
![Page 47: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/47.jpg)
47
END-TO-END TRUST
INTELLIGENT MANAGEMENT
GLOBAL POLICY FRAMEWORK
HOLISTIC DESIGN
NEW EXPERIENCES
End-to-End Perspective
![Page 48: Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR barga@microsoft.com .](https://reader037.fdocuments.net/reader037/viewer/2022102710/5519e35f550346443e8b508a/html5/thumbnails/48.jpg)
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,
it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Questions?...