21st Century Computer Architecture A community white paper 21st Century Computer Architecture.
-
Upload
mina-alday -
Category
Documents
-
view
222 -
download
0
Transcript of 21st Century Computer Architecture A community white paper 21st Century Computer Architecture.
21st CenturyComputer
Architecture A community white paper
http://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf
Technion, Haifa Israel, June 2013• Information & Commun. Tech’s Impact• Semiconductor Technology’s Challenges• Computer Architecture’s Future• Example: Bypassing Paged Virtual Memory
2
White Paper Participants
“*” contributed prose; “**” effort coordinatorThanks of CCC, Erwin Gianchandani & Ed Lazowska for guidance and Jim Larus & Jeannette Wing for feedback
Sarita Adve, U Illinois * David H. Albonesi, Cornell UDavid Brooks, Harvard ULuis Ceze, U Washington * Sandhya Dwarkadas, U Rochester Joel Emer, Intel/MIT Babak Falsafi, EPFL Antonio Gonzalez, Intel/UPC Mark D. Hill, U Wisconsin *,** Mary Jane Irwin, Penn State U * David Kaeli, Northeastern U * Stephen W. Keckler, NVIDIA/U TexasChristos Kozyrakis, Stanford UAlvin Lebeck, Duke UMilo Martin, U Pennsylvania
José F. Martínez, Cornell UMargaret Martonosi, Princeton U * Kunle Olukotun, Stanford UMark Oskin, U Washington Li-Shiuan Peh, M.I.T. Milos Prvulovic, Georgia Tech Steven K. Reinhardt, AMDMichael Schulte, AMD/U WisconsinSimha Sethumadhavan, Columbia UGuri Sohi, U Wisconsin Daniel Sorin, Duke UJosep Torrellas, U Illinois * Thomas F. Wenisch, U Michigan * David Wood, U Wisconsin * Katherine Yelick, UC Berkeley/LBNL *
3
20th Century ICT Set Up• Information & Communication Technology (ICT)
Has Changed Our Worldo <long list omitted>
• Required innovations in algorithms, applications, programming languages, … , & system software
• Key (invisible) enablers (cost-)performance gainso Semiconductor technology (“Moore’s Law”)o Computer architecture (~80x per Danowitz et al.)
4
Enablers: Technology +
Architecture
Danowitz et al., CACM 04/2012, Figure 1
Technology
Architecture
5
21st Century Promise• ICT Promises Much More
o Data-centric personalized health careo Computation-driven scientific discoveryo Human network analysiso Much more: known & unknown
• Characterized byo Big Datao Always Onlineo Secure/Privateo …
Whither enablers of future (cost-)performance gains?
6
Technology’s Challenges 1/2
Late 20th Century The New Reality
Moore’s Law —2× transistors/chip
Transistor count still 2× BUT…
Dennard Scaling —~constant power/chip
Gone. Can’t repeatedly double power/chip
Classic CMOS Dennard Scaling: the Science behind Moore’s Law
7National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Scaling:
Oxide: tOX/a
Results:
Power Density:
Voltage: V/a
Power/ckt: 1/a2
~Constant
(Finding 2)
Source: Future of Computing Performance: Game Over or Next Level?,
National Academy Press, 2011
Power Density: ~Constant
Post-classic CMOS Dennard Scaling
8National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Scaling:
Oxide: tOX/a
Results:
Voltage: V/a V
Power/ckt: 1a2
1/a2
Post Dennard CMOS Scaling Rule TODO:Chips w/ higher power (no), smaller (),
dark silicon (), or other (?)
9
Technology’s Challenges 2/2
Late 20th Century The New Reality
Moore’s Law —2× transistors/chip
Transistor count still 2× BUT…
Dennard Scaling —~constant power/chip
Gone. Can’t repeatedly double power/chip
Modest (hidden) transistor unreliability
Increasing transistor unreliability can’t be hidden
Focus on computation over communication
Communication (energy) more expensive than computation
1-time costs amortized via mass market
One-time cost much worse &want specialized platforms
How should architects step up as technology falters?
10
21st Century Comp Architecture
20th Century
21st Century
Single-chip in generic computer
Architecture as Infrastructure: Spanning sensors to cloudsPerformance plus security, privacy, availability, programmability, …
Cross-Cutting:
Break current layers with new interfaces
Performance via invisible instr.-level parallelism
Energy First● Parallelism● Specialization● Cross-layer design
Predictable technologies: CMOS, DRAM, & disks
New technologies (non-volatile memory, near-threshold, 3D, photonics, …) Rethink: memory & storage, reliability, communication
X
X
12
What Research Exactly?• Research areas in white paper (& backup slides)
1. Architecture as Infrastructure: Spanning Sensors to Clouds2. Energy First3. Technology Impacts on Architecture4. Cross-Cutting Issues & Interfaces
• Much more research developed by future PIs!
• E.g.: Efficient Virtual Memory for Big Memory Serverso Basu, Gandhi, Chang, Hill, & Swift [ISCA 2013]o Big Memory: graph500, memcached, databaseso Self-manage most memory (e.g., bufferpool)
1310/5/12
Execution Time Overhead: TLB Misses
1. Significant waste
2. Larger memory?3. Byte-addr NVM?
Lower is better
Hardware: Direct Segment
OFFSET
BASE LIMIT VA
Conventional Paging
PA
1 2 Direct Segment
Why Direct Segment?• Matches Big Memory Workload
needs• NO Paging => NO TLB Miss
15
Execution Time Overhead: TLB Misses
10/5/12
92-100% TLB “misses” to direct segmentRequires: Both small SW + small HW
changes
21st CenturyComputer
Architecture A community white paper
http://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf
Technion, Haifa Israel, June 2013• Information & Commun. Tech’s Impact• Semiconductor Technology’s Challenges• Computer Architecture’s Future• Example: Bypassing Paged Virtual Memory
19
Back Up Slides• Detailed research areas in white paper
1. Architecture as Infrastructure: Spanning Sensors to Clouds
2. Energy First3. Technology Impacts on Architecture4. Cross-Cutting Issues & Interfaces
http://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf
• Findings on National Academy “Game Over” Study
• Glimpse at DARPA/ISAT Workshop “Advancing Computer Systems without Technology Progress”
20
1. Architecture as Infrastructure: Spanning
Sensors to Clouds• Beyond a chip in a generic computer• To pillar of 21st century societal infrastructure.
o Computation in context (sensor, mobile, …, data center) o Systems often large & distributedo Communication issues can dominate computationo Goals beyond performance (battery life, form factor)
• Opportunities (not exhaustive)o Reliable sensors harvesting (intermittent) energyo Smart phones to Star Trek’s medical “tricorder”o Cloud infrastructure suitable for both “Big Data” streams
& low-latency qualify-of-service with stragglerso Analysis & design tools that scale
21
2. Energy First• Beyond single-core performance computer• To (cost-)performance per watt/joule
• Energy across the layerso Circuit/technology (near-threshold CMOS, 3D stacking)o Architecture (reducing unnecessary data movement)o Software (communication-reducing algorithms)
• Parallelism to save energyo Vast (fined-grained) homogeneous & heterogeneouso Improved SW stacko Applications focus (beyond graphic processing units)
• Specialization for performance & energy efficiencyo Abstractions for specialization (reducing 1-time cost)o Energy-efficient memory hierarchieso Reconfigurable logic structures
22
3. Technology Impacts on Architecture
• Beyond CMOS, Dram, & Disks of last 3+ decades to
• Using replacement circuit technologieso Sub/near-threshold CMOS, QWFETs, TFETs, and QCAs
• Non-volatile storageo Beyond flash memory to STT-RAM, PCRAM, & memristor
• 3D die stacking & interposerso logic, cache, small main memory
• Photonic interconnectso Inter- & even intra-chip
• Design automationo from circuit-design w/ new technologies to o pre-RTL functional, performance, power, area modeling of
heterogeneous chips & systems
23
4. Cross-Cutting Issues & Interfaces
• Beyond performance w/ stable interfaces to
• New design goals (for pillar of societal infrastructure)o Verifiability (bugs kill)o Reliability (“dependability” computing base?)o Security/Privacy (w/ non-volatile memory?)o Programmability (time to correct-performant solution)
• Better Interfaceso High-level information (quality of service, provenance)o Parallelism ((in)dependence, (lack of) side-effects)o Orchestrating communication ((recursive) locality)o Security/Reliability (fine-grain protection)
Executive summary (Added to National Academy Slides)
Highlights of National Academy Findings(F1) Computer hardware has transitioned to multicore(F2) Dennard scaling of CMOS has broken down(F3) Parallelism and locality must be exploited by software(F4) Chip power will soon limit multicore scaling
Eight recommendations from algorithms to education
We know all of this at some level, BUT:Are we all acting on this knowledge or hoping for business as usual?
Thinking beyond next paper to where future value will be created?– Questions Asked but Not Answered Embedded in NA Talk– Briefly Close with Kübler-Ross Stages of Grief:
Denial … AcceptanceSource: Future of Computing Performance: Game Over or Next Level?,
National Academy Press, 2011Mark Hill talk (http://www.cs.wisc.edu/~markhill/NRCgameover_wisconsin_2011_05.pptx)
The Graph
25
Syste
m C
ap
ab
ilit
y
(log
)
80s
90s
00s
10s
20s
30s
40s
CMO
SFallow Period
New
Tech
nologyOur
Focus
50sSource: Advancing Computer Systems without Technology Progress,
ISAT Outbrief (http://www.cs.wisc.edu/~markhill/papers/isat2012_ACSWTP.pdf)
Mark D. Hill and Christos Kozyrakis, DARPA/ISAT Workshop, March 26-27, 2012.
Approved for Public Release, Distribution UnlimitedThe views expressed are those of the author and do not reflect the official policy or position of
the Department of Defense or the U.S. Government.
Surprise 1 of 2• Can Harvest in the “Fallow” Period!
• 2 decades of Moore’s Law-like perf./energy gains
• Wring out inefficiencies used to harvest Moore’s Law
HW/SW Specialization/Co-design (3-100x)
Reduce SW Bloat (2-1000x)
Approximate Computing (2-500x)
---------------------------------------------------
~1000x = 2 decades of Moore’s Law!
26