Post on 07-Apr-2018
8/6/2019 Modeling Tools for CMP Research
1/46
Simics and FriendsSimics and Friends
Modelin Tools for CMP ResearchModelin Tools for CMP Research
ZvikaZvikaZvikaZvikaZvikaZvikaZvikaZvika GuzGuzGuzGuzGuzGuzGuzGuz,,,,,,,, IsaskharIsaskharIsaskharIsaskharIsaskharIsaskharIsaskharIsaskhar ((((((((ZigiZigiZigiZigiZigiZigiZigiZigi) Walter) Walter) Walter) Walter) Walter) Walter) Walter) Walter
The TechnionThe Technion Israel Institute of TechnologyIsrael Institute of Technology
8/6/2019 Modeling Tools for CMP Research
2/46
AgendaAgenda
Review the most commonly used tools in CMP arch research
Simulators
Benchmarks
Official AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial Agenda
2
Convince you to use Simics
Because most often than not it is the best option
Because we need more (geographically adjacent) people
Unofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial Agenda
Teaching the tools
Not on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our Agenda
8/6/2019 Modeling Tools for CMP Research
3/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary Technion goodies
3
8/6/2019 Modeling Tools for CMP Research
4/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary Technion goodies
4
8/6/2019 Modeling Tools for CMP Research
5/46
Choosing A SimulatorChoosing A Simulator
Performance
Design
Space
Performance
Design
Space
Ease Of Use
What should it model?
Processor /Cache/Interconnect/etc.
What would run on it?
Benchmarks type
5
e a ex ye
a ex y
8/6/2019 Modeling Tools for CMP Research
6/46
8/6/2019 Modeling Tools for CMP Research
7/46
Choosing a Simulator for CMP ResearchChoosing a Simulator for CMP Research
What will it model?
Multiple cores
Memory hierarchy (caches, coherence)
Interconnect (NoC)
a w run on
Multi-threaded benchmarks
Commercial workloads
7
Need OS for that
Really need OS for that
Full-system simulator, capable of booting (commercial) OS
8/6/2019 Modeling Tools for CMP Research
8/46
8/6/2019 Modeling Tools for CMP Research
9/46
Meet the ContendersMeet the Contenders
SimpleScalar
Uniprocessor
PIN
Not a simulator
Several in-house tools
Not relevant
M5
Simics
?
9
8/6/2019 Modeling Tools for CMP Research
10/46
Why Simics?Why Simics? (the short answer)(the short answer)
Because everyone is using it
THE most widely used simulator in our field
1/3 of ISCA07 papers used Simics
Huge, active community
Alive and kicking forum
Because it is free
For academia
Up to Simics 4.2
..Oh.. and because it is really really good!
10
8/6/2019 Modeling Tools for CMP Research
11/46
SimicsSimics in a Nutshellin a Nutshell
Virtual Hardware
Event driven
Cycle accurate*
Completeroduction
The software cant
tell the difference
Runs binaries from
real target
O eratin s stem
User program
MiddlewareDBJava VM
get
Software
11
HW/SW
interface
software
Simulated(virtual)
hardware
Virtual Hardware
CPU
RAM
FLASH
User Intf
device
A/DROM
PCI
I2C
Bus
CPU
NetworkDisk
Disk Ctrl
Drivers Boot firmwareHardware
abstraction layer
Ta
http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
12/46
Simics Overview (Simics Overview (11//33))
A software, event-driven simulator
Full-system simulator
Processor
Simics is a flexible, scalable, and high-performance full-system simulator
Memory hierarchy (DRAM, Disk)
Network
Devices (DMA, Interrupt controller, PCI, etc.)
Runs unmodified binaries
OS, drivers and applications
Models the entire machine that OS sees
Application cannot tell the difference
12http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
13/46
Simics Overview (2/3)Simics Overview (2/3)
Fully supported ISAs:
SPARC
X86
Simics is a flexible, scalable, and high-performance full-system simulator
Alpha, Itanium, MIPS, ARM, ..
Scalable:
Single processor (uniprocessor /CMP) MPs Racks Clusters
Distributed systems
13http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
14/46
Simics Overview (Simics Overview (33//33))
Flexible
Different degrees of simulation (details)
Functionality only
Simics is a flexible, scalable, and high-performance full-system simulator
Microarchitecture and timing
Configurable
Hook/unhook modules
Control their timing
Write your own (in C++)
14http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
15/46
DemoDemo
Solaris/PowerPCSolaris/PowerPC
RedHat 7.2/Itanuim
NT/x86
RedHat 6.2/
x86
15
RedHat 7.2/ Pentium III
XP/x86-64
RedHat 7.2/ Pentium III
Simics console
XP/x86-64Solaris 8/UltraSparc II
Simics console
http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
16/46
What Have We Seen?What Have We Seen?
User application code
Middleware and libraries
16
SimicsSimics
Host hardwareHost hardware
Host operating systemHost operating system
Virtual target hardware
Target operating system (s)
http://www.virtutech.com/
8/6/2019 Modeling Tools for CMP Research
17/46
Simics Provides:Simics Provides:
Checkpoints
Save/restore state
Breakpoints
Temporal breakpoints
rea on memory eg ster
Graphics breakpoint
Magic instructions
Signal Simics from within your application
Access host files from the simulated machine
So much more..
17
8/6/2019 Modeling Tools for CMP Research
18/46
Simics Timing ModelsSimics Timing Models
Default mode
Every instruction takes exactly 1 clock cycle
Including access to disc, access to memory, etc.
in-order mode
10X-100X
slowdown
when memory request occurs
Function returns the number of cycles to stall
Out-of-order mode (MAI mode)
Detailed out-of-order arch simulation
User-defined processor model
Full control on how instructions advance
18
1000X-10000X
slowdown
10000X-1 million
slowdown
8/6/2019 Modeling Tools for CMP Research
19/46
Simics TimingSimics Timing -- defaultdefault
Emulation mode
Used for fast-forwarding
Boot OS
Build workload
ast- orwar to re evant execut on part
Basically, used for creating a checkpoint
19
8/6/2019 Modeling Tools for CMP Research
20/46
Simics TimingSimics Timing in orderin order
Timing model is a C program
You can act on every memory access
Usually used for modeling:
Caches (and cache hierarchies)
Coherency protocols (directory)
Hardware/Hybrid transactional memory
20
8/6/2019 Modeling Tools for CMP Research
21/46
Simics TimingSimics Timing Out Of Order ModeOut Of Order Mode
Gives full control over timing
User decides when things happen
Fetch/decode/execute/commit
Simics handle how these things happen
Out-of-order execution, multi-processor, multi-threading,
branch prediction, value prediction
Used for processor arch research
Models processor internal
And whenever you need a better notation of time
Interconnect study
21
8/6/2019 Modeling Tools for CMP Research
22/46
Simple ExampleSimple Example Adding Cache (Adding Cache (11//44))
Nahalal A new cache architecture for CMP
Architectural differentiation of cache lines at runtime
According to usage -Private vs. Shared
22
CPU0
CPU1
CPU2
CPU6CPU5
CPU4
CPU3CPU7
CPU0
CPU1
CPU2
CPU6CPU5
CPU4
CPU3CPU7
8/6/2019 Modeling Tools for CMP Research
23/46
Simple ExampleSimple Example Adding Cache (Adding Cache (22//44))
1. Writing a cache timing model
C- Program
23
8/6/2019 Modeling Tools for CMP Research
24/46
Simple ExampleSimple Example Adding Cache (Adding Cache (33//44))
2. Hooking the new cache into Simics
Python script
24
8/6/2019 Modeling Tools for CMP Research
25/46
Simple ExampleSimple Example Adding Cache (Adding Cache (44//44))
3. Run Simics and collect statistics
25
8/6/2019 Modeling Tools for CMP Research
26/46
Simics in ResearchSimics in Research Virtual Hierarchies, M. R. Marty and M. D. Hill, Micro's Top Picks 2008
Improving Multiple-CMP Systems Using Token Coherence,, M. R. Marty, J. D.
Bingham, M. D. Hill, A. J. Hu, M.K. Martin and D. A. Wood, HPCA 2005
"Nahalal: Cache Organization for Chip Multiprocessors", Z. Guz, I. Keidar, A.
Kolodny, U. C. Weiser, IEEE Computer Architecture Letters, May 2007
Memory Mapped ECC: Low-Cost Error Protection forLast Level Caches, D. H.. ,
TokenTM: Efficient Execution of Large Transactions with Hardware Transactional
Memory, J. Bobba, N. Goyal, M. D. Hill, M. M. Swift, and D. A. Wood, ISCA 2008
Predicting the Performance of Reconfigurable Optical Interconnects in Distributed
Shared-Memory Systems, W. Heirman, J. Dambre, I. Artundo, C. Debaes, H.Thienpont, D. Stroobandt, J. Van Campenhout, Photonic Network Communications 08
Serializing Instructions in System-Intensive Workloads: Amdahl's Law Strikes Again
P. M. Wells, G. S. Sohi, HPCA 2008
PredictorVirtualization, I. Burcea, S. Somogyi, A. Moshovos and B. Falsafi,
ASPLOS 2008
26
8/6/2019 Modeling Tools for CMP Research
27/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary
Technion goodies
27
8/6/2019 Modeling Tools for CMP Research
28/46
AddAdd--ons for Simicsons for Simics
Open-source add-ons enlarge Simics capabilities
Some as popular as Simics itself
Garnet & Orion
SimFlex
FeS2
28
8/6/2019 Modeling Tools for CMP Research
29/46
MultifacetMultifacet GEMSGEMS
The most mature Simics add-on
Most of ISCAs Simics papers actually use GEMS
Alive and active forum
GEMS is a set of modules for Virtutech Simics that enables
detailed simulation of multiprocessor systems, including CMP.
Two main components
Ruby Memory system timing simulator
Opal Timing model for OOO processor
Flexible
Can be configured/altered/hacked
Add your own models
29http://www.cs.wisc.edu/gems/
8/6/2019 Modeling Tools for CMP Research
30/46
GEMS RubyGEMS Ruby
Cache hierarchy
L1, L2 (private/shared), SNUCA/DNUCA, Simple DRAM
Different coherence protocols
Snoop, Directory, Token coherence
HW transaction memory
Log-TM
Suns Rock
Interconnect
Simple
Garnet - detailed NoC interconnect
30http://www.cs.wisc.edu/gems/
8/6/2019 Modeling Tools for CMP Research
31/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary
Technion goodies
31
8/6/2019 Modeling Tools for CMP Research
32/46
L2$ L2$ L2$ L2$
L2$ L2$ L2$ L2$
CPU
L1$
CPU
L1$
CPU
L1$
CPU
L1$
CMP is More than CPUs and MemoryCMP is More than CPUs and Memory
We need to model the interconnect too
Might have a paramount effect on performance and power
Sometime, this is all we need!
L2$ L2$ L2$ L2$
L2$ L2$ L2$ L2$CPU
L1$
CPU
L1$
CPU
L1$
CPU
L1$
32
8/6/2019 Modeling Tools for CMP Research
33/46
Important part of the system!
Static modeling can account for static attributes
Topology, routing, link bandwidth, packet size, etc.
Run-time effects are much harder to (statically) model
Simulate the Interconnect? Why Bother?Simulate the Interconnect? Why Bother?
Shared resource arbitration, finite buffer sizes,
channel multiplexing, flow control,
Might be dominating factors
Driving home during rush hours
33
8/6/2019 Modeling Tools for CMP Research
34/46
NoC is a network!
Use a network oriented tool with built in support for traffic modeling
Eliminate complex system simulator if not really needed
Perfect tool for optimizing the interconnect
Network vs. Full System SimulatorNetwork vs. Full System Simulator
rc ec ure, opo ogy, pro oco s, parame er un ng, e c.
Easy programming and debugging
Fast!
Fastest discrete event simulation engine among leading industry
solutions
34
8/6/2019 Modeling Tools for CMP Research
35/46
OPNET Modeler FeaturesOPNET Modeler Features
Object-oriented modeling
Hierarchical modeling environment
GUI-based debugging and analysis
Event-driven simulation engine
Coding C/C++ & auxiliary functions
Open interface for integrating external object files, libraries, and
other simulators
Asynchronous/synchronous modeling
35
8/6/2019 Modeling Tools for CMP Research
36/46
"QNoC: QoS architecture and design process for Network on Chip, E. Bolotin, I.
Cidon, R. Ginosar, A. Kolodny, Special issue on Networks on Chip, The Journal of
Systems Architecture, December 2003
"Network Delays and Link Capacities in Application-Specific Wormhole NoCs, Z.
Guz, I. Walter, E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny, VLSI Design, vol.
2007, Article ID 90941, May 2007
OPNET in CMP ResearchOPNET in CMP Research
"Routing Table Minimization for Irregular Mesh NoCs, E. Bolotin, I. Cidon, R.
Ginosar, A. Kolodny, DATE 2007
"Access Regulation to Hot-Modules in Wormhole NoCs, I. Walter, I. Cidon, R.
Ginosar, A. Kolodny, NOCS 2007
"The Power of Priority: NoC based Distributed Cache Coherency, E. Bolotin, Z.Guz, I. Cidon, R. Ginosar, A. Kolodny, NOCS 2007
"Best of Both Worlds: A Bus Enhanced NoC (BENoC), R. Manevich, I. Walter, I.
Cidon, and A. Kolodny, the ACM/IEEE Int. Symp. on Networks-on-Chip (NOCS),
2009
36
8/6/2019 Modeling Tools for CMP Research
37/46
A new interconnect architecture, utilizing the best of both worlds
Use NoC for data delivery
Use bus for lightweight, latency critical meta-data
Coherency
BusBus--Enhanced Network onEnhanced Network on--ChipChip
R
R
R
R
R R
R
RR R R
RR R R
R
Module
Module
Module
Module
Module
Module
Module
Module
ModuleModule Module Module
ModuleModule Module Module
37
8/6/2019 Modeling Tools for CMP Research
38/46
BusBus--Enhanced Network onEnhanced Network on--ChipChip
R
R
R
R
R R
R
RR R R
RR R R
R
Module
Module
Module
Module
Module
Module
Module
Module
ModuleModule Module Module
ModuleModule Module Module
38
8/6/2019 Modeling Tools for CMP Research
39/46
Run OPNET as a trace-driven simulator
L2 access logs generated by Simics
Advantages
Fast
Gluing OPNET toGluing OPNET to SimicsSimics
mp e
Disadvantage
Dependencies are lost
Does not account for latency hiding techniques (e.g. OOO)
But..
OPNET can be glued to Simics using Ruby
39
8/6/2019 Modeling Tools for CMP Research
40/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary
Technion goodies
40
8/6/2019 Modeling Tools for CMP Research
41/46
Meet the ContendersMeet the Contenders
CPU2006, CPU2000
OMP2001
JBB2005, JBB2000
SPLASH-2
PARSEC
Commercial workloads
Apache
Databases
?
?
41
8/6/2019 Modeling Tools for CMP Research
42/46
Benchmark ComparisonBenchmark Comparison
CPU
2006
OMP
2001
SPLASH-
2
PARSEC Commercial
Programs 29 11 14 13 1
Multi-Threaded
42
verse
Updated
Emerging apps
Installation ease
Simulation friendly
8/6/2019 Modeling Tools for CMP Research
43/46
The PARSC Benchmark SuiteThe PARSC Benchmark Suite
Over 1000 downloads since release
This is what everyone will be using
43http://parsec.cs.princeton.edu/
8/6/2019 Modeling Tools for CMP Research
44/46
OutlineOutline
Choosing a Simulators
Simics
And friends
GEMS, Garnet & Orion , FeS2, SimFlex
OPNET - modeling CMP interconnect
Benchmarks
Summary
Technion goodies
44
8/6/2019 Modeling Tools for CMP Research
45/46
Technion GoodiesTechnion Goodies http://www.ee.technion.ac.il/matrics/software.html
Simics workload kits
Ease up installation of simics workloads
Wisconsin GEMS provide few other too
Constantly adding more workloads to the pool
Can you help?
OPNET models for NoC
Our entire QNoC model for OPNET
Cores, router and links, SNUCA/DNUCA L2 caches
Routing schemes, arbitration policies, resource contention
Synthetic/trace driven simulation
Transactified version of Apache
45
8/6/2019 Modeling Tools for CMP Research
46/46
SummarySummary
A swift overview of simulation tools for CMP
Simics
GEMS
OPNET
Technions two cents
46
Questions?
zguz@tx.technion.ac.il