Interface-Based Design - Peoplenewton/Classes/EE290sp99/... · Interface-Based Design Introduction...
Transcript of Interface-Based Design - Peoplenewton/Classes/EE290sp99/... · Interface-Based Design Introduction...
Interface-Based DesignInterface-Based DesignIntroductionIntroduction
A. Richard NewtonA. Richard NewtonDepartment of Electrical Engineering and Computer SciencesDepartment of Electrical Engineering and Computer Sciences
University of California at BerkeleyUniversity of California at Berkeley
Integrated CMOS RadioIntegrated CMOS Radio
AD
Analog RF
Timing recovery
phone
bookJava VM
ARQ
Keypad,Display
Control
FiltersAdaptiveAntennaAlgorithms
Equalizers MUD
Accelerators(bit level)
analog digital
DSP core
uC core
(ARM)
Logic
Dedicated Logicand Memory
Integrate within the same chip very diverse system functions like:wireless channel control, signal processing, codec algorithms,
radio modems, RF transceivers… and implement them using a heterogeneous architecture (J. Rabaey)
Communication versus ComputationCommunication versus Computation
◆◆ Computation cost (2004): 60Computation cost (2004): 60 pJ pJ/operation (assuming/operation (assumingcontinued scaling)continued scaling)
◆◆ Communication cost (minimum):Communication cost (minimum):▲▲ 100 m distance: 20100 m distance: 20 nJ nJ/bit @ 1.5/bit @ 1.5 GHz GHz▲▲ 10 m distance: 210 m distance: 2 pJ pJ/bit @ 1.5/bit @ 1.5 GHz GHz
◆◆ Computation versus CommunicationsComputation versus Communications▲▲ 100 m distance: 300 operations == 1bit100 m distance: 300 operations == 1bit▲▲ 10 m distance: 0.03 operation == 1bit10 m distance: 0.03 operation == 1bit
Computation/Communication requirements vary withComputation/Communication requirements vary withdistance, data type, and environmentdistance, data type, and environment
(courtesy of J. Rabaey)
Energy-efficient ProgrammableEnergy-efficient ProgrammableImplementation PlatformImplementation Platform
EmbeddedMicroprocessor/
DSP System
ProgrammableLogic
DedicatedModules
ConfigurableArithmetic and Logic
Processors
Communication ChannelProtocol Processing
Analog R
F“Software-defined Radio”
(courtesy of J. Rabaey)
The Design ObjectThe Design Object
◆◆ Assemble ComponentsAssemble Components from parameterized library from parameterized library
◆◆ Including: Including:
◆◆ Integrate using standardIntegrate using standard approach to on-chip approach to on-chip communication communication
Configurable processor core Configurable processor core
Memories (RAM, ROM) Memories (RAM, ROM)
Special-purpose standardSpecial-purpose standard blocks ( blocks (ASSPsASSPs))
Glue LogicGlue Logic
◆◆ Third-party special-purposeThird-party special-purpose logic/MEMS/MEOS logic/MEMS/MEOS
InterfaceInterface
Interfaces and ContractsInterfaces and Contracts
ComponentComponent
PreconditionsPreconditions
PostconditionsPostconditions PostconditionsPostconditions
PreconditionsPreconditions
ContractContract
Environment
InterfaceInterface
ComponentComponent
InterfaceInterface
Interfaces and ContractsInterfaces and Contracts
ComponentComponent
Interface: Levels of AbstractionInterface: Levels of AbstractionPart 1: Mechanisms (Wiring)Part 1: Mechanisms (Wiring)
◆◆ PhysicalPhysical: Geometrical arrangement of I/O locations,: Geometrical arrangement of I/O locations,how to connect, etc.how to connect, etc.
◆◆ Electrical:Electrical: Restrictions/requirements on currents, Restrictions/requirements on currents,voltages, noise, voltages, noise, risetimesrisetimes, , falltimesfalltimes, etc., etc.
◆◆ Logic (Logic (CombinationalCombinational):): Largely a Largely a transcodingtranscoding((discretizationdiscretization) of electrical limits into logic domain) of electrical limits into logic domain
◆◆ Sequential (Sequential (StatefulStateful):): form of the model: clocked form of the model: clockedsynchronous, asynchronous (what?), etc.synchronous, asynchronous (what?), etc.
High-End SystemsHigh-End Systems
Intel Pentium ProIntel Pentium ProIntel Pentium 2Intel Pentium 2
IBM/Motorola PPC 620IBM/Motorola PPC 620
High-End SystemsHigh-End Systems
IBM/Motorola PPC 620IBM/Motorola PPC 620
VSIA: Four Orthogonal ModelVSIA: Four Orthogonal ModelCharacteristicsCharacteristics
◆◆ Temporal DetailTemporal Detail◆◆ Data Value DetailData Value Detail◆◆ Functional DetailFunctional Detail◆◆ Structural DetailStructural Detail
VSIA: System-Level Data AbstractionsVSIA: System-Level Data Abstractions
Interface: Levels of AbstractionInterface: Levels of AbstractionPart 2: Policies (Semantics?)Part 2: Policies (Semantics?)
◆◆ Legal Data Types and Abstract TypesLegal Data Types and Abstract Types◆◆ “Protocols”“Protocols”
▲▲ Local (handshakes)Local (handshakes)▲▲ Global (clocked synchronous)Global (clocked synchronous)
◆◆ Transaction ModelsTransaction Models◆◆ Implications of ConcurrencyImplications of Concurrency
Interactions of ComponentsInteractions of Components◆◆ ProceduresProcedures◆◆ Synchronous logicSynchronous logic◆◆ Asynchronous logicAsynchronous logic◆◆ Bus protocolsBus protocols◆◆ Shared memoryShared memory◆◆ SemaphoresSemaphores◆◆ RendezvousRendezvous◆◆ Timed eventsTimed events◆◆ StreamsStreams◆◆ Message passingMessage passing◆◆ Communication protocols/handshakingCommunication protocols/handshaking
7
R
Z
H
U
�
R
I
�
%
D
E
H
O
�
�
L
Q
�
S
D
L
Q
W
L
Q
J
�
E
\
�
%
U
X
H
J
H
O
�
�
�
�
�
�
Source: Prof. Edward Lee
Abstracting SynchronyAbstracting Synchrony
A
C
D
B
V\QFKURQRXV�UHDFWLYH�PRGXOHV
SDUDOOHO�V\QFKURQRXV�FLUFXLWV�
VHTXHQWLDO�HPEHGGHG�VRIWZDUH�
Source: Prof. Edward Lee
Abstracting RendezvousAbstracting Rendezvous
A
C
D
B
FRPPXQLFDWLQJVHTXHQWLDOSURFHVVHV
SDUDOOHO�FLUFXLWV�ZLWK�KDQGVKDNLQJ�
VHTXHQWLDO�VRIWZDUH�ZLWK�WKUHDGV�
Source: Prof. Edward Lee
Abstracting Message PassingAbstracting Message Passing
A
C
D
B
GDWDIORZ�PRGXOHV
SDUDOOHO�DV\QFKURQRXV�FLUFXLWV�
VHTXHQWLDO�VRIWZDUH�PXOWLWDVNLQJ�
Source: Prof. Edward Lee
It’s all aboutCommunication!
Then must be all about
Concurrency...
Useful Concurrent SemanticsUseful Concurrent Semantics
◆◆ Analog computers (Analog computers (ODEsODEs))◆◆ Spatial/temporal models (Spatial/temporal models (PDEsPDEs))◆◆ Discrete time (difference equations)Discrete time (difference equations)◆◆ Discrete-event systems (DE)Discrete-event systems (DE)◆◆ Synchronous-reactive systems (SR)Synchronous-reactive systems (SR)◆◆ Sequential processes with rendezvous (CSP)Sequential processes with rendezvous (CSP)◆◆ Process networks (Kahn)Process networks (Kahn)◆◆ Dataflow Dataflow (Dennis)(Dennis)
A
C
D
B
%ORFN�GLDJUDPV�RIWHQ�SURYLGH�D�QLFHV\QWD[�IRU�FRQFXUUHQW�VHPDQWLFV
Source: Prof. Edward Lee
control dominated systems
A Complete System-on-a-ChipA Complete System-on-a-Chip
Philips 83 C552: 8 bit-8051 based microcontroller
complete system
timers, PWM for control
I²C-bus and par./ser.interfaces for communi-cation
A/D converter
watchdog (SW activitytimeout): safety
on-chip memory
interrupt controller
parallel ports 1 through 5
processor80C51
watchdog (T3)
timer0 (16 bit)
timer1 (16 bit)
15-vectorinterrupt
timer2(16 bit)
8K8 ROM(87C552 8K8
EPROM)
256x8 RAM
A/DC10-bit
PWM
UART
I²C
Source: Prof. Rolf Ernst
control dominated systems
Architectures for Higher Computation RequirementsArchitectures for Higher Computation Requirements
Example: Motorola MC 683xx - family of controllersProcessor: CPU 32
68000 - processor enhanced by most of the 68030 features
CISC processor: code density
pipelining
standard register sets (not in RAM)context switch is more expensive
virtual memoryaims at use of operating systems
supervisor and user modes
table lookup instructions for compressed tables with built-in linear interpolation data density is concern
Source: Prof. Rolf Ernst
MC68332MC68332
control dominated systems
inter module busIMB
I/0 - channel 0
I/0 - channel 15unitTPU
time processingCPU32
serial I/0
IMB control RAM
TPU
Designed for automotive applications with mixture of computation intensive tasks and complex I/0 -functions Idea: off-load CPU from frequent I/0 interactions to make use of computation performance:
Source: Prof. Rolf Ernst
control dominated systems
MC68332MC68332
IMB
Data
Control ServiceRequests
Microengine
HostInterface
TimerChannelsScheduler
DevelopmentSupportand Test
SystemConfiguration
ChannelControl
ParameterRAM
Store
ExecutionUnit
Channel 0
Channel 1
Channel 15
Pins
Control andData
Channel
ControlStore
timebase
TPU: time processing unit: peripheral coprocessor
independent programmable timer channels: single-shot "capture & compare"
channel coupling and sequence control with control processor
pin
Source: Prof. Rolf Ernst
Embedded System Design ProcessEmbedded System Design Process
reusedcomponents
support(CAD, test, ...)
requirements definition
specification
system architecture development
integration and test
customer/marketing
systemarchitect
SWdeveloper
HWdesigner
SW development• application SW
• compilers etc.• operating syst.
interface design• SW driver dev.• HW interface
synthesis
HW design• HW architecture design• HW synthesis• physical design
Embedded system design processSource: Prof. Rolf Ernst
HDL generation
Co-synthesis Design Flow - PrincipleCo-synthesis Design Flow - Principle
OS, component &
communication libraries
system function
compilation&system analysis
intermediatecode generation
object codeobject code HW modelHW model
code generation
HW/SW partitioning &scheduling
HL synthesisco-simulation,
analysis
State of the art - Optimization and co-synthesis
constraintsand userdirectives
constraintsand userdirectives
Source: Prof. Rolf Ernst
Separate Behavior from Separate Behavior from MicroarchitectureMicroarchitecture
FrontFrontEnd End 11
TransportTransportDecode Decode 22
RateRateBufferBuffer
1212
RateRateBufferBuffer
99
RateRateBufferBuffer
55
SensorSensor
SynchSynchControlControl
44
VideoVideoDecode Decode 66
AudioAudioDecode/Decode/
Output Output 1010
MemMem1111
User/SysUser/SysControlControl
33
MemMem1313
FrameFrameBufferBuffer
77
VideoVideoOutput Output 88
◆◆ System Behavior System Behavior▲▲ Functional SpecificationFunctional Specification
of System.of System.▲▲ No notion of hardware orNo notion of hardware or
software!software!
◆◆ Implementation ArchitectureImplementation Architecture▲▲ Hardware and SoftwareHardware and Software▲▲ Optimized ComputerOptimized Computer
DSP RAMDSP RAM
ExternalExternalI/OI/O
System System RAMRAM
DSPDSPProcessorProcessor
Pro
cess
or B
us
Pro
cess
or B
us
ControlControlProcessorProcessor
MPEGMPEG
PeripheralPeripheral
AudioAudioDecodeDecode
Source: Prof. Alberto Sangiovanni
IP-Based Design of ImplementationIP-Based Design of Implementation
DSP RAMDSP RAM
ExternalExternalI/OI/O
System System RAMRAM
DSPDSPProcessorProcessor
Pro
cess
or B
us
Pro
cess
or B
us
ControlControlProcessorProcessor
MPEGMPEG
PeripheralPeripheral
AudioAudioDecodeDecode
Which DSPProcessor? C50?
Can DSP be done onMicro-controller?
Which Bus? PI?AMBA?
Dedicated Bus forDSP?
WhichMicro-controller?
ARM? HC11?
Do I need a dedicated Audio Decoder?Can decode be done on Micro-controller?
How fast will myUser Interface
Software run? HowMuch can I fit on my
Micro-controller?
Can I Buyan MPEG2Processor?
Which One?
Source: Prof. Alberto Sangiovanni
Co-design using co-synthesis and design space explorationCo-design using co-synthesis and design space exploration
HDL generation
constraints anduser directives
constraints anduser directives
OS, component &
communication libraries
system function
Compilation &system analysis
intermediate codegeneration
object codeobject code HW modelHW model
code generation
HW/SW partitioning &scheduling
HL synthesisco-simulation
analysis
• specification parameter change• high level transformations
• specification parameter change• high level transformations
hardwaredesigner
softwaredevelopercustomer system
architect
cost, performance, ...
State of the art - Optimization and co-synthesis
estimations
Source: Prof. Rolf Ernst
Communication RefinementCommunication RefinementStandard interfaces constitute the backbone of an IP market: abstract form the concerns ofhardware implementation (multi-target VC), abstract from the concerns of a particular bus
(bus-independent VC)
system transaction, «ANY» data structure (e.g. video line)
hardware or software
«ANY BUS» operation (data, address...)VSI-Alliance OCB Group.Virtual Component Interface (VCI)
Physical Bus (e.g.PIBus)fixed bus-width,detailed protocol
Bus WrapperCommunication Interface (e.g.bounded FIFO)
Source: Prof. Alberto Sangiovanni
The The OrthogonalizationOrthogonalization Approach Approach
MacroShells (the Protocol Interface)Communication Channels
MicroShells (the IP Requirements)
P1
P2
P3
P4
P5
P6
P7
Pearls (the IP Processes)
Source: Prof. Alberto Sangiovanni
Communication DesignCommunication Design◆◆ Determine a protocol so that no matter what theDetermine a protocol so that no matter what the
communication topology and the nature of the IP’s thecommunication topology and the nature of the IP’s thefunctionality of the overall system is guaranteed (TCP/IPfunctionality of the overall system is guaranteed (TCP/IPlike)like)
◆◆ Given the IP set and the interconnections, automaticallyGiven the IP set and the interconnections, automaticallysynthesize protocols and macro-shellssynthesize protocols and macro-shells
◆◆ Given the IP set and a set of time-varyingGiven the IP set and a set of time-varyinginterconnections, automatically synthesize adaptiveinterconnections, automatically synthesize adaptiveprotocol and macro-shells that optimize “performance”protocol and macro-shells that optimize “performance”according to the current topologyaccording to the current topology
In collaboration with Jim RowsonIn collaboration with Jim RowsonSource: Prof. Alberto Sangiovanni
Model of ComputationModel of Computation
●● Network ofNetwork of CFSMs CFSMs▲▲ Globally asynchronous, locally synchronous (GALS)Globally asynchronous, locally synchronous (GALS)▲▲ Extend the model to loss-less communication (abstract CFSM)Extend the model to loss-less communication (abstract CFSM)▲▲ Communication refined to implementationCommunication refined to implementation
•• Refinement steps:Refinement steps: - preserve desired properties at - preserve desired properties at
each transformation each transformation- - propagate constraints to lowerpropagate constraints to lower levels of abstraction (top- levels of abstraction (top-down).down).
Source: Prof. Alberto Sangiovanni
Abstract CFSMAbstract CFSMAbstractAbstractspecificationspecification
AbstractAbstractcommunicationcommunication
No timingNo timing
Loss-lessLoss-less
••Maximally non-deterministic view of designMaximally non-deterministic view of design••Design progresses by successiveDesign progresses by successive determinization determinization
Source: Prof. Alberto Sangiovanni
CFSM RefinementCFSM Refinement
ConcreteIP module
Concretecommunication
Shell
Source: Prof. Alberto Sangiovanni
DirectionsDirections
◆◆ Energy-efficient architectures for protocolEnergy-efficient architectures for protocolprocessingprocessing▲▲ most effort and results in “data-flow” componentsmost effort and results in “data-flow” components▲▲ complex protocol processing is becoming bottleneckcomplex protocol processing is becoming bottleneck▲▲ instruction processors energy-inefficientinstruction processors energy-inefficient▲▲ CFSM-based architectures attractive from softwareCFSM-based architectures attractive from software
perspectiveperspective
◆◆ Heterogeneous Platforms and their SoftwareHeterogeneous Platforms and their SoftwareOperation EnvironmentOperation Environment
Source: Prof. Alberto Sangiovanni
Protocol DesignProtocol Design
◆◆ SpecificationSpecification▲▲ formally describing what the protocol is supposed to doformally describing what the protocol is supposed to do
◆◆ AbstractionAbstraction▲▲ consistent layering promotes re-use and verificationconsistent layering promotes re-use and verification
◆◆ VerificationVerification▲▲ is the protocol logically consistent?is the protocol logically consistent?
◆◆ Performance EstimationPerformance Estimation▲▲ is the protocol efficient?is the protocol efficient?
◆◆ ImplementationImplementation▲▲ building a system that implements the specificationbuilding a system that implements the specification
Source: Prof. Alberto Sangiovanni
Refinement-based Protocol DesignRefinement-based Protocol DesignMethodologyMethodology
RefinementCFSM modelCFSM modelSimulationSimulation
Formal VerificationFormal Verification
Hardware (VHDL) Hardware (VHDL)
Formal
SystemSystemSpecSpec
Input language
Formal
Software (C)Software (C)
Source: Prof. Alberto Sangiovanni
System-on-Chip and IP-based DesignSystem-on-Chip and IP-based Design
◆◆ Two parts to research:Two parts to research:▲▲ Glue Logic design methodology:Glue Logic design methodology:
▼ Merge Place and Route with Logic Synthesis (Constraint DrivenSynthesis)
▼ Investigate regular circuit fabrics (solve the problem byconstruction paradigm)
▲▲ Interconnect Design Methodology Interconnect Design Methodology (Interface-based(Interface-basedparadigm)paradigm)
▼ Block Encapsulation
Source: Prof. Alberto Sangiovanni
The MethodologyThe Methodology◆◆ Orthogonalize Orthogonalize computation computation and and communicationcommunication◆◆ Plug-and-PlayPlug-and-Play system design system design◆◆ Chip assembled using Chip assembled using IP coresIP cores exchanging data by means exchanging data by means
of a of a communication protocolcommunication protocol◆◆ Interface Logic Blocks (Interface Logic Blocks (the shellsthe shells) ) encapsulateencapsulate and protect and protect
the IP cores (the IP cores (the pearlsthe pearls))◆◆ Assume-Guarantee ReasoningAssume-Guarantee Reasoning is adopted to is adopted to formallyformally
verifyverify IP cores and communication protocols in separate IP cores and communication protocols in separatestepssteps
Work in collaboration with K. Work in collaboration with K. McMillanMcMillan, L. Lavagno and A. Saldanha, L. Lavagno and A. Saldanha
Source: Prof. Alberto Sangiovanni
Latency-InsensitiveLatency-Insensitive Communication CommunicationProtocolProtocol◆◆ Long channels are Long channels are segmentedsegmented by inserting simple by inserting simple
memory stages (memory stages (Relay StationsRelay Stations))◆◆ Channel latencies are considered arbitraryChannel latencies are considered arbitrary◆◆ Requirement on IP coresRequirement on IP cores : :
▲▲ they must bethey must be stallablestallable
◆◆ Micro ShellsMicro Shells : :▲▲ controls stalling mechanismcontrols stalling mechanism
◆◆ Macro ShellsMacro Shells : :▲▲ synchronize data and interface with channelssynchronize data and interface with channels
Source: Prof. Alberto Sangiovanni
The The OrthogonalizationOrthogonalization Approach Approach
MacroShells (the Protocol Interface)Short Communication ChannelsLong Communication Channels
MicroShells (the IP Requirements)
P1
P2
P3
P4
P5
P6
P7
Pearls (the IP Processes)
Source: Prof. Alberto Sangiovanni
Channel SegmentationChannel Segmentation
MacroShells (the Protocol Interface)Short Communication ChannelsLong Communication Channels
MicroShells (the IP Requirements)
P1
P2
P3
P4
P5
P6
P7
Pearls (the IP Processes)
RS
RSRS
RS
RS
RS RS
RS
Relay Stations
Source: Prof. Alberto Sangiovanni
Actual System
Automated Interface SynthesisAutomated Interface Synthesis
nXn HPCDSP
RAM
IR Imager
CODEC
uCntrl
Normalizer
SINCGARS
A/D
Crossbar
RA
MB
US
PC
I
Logic
Hi, I talkPCI and
Hippi
Hi, I talkPCI and
HippiHello, I talkMyrinetand PCI
Hello, I talkMyrinetand PCI
OK, Letstalk PCI
Source: DARPA ISAT Silicon 2010 Study, 1997(Randy Harr, Synopsys)
ConstraintsConstraints
ImplicationsImplications