Buses - Unictvcatania/.../semm_06-07/DOWNLOAD/01_noc_buses.pdfBus Arbitration Since only one bus...

32
1 Maurizio Palesi 1 Buses Buses Maurizio Palesi Maurizio Palesi 2 Introduction Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single shared channel Micro- controller Digital Signal Processor Input/ Output Device Memory Bus

Transcript of Buses - Unictvcatania/.../semm_06-07/DOWNLOAD/01_noc_buses.pdfBus Arbitration Since only one bus...

1

Maurizio Palesi 1

BusesBuses

Maurizio Palesi

Maurizio Palesi 2

IntroductionIntroductionBuses are the simplest and most widelyused interconnection networksA number of modules is connected via asingle shared channel

Micro-controllerMicro-

controllerDigitalSignal

Processor

DigitalSignal

Processor

Input/OutputDevice

Input/OutputDevice

MemoryMemory

Bus

2

Maurizio Palesi 3

Bus Bus PropertiesPropertiesSerialization

Only one component can send a message atany given timeThere is a total order of messages

Module1

Module1

Module2

Module2

Module3

Module3

Module4

Module4

Bus1 2

Maurizio Palesi 4

Bus Bus PropertiesPropertiesBroadcast

A module can send a message to several othercomponents without an extra cost

Module1

Module1

Module2

Module2

Module3

Module3

Module4

Module4

Bus

3

Maurizio Palesi 5

Bus HardwareBus HardwarePrinciple for hardware to access the bus

Bus Transmit: ET activeBus Receive: ER active

Maurizio Palesi 6

CyclesCycles, , Messages and TransactionsMessages and TransactionsBuses operate in units of cycles, messages andtransactions

Cycles: A message requires a number of cycles to besent from sender to receiver over the busMessage: Logical unit of information (a read messagecontains an address and control signals for read)Transaction: A transaction consists of a sequence ofmessages which together form a transaction (a memoryread requires a memory read message and a reply withthe requested data)

4

Maurizio Palesi 7

Synchronous Synchronous BusBusIncludes a clock in the control linesA fixed protocol for communicationthat is relative to the clockAdvantage: involves very little logicand can run very fastDisadvantages:

Every device on the bus must run atthe same clock rateTo avoid clock skew, they cannot belong if they are fast

Maurizio Palesi 8

Asinchronous Asinchronous BusBusIt is not clockedIt can accommodate a widerange of devicesIt can be lengthened withoutworrying about clock skewIt requires a handshakingprotocol

1. Master puts address on bus andasserts READ when address isstable2. Memory puts data on bus andasserts ACK when data is stable3. Master deasserts READ when datais read4. Memory deasserts ACK

5

Maurizio Palesi 9

Bus Bus ArbitrationArbitrationSince only one bus master can usethe bus at a given time busarbitration is usedAn arbiter collects the requests ofall bus masters and gives only onemodule the right to access the bus(bus grant)

Maurizio Palesi 10

Importance of ArbitersImportance of ArbitersArbiters are not only used in bus-system,but everywhere where several devicesrequest shared resourcesIn network-on-chips arbitration is forinstance needed, if two or more packetswant to enter the same channel

6

Maurizio Palesi 11

Arbiter InterfacesArbiter InterfacesThis arbiter interface can be used to give abus grant for a fixed number of cycles

(a): 1 cycle(b): 4 cycles

r0r1…rn-1

g0g1…

gn-1

Maurizio Palesi 12

Arbiter InterfacesArbiter InterfacesThis arbiter allows for variable length grantsThe grant is hold as long as the “hold”-line is assertedIn cycle 2 requester 0 gets the bus for 3 cyclesIn cycle 5 requester 1 gets the bus for 2 cyclesIn cycle 7 requester 1 gets the bus for one cycle

r0h0…rn-1hn-1

g0g1…

gn-1

7

Maurizio Palesi 13

FairnessFairnessFairness is a key property of an arbiterSome definitions

Weak fairness: Every request is eventually servedStrong fairness: Requests will be served equally oftenWeighted “strong” fairness: The number of timesrequester i is served is equal to its weight wi

FIFO fairness: Requests are served in the order therequests have been made

Maurizio Palesi 14

Local Fairness vsLocal Fairness vs. . Global FairnessGlobal FairnessEven if an arbiter is locally fair, a system with severalarbiters employing that arbiter may not be fair

Though each arbiter Ai allocate 50% of their bandwidth toits two inputs, r0 only gets 12.5% of the total bandwidth,while r3 gets 50%

8

Maurizio Palesi 15

FixedFixed--Priority ArbiterPriority ArbiterA fixed-priority arbiter can be constructed asan iterative circuitEach cell receives a request input ri and acarry input ci and generates a grant output giand a carry output ci+1

The resulting arbiter is not fair, since acontinuously asserted request r0 means thatnone of the other requests will ever beserved!

Maurizio Palesi 16

Variable-Priority ArbitersVariable-Priority ArbitersOblivious ArbiterRound-Robin ArbiterGrant-Hold CircuitWeighted Round-Robin Arbiter

9

Maurizio Palesi 17

Fair Fair ArbitersArbitersA fair arbiter can begenerated by changingthe priority from cycle tocycleDepending on the prioritygeneration, differentarbitration schemes anddegrees of fairness canbe achieved

Maurizio Palesi 18

Fair Fair ArbitersArbitersOblivious Arbiters

If pi is generated without knowledge of riand gi, the result is an oblivious(unconscious) arbiterExamples are

Randomly generated pi

Rotating priorities (by shiftregister)

10

Maurizio Palesi 19

Oblivious ArbitersOblivious ArbitersOblivious arbiters provide

weak fairnessbut not strong fairness (i.e. if r0and r1 are constantly asserted)

1

1

0

0

1

0

0

0

0

0

0

0

1

1

0

1

Maurizio Palesi 20

Round-Round-Robin ArbiterRobin ArbiterA round-robin arbiter achieves strong fairness

A request that was just served gets the lowest priority

One-bit round-robin arbiter

11

Maurizio Palesi 21

Round-Robin ArbiterRound-Robin ArbiterAny g is low

Maurizio Palesi 22

Round-Robin ArbiterRound-Robin ArbiterOne g is high

12

Maurizio Palesi 23

Grant-Grant-Hold CircuitHold CircuitExtends the duration of a grant

As long as hold is asserted further arbitration isdisabled

Maurizio Palesi 24

Grant-Grant-Hold CircuitHold CircuitAn example for all hi=0

13

Maurizio Palesi 25

Grant-Grant-Hold CircuitHold CircuitAn example for holding a grant

Maurizio Palesi 26

Weighted Weighted Round-Round-Robin ArbiterRobin ArbiterA weighted round-robin arbiter allows to give requesters a largernumber of grants than other requesters in a controlled fashionIf three devices have the weight 1,2,3 they get 1/6, 1/3 and 1/2 of thegrantsThe preset line is activated periodically after N (here 6 cycles) to loadthe counter with its weightIf some arbiters do not issue any requests during that interval, theshared resource will remain idle until the next preset cycle

If Count≠0 the and gate isenabled

Decremented each time agrant is received

Initialized periodically everyW cycles

W=Σwi

14

Maurizio Palesi 27

Matrix ArbiterMatrix ArbiterMatrix W=[wij] of weightswij=1 if request i take priority over j

ij wij

gj

gj

Maurizio Palesi 28

Matrix ArbiterMatrix Arbiterwij = ¬wij ∀i≠jA requester will be grantedthe resource if no otherhigher priority requester isbidding for the same resourceOnce a requester succeedsin being granted a resource,its priority is updated and setto be the lowest among allrequestersRequest i granted

[i,*] ← 0[*,i] ← 1

15

Maurizio Palesi 29

Matrix ArbiterMatrix ArbiterA matrix arbiter implements a leastrecently served priority scheme bymaintaining a triangular array ofstate bits wij for all i < jThe Matrix arbiter is very goodsuited for a small number of inputs,since it is fast, easy to implementand provides strong fairness!

Maurizio Palesi 30

Queuing ArbiterQueuing ArbiterA queuing arbiter provides FIFO fairnessIt assigns each request a time stamp when it is assertedThe request with the earliest time stamp receives the grant

16

Maurizio Palesi 31

Low Low Performance Bus Performance Bus ProtocolProtocolWithout a special bus protocol the bus is not efficientlyusedIn the example module 2 requests the bus in cycle 2, butmust wait until cycle 6 to receive the grant

Maurizio Palesi 32

Bus Bus PipeliningPipeliningA memory access consists of several cycles (includingarbitration)Since the bus is not used in all cycles, pipelining can beused to increase the performance

Only one transaction canReceive the grant during a given cycleUse the bus during a given cycle

17

Maurizio Palesi 33

Bus Bus PipeliningPipeliningPipelining leads to an efficient use of the bus

Stalls are inserted since only one instance can use the busSometimes (cycle 12) two transactions can overlapHowever this cannot be done in cycle 5 (2. Write) since otherwiseRPLY and ACK would overlap in cycle 6!

Maurizio Palesi 34

SplitSplit--Transaction Transaction BusBusIn a split-transaction bus a transaction issplitted into a two transactions

”request”-transaction”reply”-transaction

Both transactions have to compete for thebus by arbitration

18

Maurizio Palesi 35

SplitSplit--Transaction Transaction BusBus

R1W1

W2R2

R1’R3

W1’R4

W2’R2’

R3’R4’

Maurizio Palesi 36

SplitSplit--Transaction Transaction BusBusThe advantages of the split-transaction bus are evident, ifthere is a variable delay for requests, since thentransactions cannot overlap

19

Maurizio Palesi 37

Burst MessagesBurst MessagesThere is a considerable amount of overhead in a bustransaction

ArbitrationAddressingAcknowledgement

Efficiency = Transmitted Words / Message Size = 1/3

Maurizio Palesi 38

Burst MessagesBurst MessagesThe overhead can be reduced, if messages are sent asblocks (bursts)

Efficiency = Transmitted Words / Message Size = 2/3

20

Maurizio Palesi 39

Burst MessagesBurst MessagesThe longer the burst, the better the efficiencyBUT

Other bus masters have to wait, which may be unacceptable inmany systems (Real-Time)

Possible solutionMaximum length for a burstInterrupt of long messages

Restart or Resume

Maurizio Palesi 40

Embedded BussesEmbedded BussesCurrent system-on-chips are advanced enough toneed a hierarchy of bussesA new set of bus standards have been defined tobe used in SoCs, e.g.

ARM AmbaSTBusAltera Avalon...

These busses allow for higher performance thantraditional Tri-State busses

21

Maurizio Palesi 41

AMBA AMBA SpecificationsSpecificationsThe AMBA specification defines an on-chipcommunications standard for designing high-performance embedded micro-controllersThree buses are defined

Advanced High-Performance Bus (AHB)Advanced System Bus (ASB)Advanced Peripheral Bus (APB)

Maurizio Palesi 42

System based on anSystem based on an AMBA Bus AMBA BusAn AMBA system typically contains a high speed bus (ASBor AHB) for CPU, fast memory and DMA and a bus forperipherals (APB), which is connected via a bridge to thehigh-speed bus

22

Maurizio Palesi 43

AMBA AMBA BusesBusesAMBA AHB (new standard)

High PerformancePipelined OperationMultiple Bus MastersBurst TransfersSplit Transactions

AMBA ASB (older standard)High PerformancePipelined OperationMultiple Bus Masters

AMBA APBLow PowerLatched Address and ControlSimple InterfaceSuitable for many peripherals

Maurizio Palesi 44

AMBA AHB AMBA AHB SystemSystemAHB Master

A bus master is able to initiate read and writeinformations by providing address and controlinformation. Only one bus master can use the bus at thesame time

AHB SlaveA bus slave responds to a read and write operationwithin a given address-space range. The bus slavesignals back to the active bus master the success,failure or waiting of the data transfer

23

Maurizio Palesi 45

AMBA AHB AMBA AHB SystemSystemAHB Arbiter

The bus arbiter ensures that only one bus master at a time isallowed to initiate data transfers. Even though the arbitrationprotocol is fixed, any arbitration algorithm, such as highest priorityor fair access can be implemented depending on the applicationrequirementsAn AHB includes only one arbiter

AHB DecoderThe AHB decoder is used to decode the address of each transferand provide a select signal for the slave that is involved in thetransferA single centralized decoder is required in all AHB implementations

Maurizio Palesi 46

AMBA AHB Bus AMBA AHB Bus InterconnectionInterconnectionAHB Protocol is based on a central multiplexerinterconnection schemeAll bus masters send their request in form ofaddress and control signalsThe arbiter chooses one master. The address andcontrol signals are routed to all slavesThe decoder selects the signals from the slavethat is involved in the transfer with the bus master

24

Maurizio Palesi 47

AMBA AHB Bus AMBA AHB Bus InterconnectionInterconnection

Maurizio Palesi 48

AMBA AMBA Address Decoding SystemAddress Decoding System

25

Maurizio Palesi 49

Basic TransferBasic TransferAn AHB transfer consists of two distinct sections

The address phase, which lasts only a single cycleThe data phase, which may require several cycles. This is achieved usingthe HREADY signal

Maurizio Palesi 50

Transfer with wait statesTransfer with wait states

26

Maurizio Palesi 51

Multiple TransfersMultiple TransfersWhen a transfer is extended in this way it will have the side-effect ofextending the address phase of the following transfer

Maurizio Palesi 52

Burst Signal EncodingBurst Signal EncodingBoth incrementing and wrapping bursts aresupported in the protocol

Incrementing bursts access sequentiallocationsFor wrapping bursts, if the start address of thetransfer is not aligned to the total number ofbytes in the burst (size x beats) then theaddress of the transfers in the burst will wrapwhen the boundary is reached

27

Maurizio Palesi 53

Burst Signal EncodingBurst Signal Encoding

HBURST[2:0] Type Description000 SINGLE Single transfer001 INCR Incrementing burst of unspecified length010 WRAP4 4-beat wrapping burst011 INCR4 4-beat incrementing burst100 WRAP8 8-beat wrapping burst101 INCR8 8-beat incrementing burst110 WRAP16 16-beat wrapping burst111 INCR16 16-beat incrementing burst

Maurizio Palesi 54

Four-beat incrementing burstFour-beat incrementing burst

28

Maurizio Palesi 55

Undefined-length burstsUndefined-length bursts

Maurizio Palesi 56

Four-beat wrapping burstFour-beat wrapping burst

29

Maurizio Palesi 57

ArbitrationArbitrationThe arbitration mechanism is used to ensure that only one master hasaccess to the bus at any one timeWhen a master is granted the bus and is performing a fixed lengthburst it is not necessary to continue to request the bus in order tocomplete the burst

Maurizio Palesi 58

Granting access with wait statesGranting access with wait states

30

Maurizio Palesi 59

Bus master grant signalsBus master grant signals

Arbiter

Maurizio Palesi 60

Split TransfersSplit TransfersSPLIT transfers improve the overallutilization of the bus

separating (or splitting) the operation of themaster providing the address to a slave fromthe operation of the slave responding with theappropriate data

31

Maurizio Palesi 61

Split TransfersSplit TransfersWhen a transfer occurs the slave can decide toissue a SPLIT response if it believes the transferwill take long timeThis signals to the arbiter that the master which isattempting the transfer should not be grantedaccess to the bus until the slave indicates it isready to complete the transferThe arbiter is responsible for observing theresponse signals and internally masking anyrequests from masters which have been SPLIT

Maurizio Palesi 62

Split Transfer SequenceSplit Transfer Sequence1. The master starts the transfer in an identical way to any othertransfer and issues address and control information

2. If the slave is able to provide data immediately it may do so. If theslave decide that it may take a number of cycles to obtain the data itgives a SPLIT transfer response

3. The arbiter grants other masters use of the bus

32

Maurizio Palesi 63

Split Transfer SequenceSplit Transfer Sequence4. When the slave is ready to complete the transfer it asserts theappropriate bit of the HSPLITx bus to the arbiter to indicate whichmaster should be regranted access to the bus5. The arbiter observes the HSPLITx signals on every cycle, and whenany bit of HSPLITx is asserted the arbiter restores the priority of theappropriate master6. Eventually the arbiter will grant the master so it can re-attempt thetransfer. This may not occur immediately if a higher priority master isusing the bus7. When the transfer eventually takes place the slave finishes with anOKAY transfer response

Maurizio Palesi 64

Handover after split transferHandover after split transfer