ARM1156T2 S Architecture
-
Upload
leonchenlayer1 -
Category
Documents
-
view
222 -
download
0
Transcript of ARM1156T2 S Architecture
-
8/8/2019 ARM1156T2 S Architecture
1/38
SUNPLUS
Technology for Easy Living
ARM1156T2ARM1156T2--S ArchitectureS Architecture
IntroductionIntroduction
Leon
Nov. 14, 2006
-
8/8/2019 ARM1156T2 S Architecture
2/38
SUNPLUS
Technology for Easy Living2
OutlineOutline
ARM1156T2ARM1156T2--S ComponentsS ComponentsPipelinesPipelines
Prefetch UnitPrefetch Unit
CoprocessorCoprocessorCache and TCMCache and TCM
Memory Protection UnitMemory Protection Unit
-
8/8/2019 ARM1156T2 S Architecture
3/38
SUNPLUS
Technology for Easy Living3
ARM1156T2ARM1156T2--S ComponentsS Components
ALU / Shifter
Multiplier
-
8/8/2019 ARM1156T2 S Architecture
4/38
SUNPLUS
Technology for Easy Living4
PipelinePipeline
Fe1 instruction fetch, address is issued to memory.
Fe2 memory returns data to core.
Fe3branch prediction
-
8/8/2019 ARM1156T2 S Architecture
5/38
SUNPLUS
Technology for Easy Living5
Prefetch UnitPrefetch Unit
Instructions fetched fromInstructions fetched from Instruction TightlyInstruction Tightly--Coupled Memory (ITCM)Coupled Memory (ITCM)
Instruction CacheInstruction Cache
External memoryExternal memory
Branch PredictionBranch Prediction Branch PredictorBranch Predictor If disable, conditionally execution wait until Execute stage toIf disable, conditionally execution wait until Execute stage to
determine branchdetermine branchstallstall
Pattern History TablePattern History Table 256 Entries256 Entries
ThreeThree--entry circular predicted HW Return Stackentry circular predicted HW Return Stack
-
8/8/2019 ARM1156T2 S Architecture
6/38
SUNPLUS
Technology for Easy Living6
Branch PredictorBranch Predictor
Use Global History prediction schemeUse Global History prediction scheme. TwoTwoPattern History tables, per 256 entriesPattern History tables, per 256 entries
101010000001.
Target address This is some logicfunction that combines
target with history, say, 12
previous branches
For example, reach 102, thisprediction is taken
index
Pattern
History
TableTwo bit counterGlobal History register
N bits
History is based on
taken/not taken of ALL
branches
-
8/8/2019 ARM1156T2 S Architecture
7/38
SUNPLUS
Technology for Easy Living7
Enable/Disable Branch PredictorEnable/Disable Branch Predictor
Enable/Disable Z bit of CP15 Control Register c1 and
Auxiliary c1 DB bits are set to 1
Z bit of CP15 Control Register c1 andAuxiliary c1 DB bits are set to 0
If disable, conditional branches are
predicted not taken.
-
8/8/2019 ARM1156T2 S Architecture
8/38
-
8/8/2019 ARM1156T2 S Architecture
9/38
SUNPLUS
Technology for Easy Living9
Branch Return StackBranch Return Stack
Predict a procedure call instruction as taken, PFU push return address toReturn Stack
Procedure call ARM instruction
BL immediate conditional
BLX immediate unconditional
Thumb Unconditional BL immediate and BLX immediate
PFU Fetch from Return Stack when detect unconditional instruction ARM instruction
MOV pc, r14
ARM and Thumb-2 instruction LDR pc
LDM r13,{..pc..}
BX r14
Thumb POP
-
8/8/2019 ARM1156T2 S Architecture
10/38
SUNPLUS
Technology for Easy Living10
CoprocessorCoprocessor
CP0~CP15 CP10 VFP control
CP11 VFP control
CP14 Debug
CP15 System control.
User Lord-Store architecture to perform coprocessorinternal operations, save/load internal registers datato/from memory, to/from ARM core registers
CP15 Configuration of cache, TCM, MMU, MPU
-
8/8/2019 ARM1156T2 S Architecture
11/38
SUNPLUS
Technology for Easy Living11
Unified Instruction and DataUnified Instruction and Data
CacheCache
Flexible to adjust portion of instruction and data region
-
8/8/2019 ARM1156T2 S Architecture
12/38
SUNPLUS
Technology for Easy Living12
Harvard ArchitectureHarvard Architecture
Instruction fetch
and data access
in a single clockcycle
-
8/8/2019 ARM1156T2 S Architecture
13/38
SUNPLUS
Technology for Easy Living13
Cache CharacteristicsCache Characteristics
One-1KB, two-2KB, four-other cache sizeway set associative
Cache line size:32 bytes
Cache way size support maximum is 16KB
minimum is 1KB
Unique values of cache lines within a set
-
8/8/2019 ARM1156T2 S Architecture
14/38
SUNPLUS
Technology for Easy Living14
Cache OrganizationCache Organization
31 34 0
Data
indexSet indexTag 910
== = =
MUX
HIT data
Words:4 words
64
entries
4KB, 4 way
-
8/8/2019 ARM1156T2 S Architecture
15/38
SUNPLUS
Technology for Easy Living15
xxxx31 45 0
Data
indexSet indexTag
22 4
910
0x00000A24
0x00000624
0x00000224 0x3FF
0x000
0x224
0x3FF
0x000
0x224
0x3FF
0x000
0x224
0x3FF
0x000
0x224
-
8/8/2019 ARM1156T2 S Architecture
16/38
SUNPLUS
Technology for Easy Living16
Write BufferWrite Buffer
FIFO with fast memory
Writes to external memory
Nonblocking cache
If a read access which address is the same
with one in the Write Buffer, the read is
blocked until writes drain to main memory
-
8/8/2019 ARM1156T2 S Architecture
17/38
SUNPLUS
Technology for Easy Living17
Cache PolicyCache Policy
Cache line allocation policy Read allocation Read write allocation
ARM1156T2-S only support read allocation
Write Policy, control by Memory Attribute (CB bits) Writethrough
Writeback Set dirty bit
Cache line replacement policy, CP15, c1, Control Register When miss, select victim
Round-robin
Pseudorandom
Least recently used (ARM do not support)
-
8/8/2019 ARM1156T2 S Architecture
18/38
SUNPLUS
Technology for Easy Living18
Invalidate and CleanInvalidate and Clean
Clear: clear valid bit in the affected cacheline
Alias to flush
Clean: write the cache lines with dirty bitto main memory and clear dirty bit
No need clean operation for Instruction
cache
-
8/8/2019 ARM1156T2 S Architecture
19/38
SUNPLUS
Technology for Easy Living19
Cache lock downCache lock down
Avoid miss penalty Lockable at a granularity of a cache way
Critical code or data
Vector interrupt ISR
Algorithm used extensively
Variables referenced intensively
If cache is flushed, must rerun to restore
-
8/8/2019 ARM1156T2 S Architecture
20/38
SUNPLUS
Technology for Easy Living20
Cache miss handleCache miss handle
If all ways are locked while cache miss ARM architecture
Unpredictable behavior
ARM 1156T2-S
Evict the cache line in Way 0 as if Way 0 is notlocked
If cache is disabled, an read/write arise inthe address range of cache. ThisUnexpected hit is igonred
-
8/8/2019 ARM1156T2 S Architecture
21/38
SUNPLUS
Technology for Easy Living21
TightlyTightly--coupled memory(TCM)coupled memory(TCM)
Low-latency memory As a part of physical memory map, contiguous memory space
Hold critical routines Interrupt Service Routine
Critical tasks
Interrupt stacks
Data intensively referenced TCM support size
Maximum is 256 KB
Minimum is 4KB
TCM information CP15 c0 TCM status Register
TCM and cache are independent ITCM and DTCM region can not overlap
-
8/8/2019 ARM1156T2 S Architecture
22/38
SUNPLUS
Technology for Easy Living22
TCM(Cont.)TCM(Cont.)
The TCM region overrides memory typeattributes of the MPU and all addresses withinthe TCM space are treated as Normal, Non-Shared memory
If the peripheral port region overlaps the TCM Treated as: Device, non-shared, and TCM
Access to the region, route to TCM, not peripheralport
Configurable variables
Base address Size
-
8/8/2019 ARM1156T2 S Architecture
23/38
SUNPLUS
Technology for Easy Living23
Access TCM V.S. CacheAccess TCM V.S. Cache
-
8/8/2019 ARM1156T2 S Architecture
24/38
SUNPLUS
Technology for Easy Living24
Memory Protection UnitMemory Protection Unit
Support to 16 regions Configuration options:
region base address
region size region attributes
region access permissions
If MPU disable, no access permission is
checked
-
8/8/2019 ARM1156T2 S Architecture
25/38
SUNPLUS
Technology for Easy Living25
MPU(Cont.)MPU(Cont.)
Region base address Region-sized boundary if not follow, Unpredictable behavior
Region size 32 bytes to 4GB
Region attributes Memory Type (Strongly ordered, Device, or Normal)
Shared/Non-Shared Non-Cacheable
Write-through Cacheable
Write-back Cacheable
Access permission User and privileged mode
Read/Write
-
8/8/2019 ARM1156T2 S Architecture
26/38
SUNPLUS
Technology for Easy Living26
Overlap examplesOverlap examples
Region 1 Base is 0x0000
Privileged mode full
access, user mode
read-only
Region 2
Base is 0x3000
User mode full access
only
-
8/8/2019 ARM1156T2 S Architecture
27/38
SUNPLUS
Technology for Easy Living27
Overlap examples(2)Overlap examples(2)
Region 1 Base is 0x0
Full access by both
modes
Region 2 Base is 0x0
No access
-
8/8/2019 ARM1156T2 S Architecture
28/38
SUNPLUS
Technology for Easy Living28
Memory map at resetMemory map at reset
2 Giga
1 Giga
-
8/8/2019 ARM1156T2 S Architecture
29/38
SUNPLUS
Technology for Easy Living29
MPU EnableMPU Enable
Before enable MPU setting up at least one memory region Clean and invalidate the data cache.
Invalidate the instruction caches.
Address generation from Load Store Unit or Prefetch Unit Not match in configured memory region
Background fault generation, Fault Statue Register is filled Alignment fault
Background fault
Permission fault
Matching one memory region No Permissionmemory abort
Determine is cached, uncached, or shared
The highest priority memory region is applied
-
8/8/2019 ARM1156T2 S Architecture
30/38
SUNPLUS
Technology for Easy Living30
MPU DisableMPU Disable
Before disable MPU setting up at least one memory region
Clean and invalidate the data cache.
Invalidate the instruction caches.
No access permission check, no abort generation Memory map is default
Instruction and data prefetch operations work asnormal
Access to TCM work as normal
-
8/8/2019 ARM1156T2 S Architecture
31/38
SUNPLUS
Technology for Easy Living31
Memory attributes and typesMemory attributes and types
Mutually exclusive type attributes Strongly Ordered
Device
Normal
Shared
access by multiple processors
Non-shared
access by one single processor
c6, Region Control Register S(Shared) bit only apply toNormal memory, not Device or Strongly Order memory
-
8/8/2019 ARM1156T2 S Architecture
32/38
SUNPLUS
Technology for Easy Living32
Strongly Ordered memoryStrongly Ordered memory
Access to memory marked as StronglyOrdered acts as a memory barrier to all
other explicit accesses from that processor
Address marked as Strongly ordered Noncacheable
shared
-
8/8/2019 ARM1156T2 S Architecture
33/38
SUNPLUS
Technology for Easy Living33
Memory BarriersMemory Barriers
A class of instructions which cause a CPU toenforce an ordering constraint on memoryoperations issued before and after the barrierinstruction.
Performance optimizations can result in out-of-
order-execution, Ex.: load and store
Memory operation reordering normally goesunnoticed within a single task, but causesunpredictable behaviour in multi-tasks and device
drivers unless carefully controlled
-
8/8/2019 ARM1156T2 S Architecture
34/38
SUNPLUS
Technology for Easy Living34
Memory Barriers(Cont.)Memory Barriers(Cont.)
CP15, c7 Data Memory Barrier
ensures that all explicit memory transactionsoccurring in program order before this instruction arecompleted
Drain Write Buffer
Flush Prefetch Buffer
Invalidate the I-Cache
Clean D-Cache
-
8/8/2019 ARM1156T2 S Architecture
35/38
SUNPLUS
Technology for Easy Living35
NormalNormal
Cacheable write-through Cacheable write-back
Noncacheable.
Shared, non-shared ARM1156T2-S do not cache shareable locations
If a memory region is covered by TCM, always
non-shared. If it is marked as shared, it results in
Unpredictable behavior
-
8/8/2019 ARM1156T2 S Architecture
36/38
SUNPLUS
Technology for Easy Living36
DeviceDevice
A region with Device attribute is not heldin a cache
In the ARM1156T2-S processor Non-
Shared Device attribute is assigned to theperipheral Port and Shared Device
attribute is assigned to the system bus
-
8/8/2019 ARM1156T2 S Architecture
37/38
SUNPLUS
Technology for Easy Living37
Access PermissionAccess Permission
The access permissions are determined by the AP[2:0]
bits in the CP15, c6,Data Access Permission Registers.
-
8/8/2019 ARM1156T2 S Architecture
38/38