Introduction to the C2000 Control Law Accelerator Part 1
description
Transcript of Introduction to the C2000 Control Law Accelerator Part 1
Ver 6, 08 April 2009
Slide 1
Introduction to the C2000 Control Law Accelerator
Part 1Lori HeustessC2000 ApplicationsApril 8, 2009
Ver 6, 08 April 2009
Slide 2
Introduction: What is it? Why is it?Architecture:Floating-Point Format, Tasks, CLA Execution Flow, Time Slicing, Register Set, Program and Data Bus,
Memory and Register AccessInstructions:
Format, Addressing Modes, Types of InstructionsParallel Instructions, Status Flags
Pipeline: Pipeline Stages, Affects on InstructionsCLA Compared to C28x+FPU CLA in a Control System:
Code Partitioning, “Just in Time” ADC SamplingCode Development and Debug:
Anatomy of CLA Code, Initialization, Code Debug
Session Agenda
Ver 6, 08 April 2009
Slide 3
Introduction: What is it? Why is it?Architecture:Floating-Point Format, Tasks, CLA Execution Flow, Time Slicing, Register Set, Program and Data Bus,
Memory and Register AccessInstructions:
Format, Addressing Modes, Types of InstructionsParallel Instructions, Status Flags
Pipeline: Pipeline Stages, Affects on InstructionsCLA Compared to C28x+FPU CLA in a Control System:
Code Partitioning, “Just in Time” ADC SamplingCode Development and Debug:
Anatomy of CLA Code, Initialization, Code Debug
Session Agenda
Ver 6, 08 April 2009
Slide 4
C28xCPU
3.3V
12-bitADC
CMP
HighRes
PWMCLA
An independent 32-bit floating-point math accelerator
An independent 32-bit floating-point math accelerator
What is the Control Law Accelerator (CLA)?
Operates independently of the C28x CPU Clocked at the CPU frequency (SYSCLKOUT) independent register set, memory bus structure and processing unit Direct access to ePWM+HRPWM, ADC result and comparator registers Low interrupt response time – no nesting of tasks Can read an ADC result “just-in-time” Execution of algorithms in parallel with the C28x CPU Executes time-critical control loops concurrently with the main CPU
Ver 6, 08 April 2009
Slide 5
C28xCPU
3.3V
12-bitADC
CMP
HighRes
PWMCLA
An independent 32-bit floating-point math accelerator
An independent 32-bit floating-point math accelerator
What is the Control Law Accelerator (CLA)?
Fully programmable: IEEE 32-bit floating-pointEasier to code than fixed-point Inherently more robustRemoves scaling and saturation burden.Sign inversion problems go awaySupports instructions to convert fixed-point to float when reading in the value (ex:
MIU16TOF32)
Ver 6, 08 April 2009
Slide 6
C28xCPU
3.3V
12-bitADC
CMP
HighRes
PWMCLA
An independent 32-bit floating-point math accelerator
An independent 32-bit floating-point math accelerator
Benefits of the CLA
Reduced sample-to-output delay (& jitter)
Faster system response & higher MHz control loops
Improved system robustness (IEC60730, SIL-3)
Free-up C28x CPU for other tasks (comms, diagnostics)
Automotive,White-goods
All Applications
Key Drivers
DigitalPower
Applications
Improved support for multi-channel (phase/freq) loops
Ver 6, 08 April 2009
Slide 7
32-bit CLA 60MHz
Data0RAM
2KByte
ProgRAM
8KByte
Data1RAM
2KByteSecure
MsgRAM
256Byte
InterruptSleep
32-bit C28-CPU60MHz
DAC
3 xComp3 x
Comp
DAC
3 x Comp
PICCOLO Device With CLA (F2803x series)
GPIO
Mux
SCIFLASH64/128KByte
4/8 sectors
Per
Bus
3xDAC10-bit
M0,M1RAM
4KByte
EPWM1 HRPWM
Per
Bus
SPI
I2C
CAN
EPWM2 HRPWM
ECAP
EQEP
OTP 2KByte
Secure
BootROM
32b32b 32b
OSC110MHz
LIN (SCI)
OSC210MHz
PLL
WD
LPM
mux
EXTXTAL
GPIOMUXXCLKIN
EPWM3 HRPWM
EPWM4 HRPWM
GPIO0
GPIOx
Ax
X1
X2
+
-
Interrupt
2 * SPI
EPWM5 HRPWM
EPWM6
EPWM7
POR/BOR XRSn
VSS
VREGENZ
VDD (core voltage)VDDIO
Digital Power VREG
3.3V +/-10%
VDDA
VSSAAnalog Power
3.3V +/-10%
ADC12-bit2 S/H
4.6MSPS
AIO
Mux
Per Bus
Bx
L0RAM
4KByte
3 External InterruptsJTAG
HRPWM
HRPWM
Ver 6, 08 April 2009
Slide 8
Introduction: What is it? Why is it?Architecture:Floating-Point Format, Tasks, CLA Execution Flow, Time Slicing, Register Set, Program and Data Bus,
Memory and Register AccessInstructions:
Format, Addressing Modes, Types of InstructionsParallel Instructions, Floating-Point Flags
Pipeline: Pipeline Stages, Affects on InstructionsCLA Compared to C28x+FPU CLA in a Control System:
Code Partitioning, “Just in Time” ADC SamplingCode Development and Debug:
Anatomy of CLA Code, Initialization, Code Debug
Session Agenda
Ver 6, 08 April 2009
Slide 9
IEEE Single-Precision Floating-Point Format
SS EE M M
0 Positive or Negative Zero001
0 Positive or Negative Values* 0–0x7FFFF1–2541
Not a Number (NaN)Non-Zero255 (max)10
0 Positive or Negative Infinity0255 (max)1
Denormalized NumberNon-Zero010
*Normal Positive and Negative Values are Calculated as: ( -1 ) s x 2 (E-127) x 1.M+/- ~1.7 x 10 -38 to +/- ~3.4 x 10 +38
The normalized IEEE numbers have a hidden 1. Thus the equivalent signed integer resolution is the number of mantissa bits + sign + 1
The normalized IEEE numbers have a hidden 1. Thus the equivalent signed integer resolution is the number of mantissa bits + sign + 1
ValueMES
23-bit Mantissa (Implicit Leading Bit + Fraction Bits)
8-bit Exponent (Biased)
1 Sign Bit (0 = Positive, 1 = Negative)
Ver 6, 08 April 2009
Slide 10
IEEE Single-Precision Floating-Point Format
Most Widely Used Standard for Floating PointStandard number formats, special values (NaN, infinity)
Rounding modes & floating-point operations
Used on many CPUs including C67x, C28x+FPU
Note: The C3x used a different format. Note: The C3x used a different format.
These formats are commonly handed this way on embedded processors.These formats are commonly handed this way on embedded processors.
Simplifications for the CLA (Same as C28x+FPU):Flags & compare operations: Negative zero is treated as positive zero
Denormalized values are treated as zero
Not-a-number (NaN) is treated as infinity
IEEE
754IEEE
754
Round-to-zero mode supported (truncate)
Round-to-nearest mode supported (even)
Ver 6, 08 April 2009
Slide 11
Task Triggers From PeripheralInterrupts or Software
Task Triggers From PeripheralInterrupts or Software
INT11INT12
CLA1_INT1 to CLA1_INT8
LVF, LUF
Registers
MessageRAMs
CLA to CPU
CLA Data Write Data Bus
CLA Data Write Addr Bus
CLA Data Read Addr Bus
CLAProgramMemory
Main CPU Data Read Bus
CLA ExecutionRegisters
CLA ConfigurationRegisters
MR0 (32)MR1 (32)MR2 (32)MR3 (32)
MAR0MAR1
MPC
MSTF (32)
to
MVECT1
MVECT8
MCTL
MPISRCSEL1
MIFRMICLRMIFRC
MIOVFMICLROVF
MIERMIRUN
C28x CPU
PIE
Main C28x CPU Bus
Data RAMs
CPU to CLA
ADCResult
ePWMHRPWM
COMP
Main CPU Read/Write Data Bus
Map to CPU or CLA Space
Map to CPU or CLA Space
CLA Prog Addr Bus
CLA Prog Data BusCLA Data Read Data Bus
MEALLOW
MEMCFG
CL
A P
rog
ram
Bu
s
RAM0
RAM1
CL
A D
ata
Bu
s
ADCINT1 to ADCINT8
EPWM1_INT to EPWM7_INT
T0INT(CPU Timer 0)
MPERINT1to MPERINT8
IACK #16bit
Ver 6, 08 April 2009
Slide 12
What is a CLA Task?
CLA Task: CLA assembly code routine
CLA Supports 8 interrupts (Task1 to Task8)The start address of the task is configurable (MVECTx) The end address is marked with an MSTOP instructionExecuted by the CLA in response to an interrupt event
CLA Task: CLA assembly code routine
CLA Supports 8 interrupts (Task1 to Task8)The start address of the task is configurable (MVECTx) The end address is marked with an MSTOP instructionExecuted by the CLA in response to an interrupt event
Tasks can also be started via the main CPU’s IACK instructionFor example: IACK #0x0003 will flag Task1 and Task2
Tasks can also be started via the main CPU’s IACK instructionFor example: IACK #0x0003 will flag Task1 and Task2
The task executed depends on the interrupt received:Interrupt 1 = Task1: ADCINT1 or EPWM1_INT (Highest Priority)Interrupt 2 = Task2: ADCINT2 or EPWM2_INT…Interrupt 7 = Task7: ADCINT7 or EPWM7_INTInterrupt 8 = Task8: ADCINT8 or CPU Timer 0 (Lowest Priority)
Once a task begins it runs to completion (no task/interrupt nesting)
The task executed depends on the interrupt received:Interrupt 1 = Task1: ADCINT1 or EPWM1_INT (Highest Priority)Interrupt 2 = Task2: ADCINT2 or EPWM2_INT…Interrupt 7 = Task7: ADCINT7 or EPWM7_INTInterrupt 8 = Task8: ADCINT8 or CPU Timer 0 (Lowest Priority)
Once a task begins it runs to completion (no task/interrupt nesting)
Ver 6, 08 April 2009
Slide 13
CLA Time Slicing
CLA Task 1 CLA Task 2 CLA Task N
CPU Task 1
CPU Task 2
CPU Task 1
CPU Task 4
CPU Task 3
CLA Task 1 CLA Task 2
CPU Task 3
The CLA performs multiple tasks using the "Time Slicing" method
The main CPU handles other system tasks, the two work in parallel
Communication between CPU & CLA is via shared RAM
The CLA performs multiple tasks using the "Time Slicing" method
The main CPU handles other system tasks, the two work in parallel
Communication between CPU & CLA is via shared RAM
Ver 6, 08 April 2009
Slide 14
CLA ExecutionRegisters
CLA ConfigurationRegisters
MR0 (32)MR1 (32)MR2 (32)MR3 (32)
MAR0MAR1
MPC
MSTF (32)
to
MVECT1
MVECT8
MCTL
MPISRCSEL1
MIFRMICLRMIFRC
MIOVFMICLROVF
MIERMIRUN
MEMCFG
CLA Register Set
Four 32-bit Result Registers MR0 – MR3
MSTF: Status RegisterZero, negative, overflow, underflowRounding modeRPC: Return PC MEALLOW
Two 16-bit Auxiliary RegistersMAR0, MAR1Used for indirect addressing
MPC: 12-bit Program CounterOffset from the start of CLA program memoryIndicates instruction in the D2 phase
Eight Interrupt (Task) VectorsMVECT1 to MVECT8Offset from the start of CLA Program Memory to the beginning of the task
Interrupt/Task Source SelectionMPISRCSEL1: Task1: ADCINT1 or EPWM1_INT Task2: ADCINT2 or EPWM2_INT …. Task7: ADCINT7 or EPWM7_INT Task8: ADCINT8 or CPU Timer 0
MIER: Interrupt enable/disableMIRUN: Which task is running
Interrupt/Task ControlMIFR: FlagMICLR: ClearMIFRC: ForceMIOVF: Overflow flagMICLROVF: Overflow clear
Configuration and ControlMEMCFG: Memory configMCTL: CLA control
CLA Execution Registers:CSM ProtectedMain CPU has Read Only Access
CLA Execution Registers:CSM ProtectedMain CPU has Read Only Access
CLA Configuration Registers:CSM and EALLOW ProtectedMain CPU has Read and Write Access
CLA Configuration Registers:CSM and EALLOW ProtectedMain CPU has Read and Write Access
Ver 6, 08 April 2009
Slide 15
CLA Execution Flow
The task runs to completion(No task nesting)
The task runs to completion(No task nesting)
x = Highest priority task both enabled and pending
PriorityTask1: Highest . . .
Task8: Lowest
x = Highest priority task both enabled and pending
PriorityTask1: Highest . . .
Task8: Lowest
Task request is via software or interrupt assigned in MPISRCSEL1:
Task1: ADCINT1 or EPWM1_INTTask2: ADCINT2 or EPWM2_INT…
Task7: ADCINT7 or EPWM7_INTTask8: ADCINT8 or CPU Timer 0
Task request is via software or interrupt assigned in MPISRCSEL1:
Task1: ADCINT1 or EPWM1_INTTask2: ADCINT2 or EPWM2_INT…
Task7: ADCINT7 or EPWM7_INTTask8: ADCINT8 or CPU Timer 0
TaskRequest
?
Set MIOVF Bit(Overflow Flagged)
Set MIFR bit(Task Pending)
Yes
No
Yes
No
MIFRbit
Set?
The main CPU continues code execution in parallel with the CLA
The main CPU continues code execution in parallel with the CLA
Note: Software task requests will not set MIOVF
Note: Software task requests will not set MIOVF
When a task completes a task-specific interrupt is sent to the PIE
When a task completes a task-specific interrupt is sent to the PIE
Yes
Clear MIFR.x bitSet MIRUN.x bitMPC == MVECTx
Run CLA
TaskEnabled?
(MIER)
Yes
Yes
No
No
No
End ofTask?
MSTOP
TaskPending?
(MIFR)
Clear MIRUN.x bitTask x Interrupt to PIE
Ver 6, 08 April 2009
Slide 16
MessageRAMs
CLA to CPU
CPU to CLA
Registers
ADCResult
ePWMHRPWM
COMP
Data RAMs
RAM0
RAM1
CLAProgramMemory
CLA Program Memory:- Mapped to CPU program and data space at reset- CLA code must be even aligned (all instructions are 32-bits)- 4K x 16 (2048 CLA instructions), single cycle
Piccolo (2803x) CLA Memory and Register Access
CLA Data Memory:- Two blocks: RAM0 and RAM1. 1K x 16 each, single cycle- Mapped to CPU program and data space at reset- Each block can be independently mapped to CLA data space
Message RAMs:- Used to pass data between the CLA and CPU
CPU to CLA message RAM (Ignores CLA writes)CLA to CPU message RAM (Ignores CPU writes)
- Always mapped to both CPU and CLA memory space- 128 x 16 each, single cycle
Registers the CLA can Directly Access:- ePWM + HRPWM, Comparator and ADC Result registers- CLA MEALLOW protects EALLOW registers from CLA writes
Ver 6, 08 April 2009
Slide 17
Introduction: What is it? Why is it?Architecture:Floating-Point Format, Tasks, CLA Execution Flow, Time Slicing, Register Set, Program and Data Bus,
Memory and Register AccessInstructions:
Format, Addressing Modes, Types of InstructionsParallel Instructions, Floating-Point Flags
Pipeline: Pipeline Stages, Affects on InstructionsCLA Compared to C28x+FPU CLA in a Control System:
Code Partitioning, “Just in Time” ADC SamplingCode Development and Debug:
Anatomy of CLA Code, Initialization, Code Debug
Session Agenda
Ver 6, 08 April 2009
Slide 18
CLA Instructions
Same instruction format as the C28x and C28x+FPU Destination operand is always on the left Same instruction format as the C28x and C28x+FPU Destination operand is always on the left
Destination Source Operands
Fixed Point: MPY ACC, T, loc16Floating Point: MPYF32 R0H, R1H, R2H CLA: MMPYF32 MR0, MR1, MR2
To enable support for CLA instructions on the use the switch: --cla_support=cla0
(C2800 codegen tools v5.2.x or later)
To enable support for CLA instructions on the use the switch: --cla_support=cla0
(C2800 codegen tools v5.2.x or later)
Same mnemonics as the C28x+FPU but with a leading “M”
Same mnemonics as the C28x+FPU but with a leading “M”
Ver 6, 08 April 2009
Slide 19
CLA Addressing Modes
Indirect Addressing with 16-bit Post Increment
Syntax: MAR0[#imm16]++MAR1[#imm16]++
Uses the address in MAR0 or MAR1 to access memoryAfter the read or write MAR0/MAR1 is incremented by #Imm16
MMOV32 MR1, MAR0[-2]++ ; Load MR1 with what MAR0 points to; & post increment MAR0 by -2
Indirect Addressing with 16-bit Post Increment
Syntax: MAR0[#imm16]++MAR1[#imm16]++
Uses the address in MAR0 or MAR1 to access memoryAfter the read or write MAR0/MAR1 is incremented by #Imm16
MMOV32 MR1, MAR0[-2]++ ; Load MR1 with what MAR0 points to; & post increment MAR0 by -2
Direct Addressing:
Encodes the 16-bit Address of the Variable:MMOV32 MR1, @_Var1
Direct Addressing:
Encodes the 16-bit Address of the Variable:MMOV32 MR1, @_Var1
CLA has only two addressing modes:
Both modes can access the low 64k of memory which includes: All of the CLA data space Both message RAMs Shared peripheral registers No stack pointer or data page pointer
Ver 6, 08 April 2009
Slide 20
Types of CLA Instructions
Type Example Cycles
Load (Conditional) MMOV32 MRa,mem32{,CONDF} 1
Store MMOV32 mem32,MRa 1
Load With Data Move MMOVD32 MRa,mem32 1
Store/Load MSTF MMOV32 MSTF,mem32 1
Compare, Min, Max MCMPF32 MRa,MRb 1
Absolute, Negative Value MABSF32 MRa,MRb 1
Unsigned Integer To Float MUI16TOF32 MRa,mem16 1
Integer To Float MI32TOF32 MRa,mem32 1
Float To Integer & Round MF32TOI16R MRa,MRb 1
Float To Integer MF32TOI32 MRa,MRb 1
Multiply, Add, Subtract MMPYF32 MRa,MRb,MRc 1
1/X (16-bit Accurate) MEINVF32 MRa,MRb 1
1/Sqrt(x) (16-bit Accurate) MEISQRTF32 MRa,MRb 1
Ver 6, 08 April 2009
Slide 21
Types of CLA Instructions
Type Example Cycles
Integer Load/Store MMOV16 MRa,mem16 1
Load/Store Auxiliary Register MMOV16 MAR,mem16 1
Branch/Call/Return Conditional Delayed
MBCNDD 16bitdest {,CNDF}1-7 *
Integer Bitwise AND, OR, XOR MAND32 MRa,MRb,MRc 1
Integer Add and Subtract MSUB32 MRa,MRb,MRc 1
Integer Shifts MLSR32 MRa,#SHIFT 1
Write Protection Enable/Disable MEALLOW 1
Halt Code or End Task MSTOP 1
No Operation MNOP 1
* Number of cycles varies based on how many of the delay slots can be used up* Number of cycles varies based on how many of the delay slots can be used up
Ver 6, 08 April 2009
Slide 22
Parallel Instructions
Instruction Example Cycles
Multiply & Parallel Add/Subtract
MMPYF32 MRa,MRb,MRc|| MSUBF32 MRd,MRe,MRf
1/1
Multiply, Add, Subtract
& Parallel Store
MADDF32 MRa,MRb,MRc|| MMOV32 mem32,MRe 1/1
Multiply, Add, Subtract, MAC
& Parallel Load
MADDF32 MRa,MRb,MRc|| MMOV32 MRe, mem32 1/1
Both Operations Complete in a Single Cycle!Both Operations Complete in a Single Cycle!
MADDF32 MR3, MR3, MR1|| MMOV32 @_Var, MR3
Single instructionSingle opcodePerforms 2 operations
Example: Add + parallel store
Single instructionSingle opcodePerforms 2 operations
Example: Add + parallel store
Parallel bars indicate a parallel instruction
Parallel bars indicate a parallel instruction
Ver 6, 08 April 2009
Slide 23
Look for the next presentation
Introduction to the C2000 Control Law Accelerator
Part 2
Thank you!