ARM SoC exploration

10
ARM SoC exploration Using gem5 for application specific system-on-chip architecture exploration Alexandre Romaña & Abhilash Nair 1

Transcript of ARM SoC exploration

Page 1: ARM SoC exploration

ARM SoC exploration

Using gem5 for application specific

system-on-chip architecture exploration

Alexandre Romaña & Abhilash Nair

1

Page 2: ARM SoC exploration

How we do architecture exploration

2

Simulation -architectural change

-IP enhancement -re-partitioning

-algorithmic enhancement

Page 3: ARM SoC exploration

The Case for gem5

3

Cycle accurate

Fast

• Late

• Expensive

• Slow

• Complex

Emulation

Page 4: ARM SoC exploration

Architecture exploration for TCI6636K2H SoC for Small Cells (High End)

4

Tera

Net

Multicore Navigator

Security AccelerationPac

Packet AccelerationPac

+ * - <<

C66x DSP

+ * - <<

C66x DSP

System Services

Power Manager

Debug

System Monitor

EDMA

Multicore Shared Memory Controller

011100

100010

001111

DDR3/3L 64/72b

DDR3/3L 64/72b

4MB ARM Shared Memory

1MB L2 Per C66x Core

Ethernet Switch

+ * - <<

C66x DSP

+ * - <<

C66x DSP

+ * - <<

C66x DSP

+ * - <<

C66x DSP

+ * - <<

C66x DSP

+ * - <<

C66x DSP

6MB

PCIE I2C USB 3 SRIO Ethernet EMIF 16 UART SPI Hyperlink

Radio AccelerationPac

FFTC TCP3d

ARM A15

VCP2

RAC TAC2 BCP

ARM A15

ARM A15

ARM A15

• First chip architected

using gem5:

– SL2$ size

– MSMC Coherency

– Arbitration

– …

Page 5: ARM SoC exploration

System on a Chip components

• IPs in a SoC vs what gem5 supports:

– ARM

– DSP

– Caches

– Memories

– Core Interconnect

– Chip Interconnect

– Hardware accelerators

– External Interfaces

–…

• SystemC enabled full system benchmarking

5

Page 6: ARM SoC exploration

Use case example: Leveraging open source for linux fast path benchmarking

6

Radio Access

Network

Network

Co-processor SoC

Zero overhead linux

Page 7: ARM SoC exploration

The need for task partitioning

7

Antenna interface

Frequency shift

Remove CP FFT PUSCH/PUCCH de-mapping

LTE PUCCH decoding taskflow

ARM A15

+ * - <<

C66x DSP

FFTC

Automated static mapping

+ * - <<

C66x DSP

+ * - <<

C66x DSP ARM A15 ARM A15

or Runtime load balancing

1000s of tasks

Page 8: ARM SoC exploration

Leveraging checkpointing

8

gem5 Check-

pointing closure

Useful for both bare metal and linux

Page 9: ARM SoC exploration

9

Source: http://www.arm.com/images/CoreLink_CCN-504_system_large.png

4x Gem5

Gem5

Topaz

SLICC

SimpleDRAM

TLM2.0 protocol bridging

Modeling today’s architectures ARM CCN-504

Page 10: ARM SoC exploration

Thank You! Questions?

10