Network On Chip Cache Coherency

58
Network On Chip Cache Coherency Final report, part B Students: Zemer Tzach Kalifon Ethan Instructor: Walter Isaschar Winter 2009

description

Network On Chip Cache Coherency. Final report, part B Students: Zemer Tzach Kalifon Ethan Instructor: Walter Isaschar Winter 2009. Agenda. General concepts. Description of the coherency protocol. Architecture design. Components implementation. Simulations. - PowerPoint PPT Presentation

Transcript of Network On Chip Cache Coherency

Page 1: Network On Chip  Cache Coherency

Network On Chip Cache Coherency

Final report, part B

Students: Zemer Tzach Kalifon Ethan

Instructor: Walter Isaschar

Winter 2009

Page 2: Network On Chip  Cache Coherency

AgendaGeneral concepts.

Description of the coherency protocol.

Architecture design.

Components implementation.

Simulations.

Functionality demonstration .Network On Chip - Cache Coherency 2

Page 3: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

General Concepts

3

Page 4: Network On Chip  Cache Coherency

General Background

Modern CPU’s are based on CMP – Chip-Multi Processor.

Improved performance is achieved by “Distribution and Parallelism”.

Cores interact by using NoC – Network on Chip.

Network On Chip - Cache Coherency 4

Page 5: Network On Chip  Cache Coherency

NoC General Diagram

Network On Chip - Cache Coherency 5

Page 6: Network On Chip  Cache Coherency

NoC Characteristics

Wormhole packet routing.

Packet’s path is X-Y.

Units can communicate simultaneously.

Reduce power consumption.

Scalability.

Network On Chip - Cache Coherency 6

Page 7: Network On Chip  Cache Coherency

Cache Coherency

Cache: On chip fast temporary storage.

Cache Coherency: CMP cores use only up to date data.

Traditionally, Cache Coherency achieved by central memory control unit.

Network On Chip - Cache Coherency 7

Page 8: Network On Chip  Cache Coherency

Traditionally Cache Coherency

Network On Chip - Cache Coherency 8

Line 1000 = X Line 1000 = XLine 1000 = Y

Page 9: Network On Chip  Cache Coherency

Problem Description

Prior Cache Coherency protocols are irrelevant – NoC doesn’t have central unit.

Adding such unit will damage both NoC’s scalability and parallelism.

Network On Chip - Cache Coherency 9

Page 10: Network On Chip  Cache Coherency

Solution Requirements

High performance:Avoid “Hot Spots” and “Bottlenecks”.

Minimize resources.

Won’t affect main NoC characteristics (e.g. scalability).

Network On Chip - Cache Coherency 10

Page 11: Network On Chip  Cache Coherency

Solution Basics

Memory control distribution according to memory spaces.

Placement of control units as part of the NoC.

Network On Chip - Cache Coherency 11

Page 12: Network On Chip  Cache Coherency

Solution Diagram

Network On Chip - Cache Coherency 12

Page 13: Network On Chip  Cache Coherency

Solution General Example

Network On Chip - Cache Coherency 13

Read Miss on line 1000.CPU refer to the appropriate Controller.Controller order transfer of data.Other CPU sends the cache line.

Line 1000 = ?

Line 1000 = X

Line 1000 = X

Page 14: Network On Chip  Cache Coherency

Project Goal

Design and implement Cache Coherency protocol for CMP based NoC.Implement NoC (part one).Implement Cache Coherency support for NoC (part

two).

Network On Chip - Cache Coherency 14

Page 15: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Coherency Protocol

15

Page 16: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

General DescriptionThree types of transactions: Read, Read for

Ownership and Invalidation.Cache line’s status can be I/S/E

(Invalid/Shared/Exclusive respectively).Each cache control unit keeps journal which

determines line’s status.Requests are first addressed to the

appropriate cache control unit.16

Page 17: Network On Chip  Cache Coherency

Protocol’s Terminology

Requester.Home Node. Closest Sharer. Owner.

Network On Chip - Cache Coherency 17

Page 18: Network On Chip  Cache Coherency

Read Miss: Line is Shared

Network On Chip - Cache Coherency 18

(3)Data

(1)Read

Request(2)

Forward Request

(4)ACK

Page 19: Network On Chip  Cache Coherency

Write Miss: Line is Shared

Network On Chip - Cache Coherency 19

(4)ACK

(3)Data

(2)Forward and Invalidation

Request

(7)Grant

Ownership

(5)Invalidation

(1)Read for

Ownership

(6)Invalidation

ACK

Page 20: Network On Chip  Cache Coherency

Design difficulties (1st example)

Network On Chip - Cache Coherency 20

(2)Invalidation

(4)Forward and Invalidation

Request

(5)Data

(1)Read for

Ownership

(3)Invalidation

ACK

Page 21: Network On Chip  Cache Coherency

Design difficulties (2nd example)

Network On Chip - Cache Coherency 21

(2)Forward and Invalidation

Request

(2)Invalidation

(3)Data

(4)ACK

(1)Read for

Ownership

Page 22: Network On Chip  Cache Coherency

Protocol’s FeaturesParallel handling of Read requests.Data is forwarded by the Closest Sharer.Transparency: any CPU which uses M/E/S/I is

supported.The protocol supports strongly consistent

processors.

Network On Chip - Cache Coherency 22

Page 23: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Architecture

23

Page 24: Network On Chip  Cache Coherency

CMP Diagram

Network On Chip - Cache Coherency 24

Page 25: Network On Chip  Cache Coherency

CPU Node Structure

Network On Chip - Cache Coherency 25

Page 26: Network On Chip  Cache Coherency

NoC Interface

Functions as a gateway to the NoC.Packing/unpacking flits into/from NoC’s

Packets.Transmit and receive data simultaneously.

Network On Chip - Cache Coherency 26

Page 27: Network On Chip  Cache Coherency

NoC Interface Structure

Network On Chip - Cache Coherency 27

Page 28: Network On Chip  Cache Coherency

CPU Interface

Adapting between NoC’s Cache Coherency Protocol and the CPU.

Translating NoC’s Packets into/from FSB transactions.

CPU transactions doesn’t prevent the CPU Interface from handling the Protocol’s packets.

Network On Chip - Cache Coherency 28

Page 29: Network On Chip  Cache Coherency

CPU Interface Structure

Network On Chip - Cache Coherency 29

Page 30: Network On Chip  Cache Coherency

Controller Node Structure

Network On Chip - Cache Coherency 30

Page 31: Network On Chip  Cache Coherency

Cache Coherency Controller

Manages the Coherency Protocol.Each CCC (Cache Coherency Controller) is

responsible for a specific set of the Memory Lines.

The Directory Table (DT) holds the status of the above Lines as well as several protocol’s information bits.

Network On Chip - Cache Coherency 31

Page 32: Network On Chip  Cache Coherency

CCC Structure

Network On Chip - Cache Coherency 32

Page 33: Network On Chip  Cache Coherency

DT General Structure

The DT will contain the following data for each Line:

Network On Chip - Cache Coherency 33

Page 34: Network On Chip  Cache Coherency

Architecture Features

Message’s length vary according to its purpose. Reduces NoC’s congestion.

Messages carry the transaction information (reduces HW requirements).

Transaction can be blocked by memory update only (allows high parallelism).

Scalable. Network On Chip - Cache Coherency 34

Page 35: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

CMPImplementatio

n

35

Page 36: Network On Chip  Cache Coherency

CMP Characteristics

Size of memory unit is 1 [Byte].Cache line comprise 2 memory units (can

be enlarged).Size of memory is 16 [Byte].CPU’s actions are determined by the user.

Network On Chip - Cache Coherency 36

Page 37: Network On Chip  Cache Coherency

CPU Implementation

Network On Chip - Cache Coherency 37

Page 38: Network On Chip  Cache Coherency

CPU Node Implementation

Network On Chip - Cache Coherency 38

Page 39: Network On Chip  Cache Coherency

CCC Node Implementation

Network On Chip - Cache Coherency 39

Page 40: Network On Chip  Cache Coherency

CMP Implementation

Network On Chip - Cache Coherency 40

Page 41: Network On Chip  Cache Coherency

Synthesis Parameters

Network On Chip - Cache Coherency 41

Page 42: Network On Chip  Cache Coherency

System PerformanceSystem’s clock frequency is 100 [MHz]. CPU’s hold-up (in cycles):

Network On Chip - Cache Coherency 42

Event Line’s Status CPU Delay TotalInvalidation S 0 9Invalidation E 19 28 (M)Read Miss I 29 38 (M)Read Miss S 29 38Read Miss E 49 58 (M)

Page 43: Network On Chip  Cache Coherency

System Performance

M – Memory penalty.C – Dependant on number of CPUs.Delay in all nodes is one/two cycle. In larger systems network factor becomes

greater.Network On Chip - Cache Coherency 43

Event Line’s Status CPU Delay TotalWrite Miss I 29 38 (M)Write Miss S 29 38 (C)Write Miss E 29 38 (C)

Page 44: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

CMPSimulations

44

Page 45: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Shared (1)

45

CPU1x1 reads cache line. The appropriate line is stored in CPU0x0.

1

2

Page 46: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Shared (2)

46

1

2

4

3

Page 47: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Shared (3)

47

1

2 6

5

Page 48: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Exclusive (1)

48

CPU1x1 reads for ownership. The appropriate line is stored in CPU0x0.

1

2

1

2

Page 49: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Exclusive (2)

49

1

2

3

4

Page 50: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Exclusive (3)

50

1

2

5

Page 51: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Read Miss: Line is Exclusive (4)

51

1

2

6

7

Page 52: Network On Chip  Cache Coherency

Network On Chip - Cache Coherency

Demonstration

52

Page 53: Network On Chip  Cache Coherency

Demonstration Diagram

Network On Chip - Cache Coherency 53

Page 54: Network On Chip  Cache Coherency

Tasks – Part A

Familiarize with design tools.Familiarize with VirtexII Pro FPGA

(application & components).Design & Implement NoC’s router.Assemble NoC (2x2 grid) using our router

implementation.

Network On Chip - Cache Coherency 54

Page 55: Network On Chip  Cache Coherency

Tasks – Part B

Design Cache Coherency protocol for CMP based on faculty research.

Assemble CMP based on our NoC.Implement the protocol as part of the

assembled CMP.

Network On Chip - Cache Coherency 55

Page 56: Network On Chip  Cache Coherency

Future Work

Network On Chip - Cache Coherency 56

Memory should be distributed.Improve NoC Interface latency.Messages carry all the transaction’s

information.Strongly consistent processors.

Page 57: Network On Chip  Cache Coherency

Conclusions (1)

Network On Chip - Cache Coherency 57

All architectural goals were achieved. Minimal HW utilization makes for practical

solution. The most efficient possible by protocol

definition.

Page 58: Network On Chip  Cache Coherency

Conclusions (2)

Network On Chip - Cache Coherency 58

The generic design makes a great basis for further studies and research.

With larger systems, the project advantages would be even more predominant.