CSTalks-Polymorphic heterogeneous multicore systems-17Aug

23
blog.nus.edu.sg/cstalks

description

 

Transcript of CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Page 1: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

blog.nus.edu.sg/cstalks

Page 2: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous

Multi-Core Systems

Mihai Pricopi

CSTalks

August 17, 2011

Page 3: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Motivation

Mihai Pricopi 3 CSTalks

Single-core performance (complexity) increase

Page 4: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Motivation

Mihai Pricopi 4 CSTalks

Instruction-level parallelism (ILP)

1: e = a + b

2: f = c + d

3: g = e * f

4: h = f * 2

I 2

3 4

Page 5: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Motivation

Mihai Pricopi 5

2006 2007

CSTalks

Page 6: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Motivation

Mihai Pricopi 6 CSTalks

Thread-level parallelism (TLP)

Multi-threaded applications

Multi-programmed jobs

Process

P0 P1 P0 P1

Process0 Process1

Page 7: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Motivation

Mihai Pricopi 7

nVidia Tesla many-core: up to

960 simple and identical

cores.

Massively exploiting the TLP.

Sequential programs suffer

from limited ILP exploitation.

A gap between TLP and ILP.

Solution: heterogeneous

systems to accommodate the

gap between TLP and ILP.

CSTalks

Page 8: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Heterogeneous Chip Multi-processors

Mihai Pricopi 8

Multi-core systems that use cores with different

performance parameters.

Existing results show that heterogeneous systems are

more efficient than homogeneous ones in terms of

performance, power, area and delay.

Heterogeneity can be reached by using:

◦ Asymmetric chip multi-processors (ACMPs)

◦ Multiprocessor system-on-chip (MPSoC)

◦ Architectures that dynamically reconfigure the internal

structure in order to adapt to different software requests

(polymorphic)

CSTalks

Page 9: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Heterogeneous Chip Multi-processors

Mihai Pricopi 9 CSTalks

Asymmetric chip multi-processors (ACMPs)

P1

P2

P3

P0 P0 P1

P2 P3

P4

Page 10: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Heterogeneous Chip Multi-processors

Mihai Pricopi 10 CSTalks

Multiprocessor system-on-chip (MPSoC)

ARM

DSP

memory

controller

video

accelerator

bridges

Page 11: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Program Phase Behavior - gzip

Mihai Pricopi 11 CSTalks

Page 12: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Program Phase Behavior - gcc

Mihai Pricopi 12 CSTalks

Page 13: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems

Mihai Pricopi 13 CSTalks

• General propose applications

• Novel architecture that can be

tailored according to the

software requirements

• Base system: homogeneous

processor

• Reconfigurable capabilities

• Internal structure

adaptation

• Core-coalition

• Memory

P0 P1

P4 P5

P2 P3

P6 P7

P8 P9

P12 P13

P10 P11

P14 P15

RF

RF

Page 14: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems – Reconfigurable Fabric

Mihai Pricopi 14 CSTalks

• Reconfigurable hardware shared by different processors

• RF implements custom instructions

• Dynamic reconfiguration at runtime – speedup

1: e = a + b

2: f = c + d

3: g = e * f

4: h = f * 2

I 2

3 4

RF

P0

P1

Custom Instruction

Page 15: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems – Reconfigurable Fabric

Mihai Pricopi 15 CSTalks

• Challenging Problems:

• The amount of RF is limited.

• Decide when to reconfigure the RF (scheduling)

• What is the best set of Custom Instructions that

will give the highest speedup.

• Overhead of the dynamic reconfiguration.

Page 16: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems – Core Structure Adaptation

Mihai Pricopi 16 CSTalks

• Similar performance can be achieved by using smaller

processor internal units.

• Instruction fetch window size, issue width, instruction

window size, frequency can be dynamically changed.

• Power and thermal concerns.

Page 17: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems – Core-Coalition

Mihai Pricopi 17 CSTalks

• Coalition helps creating “stronger” cores using the already

existing light cores:

• accelerates serial applications by extracting more ILP

(if available).

• uses limited amount of shared hardware between

cores.

• up to 4-core coalition can be formed.

P0

(2-way)

P1

(2-way)

P

(4-way) ≡

2-core coalition

Page 18: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Polymorphic Heterogeneous Multi-Core

Systems – Core-Coalition Execution Model

Mihai Pricopi 18 National University of Singapore

B0

B1 B2

B3

B0

B1

B1

B3

B3

B0

B0

B1

B1

B3

B3

B4

B4

B4

B4

B0

B4

Core 0 Core 1

SF: Sentinel Instruction

fetch and global

renaming

RF: Regular instruction

fetch, decode and

renaming

EX: Regular instruction

execution

CM: Regular instruction

commit

Time SF RF EX CM SF RF EX CM

CFG

Page 19: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Experimental Results - Speedup

Mihai Pricopi 19 National University of Singapore

Page 20: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Experimental Results – Load Balance

Mihai Pricopi 20 National University of Singapore

Page 21: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Proposed directions

Mihai Pricopi 21 National University of Singapore

Next steps:

◦ Implement Coalition on FPGA.

◦ More study on the overhead and power

consumption determined by the shared resources.

◦ Implement a dynamic scheduler for Coalition.

Page 22: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Mihai Pricopi 22 National University of Singapore

?

Page 23: CSTalks-Polymorphic heterogeneous multicore systems-17Aug

Next Week’s Talk

A Unified Framework for Recommendations in

the Social Network by Chen Wei

Join us next Wednesday!

Wednesday, 31 August, 2011 23