REGISTER FILE ACCESS REDUCTION BY DATA REUSE

23
1 REGISTER FILE ACCESS REDUCTION BY DATA REUSE Hiroshi Takamura Koji Inoue Vasily G. Moshnyaga Dept. of Electronics Engineering and Computer Science Fukuoka University, Japan

description

REGISTER FILE ACCESS REDUCTION BY DATA REUSE. Hiroshi Takamura Koji Inoue Vasily G. Moshnyaga. Dept. of Electronics Engineering and Computer Science Fukuoka University, Japan. Overview of the talk. Motivation of this work The Data-Reuse approach Experimental Results Conclusion. - PowerPoint PPT Presentation

Transcript of REGISTER FILE ACCESS REDUCTION BY DATA REUSE

Page 1: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

1

REGISTER FILE ACCESS REDUCTION BY DATA REUSE

Hiroshi Takamura

Koji Inoue

Vasily G. MoshnyagaDept. of Electronics Engineering and Computer Science

Fukuoka University, Japan

Page 2: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

2

Overview of the talk

Motivation of this work The Data-Reuse approach Experimental Results Conclusion

Page 3: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

3

Motivation of this work

Extending battery life time.Making to low-cost.

Reducing energy consumption of microprocessors is necessary

Page 4: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

4

Power distribution in Motorola’s M-core Source: D.Gonzales, IEEE Micro,19(4)1999

Register file takes 16% of the total power and 42% of the data path power!

Clock :

Data path:Controller:

36%

36%28%

Total 100%

Page 5: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

5

Register File Energy Dissipation

Energy = ( Nread + Nwrite ) * Eacc

Total number of RF reads

Total number of RF writes

Average energy per RF access

Our goal: To lower N according to operand variation by Architectural optimizations

Assumption: Read and write consumes equal energy

Page 6: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

6

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

Register file ALU

Rs

Rt

The first source operandThe second source operandDestination operand

The value is not updated. 4 read-accesses

Problem of conventional RF operation

Therefore there is unnecessary RF reading

Page 7: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

7

Problem of conventional RF operation

ALURegister

File Datamemory

x1

x2

Forwardingunit

ID/EX EX/MEM MEM/WB

AB

x

$rs

$rt

rs

rs

rdALU

RegisterFile Data

memory

x1

x2

Forwardingunit

ID/EX EX/MEM MEM/WB

AB

x

$rs

$rt

rs

rs

rd

Almost all results are provided to following instructions via forwarding units, so that they are consumed before RF writing.

So, there is a unnecessary RF writing

Page 8: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

8

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

Register file

ALU

Rs

Rt

control

Register file access reduction approach (Reuse of the same source operand value )

The first source operandThe second source operandDestination operand

R-mode

Page 9: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

9

add $t0, $s1, $t1 (i)

mul $t3, $t1, $s1 (ii)

Register file

ALU

Rs

Rt

S-mode

MUX

MUX

control

Register file access reduction approach(operand swapping)

The first source operandThe second source operandDestination operand

Page 10: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

10

RF access reduction approach(Delayed Operand Reuse)

sub $t3, $s1, $t1 (i)lw $t2, 20($s2) (ii)sub $t4, $t2, $t1 (iii)

J-mode

Register file

ALU

Rs

Rt

control

The first source operandThe second source operandDestination operand

Page 11: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

11

add $t1, $t1, $s1 (i)sub $t1, $s1, $t1 (ii)Useless writing

access

c.c.1 c.c.2 c.c.3 c.c.4 c.c.5 c.c.6

IM Reg DM Reg

IM Reg DM Reg

(i)

(ii)

Reduction of RF writing(Application of writing operation omission)

The first source operand

The second source operandDestination operand

Page 12: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

12

Number of accessesadd $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R

S

RSJ

W+RSJ

Dest.sSource1

Source2

-An example-Number of accesses in conventional register file

Page 13: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

13

Operand reusing between continuous instructions

Nread Nwrite

CONV 11 6R

S

RSJ

W+RSJ

Dest.sSource1

Source2

7 6

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Number of accesses

Page 14: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

14

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S

RSJ

W+RSJ

3 6

Dest.sSource1

Source2

Operand swapping

Number of accesses

Page 15: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

15

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ

W+RSJ

2 6

Dest.sSource1

Source2

Reusing operand between discontinuous instructions

Number of accesses

Page 16: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

16

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ 2 6W+RSJ 2 5

Dest.sSource1

Source2

Writing operation omission

Number of accesses

Page 17: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

17

RF accesses by the proposed technique

add $t0, $s1, $t1 (i)

mul $t3, $s1, $t1 (ii)

add $t1, $t1, $s1 (iii)

sub $t1, $s1, $t1 (iv)

lw $t2, 20($s1) (v)

sub $t4, $s1, $t1 (vi)

Nread Nwrite

CONV 11 6R 7 6S 3 6

RSJ 2 6W+RSJ 2 5

Number of reading : 11 times > 2 timesNumber of writing : 6 times > 5 timesNumber of total accesses : 17 times > 7 times

Page 18: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

18

Experimental Evaluation Flexible Architecture Simulation Tool

Cycle-accurate instruction simulation on 5-stage RISC-type microprocessor (similar to MIPS)

Traces user-level instructions and records RF access info as well as operand’s total number of reuse.

32-entry RF (1 write, 2 reads) SPEC95 and MediaBench Benchmarks:

adpcm_c, adpcm_d, compress, go, mpeg_d, mpeg_e, pegwit_g, pegwit_enc, pegwit_dec

we described a simple RISC microprocessor in Verilog-HDL, and synthesized it by Synopsys Design Compiler. A 0.35 μm process technology was assumed.

SUN UltraSparc-3 environment

Page 19: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

20

Reduction rate (%) for the RF read

0

10

20

30

40

50

60

70RSJRSJ

RF access reduction: 62.7% (maximum)!

Page 20: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

21

Reduction rate (%) for the RF write

0

10

20

30

40

50

60

70

2inst

1inst

RF access reduction: 60% (maximum)!

Page 21: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

22

Reduction rate (%) for read&write

0

10

20

30

40

50

60

70

ade add com_n com_t com_b go mpd_m mpd_t mpd_tv mpd_tm mpe pegc pege pegd

W+RW+SW+JW+RSJ

RF access reduction: 61% (maximum)!

Page 22: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

23

Area comparison

100.00%

101.70%

103.23%

97%

98%

99%

100%

101%

102%

103%

104%

105%

Conventional type Read Reuse Read &Write Reuse

The

inc

reas

e ra

te o

f ar

ea(%

)

Hardware Overhead: +3.2% (maximum)!

Page 23: REGISTER FILE ACCESS REDUCTION BY DATA REUSE

24

Conclusion We proposed a technique to reduce energy dissipation of

register file by operand reuse Energy savings vary on application:

Read: 62% (max), 29%(aver.) Write: 60% (2instr), 55%(1instr) Total: 61% (max), 39%(aver.)

Hardware overheadRead: 1.7%, Read&Write: 3.2%

Verification at a cycle level Evaluation based on a detailed energy models A detailed estimation of the control circuitry overhead

Future Work