1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six...

12
1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining

Transcript of 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six...

Page 1: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1

Chapter SixPipelining

Page 2: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-2

Pipelining: analogia com linha de produção

• Tempo de fabricação de um carro: C+M+C+P+A• Taxa de produção (carros/h): medido na saída

– throughput: depende do número de estágios e do balanceamento

• Objetivo do pipeline:– distribuir e balancear o tempo em cada estágio– otimizar o uso de HW dos estágios (taxa de ocupação)– resumo: aumentar a velocidade e diminuir o hardware

carro1 Chassi Mec Carroc. Pint. Acab.

carro2 Chassi Mec Carroc. Pint. Acab.

carro3 Chassi Mec Carroc. Pint. Acab.

tempo

Page 3: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-3

Pipelining• Improve perfomance by increasing instruction throughput

• tempo de execução de uma instrução: 8 ns e 9 ns (com ou sem pipe)• throughput: 1 instr / 8ns (sem pipeline) ou 1 instr / 2 ns (com)• # de estágios = 5; ganho de 4:1; para obter 5:1 ??

Instructionfetch Reg ALU Data

access Reg

8 ns Instructionfetch Reg ALU Data

access Reg

8 ns Instructionfetch

8 ns

Time

lw $1, 100($0)

lw $2, 200($0)

lw $3, 300($0)

2 4 6 8 10 12 14 16 18

2 4 6 8 10 12 14

. . .

Programexecutionorder(in instructions)

Instructionfetch Reg ALU Data

access Reg

Time

lw $1, 100($0)

lw $2, 200($0)

lw $3, 300($0)

2 ns Instructionfetch Reg ALU Data

access Reg

2 ns Instructionfetch Reg ALU Data

access Reg

2 ns 2 ns 2 ns 2 ns 2 ns

Programexecutionorder(in instructions)

Page 4: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-4

Pipelining

• What makes it easy– all instructions are the same length– just a few instruction formats– memory operands appear only in loads and stores

• What makes it hard?– structural hazards: suppose we had only one memory– control hazards: need to worry about branch instructions– data hazards: an instruction depends on a previous instruction

• We’ll build a simple pipeline and look at these issues

• We’ll talk about modern processors and what really makes it hard:– exception handling– trying to improve performance with out-of-order execution, etc.

Page 5: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-5

Basic Idea

• What do we need to add to actually split the datapath into stages?

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Instruction

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

WriteregisterWritedata

ReaddataAddress

Datamemory

1

ALUresultM

ux

ALUZero

IF: Instruction fetch ID: Instruction decode /register file read

EX: Execute/address calculation

MEM: Mem ory access WB: Write back

Page 6: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-6

IM Reg DM RegALU

IM Reg DM RegALU

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7

Time (in clock cycles)

lw $2, 200($0)

lw $3, 300($0)

Programexecutionorder(in instructions)

lw $1, 100($0) IM Reg DM RegALU

Uma representação para o pipeline

• Hachurado representa “atividade”

• Representação alternativa: imaginando que cada instrução tenha a sua própria via de dados

• Outra representação: foto no tempo

Page 7: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-7

Pipelined Datapath

• O que é produzido em cada estágio é armazenado em um registrador de pipeline para uso pelo próximo estágio

• Problemas com o endereço do registrador de escrita?? Caminhando para esquerda?

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Instru

ction

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1Readregister 2

16 Signextend

WriteregisterWritedata

Readdata

1

ALUresultM

ux

ALUZero

ID/EX

Datamemory

Address

Page 8: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-8

Corrected Datapath

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Instru

ction

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0

Address

Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1Readregister 2

16 Signextend

WriteregisterWritedata

Readdata

Datamemory

1

ALUresultM

ux

ALUZero

ID/EX

Page 9: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

Exemplo com sub e lw (1)

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Instruction decode

lw $10, 20($1)

Instruction fetch

sub $11, $2, $3

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Instruction fetch

lw $10, 20($1)

Address

Datamemory

Address

Datamemory

Clock 1

Clock 2

Page 10: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

Instructionmemory

Address

4

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

3216Sign

extend

Writeregister

Writedata

Memory

lw $10, 20($1)

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Execution

sub $11, $2, $3

Instructionmemory

Address

4

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Execution

lw $10, 20($1)

Instruction decode

sub $11, $2, $3

3216Sign

extend

Address

Datamemory

Datamemory

Address

Clock 3

Clock 4

Exemplo com sub e lw (2)

Page 11: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

Exemplo com sub e lw (3) Instruction

memory

Address

4

32

0

Add Addresult

1

ALUresult

Zero

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEMID/EX MEM/WB

Write backMux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Mux

ALUReaddata

Writeregister

Writedata

lw $10, 20($1)

Instructionmemory

Address

4

32

0

Add Addresult

1

ALUresult

Zero

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEMID/EX MEM/WB

Write backMux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Mux

ALUReaddata

Writeregister

Writedata

sub $11, $2, $3

Memory

sub $11, $2, $3

Address

Datamemory

Address

Datamemory

Clock 6

Clock 5

Page 12: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-1 Chapter Six Pipelining.

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch6-12

Graphically Representing Pipelines

• Can help with answering questions like:– how many cycles does it take to execute this code?– what is the ALU doing during cycle 4?– use this representation to help understand datapaths

IM Reg DM Reg

IM Reg DM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

lw $10, 20($1)

Programexecutionorder(in instructions)

sub $11, $2, $3

ALU

ALU