EE560 CMP Design Aspects Simplified for EE457

28
EE560_CMP_Design_Aspects_Simplified_for_EE457.pdf

Transcript of EE560 CMP Design Aspects Simplified for EE457

Page 1: EE560 CMP Design Aspects Simplified for EE457

EE560_CMP_Design_Aspects_Simplified_for_EE457.pdf

Page 2: EE560 CMP Design Aspects Simplified for EE457

Very long instruction word

Page 3: EE560 CMP Design Aspects Simplified for EE457

Let's not cover Inverted page table in EE457.

Page 4: EE560 CMP Design Aspects Simplified for EE457

Intel uses a 2-level page-table for it 32-bit address

from mindshare.com'shttps://www.mindshare.com/Learn/Intel_32!64-bit_x86_Architecture

Page 5: EE560 CMP Design Aspects Simplified for EE457

Intel uses a 4-level page-table for it 64-bit address

from mindshare.com'shttps://www.mindshare.com/Learn/Intel_32!64-bit_x86_Architecture

Page 6: EE560 CMP Design Aspects Simplified for EE457

In EE457, we covered the two designs below:

Page 7: EE560 CMP Design Aspects Simplified for EE457
Page 8: EE560 CMP Design Aspects Simplified for EE457
Page 9: EE560 CMP Design Aspects Simplified for EE457
Page 10: EE560 CMP Design Aspects Simplified for EE457
Page 11: EE560 CMP Design Aspects Simplified for EE457

Register Map Table or just Map Table

Page 12: EE560 CMP Design Aspects Simplified for EE457

Physical Register File Free Register List

add $2, $18, $19

add $2, $2, $2

Page 13: EE560 CMP Design Aspects Simplified for EE457

From Prof. Dubois' book

Page 14: EE560 CMP Design Aspects Simplified for EE457

We want to do B.P. and S. E. B. B.(Branch Prediction and Speculative Execution Beyond Branch) with ability to flush Wrong-Path Instructions.

So FRAT needs to be restored to its state just before the mispredicted branch was dispatched. This is done through Check Points (Snap Shots) of FRAT taken when less confidence branches were predicted.

Page 15: EE560 CMP Design Aspects Simplified for EE457
Page 16: EE560 CMP Design Aspects Simplified for EE457
Page 17: EE560 CMP Design Aspects Simplified for EE457
Page 18: EE560 CMP Design Aspects Simplified for EE457

FRAT is like the RST of IoI-OoE-OoC

Page 19: EE560 CMP Design Aspects Simplified for EE457
Page 20: EE560 CMP Design Aspects Simplified for EE457
Page 21: EE560 CMP Design Aspects Simplified for EE457

1. What does VLIW stand for? Very Long Instruction Word

Compared to a Superscalar CPU (where instruction scheduling is done in hardware), a VLIW CPU depends more (more/ less) on the compiler technology for instruction scheduling.

2. IPT (Inverted Page Table) is more attractive in a 64-bit (32-bit/64-bit) processors.

In such processors, TLB miss causes an exception so that IPT Look-up is performed by the O.S. True/ False? True

What is included for the Final Exam from the EE557/EE560 Preview portion?

Page 22: EE560 CMP Design Aspects Simplified for EE457

3. List two main drawbacks with the ROB-based design covered in EE457 which makes it difficult to scale the design.

A. Need to carry DATA through Instruction Queues (upstream of the functional units) and through ROB. This problem is more acute in the 64-bit CPUs than the 32-bit CPUs.

B. Need to conduct TWO expensive associative prioritized searches in the ROB (one for $Rs and another for $Rt (the two source registers of the instruction being dispatched)) . The above two disadvantages make it difficult to scale the processor (scale it to higher data widths and deeper ROBs).

4. PRF stands for Physical Register File.FRL stands for Free Register List.RAT stands for Register Alias Table. FRAT stands for Frontend RAT and is updated by the dispatch unit whereas RRAT stands for the Retirement RAT and is updated by the Instruction Retirement unit.

Page 23: EE560 CMP Design Aspects Simplified for EE457

FRAT (FRAT / RRAT) avoids the expensive associative priorotized search (by holding the latest mapping between each architectural register and the locations in the physical register file).

5. In case of branch misprediction, FRAT (FRAT / RRAT) needs to be reverted to its state before the dispatch of the mispredicted branch. We take snapshots of the FRAT (FRAT / RRAT) and store them. These are called check-points. Check-points are taken for low-confidence branches. If a low confidence branch comes out to be mispredicted, you restore FRAT by copying the checkpoint back to FRAT (FRAT / RRAT) . But if a high-confidence (low-confidence / high-confidence) branch (for whom we did not create a check-point) got mispredicted, we jump to the nearest check-point and then walk though the ROB by rewinding/fast-forwarding the dispatch (dispatch / graduation) process to the mispredicted branch. This is similar to a DVD (DVD /VHS tape) where we jump to the nearest chapter point and then rewind or fast forward to the scene of interest.

Page 24: EE560 CMP Design Aspects Simplified for EE457

The following was not covered. Hence this is not included for the EE457 Final exam.

To facilitate this "walk" through the ROB, each entry in the ROB contains information about

-- if it is register writing or not -- if it is register writing, what is the architectural destination register name (say $2)-- what PRF register was it mapped to (say P42)-- What PRF register was that $2 mapped to before this mapping (say P52)

So if we were walking back in the ROB, the FRAT entry for $2 will be chaged from P42 to P52.If we were walking forward in the ROB, the FRAT entry for $2 will be chaged from P52 to P42.

More in the EE557 and in EE560.

Page 25: EE560 CMP Design Aspects Simplified for EE457

EE560_CMP_Design_Aspects_Simplified_for_EE457.pdfUnlike in the T1 processor of Oracle, in our EE560 CMP, we have ID stage before the TS stage. Our stage order is IF ID TS EX MEM WBWe initially made a mistake of placing the register file in the ID stage, which needed elaborate forwarding mechanism to forward to instructions waiting in the TS stage from seniors of the same thread in the stages ahead. In recent years, we moved the register file to the TS stage, making forwarding much simpler (basically taking advantage of the RF to hold data that arrived before the STALL is over. The IFRF (Internally Forwarding Register File)) helps to collect the last data that was arriving in the clock STALL ends.

Page 26: EE560 CMP Design Aspects Simplified for EE457

EE560_CMP_Design_Aspects_Simplified_for_EE457.pdf

Rollback occurs when an instruction is stuck in the EX stage and cannot proceed further because its senior LW instruction has incurred cache miss and was sent to RMSHR (Read Miss Status Handling Register).Rollback means convert that instruction and its juniors to bubbles and set the PC to fetch that instruction in the IF stage again.

Page 27: EE560 CMP Design Aspects Simplified for EE457
Page 28: EE560 CMP Design Aspects Simplified for EE457

In Summer 2018 we did a project on PCIe, which enhanced the EE560 course further! Currently we are working on a GPGPU project and hope to be done before the start of the Summer 2020!

By doing EE457, you have proved that you have the potential to become a good RTL designer. But you need to take EE560 also.

Students with an “A-” grade in any of the five courses EE457, EE557, EE577a, EE577b, and EE 533, in recent semesters including the Fall 2019 and Spring 2020 semesters, are eligible to join EE560 in Summer 2020.

http://www-classes.usc.edu/engr/ee-s/457/EE560_Summer2019_invitation_to_join.pdf

Invitation to join EE560 will be sent out in March 2020. It will be similar to the one below from March

Major Projects, Labs, and Topics to be covered in EE560 in Summer 2019 http://www-classes.usc.edu/engr/ee-s/457/EE560_Summer2019_Major_Projects_Labs_Topics.pdf