MICROPROCESSORS - WordPress.com · MICROPROCESSORS Lecture 2: ... Connection between the CPU and...

60
MICROPROCESSORS Lecture 2: identify the core components of a CPU Networks and Communication Department 1 By: Latifa ALrashed

Transcript of MICROPROCESSORS - WordPress.com · MICROPROCESSORS Lecture 2: ... Connection between the CPU and...

MICROPROCESSORS

Lecture 2: identify the core components of a CPU

Networks and Communication Department

1

By: Latifa ALrashed

Outline

Networks and Communication Department

¨  Identify the core components of a CPU ¤ EDB ¤ Registers ¤ codebook ¤ Clock

¨  Describe the relationship of CPU and RAM ¨  Pipelining ¨  CPU Cache

The CPU Core Components

Networks and Communication Department

Networks and Communication Department

Networks and Communication Department

Talking to the Man

¨  Imagine 16 lights ¤  8 on the inside and 8 on the outside ¤ When an inside light is on, the corresponding outside light is on. ¤ We can switch these lights on and off. ¤  This communication system is like the external data bus

Talking to the Man (Cont.)

¨  In reality, a lot of little wires flash on or off ¤ Voltage is applied or not ¤  Represented not as on, on, off, off…

but as 1, 1, 0, 0…

On

1

Off

0

On

1

Off

0

On

1

Off

0

On

1

On

1

Networks and Communication Department

¨  Need some sort of codebook that assigns meanings to the many different patterns

¨  You can see little wires sticking out of the CPU ¨  Figure shows the close-up of the underside of a CPU

Talking to the Man (Cont.)

External Data Bus

¨  The CPU communicates with the outside world using the external data bus (EDB) ¤  Instead of light bulbs, the EDB is made up of tiny wires ¤ The state of a wire is expressed in a binary format,

with zeroes and ones ¤ This “1 and 0” or binary system is used to describe the

state of these wires at any given moment. ¤ Each state represents a line of code in a program

Registers

Networks and Communication Department

¨  Registers are tiny storage areas on the CPU, microscopic semiconductor circuits.

¨  They provide the Man in the Box with a workplace for the problems you give him.

¨  He needs at least four worktables.

Registers (Cont.)

Networks and Communication Department

¨  Each of these four worktables has 16 light bulbs. (16 bits)

¨  All CPUs contain a large number of registers ¨  The four most commonly used ones: the general-purpose

registers. ¨  Intel gave them the names AX, BX, CX, and DX.

¤ AX (Accumulator Register) ¤  BX (Base Register) ¤ CX (Count Register) ¤ DX (Data Register)

¨  The man in the box needs one more tool: the codebook or instruction set ¤ Called microprocessor’s machine language. ¤ One command is a line of code ¤ Here are some examples of real machine language for

the Intel 8088

Instruction Meaning 10111010 The next line of code is a number. Put

that number into the DX register 01000001 Add 1 to the number already in the CX

register 00111100 Compare the value in the AX register with

the next line of code

The codebook

The codebook (Cont.)

Networks and Communication Department

¨  By placing machine language commands called lines of code onto the external data bus one at a time, you can instruct the Man in the Box to do specific tasks.

¨  All of the machine language commands that the CPU understands make up the CPU’s instruction set.

The CPU Clock

Networks and Communication Department

Time to work

10000101

00110101

Clock

Networks and Communication Department

¨  The CPU does no work until told to even though data may be on the EDB

¨  You need a buzzer to tell the man in the box to start ¤  This is referred to as a clock ¤  A clock is actually a stream of

pulses

¨  Of course, a real computer doesn’t use a buzzer. The buzzer on a real CPU is a special wire called the CLOCK wire (most diagrams label the clock wire CLK).

¨  A charge on the CLK wire tells the CPU there’s another piece of information waiting to be processed

zz

10000101

00110101

Time to work

10000101

00110101

Clock

Clock (Cont.)

¨  A clock cycle is the time taken by the special wire to charge up

¨  Actually, the CPU requires at least two clock cycles to act on a command, and usually more.

¨  In fact, a CPU may require hundreds of clock cycles to process some commands

Clock (Cont.)

Networks and Communication Department

¨  The maximum number of clock cycles that a CPU can handle in a given period of time is referred to as its clock speed

¨  The clock speed is the fastest speed at which a CPU can operate, determined by the CPU manufacturer.

¨  The rated speed of the CPU, measured in Hertz – cycles (ticks) per second.

Clock (Cont.)

Networks and Communication Department

¨  i.e. The Intel 8088 processor had a clock speed of

4.77 MHz (4.77 million of cycles per second)

¨  1 hertz (1 Hz) = 1 cycle per second ¨  1 kilohertz (1 KHz) = 1 thousand cycles per second ¨  1 megahertz (1 MHz) = 1 million cycles per second ¨  1 gigahertz (1 GHz) = 1 billion cycles per second

Diagram of an Intel 8088 showing the external data bus

and clock wires

Networks and Communication Department

In Summary

¨  The CPU is like a man in a box

¨  The external data bus gets data in and out of the CPU

¨  Registers are used as temporary storage inside the CPU

¨  The instruction set is like a codebook

¨  The clock defines the speed of the CPU

10000101

00110101

11001001

10100001

Connection between the CPU and the RAM

Networks and Communication Department

How the CPU executes program code

Networks and Communication Department

¨  Try the following simple exercise to see how the process works.

¨  Tell the 8088 CPU to add 2 + 3.

Exercise

Networks and Communication Department

1.  Place 10000000 on the external data bus (EDB). 2.  Place 00000010 on the EDB. 3.  Place 10010000 on the EDB. 4.  Place 00000011 on the EDB. 5.  Place 10110000 on the EDB. 6.  Place 11000000 on the EDB.

¨  When you finish Step 6, the value on the EDB will be 00000101, the decimal number 5 written in binary.

Connection between the CPU and the RAM

Networks and Communication Department

¨  Program itself is stored on the hard drive. ¨  The hard drive is too slow ¨  Memory Takes copies of programs from the hard drive

and then sends them, one line at a time, to the CPU quickly enough to keep up with its demands.

¨  Also, it must store the result of the programs. ¨  Must be done at or at least near the clock speed of the

CPU. ¨  The CPU needs a way to address each line of this

memory

Connection between the CPU and the RAM

Networks and Communication Department

Memory controller chip

Networks and Communication Department

¨  The Memory controller chip (MCC) contains special circuitry that enables it to grab the contents of any single line of RAM and place that data or command on the external data bus.

¨  This in turn enables the CPU to act on that code

Memory controller chip (Cont.)

Networks and Communication Department

Address Bus

Networks and Communication Department

¨  Address Bus enables CPU to tell the MCC which line of code it needs

¨  Different CPUs have different numbers of wires

¨  The 8088 had 20 wires in its address bus

¨  If you know the number of wires in the CPU’s address bus, you know the maximum amount of RAM that a particular CPU can handle.

Address Bus

q Another set of wires in addition to the external data bus q Used by the CPU to tell the Northbridge which line of code it

wants from RAM

MCC

Address Bus (Cont.)

Networks and Communication Department

¨  There are two big questions here.

Address Bus

¨  The number of wires in the address bus determines the maximum amount of RAM the CPU can handle

¤ An 8088 had 20 wires, which provided 220 combinations (1,048,576 or 1 MB)

¤ Many current CPUs use 36 wires, which provide 236

combinations (68,718,476,736 or 64 GB)

How many patterns?

Networks and Communication Department

¨  The 8088 had a 20 wires address bus,

¨  you can say that the 8088 had (Maximum) one megabyte (1

MB) of RAM .. (How !!!)

¨  If you have 20 wires, you would have 220 (or 1,048,576)

combinations. Because each pattern points to one line of code

and each line of RAM is one byte

¨  Therefore, had an address space of 1,048,576 bytes.

¨  The most RAM it could handle was 220 or 1,048,576 bytes.

Which Pattern Goes to Which Row?

Networks and Communication Department

¨  The CPU identifies the first byte of RAM on the address bus as

00000000000000000000

¨  T he CPU i den t i f i e s t h e l a s t RAM row w i t h

11111111111111111111 (the 1,048,576th line of RAM)

¨  Obviously, the address bus also addresses all the other rows of

RAM in between

¨  So, the CPU can access any row of RAM it needs

Traditional Pipeline Concept

¨ Laundry Example ¨ Ann, Brian, Cathy, Dave

each have one load of clothes to wash, dry, and fold

¨ Washer takes 30 minutes

¨ Dryer takes 40 minutes

¨ “Folder” takes 20 minutes

A B C D

Traditional Pipeline Concept

¨  Sequential laundry takes 6 hours for 4 loads

¨  If they learned pipelining, how long would laundry take?

A

B

C

D

30 40 20 30 40 20 30 40 20 30 40 20

6 PM 7 8 9 10 11 Midnight

Time

Traditional Pipeline Concept

¨ Pipelined laundry takes 3.5 hours for 4 loads

A

B

C

D

6 PM 7 8 9 10 11 Midnight

T a s k

O r d

e r

Time

30 40 40 40 40 20

Traditional Pipeline Concept ¨  Pipelining doesn’t help latency

of single task, it helps throughput of entire workload

¨  Pipeline rate limited by slowest pipeline stage

¨  Multiple tasks operating simultaneously using different resources

A

B

C

D

6 PM 7 8 9

T a s

k O

r d e

r

Time

30 40 40 40 40 20

Use the Idea of Pipelining in a Computer

Networks and Communication Department

1.  Fetch: Get the data from the EDB 2.  Decode: Figure out what type of command needs

to be done 3.  Execute: Perform the calculation 4.  Write: Send the data back onto the EDB ¨  There are discrete circuits inside your CPU handle

each of these stages.

Pipelining

Networks and Communication Department

¨  In early CPUs, when a command was placed on the EDB, each stage did its job and the CPU handed back the answer before starting the next command, requiring at least four clock cycles to process a command.

¨  In every clock cycle, three of the four circuits sat idle.

Pipelining (Cont.)

Networks and Communication Department

¨  Today, the circuits are organized in a fashion called a pipeline.

¨  With pipelining, each stage does its job with each clock cycle pulse

¨  The CPU has multiple circuits doing multiple jobs

Pipelining (Cont.)

Networks and Communication Department

¨  Pipelines keep every stage of the processor busy on every click of the clock, making a CPU run more efficiently without increasing the clock speed.

¨  No CPU ever made has fewer than four stages, ¨  but advancement in caching have increased the

number of stages over the years. ¨  Current CPU pipelines contain many more stages, up

to 20 in some cases.

Now, it’s Men in Box (Simple pipeline)

Networks and Communication Department

Pipelining (Cont.)

F 1 E 1 F 2 E 2 F 3 E 3

I 1 I 2 I 3

(a) Sequential execution

Instruction fetch unit

Ex ecution unit

Interstage buffer B1

(b) Hardware organization

T ime

F 1 E 1

F 2 E 2

F 3 E 3

I 1

I 2

I 3

Instruction

(c) Pipelined execution

Basic idea of instruction pipelining.

Clock cycle 1 2 3 4 T ime

Fetch + Execution

Pipelining (Cont.)

Fetch + Decode + Execution + Write

Pipeline Performance

¨  The potential increase in performance resulting from pipelining is proportional to the number of pipeline stages.

¨  However, this increase would be achieved only if all pipeline stages require the same time to complete, and there is no interruption throughout program execution.

¨  Unfortunately, this is not true.

Pipeline Performance

Pipeline Performance

¨  The previous pipeline is said to have been stalled for two clock cycles.

¨  Any condition that causes a pipeline to stall is called a hazard.

Pipelining isn’t perfect

Networks and Communication Department

Certain commands are complex and therefore harder to decode than other commands.

The Pentium used two decode stages to reduce the chance of pipeline stalls due to complex decoding

Networks and Communication Department

Sometimes a stage hits a complex command that requires more than one clock cycle, forcing the pipeline to stop. These stops, called pipeline stalls

Pipelining (Cont.)

Networks and Communication Department

¨  The inside of the CPU is composed of multiple chunks of circuitry to handle the different types of calculations your PC needs to do.

¨  For example, one part, the integer unit, handles integer math—basic math for numbers with no decimal point.

¨  The typical CPU spends more than 90 percent of its work doing integer math.

¨  But the Pentium also had special circuitry to handle complex numbers, called the floating point unit (FPU).

¨  With a single pipeline, only the integer unit or the floating point unit worked at any execution stage. (Second issue!!)

Pipelining (Cont.)

Networks and Communication Department

¨  Worse yet, floating point calculation often took many, many clock cycles to execute, forcing the CPU to stall the pipeline until the floating point finished executing the complex command (As shown in figure “Bored ALU”)

Pipelining (Cont.)

Networks and Communication Department

¨  Intel gave the Pentium two pipelines, ¨  one main, “do everything” pipeline and one that

only handled integer math. ¨  Although this didn’t stop pipeline stalls, it at least had

a second pipeline that kept running when the main one stalled

Pipelining (Cont.)

Networks and Communication Department

Pipelining (Cont.)

Networks and Communication Department

¨  The two pipelines on the old Pentium were so successful that Intel and AMD added more and more pipelines to subsequent CPUs

¨  One of the biggest differences between equivalent AMD and Intel processors is the pipelines.

¨  AMD tends to go for lots of short pipelines whereas Intel tends to go with just a few long pipelines.

Cache

¨  Cache is separate storage area used for quick access of data

¨  CPU runs faster than RAM

¨  However, you’ll always get pipeline stalls-called wait states-due to the RAM not keeping up with the CPU.

¨  To reduce wait states, the Pentium came with built-in, very high speed

¨  RAM called static RAM (SRAM).

¨  Using a faster RAM cache close to the CPU helps the CPU run without waiting

¨  There are 2 types of cache memory: ¤  L1: inside the CPU ¤  L2: on the motherboard outside the CPU

1011 0101

1001 0001 1011 0110 0001 1101 1101 1101 1100 0111 1001 1111 1100 1010

RAM Cache

RAM

CPU

L1 and L2 Cache

¨  L1 is small (16-32 KB) and runs almost as fast as CPU.

Now is a part of CPU.

¨  L2 cache is larger (64KB to 1MB) and runs slower than

CPU.

¨  L2 Was external to CPU and now a part of CPU also.

¨  For our next instruction, we go from CPU to L1 cache to

L2 cache to RAM

L2 Cache

¨  L2 was originally on the motherboard ¤ Referred to as external cache ¤ Not uncommon on today’s CPUs

Networks and Communication Department

59 A

ny Q

uest

ions

?

References

Networks and Communication Department

¨  Mike Meyers, “CompTIA A+ Certification, all-in-one”, sixth edition,chapter3,pp. 55-69,pp.75-81