Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and...

22
PA-RISC 2.0 64-Bit Processors: History, Features, and Architecture CS-350 Section 2 Spring 2004 By: Joshua Madagan Adam Gray Christie Kummers

Transcript of Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and...

Page 1: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

PA-RISC 2.0 64-Bit Processors: History, Features, and Architecture

CS-350Section 2

Spring 2004

By: Joshua MadaganAdam Gray

Christie Kummers

Page 2: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Table of Contents

Computing in 64-bit……………………………………………………………………………….3The Dawn of the 64-bit chip………………………………………………………………3Differences between 32-bit and 64-bit chips……………………………………………...3Benefits of switching to 64-bit…………………………………………………………….3

History of PA-8x00 Processors……………………………………………………………………4PA-8000…………………………………………………………………………………...4PA-8200…………………………………………………………………………………...5PA-8500…………………………………………………………………………………...5PA-8600…………………………………………………………………………………...5PA-8700…………………………………………………………………………………...5PA-8800…………………………………………………………………………………...5

Features and Detailed Architecture…………………...…………………………………………..6RISC……………………………………………………………………………………….664-bit Computing………………………………………………………………………….6Out of Order (OoO) Execution……………………………………………………………6Branch Prediction………………………………………………………………………….74-Way Superscalar Execution……………………………………………………………..7Cache System……………………………………………………………………………...7Physical Architecture……………………………………………………………………...9

Conclusion……………………………………………………………………………………….11Appendix…………………………………………………………………………………………12Works Cited……………………………………………………………………………………...16

2

Page 3: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Computing in 64-bit

With the growth of today’s technology, anything and everything dealing with computers is constantly becoming faster and more powerful. A desktop system can be bought today and in about three months, the technology used on the system can be replaced. To meet the growing demand of more power and better performance, sometimes drastic measures have to be taken. In this case we are going from a 32-bit processor to a 64-bit processor.

The Dawn of the 64-bit chip

64-bit computing is the next step towards advancing technology in computers. Hewlett-Packard broke onto the scene of 64-bit computing in late 1995 early 1996 with the release of their PA-RISC 8000 for large scale servers. Currently, HP markets its 64-bit chip for servers only and runs mainly on Linux. However, for HP to compete with the market place, they debuted their original workstation at a considerably low price for the company, just under $25,000. The reason for the bargain price was due to pressure from competitors. Due to HP’s committed performance, they managed to attract a very promising customer, the United States Army. The Army signed with HP when the PA-8000 was first released (Hayes, 1996).

Differences between 32-bit and 64-bit chips

When it comes down to comparing the 32-bit and 64-bit chips, there is an extremely noticeable difference. With today’s Intel based processors and AMD chips, the computer can address up to four gigabytes of memory. In Windows-based machines though that memory is divided between the operating system and applications; thus the most memory that any one program can use is two gigabytes. On the other hand, a 64-bit central processing unit (CPU) can handle more memory and larger files. The 64-bit processor can handle around sixteen exabytes of memory, which is over sixteen billion gigabytes. This gives the computer a larger address space thus allowing more memory to be addressed (Mainelli, 2003).

In order to best achieve the true 64-bit effect, you need a 64-bit operating system that would use 64-it addressing and arithmetic capabilities of the CPU. With the use of the 64-bit processor, it provides more resources to the system and programs. 64-bit processors come in handy when the system is called upon to perform integer arithmetic. The use of sixty-four bits provides better performance and precision than thirty-two bits does. Even today, most system compilers support the 64-bit feature on the 32-bit CPU; resulting in increased performance on larger data types (Hewlett-Packard, 2004).

If an application were to run on a 64-bit system that does not require any 64-bit features, the program should remain as a 32-bit program. On HP’s systems though, 32-bit applications can run on both thirty-two and sixty-four bit systems, saving the user money on multiple versions of the same program. HP stated that a 32-bit application would run seventy to one hundred percent faster on its original PA-8000 which is 64-bit than a 32-bit system (Foley, 1996). Since the application would need to be recompiled for the 64-bit, the file will then be larger than before, and system performance could decline because of the number of cache misses while the program

3

Page 4: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

is running in 64-bits (Hewlett-Packard, 2004 and Jacobs, 1998). However, if a client wanted to run their applications on 64-bits, as long as they were to follow the instructions supplied by their vendor, recompiling the programs would be the only main issue that would arise (Garvey, 1998).

Benefits of switching to 64-bit

The benefits that can be reaped from 64-bit computing are impeccable; it is all about speed (Foley, 1996). When 64-bit computing is adapted into any environment, it will give the user more powerful hardware and increased application performance (Jacobs, 1998). Currently 64-bit technology that is manufactured by HP is used for storing large amounts of data more than anything else, and is common in data warehouses and similar database work (Jacobs, 1998). Some applications do not fit onto 32-bit machines, causing systems to store data on multiple files instead of one. By placing such a large application onto a 64-bit machine, the system will output at a higher performance level. With more memory and address space, there is less swapping and searches can be increased by a considerable amount (Hewlett-Packard, 2004)1.

Besides being well-suited for working with databases, 64-bit processors will also be a driving force with programs that are graphic-intensive. A 64-bit system would be more capable of handling large files with are usually associated with graphics and movie files. This would be a good match for any real-time multimedia programs that are on the World Wide Web (Foley, 1996). With extra speed added to the processor, programmers can add an incredible amount of detail to games. This will lead to more realistic sounds, environments, and better textures. The characters would be more detailed and more human representation of their features. Even computer run characters would have a more realistic playing mode. Until 64-bit technology is adapted at every level of the system architecture, the full benefits of 64-bit technology cannot be fully comprehended (Garvey, 1998). 64-bit computing will also benefit computer-aided design (CAD), three-dimensional simulation, business modeling, and semiconductor design. The 64-bit system could also be adapted to our nation’s line of defense and high-end decision support systems (Phan, 2002).

History of PA-8x00 Processors

PA-8000

The PA-8000 was originally introduced in January of 1996. It was the first chip to use the 64-bit PA-RISC 2.0 Architecture, meaning that all integer and registers and functional units have been widened to 64-bit. It also allowed for faster translation from virtual addresses to physical addresses. The PA-8000 was given an Instruction Recorder Buffer, allowing the CPU to perform its own instruction scheduling. The PA-8000 was also equipped with duel floating-point and duel load/store units and no on-chip caches. All caches were made off chip so that more data could be accessed per cycle. This made the latency nearly two cycles but with completely pipelining it could be made closer to one cycle. The PA-8000, and all PA-8x00 after it, also performed

1 See the Appendix for a table summarizing the sources of increase in performance and scalability that is related to 64-bit computing by the type of application.

4

Page 5: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

speculative execution. This means that the processor would try to guess what instructions were coming up and would prepare for them accordingly. When it is time to actually perform the instruction, the predicted outcome and the actual outcome are compared. If they do not match the predicted outcome is thrown out. The principal is that if things go as the computer predicts, then it can follow its instructions much quicker (Weissmann, 1999-2004).

PA-8200

This version of the PA-8x00 was introduced in May of 1997. It was essentially just an upgraded version of the PA-8000. It had improved performance including 4Mb SRAMs with faster access time, allowing for a larger cache size. The Translation Lookaside Buffer (TLB) and Branch History Table (BHT) were also increased to reduce “wasted cycles” (Weissmann, 1999-2004).

PA-8500

The PA-8500 was introduced in September of 1998. Once again the chip was made bigger and faster. A major change was that the L1 cache was integrated with the CPU die. Once again the TLB and BHT were increased. The PA-8500 was also made able to handle two memory operations at the same time. This was accomplished by using the same dual bank used for the PA-8000’s off-chip data cache. All data caches on the PA-5000 are 0.5 MB and are implemented as four .125 MB arrays, each with a double-word of data. The instruction cache is a .5 Mb four-way set associative pipeline cache with 128 bits of instruction (Weissmann, 1999-2004).

PA-8600

In January of 2000, the PA-8600 was introduced. Only minor modifications were made between the PA-8500 and PA-8600. The only real changes were a higher clock speed, modifications to the interface bus, rework on the bus transactions, and the addition of a quasi (Least Recently Used) LRU replacement policy for the instruction cache (Weissmann, 1999-2004).

PA-8700

The PA-8700 was introduced in August of 2001. Again, the PA-8700 was just another upgrade from the PA-8500. The on-chip L1 cache and TLB were enhanced significantly and a new CMOS-process helped boost the clock frequency (Weissmann, 1999-2004).

PA-8800

The PA-8800 was introduced in October of 2001. No modifications were made between the PA-8800 and the PA-8700. Rather, the PA-8800 was only two PA-8700 cores put together on the same chip. This allowed the chip’s core speed to run up to 1 GHz and allowed for a combined 35 MB L1+L2 cache (LostCircuits, 2001).2

2 See the appendix for a table that lists all of the HP PA-8x00 processors and the features that they each posses.

5

Page 6: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Features and Detailed Architecture

The basic architecture of the PA-8x00 64-bit processors has not changed much since the original PA-8000 was released. The standards set in the PA-RISC 2.0 specifications are implemented in all PA-8x00 chips. These include 64-bit data and address extensions, branch prediction, the use of a 4-way superscalar system, and out-of-order (OoO) execution, among other things (Hewlett Packard, 2000). However, several key items have been changed and improved during the lifecycle of the PA-8x00 family. The location of L1 cache has been changed during the process, components have been added, and paths widened (Weissmann, 1999-2004). The PA-RISC 64-bit processors have quite a few interesting features.

RISC

The PA-8x00 series of processors is a Reduced Instruction Set Computer (RISC). The instruction set is directly implemented in hardware, without the use of microcode. Hardware implemented instructions are performed in one clock cycle, while microcode instructions can take several cycles. The PA-RISC 2.0 processors also use a fixed instruction size of 32-bits. These instructions can easily be divided into parts, allowing for easier pipelining (Kane, 1996).

64-Bit Computing

The PA-8x00 processors are true 64-bit computers. All of the integer registers, Arithmetic Logic Units (ALUs), and shift and merge units are 64-bits wide. The address space can theoretically be up to 64-bits wide, but the PA-8000, PA-8200, PA-8500, and PA-8600 only support 40-bit addresses, while the PA-8700 and PA-8800 support 44-bit addresses. This allows for 1TB and 16TBs of memory, respectively. This all combines to allow the PA-8x00 processors to access huge amounts of data quickly and operate on increasingly large numbers (Weissmann, 1999-2004).

Out of Order (OoO) Execution

The PA-8x00 processors have the ability to schedule their own instructions (Hunt). This allows the processor to find instructions than can be executed simultaneously, and therefore make better use of the multiple execution units of the processor. The Instruction Reorder Buffer (IRB) stores a maximum of 28 computational instructions and 28 load/store instructions and determines which instructions can be executed (Weissmann, 1999-2004). This results in instructions not necessarily being executed in program order. Branch prediction also comes into play here, as later instructions can be executed before earlier ones have finished, and may have based calculations on incorrect data. Out of order execution provides increased performance through the constant use of pipelining, as instructions are constantly being fed to the execution units (Hunt).

6

Page 7: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Branch Prediction

The PA-8x00 implements branch prediction in order to keep pipelines full while executing instructions that change the flow of control in programs. Conditional and looping statements create instances where the control could be passed to different places, depending on the outcome of the statement (Downey, 2000). To combat this, the PA-8x00 implements both static and dynamic branch prediction (Hewlett Packard, 2000). The processor guesses the outcome of a branch based on the code itself and on the history of the branch, if any, found in the BHT. Assuming that the prediction is correct, the processor continues from the branch until it receives the true result. If it finds that the result matches the prediction, the program continues on, having saved itself some cycles. However, if the result does not match the prediction, the provisionally executed instructions are thrown out and the IRB reverts back to the branch and continues feeding instructions from there, with the correct result (Hewlett Packard, 2000). Branch prediction provides increased performance through aiding OoO execution in keeping instructions flowing to the multiple execution units.

4-Way Superscalar Execution

The PA-8x00 processors implement a 4-way superscalar system. This means that the processor can execute 4 different instructions per clock cycle (Webopedia, 2001). Superscalar systems work by using multiple execution units so that long execution times do not waste cycles in the fetch, decode, and save stages (Downey, 2000). Superscalar execution requires a constant flow of data and instructions to see the benefits of simultaneous execution. This is achieved through a combination of advanced instruction scheduling algorithms, the ability to process instructions out of order, and branch prediction.

Cache System

The cache system has not remained the same throughout the PA-8x00 chips. All of the chips have used separate data and instruction L1 caches, as specified in the Harvard Architecture, but the placement of these has changed. The PA-8000 and PA-8200 both make use of off-chip L1 caches (Weissmann, 1999-2004). The PA-8x00 series of chips was designed to be high performance, and delivering this performance required a larger cache with more bandwidth than could be included on the chip in the mid 1990s (Hunt). These off-chip caches, one for data and one for instructions, could be up to 2MB each for the 8200. The caches are direct-mapped and dual-ported (Weissmann, 1999-2004). Direct-mapping means that each block of memory, when pulled into the cache, can be mapped to only one block in the cache. This is cheaper for HP to implement, but also low performance (Null & Lobur, 2003). Dual-ported cache can feed data to two load/store units at the same time, which increases performance.

Starting with the PA-8500, HP moved the L1 caches onto the chip. The caches were moved on-chip because they are cheaper and use fewer I/O resources than the previously used off-chip RAM chips (Hunt et al.). These on-chip caches are 4-way set associative, meaning that cache is divided into sets of 4 blocks each. This allows the system to map memory blocks to any of the 4 blocks in the set that a memory block is going to, which means that blocks are replaced less often

7

Page 8: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

than with direct-mapping (Null & Lobur, 2003). These caches are only single-ported though, as this saves space on the processor core (Weissmann, 1999-2004). The L1 caches of the PA-8500, PA-8600, PA-8700, and PA-8800 are all arranged in the same way, but increase in size with later models. This diagram shows the PA-8700 L1 cache organization (Hewlett Packard, 2000).

The L1 data cache is separated into odd and even double-word caches, each with a 64-bit pipe out. Both of these caches are divided into four arrays of equal size. This allows for the 4-way set associative nature of the cache. The tags for each line of cache are held in the tag array. The L1 instruction cache is also divided into four arrays of equal size, with 2 smaller arrays in each. These 4 arrays all have 128-bit pipes to the multiplexer, which can then transmit 4 instructions per cycle to the Instruction Fetch Unit (IFU), assuming one instruction comes from each array (Hewlett Packard, 2000).

The PA-8800 also implements a 32MB off-chip L2 cache, which is shared by the two logic cores. This, combined with the 1.5MB of L1 data cache and 1.5MB of L1 instruction cache, gives the PA-8800 the ability to store even relatively large applications in just cache (Lostcircuits, 2001).

8

Page 9: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Physical Architecture

Physically, the Architecture of the PA-8x00 processors has not changed much since the introduction of the PA-8000. Here is a block diagram of the PA-8000 (Hunt).

The architecture of the PA-8000 is fairly simple. It contains two 64-bit integer ALUs, two floating point units (FPUs), two load/store units, two shift/merge units, and two divide/square root units (Hunt). Both of the FPUs have thirty-two 64-bit registers, and the ALUs can take advantage of thirty-two 64-bit registers, as well (Gwennap, 1994). The IFU can fetch up to 4 instructions per cycle, contains the BHT and Branch Target Address Cache (BTAC), and connects to the L1 instruction cache, the system bus/memory, and the sort unit. The BTAC stores addresses for predicted branches, and works with the BHT in branch prediction. The sort unit controls the flow of instructions into the two buffers. The 28-entry ALU and memory buffers create the 56-entry IRB, used to perform OoO execution. These feed instructions to the processing units (Hunt). The rename registers hold the results from the processing units and from L2 data cache for use in other pending instructions (Gwennap, 1994). Once all of the preceding instructions have been completed, an instruction is retired. This process clears the instruction from the IRB and moves the result from the rename registers to the architected registers (Hunt).

9

Page 10: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

The architecture did not change until the PA-8700. Here is a block diagram of the PA-8700 (Johnson, 2001).

The only real difference between the architecture of the PA-8000 and that of the PA-8700 is the use of on-chip L1 cache and the inclusion of an interface to off-chip L2 cache.

The PA-8800 is the latest member of the PA-8x00 family of processors. Here is a block diagram of the PA-8800 (Lostcircuits, 2001).

10

Page 11: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

The PA-8800 architecture is simply two PA-8700 processors on the same chip, linked together and linked with an off-chip 32MB L2 cache (Lostcircuits, 2001).

ConclusionIn conclusion, 64-bit computing has come a long way. It provides great performance increases over 32-bit computing. Hewlett Packard's entry into 64-bit computing, the PA-RISC 2.0 architecture, has grown quite a bit since its inception. From the PA-8000 with its off-chip L1 cache to the dual-core PA-8800, HP has developed a family of processors with competitive features and a unique architecture.

11

Page 12: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Appendix

Table 1: The following table summarizes the sources of increases in performance and scalability associated with 64-bit computing by type of application.

(Hewlett-Packard, 2004)

Example Sources of Performance & Scalability gains → Large Databases ● Larger memory allocation per user

● Many more users● Large file implementations● Reduced swapping

→ Decision support ● Direct addressing● Reduced swapping● Large file implementations

→ Technical ● Large process data space applications ● More available shared memory segments

● Reduced swapping● High-precision arithmetic

12

Page 13: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Table 2: This is a table that features what each PA-8x00 processor possesses

PA

-RIS

C V

2.0

64-

Bit

10 F

unct

ion

Uni

ts, 2

Inte

ger A

LUs,

2 S

hift/

Mer

ge U

nits

2 C

ompl

ete

Load

/Sto

re P

ipel

ines

, 2 F

P M

ultip

ly/A

ccum

ulat

e U

nits

, 2 F

P D

ivid

e/S

quar

e R

oot U

nits

4-w

ay s

uper

scal

ar

2 Ad

dres

s Add

ers

96-e

ntry

fully

-ass

ocia

tive

dual

-por

ted

TLB

TLB

Mis

s P

enal

ty o

f 61

Cyc

les

120-

entry

fully

-ass

ocia

tive

dual

-por

ted

TLB

160-

entry

fully

-ass

ocia

tive

dual

-por

ted

TLB

240-

entry

fully

-ass

ocia

tive

dual

-por

ted

TLB

32-e

ntry

BTA

C (b

ranc

h ta

rget

add

ress

cac

he)

42-e

ntry

BTA

C (B

ranc

h Ta

rget

Add

ress

Cac

he)

256-

entry

BHT

(Bra

nch

His

tory

Tab

le)

1024

-ent

ry B

HT (B

ranc

h H

isto

ry T

able

)

2048

-ent

ry B

HT (b

ranc

h hi

stor

y ta

ble)

dyna

mic

and

stat

ic b

ranc

h pr

edict

ion

mod

es

PA-8000 X X X X X X X       X   X     XPA-8200 X X X X X     X       X   X   XPA-8500 X X X X X       X   X       X XPA-8600 X X X X X       X   X       X XPA-8700 X X X X X         X X       X XPA-8800 X X X X X         X X       X X

.

13

Page 14: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Table 2.2: This table is a continuation of table 2.

off-c

hip

L1 c

ache

s up

to 1

MB

I and

1M

B D,

real

ized

in sy

nchr

onou

s 6.7

ns (1

50M

Hz) l

ate-

write

1M

b SR

AMs,

one

cycle

late

ncy

off-c

hip

L1 c

ache

s up

to 2

MB

I and

2M

B D,

real

ized

in sy

nchr

onou

s 5ns

(200

MHz

) lat

e-wr

ite 4

Mb

SRAM

s, on

e cy

cle la

tenc

y

on-c

hip

L1 c

ache

s 0.5

MB

I and

1M

B D,

eac

h 4-

way

set a

ssoc

iativ

e

on-c

hip

L1 c

ache

s 0.7

5MB

I and

1.5

MB

D, e

ach

4-wa

y se

t ass

ocia

tive,

impl

emen

ted

in

inde

pend

ent 0

.75M

B ba

nks.

cach

es a

re d

irect

-map

ped

and

dual

-por

ted

32 o

r 64

Byte

cac

he li

ne si

ze

Data

cac

he p

refe

tchi

ng

Supp

orts

up

to 1

TB

of p

hysic

ally

add

ress

able

mem

ory

(40-

bit p

hysic

al a

ddre

sses

Supp

orts

up

to 1

6 TB

of p

hysic

ally

add

ress

able

mem

ory

(44-

bit p

hysic

al a

ddre

sses

)

56-e

ntry

inst

ruct

ion

queu

e/re

orde

r buff

er (I

RB)

each

inst

ruct

ion

inclu

des fi

ve p

rede

code

bits

Quas

i LRU

repl

acem

ent p

olicy

for t

he in

stru

ctio

n ca

che

Quas

i LRU

repl

acem

ent p

olicy

for b

oth

the

inst

ruct

ion

and

data

cac

he

bi-e

ndia

n su

ppor

t

Supp

ort f

or h

ardw

are

lock

-ste

ppin

g, i.

e. o

pera

ting

mul

tiple

chi

ps in

par

alle

l to

dete

ct fa

ults

Runw

ay sy

stem

/mem

ory

bus,

120M

Hz, 6

4-bi

t wid

e, fe

atur

ing

split

tran

sact

ions

and

gl

uele

ss m

ultip

roce

ssin

g. M

ax. t

hrou

ghpu

t of 9

60M

B/s

PA-8000 X       X         X X     X   XPA-8200   X     X         X X     X   XPA-8500     X     X   X   X       X    PA-8600     X     X   X   X   X   X    PA-8700       X   X X   X X     X X X  PA-8800       X   X X   X X     X X X  

Table 2.3: This table is a continuation of table 2.

14

Page 15: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Run

way

sys

tem

/mem

ory

bus,

125

MH

z, 6

4-bi

t, D

DR

(dou

ble

data

rate

), ~2

GB

/s p

eak

band

wid

th

Run

way

sys

tem

/mem

ory

bus,

125

MH

z, 6

4-bi

t, D

DR

(dou

ble

data

rate

), ~2

GB

/s p

eak

band

wid

th

Up to

180

MHz

freq

uenc

y wi

th 3

.3V

core

vol

tage

Up to

300

MHz

freq

uenc

y wi

th 3

.3V

core

vol

tage

Up to

440

MHz

freq

uenc

y wi

th 2

.0V

core

vol

tage

Up to

~55

0MHz

freq

uenc

y wi

th 2

.0V

core

vol

tage

Up to

750

MHz

(875

MHz

on

the

PA-8

700+

) fre

quen

cy w

ith 1

.5V

core

vol

tage

17.7

x 1

9.6

mm

2 die

, 4'5

00'0

00 F

ETs,

0.5

micr

on, 5

-laye

r met

al C

MOS

pac

kage

d in

a

1,08

5-pi

n fli

p-ch

ip L

GA p

acka

ge

21.3

x 2

2.0

mm

2 di

e, 1

40'0

00'0

00 F

ETs

, 0.2

5 m

icro

n, 5

-laye

r met

al C

MO

S p

acka

ged

in a

544

-pin

LG

A p

acka

ge

16.0

x 1

9.0

mm

2 die

, 186

'000

'000

FET

s, 0.

18 m

icron

, 7-la

yer S

OI C

MOS

pac

kage

d in

a

544-

pin

LGA

pack

age

SPEC

95 in

t/fp:

11.

8/20

.2

SPEC

95 in

t/fp:

15.

5/25

.0

SPEC

95 in

t/fp:

31.

8/47

.2

SPEC

2000

int/f

p: 1

25/1

53

SPEC

2000

int/f

p: 1

65/1

89

SPEC

2000

int/f

p: 3

38/3

57

PA-8000     X         X     X     X    PA-8200       X       X       X     X  PA-8500 X       X       X       X     XPA-8600 X         X     X              PA-8700   X         X     X            PA-8800   X         X     X            

15

Page 16: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Works Cited

64-bit Computing re-examined. (2002, August). Network Magazine. http://www.networkmagazineindia.com/200208/focus2.shtml

Allison, Andrew. (1995, April). High-End Computing: Hope And Reality Of 64-Bit. InformationWeek. http://www.informationweek.com/524/24uwfw.htm

Bourekas, Phil (1999, January). 64-bit features give clout to 64-bit chips. EE Times. http://www.eetimes.com/article/showArticle/jhtml?/articleId+18300800

Downey, Tim (2000). “Branch Prediction.” http://www.cs.fiu.edu/~downeyt/cop3402/prediction.html

Foley, John. (1996, January). High-Speed Processors: Plugging In 64-Bit Chips. InformationWeek. http://www.informationweek.com/560/60ht64b.htm

Garvey, Martin. (1998, April). 64-Bit Computing Takes Off. InformationWeek. http://www.informationweek.com/678/78iubit.htm

Gwennapp, Linley (1994). “PA-8000 Combines Complexity and Speed.” The Insiders' Guide to Microprocessor Hardware. 14 Nov. 1994.

Hayes, Mary. (1996, June). HP 64-Bit Workstations Ready. InformationWeek. http://www.informationweek.com/582/82iubit.htm

Hewlett Packard (2000). “PA-RISC 8x00 Family Microprocessors with Focus on PA-8700.” URL: http://www.cpus.hp.com/technical_references/PA-8700wp.pdf

Hewlett-Packard (2004) What is 64-it computing? http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,989,00.html

Hunt, Doug. “Advanced Performance Features of the 64-bit PA-8000.” http://www.cpus.hp.com/technical_references/advperf.shtml

Hunt, D., Lesartre, G. “PA-8500: The Continuing Evolution of the PA-8000 Family.” http://www.cpus.hp.com/technical_references/8500.shtml

Jacobs, April. (1998, April). 64-bit computing. http://www.computerworld.com/hardwaretopics/hardware/story10,10808,43552,00.html

Johnson, David (2001). “HP’s Mako Processor.” http://www.cpus.hp.com/technical_references/mpf_2001.pdf

16

Page 17: Detailed Architecture - James Madison University€¦  · Web viewHistory, Features, and Architecture. CS-350. Section 2. Spring 2004. By: Joshua Madagan. Adam Gray. Christie Kummers

Kane, Gerry (1996). “PA-RISC 2.0 Architecture.” http://h21007.www2.hp.com/dspp/files/unprotected/parisc20/PA_1_overview.pdf

Lostcircuits (2001). “HP PA-8800 RISC Processor.” http://www.lostcircuits.com/cpu/hp_pa8800/

Mainelli, Tom. (2003, July). Are You Ready for a 64-Bit PC? PCWorld. http://www.pcworld.com/news/article/0,aid,111508,00.asp

McGee, Marianne Kolbasuk & Panettieri, Joseph C. (1995, December). The Push To 64-Bit Systems: Top vendors plan move to advanced microprocessors. InformationWeek. http://www.informationweek.com/558/58/iubit.htm

Null, L., Lobur, J. (2003). Computer Organization and Architecture. Sudbury, MA: Jones and Bartlett Publishers. QZ76.9.C643 N85 2003. ISBN 0-7637-0444-X.

Webopedia (2001). “What is superscalar?” http://www.webopedia.com/TERM/S/superscalar.html

Weissmann, Paul (1999-2004). “The OpenPA Project.” http://www.openpa.net

17