Pentium II Felülről kompatibilis az I8088 , … , Pentium Pro -val.
7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.
-
Upload
julius-payne -
Category
Documents
-
view
221 -
download
0
Transcript of 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.
![Page 1: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/1.jpg)
Apr 19, 2023 (1)
CSC2510 - Computer Organization
Lecture 6: A Historical Perspective of Pentium IA-32
![Page 2: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/2.jpg)
IA-32 Intel ArchitectureIA-32 Intel Architecture
![Page 3: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/3.jpg)
IA-32 processors
• 386 & 486 processors
• Pentium processors
• P6 family processors (Pentium Pro, Pentium II, Pentium III) : based on the P6 family microarchi-tecture
• Pentium 4 processors, Intel Xeon processors, Pentium D processors, Pentium processor Extreme Editions : based on the Intel NetBurst microarchi-tecture
![Page 4: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/4.jpg)
IA-32 Intel Architecture
• A Brief history of the IA-32 Architecture
• Coming from …16-bit processors• 8086 processors
− 16-bit registers, 16-bit external data bus
− 20-bit addressing 1 MByte address space
• 8088 processors : 8-bit external data bus• 8086/8088 introduced ‘segmentation’ to the IA-32
architecture: four 16-bit segment registers point to memory segments of 64 Kbytes
![Page 5: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/5.jpg)
Internal architecture of 8086
![Page 6: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/6.jpg)
Intel 8085 architecture : 8-bit data, 16-bit address
![Page 7: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/7.jpg)
Intel 286 processor (1982)
Provide two programming modes1) Real mode• functions exactly same as 8086• use only 20 least significant address lines (max. 1 MB)• faster than 8086 due to redesigning and higher clock2)Protected mode• 16 new instructions are added• support multi-program environment by giving each
program a predetermined amount of memory (16 MB)• programs no longer have physical addresses, but are
addressed by a segment selector• Several programs can be loaded into memory at the same
time, but protected from each other
![Page 8: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/8.jpg)
The 8086 and 80286 microprocessors.
John UffenbeckThe 80x86 Family: Design, Programming, and Interfacing, 3e
Copyright ©2002 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458
All rights reserved.
![Page 9: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/9.jpg)
Intel 386 processor (1985)
• First 32-bit processor in the IA-32 architecture family
• 32-bit registers used both for holding operands and addressing
• 32-bit address bus that supports up to 4 Gbytes of physical memory
• Segmented-memory model and flat memory model
• Paging (fixed 4-Kbyte page) for virtual memory management
• 386CX, 386DX(with FPU inside)
![Page 10: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/10.jpg)
Internal architecture of 80386
![Page 11: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/11.jpg)
Internal registers of 80386
![Page 12: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/12.jpg)
Intel 486 processor (1989)
• Added more parallel execution by using five-stage pipeline
• 8-Kbyte on-chip first-level cache
• Integrated x87 FPU
• Power saving and system management capabilities
• Includes FPU
![Page 13: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/13.jpg)
Intel Pentium processor (1993)
• Added a second execution pipeline to achieve superscalar performance (u & v pipelines executing two instructions per clock)
• Split on-chip caches (8-KByte code cache and 8-KByte data cache)
• Data cache uses MESI (coherence) protocol • Branch prediction with an on-chip branch table• Internal data path : 128, 256 bits• External data bus : 64 bits• Enhanced by MMX technology that uses SIMD
execution model
![Page 14: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/14.jpg)
FIGURE 3-28 Processor model for the Pentium. The BIU supplies instructions to the CPU via two pipelines called the u and v pipes. In addition, two separate 8K data and code caches are provided.
John UffenbeckThe 80x86 Family: Design, Programming, and Interfacing, 3e
Copyright ©2002 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458
All rights reserved.
![Page 15: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/15.jpg)
![Page 16: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/16.jpg)
The U and V Pipes
U and V pipes : dual five-stage pipelines Prefetcher and queue units provide paired instructions for U and V pipes U pipe : executes all Pentium instructions V pipe : executes only simple integer instructions (data is already in the CPU registers) --- sorting of instructions is performed by the prefetcher
Two pipelines and two ALUs Pentium executes two instructions simultaneously (in one clock cycle).
Condition : two instructions are simple and do not depend on each other – no data dependency.
![Page 17: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/17.jpg)
Superpipelined vs. Superscalar
Superpipelining : divide the instruction execution pipeline into the smaller stages.
[ex] 5-stage pipeline (80486, Pentium) 12-stage (P6 processors)
Superscalar : execute two or more instructions per clock cycle by using multiple execution units (include ALUs).
[ex] Pentium executes two instructions simultaneously = 2-way superscalar
Pentium II, III & Celeron : 3-way superscalar
![Page 18: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/18.jpg)
MMX (Multimedia Extension) : provides 2 architecturalenhancements over non-MMX Pentium
① 57 instructions are added for multimedia (audio, video,and graphic data) applications.
② SIMD(Single-Instruction stream Multiple-Data stream)allows the same operation to be performed on multipledata items. Because many multimedia applications require large blocks of data to be manipulated, SIMD provides a significant performance enhancement.
For general applications, 10~20% performance improved.For multimedia applications, nearly 70% improved.
![Page 19: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/19.jpg)
SIMD Execution Model
![Page 20: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/20.jpg)
P6 family processors (1995-1999)
• Intel Pentium Pro processor– Three-way superscalar : decode, dispatch, and complete
execution (retire) of three instructions per clock cycle on average
– Introduced the dynamic execution (micro-data flow analysis, out-of-order execution, superior branch prediction, and speculative execution) in a superscalar implementation
– Enhanced by caches (two on-chip 8-Kbyte 1st-level cache and 256-Kbyte 2nd-level cache in the same package (two-chips in the same package)
– 36 address lines max. 64 GB memory
![Page 21: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/21.jpg)
FIGURE 1-14 The Pentium Pro is two chips in one. The larger die is the processor, the smaller a 256K L2 cache. (Courtesy of Intel Corporation.)
John UffenbeckThe 80x86 Family: Design, Programming, and Interfacing, 3e
Copyright ©2002 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458
All rights reserved.
![Page 22: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/22.jpg)
Dynamic Execution : a new approach to processing S/Winstructions, that reduces idle processor time
• Multiple Branch Prediction : Pentium Pro can look as far as 30 instructions ahead to anticipate conditionalbranches reduce waste of pipeline clocks
• Data Flow Analysis : looks at upcoming S/W instruc-tions for the optimal sequence of processing
• Speculative Execution : allows to execute instructionsin a different order from which they are entered theprocessor = “out-of-order execution”. The result ofthese instructions are stored as speculative resultsuntil their final states can be determined
![Page 23: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/23.jpg)
P6 family processors (cont’d)
• Pentium II processor– Added Intel MMX technology– Processor core is packaged in the single edge contact
cartridge (SECC)– 1st-level(L1) caches are enlarged (16 Kbytes each)– 2nd-level(L2) cache sizes of 256 KB, 512 KB, 1 MB
are supported– A half-clock speed backside bus connects 2nd-level
cache and the processor– Multiple low-power states such as AutoHALT, Stop-
Grant, Sleep, and Deep Sleep are supported to conserve power when being idle
![Page 24: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/24.jpg)
P6 family processors (cont’d)
• Pentium II Xeon processor– Includes 4-way and 8-way, 2 Mbyte 2nd-level
cache running on a dual-clock speed backside bus
• Intel Celeron processor – Focused on the PC market– Pentium II without L2 cache– Use the slot 1 connector without the plastic cover
called “naked CPU”
![Page 25: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/25.jpg)
John UffenbeckThe 80x86 Family: Design, Programming, and Interfacing, 3e
Copyright ©2002 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458
All rights reserved.
Celeron Board
![Page 26: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/26.jpg)
P6 family processors (cont’d)
• Celeron A : Includes 128KB L2 cache on the same die with processor.
– Drawback : 66 MHz bus cycle– 370-pin PGA package (called Socket 370)
![Page 27: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/27.jpg)
P6 family processors (cont’d)
• Pentium III processor– Introduced Streaming SIMD Extensions (SSE) :
expand SIMD execution model by providing new set of 128-bit registers and the ability to perform SIMD operations on packed single-precision floating-point values
• Pentium III Xeon processor – Enhanced a full-speed, on-die Advanced Transfer
Cache
![Page 28: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/28.jpg)
John UffenbeckThe 80x86 Family: Design, Programming, and Interfacing, 3e
Copyright ©2002 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458
All rights reserved.
Pentium III with integrated L2 cache (more than 22 million transistors)
![Page 29: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/29.jpg)
2.1.7 Pentium 4 Processor Family (2000-2005)
• Based on Intel NetBurst microarchitecture
• Introduced Streaming SIMD Extentions 2 (SSE2)
• Pentium 4 processor 3.40 GHz supports Hyper Threading Technology and Streaming SIMD Extentions 3 (SSE3)
• Pentium 4 Processor Extreme Edition supports Intel Extended Memory 64 Technology and Hyper-Threading Technology
• Pentium 4 Processor 6xx series supports Intel Extended Memory 64 Technology
![Page 30: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/30.jpg)
Streaming SIMD Extensions 2 (SSE2)
![Page 31: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/31.jpg)
![Page 32: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/32.jpg)
• Horizontal Data Movement in ADDSUBPD
![Page 33: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/33.jpg)
2.1.8 Intel Xeon Processor (2001-2005)
• Based on Intel NetBurst microarchitecture
• As a family, this group of IA-32 processors is designed for use in multiprocessor server systems and high-performance workstations
• Intel Xeon processor MP supports for Hyper-Threading Technology
• 64-bit Intel Xeon processor 3.60 GHz with 800 MHz System Bus introduced Intel Extended Memory 64 Technology
![Page 34: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/34.jpg)
2.1.9 Intel Pentium M Processor (2003-2005)
• Low-power mobile processor family
• Designed for extending battery life and seamless integration
• Its extended microarchitecture includes:
– Support for Dynamic Execution
– Low-power core with copper interconnect
– On-die, primary 32-KB instruction cache and 32-KB write-back data cache, and second-level 2 MB cache with Advanced Transfer Cache Architecture
– Advanced Branch Prediction and Data Prefetch Logic
– Support for MMX tech, Streaming SIMD instructions, and SSE2 instruction set
![Page 35: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/35.jpg)
Intel Pentium Processor Extreme Edition (2005)
• Introduced dual-core technology that provides advanced H/W multi-threading support
• Based on Intel NetBurst microarchitecture
• Supports SSE, SSE2, SSE3, Hyper-Threading Technology, and Intel Extended Memory 64 Technology
![Page 36: 7-Aug-15 (1) CSC2510 - Computer Organization Lecture 6: A Historical Perspective of Pentium IA-32.](https://reader036.fdocuments.net/reader036/viewer/2022062304/56649d975503460f94a8092f/html5/thumbnails/36.jpg)
The Processor War
Apr 19, 2023 (36)