Talk02 David Scott

download Talk02 David Scott

of 29

Transcript of Talk02 David Scott

  • 7/28/2019 Talk02 David Scott

    1/29

    HPC@IntelPlatforms and Technology

    CCGSCSeptember 10, 2006

    Dr. David Scott

    Petascale Product

    Line [email protected]

  • 7/28/2019 Talk02 David Scott

    2/29

    Legal Disclaimer

    Information in this document is provided in connection with Intel products.

    No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted bythis document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumesno liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use ofIntel products including liability or warranties relating to fitness for a particular purpose, merchantability,or infringement of any patent, copyright or other intellectual property right. Intel products are not intendedfor use in medical, life saving, or life sustaining applications.

    Intel may make changes to specifications and product descriptions at any time, without notice

    Designers must not rely on the absence or characteristics of any features or instructions marked"reserved" or "undefined." Intel reserves these for future definition and shall have no responsibilitywhatsoever for conflicts or incompatibilities arising from future changes to them.

    This document contains information on products in the design phase of development. The information hereis subject to change without notice. Do not finalize a design with this information.

    Intel Xeon, Pentium 4, Itanium, Itanium 2, Prescott, Prestonia, Nocona, Jayhawk, Potomac, Tulsa, andDempsey processors may contain design defects or errors known as errata which may cause the productto deviate from published specifications. Current characterized errata are available on request.

    Contact your local Intel sales office or your distributor to obtain the latest specifications before placingyour product order.

    Copies of documents which have an order number and are referenced in this document, or other Intelliterature, may be obtained by calling 1-800-548-4725, or by visiting Intel's website at.

    Intel, Itanium, Xeon and Pentium are trademarks or registered trademarks ofIntel Corporation or its subsidiaries in the United States and other countries.

    http://www.intel.com/http://www.intel.com/
  • 7/28/2019 Talk02 David Scott

    3/29

    AGENDA

    New Processors

    New HPC focused platforms

    Technologies for the future

  • 7/28/2019 Talk02 David Scott

    4/29

    Core-Duo Processors

    Lets Take A Look Inside

  • 7/28/2019 Talk02 David Scott

    5/29

    Historical Driving Forces

    0.01

    0.1

    1

    10

    1970 1980 1990 2000 2010 2020

    1

    10

    100

    1000

    10000

    100000

    1970 1980 1990 2000 2010 2020

    Increased Performancevia Increased Frequency

    FeatureSize(um)Frequency

    (MHz)

    200565nm

    1B+ Transistors

    194620 Numbers

    in Main Memory

    1971I4004 Processor2300 Transistors

    Shrinking Geometry

  • 7/28/2019 Talk02 David Scott

    6/29

    The Challenges

    10

    100

    1000

    1990 1995 2000 2005 2010 2015

    CPU

    Power(W)

    30nm

    45nm

    65nm

    90nm

    0.13um

    0.18um

    0.25um

    0.35um

    0.5um

    0.7um

    0.1

    1

    10

    1990 1993 1997 2001 2005 200

    ~30%

    Supply

    Voltage(V)

    Power = Capacitance x Voltage2 x Frequencyalso

    Power ~ Voltage3

    Power Limitations Diminishing Voltage Scaling

  • 7/28/2019 Talk02 David Scott

    7/29

    IntelCore Microarchitecture

    *Graphics not representative of actual die photo or relative size

    ScalableLow Power High Performance

    Merom

    Conroe

    Woodcrest

    65nm

    ServerOptimized

    DesktopOptimized

    MobileOptimized

    Intel WideDynamicExecution

    Intel

    IntelligentPower

    Capability

    IntelAdvanced

    Smart Cache

    IntelSmartMemoryAccess

    Intel

    AdvancedDigital Media

    Boost

  • 7/28/2019 Talk02 David Scott

    8/29

    IntelWide Dynamic Execution

    INSTRUCTION FETCHAND PRE-DECODE

    INSTRUCTION QUEUE

    RETIREMENT UNIT(REORDER BUFFER)

    DECODE

    RENAME / ALLOC

    SCHEDULERS

    EXECUTE

    INSTRUCTION FETCHAND PRE-DECODE

    INSTRUCTION QUEUE

    RETIREMENT UNIT(REORDER BUFFER)

    DECODE

    RENAME / ALLOC

    SCHEDULERS

    EXECUTE

    CORE 1 CORE 2

    4 WIDE -

    DECODE TO

    EXECUTE

    4 WIDE -

    MICRO-OP

    EXECUTE

    MICRO

    and

    MACRO

    FUSION

    DEEPER

    BUFFERS

    EFFICIENT14 STAGE

    PIPELINE

    ENHANCEDALUs

    EACH CORE

  • 7/28/2019 Talk02 David Scott

    9/29

    Intel Intelligent Power Capability

    UltraFine

    Grained

    CoarseGrained

    AggressiveClock Gating

    EnhancedSpeed-Step

    Low VCC Arrays Blocks Controlled

    Via SleepTransistors

    Low LeakageTransistors

    SleepTransistors

    Transistor

    65nm Strained Silicon

    Low-K Dielectric More Metal Layers

    Process

    *Graphics not representative of actual die photo or relative size

  • 7/28/2019 Talk02 David Scott

    10/29

    0.0

    0.2

    0.4

    0.60.8

    1.0

    1.2

    1.4

    1.6

    1.8

    2.0

    Gaussian(3,4) GAMESS(3,4) Amber(3,4) GROMACS(2,4) NAMD(2,6) HMMER(1,4) BLAST(1,5) ClustalW-

    MPI(2,4)

    Geomeansofr

    elativeperformance

    Intel Performance Leadership for Life Sciences

    Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intelproducts as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyersshould consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For moreinformation on performance tests and on the performance of Intel products, referencehttp://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104.

    Source: Intel Internal Measurement* Other brands and names may be claimed as the propertyof others.

    Woodcrest single thread relative performance compared to Opteron*

    Higher is better

    (1) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 4GB Memory

    (2) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 8GB Memory

    (3) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 16GB Memory

    (4) Dual-Core AMD* Opteron* processor 280, 2-socket sys. 2.4GHz, 1MB L2 cache, 16GB Memory

    (5) Dual-Core AMD* Opteron* processor 285, 2-socket sys. 2.6GHz, 1MB L2 cache, 4GB Memory

    (6) AMD* Opteron* processor 252, 2-socket sys. 2.6GHz, 1MB L2 cache, 16GB Memory

    Computational Chemistry Bioinformatics

    Intel outperforms AMD acrossall applications tested

  • 7/28/2019 Talk02 David Scott

    11/29

    Core Microarchitecture Advances WithQuad Core

    Quad Core

    Kentsfield

    Clovertown

    Server

    Desktop

    Paxville DP

    Woodcrest

    Irwindale

    DP Performance Per WattComparison with SPECint_rate

    at the Platform Level

    Dempsey MV

    1X

    2X

    3X

    4X

    H2 06

    H1 06

    H2 05

    H1 05

    Source: Intel

    *Graphics not representative of actual die photo or relative size

    Clovertown H1 07

    Energy PerformanceEfficient

  • 7/28/2019 Talk02 David Scott

    12/29

    AGENDA

    New ProcessorsNew HPC focused

    platformsTechnologies for the future

  • 7/28/2019 Talk02 David Scott

    13/29

    Motivation

    Caretta & Port Townsend:

    Provide a higher memory BW / FLOP option than DP Xeon

    Provide a less expensive option than DP Xeon

    Atoka

    High Density DP solution

    Metrics

    Performance

    Core we lead

    Bus close (depends on STREAM binaries etc) + 2x cache size

    Performance / Watt

    We lead Performance / SqFt

    We match

    Performance / $

    We lead

  • 7/28/2019 Talk02 David Scott

    14/29

    Caretta Features

    HPC BOARD FEATURES

    Single Intel Pentium-D processor (Presler, Smithfield) Support for Pentium4 (CedarMill)

    Chipset: Mukilteo + ICH7

    4 DIMM (max 8GB) - DDR2 533/667 with U-ECC

    800 MHz FSB

    Integrated 2 port SATA2 with RAID 0/1

    2xGbE (TekoaE + Tabor)

    2x USB2 external

    Rear video & serial port

    Internal headers: serial, 2xUSB2, I2C

    Custom 5.95 x13, 6 layer

    Custom power connector Client Management iAMT via TekoaE

    MCH

    GbE

    ICH

    Video

    CPU

    Memory

    P T d F

  • 7/28/2019 Talk02 David Scott

    15/29

    PortTownsend Features

    HPC BOARD FEATURES

    Single Intel PentiumD processor (Conroe, Kentsfield)

    Chipset: Mukilteo2 + ICH7 4 DIMM (max 8GB) - DDR2 533/667 with U-ECC

    1066 FSB

    PCIex8 support for IB MemFree card & SFF GbE card

    Integrated 2 port SATA2 with RAID 0/1

    2xGbE (Tekoa + TekoaE) 2xUSB2 external (crash cart)

    Rear video & serial port

    Internal headers: serial (3pin), 2xUSB2, I2C

    Custom 5.95 x13 , 6 layer

    Custom power connector

    Client Management iAMT via TekoaE

    PCI-Ex8

    GbE

    Memory

    CPU

    VRD

    MCH

    ICH

    At k V F t

  • 7/28/2019 Talk02 David Scott

    16/29

    AtokaV Features

    HPC BOARD FEATURES

    Dual Intel Xeon processor (WC, CTN)

    Chipset: Greencreek + ESB2 8 FBD (max 32GB) - DDR2 533/667

    1333 FSB

    PCIex8 slot

    Mellanox IB 4x DDR single port down

    Integrated 2 port SATA2 with RAID 0/1

    2xGbE (Gilgal)

    2xUSB2 external (crash cart)

    Rear video & serial port

    Internal headers: serial (3pin), 1xUSB2, I2C

    Custom 6.5 x16.5

    Custom power connector Client Management via IPMI module / GbE port

    Support for 32Mbit flash & embedded Linux

    GbE

    Memory

    CPU

    VRD

    MCH

    ESB2

    PCI-Ex8

    CPU

    IB

  • 7/28/2019 Talk02 David Scott

    17/29

    Pics

    PortTownsend 1Uside bysidereferencechassis

  • 7/28/2019 Talk02 David Scott

    18/29

    Pics

    PortTownsend 4UBlade Can

    PortTownsend ACBlade

  • 7/28/2019 Talk02 David Scott

    19/29

    AGENDA

    New ProcessorsNew HPC focused platforms

    Technologies for thefuture

  • 7/28/2019 Talk02 David Scott

    20/29

    Todays PackagingTechnology

    Multi-Chip Package Wire-Bonded Stacked

    Die

    CPU DRAM

    DRAM

    Flash

    CPU

  • 7/28/2019 Talk02 David Scott

    21/29

    3D Stacking Research

    Wafer Stacking

    CPU

    DRAM

    Bottom

    Wafer

    Top

    ThinWafer

    Thru-

    SiliconVia

    Metal lines onbackside of thin

    wafer

    Bonding Interface

    BondingStructures

    Source: Intel

  • 7/28/2019 Talk02 David Scott

    22/29

    3D Stacking Research

    Die Stacking

    Analog

    Flash

    CPU

    DRAM

    DRAM

    Die 1

    Die 2

    Die 3

    Die 4

    Die 7

    Die 6

    Die 5

    Pkg. SubstrateMetal Pad

    Via

    Source: Intel

  • 7/28/2019 Talk02 David Scott

    23/29

    Chip-to-ChipSignaling Challenge

  • 7/28/2019 Talk02 David Scott

    24/29

    Enormous ($ billions) CMOS infrastructure, processlearning, and capacity

    Draft continued investment in Moores law

    Potential to integrate multiple optical devices

    Micromachining could provide smart packaging

    Potential to converge computing & communications

    The Opportunity of Silicon Photonics

    To benefit from this optical wafersmust run alongside existing product.

  • 7/28/2019 Talk02 David Scott

    25/29

    First

    Continuous

    Silicon Laser(Nature 2/17/05)

    1GHz (Nature 04)

    10 Gb/s (05)

    Intels Silicon Photonics Research

    First: Innovate to provesilicon is a viable optical

    material

  • 7/28/2019 Talk02 David Scott

    26/29

    Silicon Photonics

    LaserFilterModulator

    PassiveAlignment

    CMOSCircuitry

    Photodetector

  • 7/28/2019 Talk02 David Scott

    27/29

    Silicon Photonics Future Vision

    Data Center

    FabricsBackplane and Display

    Interconnects

    Chemical

    Analysis

    Medical

    Lasers

    Chip-to-Chip

    Interconnects

  • 7/28/2019 Talk02 David Scott

    28/29

    A

  • 7/28/2019 Talk02 David Scott

    29/29