L15: VLSI Integration and Performance L15: 6.111 Spring 2008 Introductory Digital Systems Laboratory

download L15: VLSI Integration and Performance L15: 6.111 Spring 2008 Introductory Digital Systems Laboratory

of 26

  • date post

    21-Jan-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of L15: VLSI Integration and Performance L15: 6.111 Spring 2008 Introductory Digital Systems Laboratory

  • L15: 6.111 Spring 2008 1Introductory Digital Systems Laboratory

    L15: VLSI Integration and Performance L15: VLSI Integration and Performance TransformationsTransformations

    Acknowledgements: • Lecture prepared by Professor Anantha Chandrakasan • Lecture material adapted from J. Rabaey, A. Chandrakasan, B. Nikolic, “Digital

    Integrated Circuits: A Design Perspective” Copyright 2003 Prentice Hall/Pearson. • Curt Schurgers

    • Moore’s Law and Integration • IC Design

    ASIC Design Clocks Test

    • Performance Transformations • Trends

  • L15: 6.111 Spring 2008 2Introductory Digital Systems Laboratory

    0.0000001

    0.000001

    0.00001

    0.0001

    0.001

    0.01

    0.1

    1

    10

    '68 '70 '72 '74 '76 '78 '80 '82 '84 '86 '88 '90 '92 '94 '96 '98 '00 '02

    $$

    A ve

    ra ge

    C os

    t o f o

    ne tr

    an si

    st or Gordon Moore, Keynote

    Presentation at ISSCC 2003

    Cost of TransistorCost of Transistor

  • L15: 6.111 Spring 2008 3Introductory Digital Systems Laboratory

    MooreMoore’’s Laws Law

    In 1965, Gordon Moore was preparing a speech and made a memorable observation. When he started to graph data about the growth in memory chip performance, he realized there was a striking trend. Each new chip contained roughly twice as much capacity as its predecessor, and each chip was released within 18-24 months of the previous chip. If this trend continued, he reasoned, computing power would rise exponentially over relatively brief periods of time.

    16 15 14 13 12 11 10

    9 8 7 6 5 4 3 2 1 0

    19 59

    19 60

    19 61

    19 62

    19 63

    19 64

    19 65

    19 66

    19 67

    19 68

    19 69

    19 70

    19 71

    19 72

    19 73

    19 74

    19 75

    LO G

    2 O F

    TH E

    N U

    M B

    ER O

    F C

    O M

    PO N

    EN TS

    P ER

    IN TE

    G R

    A TE

    D F

    U N

    C TI

    O N

    Electronics, April 19, 1965.

    Source: Dataquest/Intel, 12/02Source: Dataquest/Intel, 12/02

    '68 '70 '72 '74 '76 '78 '80 '82 '84 '86 '88 '90 '92 '94 '96 '98 '00 '02

    1017

    1016

    1015

    1014

    1013

    1012

    1011

    1010

    109

    Tr an

    si st

    or s

    Sh ip

    pe d

    Pe r Y

    ea r

    1018

    “…E.O. Wilson, the famous Harvard biologist who is an expert on ants, estimates that there are 10 to the 16th and 10 to the 17th ants on earth. But if you look at this curve, this year we’re making one transistor for every ant.” – Gordon Moore, “An Update on Moore’s Law”

    Gordon Moore, Keynote Presentation at ISSCC 2003

    4004 8008

    8080 8085 8086

    286 386

    486 Pentium® proc

    P6

    0.001

    0.01

    0.1

    1

    10

    100

    1000

    1970 1980 1990 2000 2010 Year

    Tr an

    si st

    or s

    (M T)

    2X growth in 1.96 years!

    Courtesy of S. Borkar (Intel)

  • L15: 6.111 Spring 2008 4Introductory Digital Systems Laboratory

    Evolution of Transistor IntegrationEvolution of Transistor Integration

    Moore’s Law: transistor density doubles every 1.5 - 2 years

    1970 1974 1978 1982 1986 1990 1994 1998 2002

    1K

    4K 16K

    64K

    1M

    4M

    16M

    128MDRAM Cost / Bit ($, 95)

    2006

    256M 1G

    4004

    8086

    8080

    Pentium Pro

    80286 80386

    Pentium

    Pentium II Pentium III

    C os

    t/B it

    ($ , 9

    5)

    10-1

    10-2

    10-3

    10-4

    10-5

    10-6

    10-7

    10-8

    Tr an

    si st

    or s

    / C hi

    p

    107

    106

    105

    104

    103

    108

    109

    C os

    t/F un

    ct io

    n

    Pentium IV

    #1

    256K 80486

    64M

    #2

    Microprocessor

    102

    10-9

  • L15: 6.111 Spring 2008 5Introductory Digital Systems Laboratory

    Layout 101Layout 101

    N-channel P-channel

    GND

    VDD

    meta pol p+

    contact frommet to ndiff

    L

    W

    L

    W

    I OU

    n-type

    p-type

    metal/pdi contact

    n+

    I OU

    VDD

    S G

    D

    D

    S G

    3-D Cross-Section

    Circuit Representation

    Layout

    Follow simple design rules (contract between process and circuit designers)

  • L15: 6.111 Spring 2008 6Introductory Digital Systems Laboratory

    Custom Design/LayoutCustom Design/Layout

    Adder stage 1

    Wiring

    Adder stage 2

    Wiring

    Adder stage 3 B

    it slice 0

    B it slice 2

    B it slice 1

    B it slice 63

    Sum Select

    Shifter

    Multiplexers

    Loopback Bus

    From register files / Cache / Bypass

    To register files / Cache

    Loopback Bus

    Loopback Bus

    Die photograph of the Die photograph of the Itanium integer Itanium integer datapathdatapath

    BitBit--slice Design Methodologyslice Design Methodology

    Hand crafting the layout to achieve maximum clock rates (> 1Ghz) Exploits regularity in datapath structure to optimize interconnects

    9- 1

    M ux

    9- 1

    M ux

    5- 1

    M ux

    2- 1

    M ux

    ck1

    CARRYGEN

    SUMGEN + LU

    1000um

    b

    s0

    s1

    g64

    sum sumb

    LU : Logical Unit

    SU M

    SE L

    a

    to Cache

    node1

    R EG

    Itanium has 6 integer execution units like thisItanium has 6 integer execution units like this

  • L15: 6.111 Spring 2008 7Introductory Digital Systems Laboratory

    The ASIC ApproachThe ASIC Approach

    Verilog (or VHDL )Verilog (or VHDL )

    Logic SynthesisLogic Synthesis

    FloorplanningFloorplanning

    PlacementPlacement

    RoutingRouting

    Tape-out

    Circuit Extraction

    Circuit Extraction

    Pre-Layout Simulation

    Pre-Layout Simulation

    Post-Layout Simulation

    Post-Layout Simulation

    StructuralStructural

    PhysicalPhysical

    BehavioralBehavioralDesign Capture

    D es

    ig n

    Ite ra

    tio n

    D es

    ig n

    Ite ra

    tio n

    Most Common Design Approach for Designs up to 500Mhz Clock Rates

  • L15: 6.111 Spring 2008 8Introductory Digital Systems Laboratory

    Standard Cell ExampleStandard Cell Example

    3-input NAND cell (from ST Microelectronics): C = Load capacitance T = input rise/fall time

    Power Supply Line (VDD )

    Ground Supply Line (GND)

    Each library cell (FF, NAND, NOR, INV, etc.) and the variations on size (strength of the gate) is fully characterized across temperature, loading, etc.

    Delay in (ns)!!

  • L15: 6.111 Spring 2008 9Introductory Digital Systems Laboratory

    Standard Cell Layout MethodologyStandard Cell Layout Methodology

    Cell-structure hidden under interconnect layers

    2-level metal technology Current Day Technology

    With limited interconnect layers, dedicated routing channels between rows of standard cells are needed

    Width of the cell allowed to vary to accommodate complexity Interconnect plays a significant role in speed of a digital circuit

  • L15: 6.111 Spring 2008 10Introductory Digital Systems Laboratory

    module adder64 (a, b, sum); input [63:0] a, b; output [63:0] sum;

    assign sum = a + b; endmodule

    VerilogVerilog to ASIC Layout to ASIC Layout (the push button approach)(the push button approach)

    After Synthesis

    After Placement

    After Routing

  • L15: 6.111 Spring 2008 11Introductory Digital Systems Laboratory

    Macro ModulesMacro Modules

    256×32 (or 8192 bit) SRAM Generated by hard-macro module generator

    Generate highly regular structures (entire memories, multipliers, etc.) with a few lines of code

    Verilog models for memories automatically generated based on size

  • L15: 6.111 Spring 2008 12Introductory Digital Systems Laboratory

    Clock DistributionClock Distribution

    For 1Ghz clock, skew budget is 100ps. Variations along different paths arise

    from: • Device: VT , W/L, etc. • Environment: VDD , °C • Interconnect: dielectric thickness

    variation

    D Q

    D Q

    Clock skew, courtesy Alpha

    IBM Clock Routing

  • L15: 6.111 Spring 2008 13Introductory Digital Systems Laboratory

    Analog Circuits: Clock Frequency Analog Circuits: Clock Frequency Multiplication (Phase Locked Loop)Multiplication (Phase Locked Loop)

    Courtesy M. PerrottCourtesy M. Perrott

    VCO produces high frequency square wave Divider divides down VCO frequency PFD compares phase of ref and div Loop filter extracts phase error information

    Used widely in digital systems for clock synthesis (a standard IP block in most ASIC flows)

    PLL

    Intel 486, 50Mhz

    up

    down

  • L15: 6.111 Spring 2008 14Introductory Digital Systems La