INVITED PAPER Superconnect Technology...2001/04/21  ·...

8
IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 2001 1709 INVITED PAPER Special Issue on Integrated Systems with New Concepts Superconnect Technology Takayasu SAKURAI a) , Regular Member SUMMARY Future electronic systems can not be built only with System-on-a-Chip (SoC), since many SoC issues have be- come evident. Relatively low yield due to the larger die size and the huge investment in developing the process to embed different kinds of technologies are some of the issues. Instead, supercon- nect technology is getting more important as a viable solution in building electronic systems. The superconnect connects sep- arately built and tested chips not by the printed circuit board but rather directly to construct high-performance yet low-cost electronic systems and may use around 10 micron level design rules. System-in-a-Package and stacked chips using interposers are some realization of the superconnect. The superconnect will also be used to mitigate IR-drop problems and RC delay prob- lems in global on-chip interconnect. key words: 1. Introduction The device count on a chip is ever increasing as shown in Fig. 1 using scaled-down transistors, which helps to achieve high-performance VLSI’s. Recently, value- added System-on-a-Chip (SoC) is attracting attention to utilize the huge number of transistors available on one chip. A SoC is a VLSI integrating complex system functions on a single chip as shown in Fig. 2. An ex- ample of the SoC is a chip integrating a large capacity DRAM and a processor. By combining several chips into one, the SoC can improve power consumption and performance of an electronic system by several times. Thus, the SoC has been pursued extensively for these several years. In realizing SoC’s, designs of functional blocks called IP (Intellectual Property) are shared and re-used among groups and companies. The exchange markets of the IP’s are in operation. After having pursued SoC extensively, several fun- damental issues of SoC are recognized. Some of the issues are discussed below. (1) Un-distributed IP’s: Certain IP’s such as CPU and DSP are sometimes not distributed by the form of IP and thus can not used by other semiconductor manufacturers in building SoC. (2) Relatively low yield due to large die size: SoC tends to use larger die size to integrate more functions on Manuscript received March 1, 2001. The author is with Center for Collaborative Research, and Institute of Industrial Science, University of Tokyo, To- kyo, 106-8558 Japan. a) E-mail: [email protected] a chip, which make the yield low. (3) Manufacturing cost increase: SoC usually requires more masks and more manufacturing cost so as to combine different processes such as DRAM process and FeRAM process in manufacturing a die. (4) Upfront IP test cost: In order to make an IP really usable in the SoC environments, the SoC manufac- ture has to prove the complete functionality and test timing and other margins of the IP before us- ing the IP in a real target SoC. This gives rise to high upfront financial and engineering cost, which makes the SoC business risky. (5) IP’s difficult to share and integrate: Some IP’s such as DRAM’s are process sensitive and difficult to transfer the design data from one factory to an- other. Some IP’s such as high precision analog IP’s are difficult to be embedded into noisy logic en- vironments. These kinds of IP’s hinder one chip Fig. 1 Technology trend. Fig. 2 System-on-a-Chip where complexity crisis for VLSI de- sign is to be solved by re-use and sharing of designs as a form of IP and by designing in higher abstraction.

Transcript of INVITED PAPER Superconnect Technology...2001/04/21  ·...

  • IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 20011709

    INVITED PAPER Special Issue on Integrated Systems with New Concepts

    Superconnect Technology

    Takayasu SAKURAI†a), Regular Member

    SUMMARY Future electronic systems can not be built onlywith System-on-a-Chip (SoC), since many SoC issues have be-come evident. Relatively low yield due to the larger die size andthe huge investment in developing the process to embed differentkinds of technologies are some of the issues. Instead, supercon-nect technology is getting more important as a viable solutionin building electronic systems. The superconnect connects sep-arately built and tested chips not by the printed circuit boardbut rather directly to construct high-performance yet low-costelectronic systems and may use around 10 micron level designrules. System-in-a-Package and stacked chips using interposersare some realization of the superconnect. The superconnect willalso be used to mitigate IR-drop problems and RC delay prob-lems in global on-chip interconnect.key words: system-on-a-chip, system-in-a-package, supercon-nect, RC delay, interconnect

    1. Introduction

    The device count on a chip is ever increasing as shownin Fig. 1 using scaled-down transistors, which helpsto achieve high-performance VLSI’s. Recently, value-added System-on-a-Chip (SoC) is attracting attentionto utilize the huge number of transistors available onone chip. A SoC is a VLSI integrating complex systemfunctions on a single chip as shown in Fig. 2. An ex-ample of the SoC is a chip integrating a large capacityDRAM and a processor. By combining several chipsinto one, the SoC can improve power consumption andperformance of an electronic system by several times.Thus, the SoC has been pursued extensively for theseseveral years. In realizing SoC’s, designs of functionalblocks called IP (Intellectual Property) are shared andre-used among groups and companies. The exchangemarkets of the IP’s are in operation.

    After having pursued SoC extensively, several fun-damental issues of SoC are recognized. Some of theissues are discussed below.

    (1) Un-distributed IP’s: Certain IP’s such as CPU andDSP are sometimes not distributed by the form ofIP and thus can not used by other semiconductormanufacturers in building SoC.

    (2) Relatively low yield due to large die size: SoC tendsto use larger die size to integrate more functions on

    Manuscript received March 1, 2001.†The author is with Center for Collaborative Research,

    and Institute of Industrial Science, University of Tokyo, To-kyo, 106-8558 Japan.a) E-mail: [email protected]

    a chip, which make the yield low.(3) Manufacturing cost increase: SoC usually requires

    more masks and more manufacturing cost so as tocombine different processes such as DRAM processand FeRAM process in manufacturing a die.

    (4) Upfront IP test cost: In order to make an IP reallyusable in the SoC environments, the SoC manufac-ture has to prove the complete functionality andtest timing and other margins of the IP before us-ing the IP in a real target SoC. This gives rise tohigh upfront financial and engineering cost, whichmakes the SoC business risky.

    (5) IP’s difficult to share and integrate: Some IP’s suchas DRAM’s are process sensitive and difficult totransfer the design data from one factory to an-other. Some IP’s such as high precision analog IP’sare difficult to be embedded into noisy logic en-vironments. These kinds of IP’s hinder one chip

    Fig. 1 Technology trend.

    Fig. 2 System-on-a-Chip where complexity crisis for VLSI de-sign is to be solved by re-use and sharing of designs as a form ofIP and by designing in higher abstraction.

  • 1710IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 2001

    Fig. 3 Mask count increase by combining different technologieson a single chip.

    implementation of an electronic system.(6) Complicate process: The extra mask count to mix

    different devices on a single chip is summarized inFig. 3. As is clearly seen, if multiple memory typessuch as DRAM and FeRAM are to be embedded onone chip, the mask count increase is prohibitivelylarge. The financial investment and engineeringcost in developing the process to combine differentkinds of technologies and in advancing the processevery year are prohibitively high, since several tech-nologies should be used in building an electronicsystem in the future.

    2. Superconnect

    As summarized in the previous section, there are sev-eral fundamental issues associated with the SoC. On theother hand, building an electronic system just by pack-aged LSI’s and printed circuit boards (PCB) is not idealeither, since this conventional approach can only real-ize relatively low-performance, large area, and powerconsuming systems.

    Recently, however, a new high-density approachcalled ‘superconnect’ is attracting attention [1]–[5], [7]to give solutions to some of the SoC problems. The su-perconnect connects separately built and tested chipsnot by the PCB but rather directly to construct high-performance yet low-cost electronic systems and mayuse around 10 micron level design rules [1]. Some-times LSI’s in the superconnect are connected in three-dimensional fashion to achieve higher performance andsmaller geometry. The SoC is still an important ap-proach to build electronic systems and will be pursuedfurther but the superconnect may complement the ap-proach and may open up new way of building electronicsystems.

    There is a large gap between on-chip and off-chipinterconnects in terms of power, density, performance,cost and turn-around-time (TAT) as shown in Fig. 4.For example, in order to achieve 1Gbyte/sec band-

    Fig. 4 Gap between on-chip and off-chip interconnects in termsof power, density, performance, cost and turn-around-time.

    Fig. 5 Technology vacuum in terms of design rule andSuperconnect technology that fills the vacuum.

    width, about 1W of power consumption is needed ifit is implemented by off-chip interconnects, that is,inter-chip interconnects over PCB, since analog high-performance I/O techniques have to be used. On theother hand, less than 10mW is sufficient if it is realizedby on-chip interconnects, since thousands of intercon-nects can be used in parallel and each on-chip intercon-nect has very low capacitance. As seen from the figure,a large gap of the order of 2–3 exists between on-chipand off-chip interconnects in terms of PDAT (power,delay, area and TAT).

    Basically, the large gap comes from the big differ-ence between the design rules of on-chip and off-chipinterconnects as shown in Fig. 5 [1]. It can be said thatthere is a technology vacuum at present between 1µmlevel on-chip interconnect and 100µm level off-chip in-terconnect. The superconnect will fill the gap betweenon-chip and off-chip interconnects, making use of 10µmlevel design rule.

    The superconnect is a form of advanced assemblytechnology and System-in-a-Package (SiP) as shown inFig. 6 is one realization of the superconnect. Other im-

  • SAKURAI: SUPERCONNECT TECHNOLOGY1711

    Fig. 6 System-on-a-Chip vs. System-in-a-Package.

    Fig. 7 Superconnect example based on three-dimensionalassembly through interposers [4].

    plementation includes stacked chips in a package con-nected by bonding wires and System-in-a-Cube whereepoxy-molded layers of different chips are stacked tomake a cube. It should be noted that a primitive formof superconnect where stacked chips are connected bywire bonding and/or normal face-down bump technol-ogy is already commercialized in business environmentsboasting of its low-cost and short TAT feature. Thenumber of vertically stacking chips in these primitivetechnologies, however, is limited to two, while more ad-vanced technologies allow to stack three or more chipsvertically.

    One of the superconnect implementation is usingthin base material called interposer to connect chips inhighly dense manner as shown in Fig. 7 [4]. With thetechnology it is claimed that it is possible to stack andconnect 8 memory chips in 1mm height cavity. Thus,it is possible to realize a memory card 8 times as highcapacity as the present without advancing LSI processtechnology that needs huge investment and time.

    We can choose the placement of chips between2D and 3D. If we can use three-dimensional assem-bly, the number of devices in a certain amount ofManhattan distance increases as is shown in Fig. 8or we can say that shorter connection among de-vices, thus higher-performance, is expected in the three-dimensional placement of the chips.

    Fig. 8 More devices exist in closer space in 3-D assemblymaking communication among devices faster.

    There are still issues with System-in-a-Packageand/or superconnect, which are summarized below.

    (1) Special design tools for placement & route for co-design of LSI’s and assembly: CAD environmentsnow should provide ways not only to build chipsbut also to build electronic systems. Co-design ofLSI’s and assembly is an emerging field that needsnew tools.

    (2) High-density reliable substrate and metallizationtechnology: Technologies used to manufacture10µm level interconnects and vertical connectionshave not been settled yet and are still to be lookedfor, although several promising proposals have beenmade.

    (3) Known good die (KGD): We have to pick gooddies before assembling them using superconnect.At-speed test of bare dies has been difficult withwafers and probing needles due to high spuriousinductance and capacitance.

    The KGD problem described above has been a dif-ficult technical issue and this is one of the reasons whyMulti-Chip Module (MCM) did not take off in its fullestpromise. Recently, however, one way of solving theKGD problem is shown to use an interposer.

    In stacked-chip systems, heat removal is one of thekey issues. It is difficult to remove heat from the midstof the stacked chips. One of the remedies is to in-sert metal layers specifically placed to remove the heat.Other remedy is to reduce the generation of heat. Thus,low-power design of LSI’s is said to be important in thesuperconnect environments.

    Various implementations are possible for the su-perconnect ranging from primitive stacked chips con-nected by bonding wires and mounted in a package tothe more advanced 3D integration [3], [4]. Thus, keytechnologies needed to build the superconnect dependheavily on the specific form of superconnect. Someneeds vertical vias through Si substrate [3], some needsinterposer technology [4] and some needs 10µm-levelmetal bumps [9]. At the moment, the comparison of

  • 1712IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 2001

    Fig. 9 Interconnect determines power, delay, area and turn-around-time.

    various form of superconnect and the prediction on whowins the technology race are still too early to tell.

    Point technologies to build superconnect some-times do exist. The important thing in the course ofthe research and development of the superconnect isthe vision to the final superconnect system and theamalgamation of point technologies to achieve the fi-nal goal. Anyway, 10µm-level superconnect technologystill needs more research efforts although the primitiveforms of superconnect are already available.

    3. Interconnect Issues in Giga-Scale Integra-tion in Relation to Superconnect

    The interconnect crisis is depicted in Fig. 9. Not tran-sistors but interconnects will be determining the cost,delay, power, reliability and turn-around time (TAT)of the future VLSI’s. Some of the design issues ofthe deep submicron interconnect are as follows. Thehigher current gives rise to static and dynamic IR volt-age drop problems and reliability degradation due toelectro-migration. The smaller geometry and denserpattern lead to RC delay increase and signal integrityproblems such as high crosstalk noise and large delayfluctuation due to capacitive coupling among adjacentlines. The higher speed causes inductance related issuesand electro-magnetic interference (EMI) problems.

    The high operation current is required in the futurefor high-performance VLSI’s as shown in Fig. 10. Thistype of high current develops voltage drop on powersupply lines due to the resistance of the lines. Even forintermediately power-consuming chip needs very thickmetal like 10µm to keep the IR drop within an accept-able level as shown in Fig. 11. This type of thick metalcan be implemented in a package or on the interposer.This is exactly the superconnect can provide. In thefuture, area pads and co-design of a VLSI and a pack-age/interposer becomes necessary.

    Efforts are made to lower the resistance and capac-itance of interconnects. Still, the interconnect delay isa big headache in designing a interconnect system withdeep submicron wires. If we use the interconnect with

    Fig. 10 High current is to be carried through power supplylines in the future.

    Fig. 11 Thick metal layer is required to mitigate IR dropproblems.

    the minimum cross-section, the signal cannot be prop-agated 1mm of distance in one clock cycle as shownin Fig. 12. The RC delay problem can be mitigatedby using the buffer insertion technique described belowwhere it is shown that the buffer insertion techniqueincreases power of the interconnect system. After that,it is shown that the superconnect provides another wayto reduce the RC delay without increasing power con-

  • SAKURAI: SUPERCONNECT TECHNOLOGY1713

    sumption.Here, an analysis of the RC delay of buffered in-

    terconnect system is discussed. Although the bufferinsertion theory has been extensively studied, empha-sis is made here from a theoretical standpoint of power,which is new and closely related to the superconnect.The effect of junction capacitance is also considered.The analysis is also helpful in understanding the delaybehavior due to the technology scaling. It is knownthat by inserting buffers (or sometimes they are calledrepeaters), the delay of a long interconnect can be re-

    Fig. 12 RC delay is increasing. For example, the interconnectdelay of 1mm in length would surmount the clock period in thenear future.

    Fig. 13 Delay optimization through buffer insertion.

    duced. The delay of an unbuffered interconnect can beapproximately expressed as below.

    t0.5 ≈ 0.377RINT CINT + 0.693· (RT CT + RT CJ + RT CINT + RINT CT ),

    where CINT is the capacitance of interconnect, RINTis the resistance of interconnect, CT is the gate capac-itance of the load and RT is the drain effective resis-tance of driving transistor and CJ is the drain junctioncapacitance of driving transistor (see Fig. 13). If the in-terconnect is divided into k sections and (k−1) buffersare inserted, the total delay of the buffered interconnectsystem is expressed as follows.

    Delay ≈ k[p1

    RINT0L

    k

    CINT0L

    k

    + p2

    (R0h

    hC0 +R0h

    hCJ0

    +R0h

    CINT0L

    k

    RINT0L

    khC0

    )]

    where h denotes the gate size of the inserted buffer. C0is the gate capacitance of the minimum width transis-tor and R0 is the gate effective resistance of the min-imum width transistor. RINT0 is the interconnect re-sistance and CINT0 is the interconnect capacitance perunit length. What should be optimized here are k andh to minimize the delay expressed in the above formula.By differentiating in terms of h and k, and setting thederivatives equal to zero, it is easy to obtain the opti-mum h, hOPT , and the optimum k, kOPT , as follows.

  • 1714IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 2001

    ∂Delay∂h

    = 0 → hOPT =√

    CINT0R0RINT0C0

    ∂Delay∂k

    = 0 → kOPT

    = L√

    p1p2

    √RINT0CINT0R0(C0 + CJ0)

    lOPT =L

    kOPT=√

    p2p1

    √R0(C0 + CJ0)RINT0CINT0

    =√

    p2p1

    √τMOSτINT0

    Here, τINT0 (= RINT0CINT0) is a time constant of theinterconnect per unit length and τMOS (= R0(C0 +CJ0)) is a time constant of the inserted buffer, whichis proportional to a logic gate delay at a certain tech-nology node. L is the total length of the interconnectand lOPT is the optimum length of the sectioned inter-connect. It is interesting to note that optimum buffersize, hOPT , and lOPT are not a function of the length ofthe interconnect, L, and only depend on a technology.This means that if technology is fixed, there is an op-timum buffer section indifferent from the total lengthof the interconnect and the optimum buffer insertion isthe cascade connection of the optimum buffer section.

    Then the optimized delay of the buffered system isexpressed as below.

    DelayOPT

    = 2

    (√

    p1p2 + p2

    √C0

    C0 + CJ0

    )

    · L√

    RINT0CINT0R0(C0 + C0J)≈ 2.4L√τINT0τMOS (when CJ0 = 0)≈ 2.0L√τINT0τMOS (when CJ0 = C0)

    p1 is 0.377 and p2 is 0.693 for the case of the delay fromzero to a half VDD, but even if these values are differ-ent, the optimization can be possible for the delay fromzero to 0.9VDD or to other intermediate values. In thissense, the formula is quite general. The above expres-sion is interesting in that the delay of the buffered inter-connect system is a geometric mean of the interconnectdelay itself and the logic gate. Since the scaling factorof the interconnect delay is almost constant and the de-lay of the logic gate is supposed to improve very rapidlyas technology advances, the delay of the buffered inter-connect system is supposed to improve slowly thanksto the speed improvement of the logic gates.

    In the optimally buffered interconnect, the capaci-tance of the system increases due to the inserted buffers.The total gate capacitance of buffers is expressed as fol-lows.

    ∆p = kOPT hOPT (C0 + CJ0)

    =√

    p1p2

    √C0 + CJ0

    C0CINT

    = 0.73CINT (when CJ0 = 0)= 1.04CINT (when CJ0 = C0)

    This means that the total capacitance is increased by73–104% compared with the system without buffers.The increase in capacitance in turn increases power con-sumption.

    There is a power conscious buffer insertion scheme.If we fix the following quantity,

    η =PTOTALPINT

    =CINT + kh(C0 + CJ0)

    CINT,

    the buffer size, the number of sections and delay areexpressed as follows. PTOTAL is the total power of thebuffered interconnect system, which is proportional tothe total capacitance and PINT is the power consumedby bare interconnect without the buffers, which is pro-portional to CINT .

    h

    hOPT=

    √η(η − 1)p2C0

    Cp

    k

    kOPT=

    √(η − 1)Cp

    ηp1C0

    DelayDelayOPT

    =1

    √p1 +

    √p2

    √Cp

    (η − 1)C0

    where

    Cp = p1(C0 + CJ0) + p2C0(η − 1).

    In Fig. 13, the results are plotted. It is possible to re-duce the power increase by inserted buffers sacrificing

    Fig. 14 Interconnect delay in buffered interconnect system.

  • SAKURAI: SUPERCONNECT TECHNOLOGY1715

    Fig. 15 RC delay of global interconnect.

    the delay but the power increases anyway.The delay can be reduced by the buffer insertion

    technique as shown in Fig. 14 but the power increasesdue to the inserted buffers. Another way to decreasethe interconnect delay without increasing power is touse a thicker and wider metal layer as in Fig. 15 us-ing a superconnect technology described below. If athick metal layer is available which could be a layer ina package or on an interposer, by using 6µm × 2µmcross-section interconnect with high aspect ratio, theRC delay can be reduced to the point where the signalcan propagate within a chip in a clock cycle as shownin Fig. 15. This approach does not increase capacitanceand hence power in contrast to the buffer insertion ap-proach. The drawback is the density.

    4. Conclusion

    In this paper, the needs for new assembly schemesare summarized and new trend called superconnectis described. The superconnect connects separatelybuilt and tested chips not by the PCB but rather di-rectly to construct high-performance yet low-cost elec-tronic systems and may use around 10µm-level designrules. System-in-a-Package and stacked chips using in-terposers are some realization of the superconnect. Thesuperconnect not only opens up ways to build higherperformance electronic systems but also is used to solveLSI interconnect problems such as IR drop and RC de-lay problems.

    SoC will be still an important approach but withthe help of the superconnect, SoC can realize the morevaluable electronic systems.

    At the moment, the comparison of various form ofsuperconnect and the prediction on who wins the tech-nology race are still too early to tell but there will bea promising future in the new system-level integration.

    At the last part of the paper, I would like to addan excerpt from the International Technology Roadmapfor Semiconductors [6]: “There is an increased aware-ness in the industry that assembly and packaging isbecoming a differentiator in product development.”

    Acknowledgement

    The author would like to appreciate the opportunitygiven by Prof. Koyanagi to write this article and thediscussions with Prof. Suga and Dr. Matsuzawa.

    References

    [1] M. Kimura, “Superconnect: 21st Century LSI Productionand Design Method,” Nikkei Microdevices, no.180, pp.62–79, June 2000.

    [2] J. Burns, L. Mcllrath, C. Keast, C. Lewis, A. Loomis, K.Warner, and P. Wyatt, “Three-dimensional integrated cir-cuits for low-power, high-bandwidth systems on a chip,”ISSCC Digest of Tech. Papers, pp.268–269, Feb. 2001.

    [3] M. Koyanagi, Y. Nakagawa, K.W. Lee, T. Nakamura, Y.Yamada, K. Inamura, K. Park, and H. Kurino, “Neuromor-phic vision chip fabricated using three-dimensional integra-tion technology,” ISSCC Digest of Tech. Papers, pp.270–271,Feb. 2001.

    [4] K. Ohsawa, H. Odaira, M. Ohsawa, S. Hirade, T. Iijima, andS.G. Pierce, “3-D assembly interposer technology for next-generation integrated systems,” ISSCC Digest of Tech. Pa-pers, pp.272–273, Feb. 2001.

    [5] A. Naeemi, C.S. Patel, M.S. Bakir, Z. Ha, K.P. Martin,and J.D. Meindl, “Sea of leads: A disruptive paradigm fora system-on-a-chip (SoC),” ISSCC Digest of Tech. Papers,pp.280–281, Feb. 2001.

    [6] International Technology Roadmap for Semiconductors,ITRS’99, p.213, 1999.

    [7] H. Goldstein, “Packages Go Vertical,” IEEE Spectrum,pp.46–51, Aug. 2001.

    [8] Akira Matsuzawa, Private communication, 2001.

  • 1716IEICE TRANS. ELECTRON., VOL.E84–C, NO.12 DECEMBER 2001

    Takayasu Sakurai received the B.S.,M.S. and Ph.D. degrees in EE from Uni-versity of Tokyo, Japan, in 1976, 1978,and 1981, respectively. In 1981 hejoined Toshiba Corporation, where he de-signed CMOS DRAM, SRAM and BiC-MOS ASIC’s. He also worked on inter-connect delay and capacitance modelingknown as Sakurai model and alpha power-law MOS model. From 1988 through1990, he was a visiting researcher at Univ.

    of Calif., Berkeley, doing research in the field of VLSI CAD. From1990 back in Toshiba, he managed RISCs, media processors andMPEG2 LSI designs. From 1996, he is a professor at the Insti-tute of Industrial Science, University of Tokyo, working on low-power and high-performance system LSI designs. Prof. Sakuraiserved as a conference chair for Symposium on VLSI Circuits,and a program committee member for ISSCC, CICC, DAC, IC-CAD, FPGA workshop, ISLPED, VLSI-TSA, ICVC, ASPDACand other international conferences.