# ECE 5745 Complex Digital ASIC Design Topic 5: Automated Design Methodologies

#### **Christopher Batten**

School of Electrical and Computer Engineering Cornell University

http://www.csl.cornell.edu/courses/ece5745

Standard-Cell

Prog Logic

FPGA

Comparison

## Part 1: ASIC Design Overview





## **Design Principles in Automated Methodologies**

- Modularity: Use modularity to enable mixing different custom and automated methodologies
- Hierarchy: Use hierarchy to more efficiently handle automatically transforming large designs
- Encapsulation: Automated methodologies have a significantly higher emphasis on encapsulation in all domains (behavoral, structural, physical) at all levels of abstraction: architecture-, register-transfer-, and gate-level
- Regularity: Use regularity to create automated tools to generate structures like datapaths and memories
- Extensibility: Automated methodologies enable more highly parameterized and flexible implementations improving extensibility

| Standard-Cell | SoC/Platform | Gate-Array | Prog Logic | FPGA | Comparison |
|---------------|--------------|------------|------------|------|------------|
|               |              | Agen       | da         |      |            |
|               |              |            |            |      |            |
| Standa        | rd-Cell-Ba   | sed Desig  | gn         |      |            |
| System        | -on-Chip F   | Platform-E | Based De   | sign |            |
| Gate-A        | rray-Based   | d Design   |            |      |            |
| Program       | nmable Lo    | gic-Base   | d Design   |      |            |
| Reprog        | rammable     | Logic-Ba   | sed Desi   | gn   |            |
| Compa         | rison and I  | Hybrids    |            |      |            |
|               |              |            |            |      |            |
|               |              |            |            |      |            |

| Standard-Cell • | SoC/Platform | Gate-Array | Prog Logic | FPGA | Comparison |
|-----------------|--------------|------------|------------|------|------------|
|                 |              | Agenc      | la         |      |            |
|                 |              |            |            |      |            |
| Standarc        | l-Cell-Bas   | ed Desig   | n          |      |            |
| System-o        | on-Chip P    | latform-B  | ased Des   | sign |            |
| Gate-Arr        | ay-Based     | Design     |            |      |            |
| Program         | mable Log    | gic-Based  | d Design   |      |            |
| Reprogra        | ammable I    | Logic-Ba   | sed Desig  | gn   |            |
|                 | son and H    | lybride    |            |      |            |













Adapted from [Rabaey'02]



Adapted from [Weste'11,Rabaey'02]

| Standard-Cell | SoC/Platform | Gate-Array | Prog Logic | FPGA | Comparison |
|---------------|--------------|------------|------------|------|------------|
|---------------|--------------|------------|------------|------|------------|

## **Standard-Cell Libraries**

| Gate Type                            | Variations                                                                                        | Options                                                                          |
|--------------------------------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| Inverter/Buffer/<br>Tristate Buffers |                                                                                                   | Wide range of power options, 1x, 2x, 4x, 8x, 16x, 32x, 64x minimum size inverter |
| NAND/AND                             | 2–8 inputs                                                                                        | High, normal, low power                                                          |
| NOR/OR                               | 2–8 inputs                                                                                        | High, normal, low power                                                          |
| XOR/XNOR                             |                                                                                                   | High, normal, low power                                                          |
| AOI/OAI                              | 21, 22                                                                                            | High, normal, low power                                                          |
| Multiplexers                         | Inverting/noninverting                                                                            | High, normal, low power                                                          |
| Adder/Half Adder                     |                                                                                                   | High, normal, low power                                                          |
| Latches                              |                                                                                                   | High, normal, low power                                                          |
| Flip-Flops                           | D, with and without synch/asynch set and reset, scan                                              | High, normal, low power                                                          |
| I/O Pads                             | Input, output, tristate, bidirectional, bound-<br>ary scan, slew rate limited, crystal oscillator | Various drive levels (1–16 mA) and logic levels                                  |

Adapted from [Weste'11]





### **Standard-Cell Electrical Characterization**

- Characterization computes cell parameter (e.g. delay, output current) depending on input variables: output load, input slew, etc.)
- Characterization is performed for various combinations of operating conditions: process, voltage, temperature (also called PVT corners)



| Standard-Cell | SoC/Platform | Gate-Array | Prog Logic | FPGA | Comparison |
|---------------|--------------|------------|------------|------|------------|
|---------------|--------------|------------|------------|------|------------|

#### **ST Microelectronics – 3-Input NAND Gate**



| Path                 | 1.2V - 125°C                        | 1.6V - 40°C                         |
|----------------------|-------------------------------------|-------------------------------------|
| In $1-t_{pLH}$       | 0.073+7.98 <i>C</i> +0.317 <i>T</i> | 0.020+2.73 <i>C</i> +0.253 <i>T</i> |
| In $1-t_{pHL}$       | 0.069+8.43 <i>C</i> +0.364 <i>T</i> | 0.018+2.14 <i>C</i> +0.292 <i>T</i> |
| In2—t <sub>pLH</sub> | 0.101+7.97 <i>C</i> +0.318 <i>T</i> | 0.026+2.38 <i>C</i> +0.255 <i>T</i> |
| In2—t <sub>pHL</sub> | 0.097+8.42 <i>C</i> +0.325 <i>T</i> | 0.023+2.14 <i>C</i> +0.269 <i>T</i> |
| $In3-t_{pLH}$        | 0.120+8.00 <i>C</i> +0.318 <i>T</i> | 0.031+2.37 <i>C</i> +0.258 <i>T</i> |
| In3—t <sub>pHL</sub> | 0.110+8.41C+0.280T                  | 0.027+2.15 <i>C</i> +0.223 <i>T</i> |

3-input NAND cell (from ST Microelectronics): C = Load capacitance T = input rise/fall time

Adapted from [Rabaey'02]

```
Standard-Cell Electrical Characterization (.lib)
/* Characterization for a 3-input NAND gate */
cell (NAND3X0) {
  /* Overall characterization */
  cell_footprint : "nand3x0";
                     : 7.3728;
  area
  cell_leakage_power : 9.151417e+04;
  /* Characterization for input pin IN1 */
 pin ( IN1 ) {
    direction : "input";
    /* Fixed input capacitance */
    capacitance : 2.190745;
    /* Transient capacitance values */
    fall_capacitance : 2.212771;
    rise_capacitance : 2.168719;
```

Gate-Array

Prog Logic

FPGA

Comparison

SoC/Platform

Standard-Cell

| Standard-Cell                           | SoC/Platform       | Gate-Array     | Prog Logic                 | FPGA    | Comparison |
|-----------------------------------------|--------------------|----------------|----------------------------|---------|------------|
| Standar                                 | d-Cell Ele         | ectrical C     | haracteriz                 | ation   | (.lib)     |
| <b>cell</b> ( NAND3<br><b>pin (</b> IN1 | , <b>,</b>         |                |                            |         |            |
| powe<br><b>interna</b>                  |                    | 2 and IN3<br>{ | nal switch:<br>are zero *, |         |            |
| fu                                      | nction of          | input sle      |                            | ower as | 3          |
| ind                                     |                    | 0160000,       | 0.0320000, -1.2594251,     |         | •          |
| }<br>fall_                              | <b>power ( "</b> r | power_inpu     | its_1" ) {                 |         |            |
|                                         |                    |                | 0.0320000,<br>1.9791286,   |         |            |
| }                                       |                    |                |                            |         |            |

| Standard-Cell                                               | SoC/Platform                            | Gate-Array                      | Prog Logic | FPGA                             | Comparison                          |
|-------------------------------------------------------------|-----------------------------------------|---------------------------------|------------|----------------------------------|-------------------------------------|
| Standar                                                     | d-Cell Ele                              | ectrical (                      | Characte   | rizatior                         | ו (.lib)                            |
| <b>cell</b> ( NAND3<br><b>pin</b> ( QN )<br><b>directio</b> | , .                                     | ıt";                            |            |                                  |                                     |
|                                                             | an logic e<br>: "(IN3*I                 | -                               |            | ion of i                         | ∟nputs */                           |
|                                                             | ) {<br>d_pin : "I<br><b>ise ( "</b> del | ·                               | ) {        |                                  |                                     |
|                                                             | x_1( "0.01<br>x_2( "0.1,                |                                 |            |                                  | nput slew */<br>bad cap */          |
| f                                                           |                                         | input s<br>8632, 0.<br>5562, 0. | ignal slev | w and lo<br>0.037497<br>0.041427 | <pre>bad cap */ 70", \ 75", \</pre> |



Layout View

Abstract Physical View

Adapted from [Melikyan]

| Standard-Cell | SoC/Platform | Gate-Array | Prog Logic | FPGA |
|---------------|--------------|------------|------------|------|
|---------------|--------------|------------|------------|------|

#### SAED 90 nm Library – NAND Gate Data Sheet



Figure 10.6. Logic Symbol of NAND

#### Table 10.11. NAND Truth Table (n=2,3,4)

| IN1 | IN2 |   | INn | QN |
|-----|-----|---|-----|----|
| 0   | Х   |   | Х   | 1  |
| X   | 0   |   | Х   | 1  |
|     |     |   |     | 1  |
| Х   | Х   |   | 0   | 1  |
| 1   | 1   | 1 | 1   | 0  |

|           | Operating Condition<br>Operating Frequent<br>Capacitive Standar | •                | mp=25 Deg.C,                                       |                |                    |
|-----------|-----------------------------------------------------------------|------------------|----------------------------------------------------|----------------|--------------------|
| Cell Name | Cload                                                           | Prop Delay (Avg) | Pov<br>Leakage<br>(VDD=1.2 V DC,<br>Temp=25 Dec.C) | wer<br>Dynamic | Area               |
|           |                                                                 | ps               | nW                                                 | nW/MHz         | (um <sup>2</sup> ) |
| NAND2X0   | 0.5 x Csl                                                       | 140              | 38                                                 | 3583           | 5.5296             |
| NAND2X1   | 1 x Csl                                                         | 132              | 78                                                 | 5208           | 5.5296             |
| NAND2X2   | 2 x Csl                                                         | 126              | 157                                                | 9191           | 9.2160             |
| NAND2X4   | 4 x Csl                                                         | 125              | 314                                                | 17902          | 14.7456            |
| NAND3X0   | 0.5 x Csl                                                       | 128              | 91                                                 | 5331           | 7.3728             |
| NAND3X1   | 1 x Csl                                                         | 192              | 102                                                | 12200          | 11.9808            |
| NAND3X2   | 2 x Csl                                                         | 212              | 155                                                | 19526          | 12.9024            |
| NAND3X4   | 4 x Csl                                                         | 241              | 260                                                | 44937          | 15.6672            |
| NAND4X0   | 0.5 x Csl                                                       | 147              | 106                                                | 5357           | 8.2944             |
| NAND4X1   | 1 x Csl                                                         | 178              | 161                                                | 15214          | 12.9024            |

Adapted from [SAED'11]

| Standard-Cell | SoC/Platform • | Gate-Array | Prog Logic | FPGA | Comparison |
|---------------|----------------|------------|------------|------|------------|
|               |                | Agenc      | la         |      |            |
| Standa        | rd-Cell-Bas    | ed Desia   | n          |      |            |
|               | -on-Chip P     | 0          |            | sign |            |
| Gate-A        | rray-Based     | Design     |            |      |            |
| Program       | nmable Log     | gic-Based  | Design     |      |            |
| Reprog        | rammable l     | _ogic-Bas  | sed Desig  | gn   |            |
|               | rison and H    | lyhrids    |            |      |            |

| Standard-Cell | SoC/Platform • | Gate-Array | Prog Logic | FPGA | Comparison |
|---------------|----------------|------------|------------|------|------------|
|---------------|----------------|------------|------------|------|------------|

#### System-on-Chip (SoC)

- Brings together: standard cell blocks, custom analog blocks, processor cores, memory blocks
- Standardized on-chip buses (or hierarchical interconnect) permit "easy" integration of many blocks.
  - Ex: AMBA, Sonics, ...
- "IP Block" business model: Hard- or soft-cores available from third party designers.
- ARM, inc. is the shining example. Hardand "synthesizable" RISC processors.
- ARM and other companies provide, Ethernet, USB controllers, analog functions, memory blocks, ...



Pre-verified block designs, standard bus interfaces (or adapters) ease integration - lower NREs, shorten TTM.

Adapted from [Asanovic'11]





# **SoC Platform-Based Design**

"Only the consumer gets freedom of choice; designers need freedom *from* choice." – Orfali

- A platform is a restriction on the space of possible implementation choices, providing well-defined abstraction of the underlying technology for the app developer
- New platforms defined at architecture/ microarchitecture boundary
- Key to such approaches is the representation of communication in the platform model



| Standard-Cell | SoC/Platform | Gate-Array  | Prog Logic | FPGA | Comparison |
|---------------|--------------|-------------|------------|------|------------|
|               |              | Agenc       | la         |      |            |
| Standa        | rd-Cell-Ba   | used Desig  | n          |      |            |
| System        | -on-Chip     | Platform-B  | ased De    | sign |            |
| Gate-A        | rray-Base    | d Design    |            |      |            |
| Program       | nmable Lo    | ogic-Based  | Design     |      |            |
| Reprog        | rammable     | e Logic-Bas | sed Desig  | gn   |            |
| Compa         | rison and    | Hybrids     |            |      |            |



- Can cut mask costs by prefabricating arrays of transistors on wafers
- Only customize metal layer for each design



- Fixed-size unit transistors
- Metal connections personalize design

Two kinds:

- Channeled Gate Arrays
  - Leave space between rows of transistors for routing
- Sea-of-Gates
  - Route over the top of unused transistors

Adapted from [Terman'02]





#### FPGA

Comparison

#### **Gate-Array Example – 3-input NAND Gate**



Adapted from [Weste'11]







35 / 55

Standard-Cell

Prog Logic

FPGA

Comparison

#### **Programming a PROM**



 $f_0 = x_0 x_1 + x_2$  $f_1 = x_0 x_1 x_2 + \overline{x_2} + \overline{x_0} x_1$ 

Adapted from [Rabaey'02]





Adapted from [Weste'11]

### Modern Complex Programmable Logic Devices



Xilinx CoolRunner-II CPLD: 2–32 PLAs integrated on a single chip PLAs support 40 inputs, 56 product terms, and 16 output terms PLAs chained together through a programmable "Advanced Interconnect Matrix"

Adapted from [Xilinx'08]

| Sta             | ndard-Cell | SoC/Platform | Gate-Array                                                                                                             | Prog Logic                                         | FPGA               | Comparison   |  |  |
|-----------------|------------|--------------|------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|--------------------|--------------|--|--|
| CPLD Macro Cell |            |              |                                                                                                                        |                                                    |                    |              |  |  |
|                 |            |              | B, PTC of<br>cells<br>Direct Input<br>from<br>I/O Block<br>PTA<br>CTS<br>GSR<br>GND<br>PTA<br>CTR<br>GSR<br>GSR<br>GND | D/T<br>D/T<br>CE IFIF<br>ILatch<br>CK DualEDO<br>R | Feedback<br>to AIM | To I/O Block |  |  |
| ECE 5745        |            | Т            | 05: Automated Desig                                                                                                    | gn Methodologies                                   |                    | 40 / 55      |  |  |



Comparison

## **Field-Programmable Gate-Arrays**

- Two-dimensional array of simple logic- and interconnectionblocks.
- Typical architecture: LUTs implement any function of n-inputs (n=3 in this case).
- Optional Flip-flop with each LUT.

Configuration memory Flipflop з Interconnect Configuration Adion logic memory

- Fuses, EPROM, or Static RAM cells are used to store the "configuration".
  - Here, it determines function implemented by LUT, selection of Flip-flop, and interconnection points.
- Many FPGAs include special circuits to accelerate adder carry-chain and many special cores: RAMs, MAC, Enet, PCI, SERDES, ...





Adapted from [Weste'11]



|   | Standard-Cell          | SoC/Plat         | tform Gate        | e-Array        | Prog Logic      | FPGA • C         | Comparison •    |  |  |  |
|---|------------------------|------------------|-------------------|----------------|-----------------|------------------|-----------------|--|--|--|
| _ | Operation Binding Time |                  |                   |                |                 |                  |                 |  |  |  |
|   |                        | "Hardy           | ware"             | "Software"     |                 |                  |                 |  |  |  |
|   | Full S<br>Custom       | Standard<br>Cell | I SoC<br>Platform | Gate<br>Array  | Prog.<br>Logic  | Reprog.<br>Logic | µproc<br>DSP    |  |  |  |
|   | First<br>Mask          | First<br>Mask    | First<br>Mask     | Metal<br>Masks | Fuse<br>Program | Load<br>Config   | Load<br>Program |  |  |  |
|   | Later Binding Time     |                  |                   |                |                 |                  |                 |  |  |  |

- Earlier the operation is bound, the less area, delay, and energy required for the implementation
- Later the operation is bound, the more flexible the device

Adapted from [Asanovic'11]

## **Comparison of Specific Design Methodologies**

| Design<br>Method | NRE   | Unit<br>Cost | Power<br>Disp | lmpl<br>Compl | Time to<br>Market | Perf  | Flex  |
|------------------|-------|--------------|---------------|---------------|-------------------|-------|-------|
| Full Custom      | VHigh | Low          | Low           | High          | High              | VHigh | Low   |
| Standard Cell    | High  | Low          | Low           | High          | High              | High  | Low   |
| SoC Platform     | High  | Low          | Low           | Med           | High              | High  | Med   |
| Gate Array       | Med   | Med          | Low           | Med           | Med               | Med   | Med   |
| Prog Logic       | Low   | High         | Med           | Low           | Low               | Med   | Med   |
| Reprog Logic     | Low   | High         | Med           | Med           | Low               | High  | High  |
| µProc/DSP        | Low   | High         | High          | Low           | Low               | Low   | VHigh |

| Standard-Cell             | SoC/Platform                                   | Gate-Array     | Prog Logic | FPGA | Comparison • |
|---------------------------|------------------------------------------------|----------------|------------|------|--------------|
|                           |                                                | ASIC vs.       | FPGA       |      |              |
|                           | Argument<br>n NRE (\$2M fo<br>nal cost, best e |                |            |      | FPGA         |
| FPGA: Lov<br>lower effect | w NRE, high m<br>ciency                        | narginal cost, | Total Cost | ASIC | 0            |

- Cross-over point: around 10,000
- Current Trends
  - ASIC: Increasing NRE (\$40M for 90 nm chip) due to design costs, verification costs, mask costs, etc
  - FPGA: Better able to track Moore's law, integrating fixed function blocks
  - Cross-over point: around 100,000



# Scale – ASIC with Pre-Placement & SRAMs

**Prog Logic** 

**FPGA** 

Comparison •

Gate-Array

SoC/Platform

Standard-Cell



FPGA

Comparison •

#### T0 – Full Custom w/ Standard Cells



Adapted from [Terman'02]

## Application-Specific IC – Full Custom w/ Standard Cell

Standard cell: predefined gates, automatically placed and routed. In .5u  $\rightarrow$  10K fets/mm<sup>2</sup>



Full custom: custom "cells" meant to be stacked in columns to create N-bit wide datapath. Signals between columns routed across cells. In  $.5u \rightarrow 25K/mm2$ 



RAM Generator: one cell iterated many times perhaps surrounded by driver/sensing logic. Basic structure stays the same, only dimensions change. In .5u  $\rightarrow$  45K/mm<sup>2</sup> for multiport regfile



Adapted from [Terman'02]



### Altera HardCopy – FPGA to Gate-Array-Like Tapeout



Adapted from [Mansur'08]

FPGA

#### Acknowledgments

- [Asanovic'11] K. Asanović, J. Wawrzynek, and J. Lazzaro, UC Berkeley CS 250 VLSI Systems Design, Lecture Slides, 2011.
- [ASV'01] A.L. Sangiovanni-Vincentelli, "Platform-Based Design", UC Berkeley White Paper, 2001.
- [Mansur'08] D. Mansur, "Stratix IV FPGA and HardCopy IV ASIC @ 40 nm," Hot Chips, Aug. 2008.
- [Melikyan] V. Melikeyan, "IC Design Introduction", Lecture Slides, Synopsys Curriculum Materials.
- [Rabaey'02] J. Rabaey et al., Companion Slides for "Digital Integrated Circuits: Design Perspective," 2nd ed, Prentice Hall, 2002.

## Acknowledgments

- [SAED'11] "Standard Cell Lib SAED\_EDK90\_CORE Databook", Synopsys, 2011.
- [Terman'02] C. Terman and K. Asanović, MIT 6.371 Introduction to VLSI Systems, Lecture Slides, 2002.
- [Weste'11] N. Weste and D. Harris, "CMOS VLSI Design: A Circuits and Systems Perspective," 4th ed, Addison Wesley, 2011.
- [Xilinx'08] CoolRunner-II CPLD Family," Xilinx DataSheet, 2008. http://www.xilinx.com/support/documentation/data\_ sheets/ds090.pdf