## Worst-Case Performance Prediction Under Supply Voltage and Temperature Noise

Chung-Kuan Cheng<sup>†</sup>, **Andrew B. Kahng<sup>†‡</sup>**, Kambiz Samadi<sup>‡</sup> and Amirali Shayan<sup>†</sup>

June 13, 2010

CSE<sup>†</sup> and ECE<sup>‡</sup> Departments University of California, San Diego

## **Motivation**

 Power distribution network (PDN) is major consumer (30+%) of interconnect resources

→ seek efficient early-stage PDN optimization

- Correct optimization of PDN requires understanding the implications on delay
- Our proposed models attempt to accurately and efficiently provide such implications



## **Existing Models**

- Gate delay models under supply voltage noise can be classified as (1) static or (2) dynamic
- Replace supply voltage noise with equivalent P/G voltage (cf. Hashimoto et al. ICCAD'04)
  - Fails to capture the dynamic behavior of the noise waveform (time-invariant)
- Probabilistic approaches to estimate supply voltage noise bound given a performance criteria (cf. *Martorell et al. CDTiSNE'07*)
  - Assumes equal supply voltage across all the gates in a path
- Discretize the noise waveform → assign an equivalent DC voltage values for each interval (cf. Weng et al. ICCD'08)
- Recently, *Okumera et al.* proposed a dynamic gate delay model (cf. *ASPDAC'10*)
  - Does not account for simultaneous change in all the relevant cell and noise parameters



## **Implementation Flow and Tools**

- Configurable SPICE netlist
- Use range of supply voltage noise, temperature and cell parameters to capture design space
- Use nonparametric regression modeling to capture impact of supply voltage noise and temperature on cell delay
- From basic gate delay model, compute delay of arbitrary k-stage critical path
- → Accurately detect worstcase supply noise waveform / performance





# **Scope of Study**

- Parameterizable SPICE netlist for a given cell
- Generic critical path with arbitrary number of stages
- 65nm Foundry SPICE (typical corner, NVT devices)
- Tool Chain: Synopsys HSPICE and Salford MARS 3.0
- Experimental axes:
  - Technology node: {65nm}
  - Cell parameters: {slew<sub>in</sub>, output<sub>load</sub>, cell<sub>size</sub>}
    - input slew, output load, cell size
  - Supply noise parameters: {amp<sub>noise</sub>, slew<sub>noise</sub>, offset<sub>noise</sub>}
    - noise amplitude, noise slew, noise offset
  - Temperature





## **Modeling Problem**

- Accurately predict y given vector of parameters  $\vec{x}$
- Difficulties: (1) which variables x to use, and (2) how different variables combine to generate y

 $y = f(\vec{x}) + noise$ 

- Parametric regression: requires a functional form
- Nonparametric regression: learns about the best model from the data itself
  - → Decouple the modeling task from understanding the complex relationships between dynamic supply voltage noise / temperature and cell delay
- This work: exploration of nonparametric regression to model delay and output slew of a given cell



### **Multivariate Adaptive Regression Splines (MARS)**

- MARS is a nonparametric regression technique
- MARS builds models of form:

$$\hat{\mathbf{f}}(\vec{\mathbf{x}}) = \mathbf{C}_0 + \sum_{i=1}^{\kappa} \mathbf{C}_i \mathbf{B}_i(\vec{\mathbf{x}})$$

- Each basis function  $B_i(\vec{x})$  can be:
  - a constant
  - a "hinge" function max(0, c x) or max(0, c x)
  - a product of two or more hinge functions
- Two modeling steps:
  - (1) forward pass: obtains model with defined maximum number of terms
  - (2) backward pass: improves generality by avoiding an overfit model





## **Example MARS Output Models**

#### **Delay Model**

$$\begin{split} \mathsf{B}_1 &= \max(0, \textit{load}_{out} - 0.021); \ \mathsf{B}_2 &= \max(0, 0.021 - \textit{load}_{out}); \ \dots \\ \mathsf{B}_{98} &= \max(0, \textit{offest}_{noise} + 2.4e\text{-}12) \times \mathsf{B}_{92}; \\ \mathsf{B}_{100} &= \max(0, \textit{offset}_{noise} + 2.4e\text{-}12) \times \mathsf{B}_{37}; \end{split}$$

 $d_{cell} = 1.02e-11 + 7.35e-10 \times B_1 - 5.89e-10 \times B_2 - 2.17e-11 \times B_3 + ...$ - 1.71e-7×B<sub>96</sub>+2.43e-7×B<sub>98</sub> - 3.03e-8×B<sub>100</sub>

#### **Output Slew Model**

$$\begin{split} &\mathsf{B}_1 = \max(0, \textit{load}_{out} - 0.0009); \, \mathsf{B}_2 = \max(0, \textit{cell}_{size} - 4) \times \mathsf{B}_1; \, \dots \\ &\mathsf{B}_{99} = \max(0, \, 0.05 - \textit{slew}_{noise}) \times \mathsf{B}_{55}; \\ &\mathsf{B}_{100} = \max(0, \textit{offset}_{noise} + 0.15) \times \mathsf{B}_{94}; \end{split}$$

 $slew_{out} = 1.23e-11 + 1.53e-10 \times B_1 - 2.05e-10 \times B_2 + 2.05e-9 \times B_3 + ... - 1.08e-8 \times B_{98} - 4.33e-9 \times B_{99} - 7.42e-9 \times B_{100}$ 

- Closed-form expressions with respect to cell and supply voltage noise parameters
- Suitable to drive early-stage PDN design exploration



### **Accurate Cell Delay Modeling**

- Noise characteristics need to be considered
- Noise slew affects cell delay only when it is comparable to that of input slew
- Noise offset affects the impact of supply noise on cell delay
- → CMOS gate delay modeling is a nontrivial task with nonobvious implications







## **Worst-Case Performance Model**

- GOAL: find set of seven parameters (7-tuple) where the path delay is maximum
- Mapping from set of all 7-tuples to cell delay and output slew values
- In a single stage pick the 7-tuple with maximum delay
- In a multi-stage path:
  - Output slew of the previous stage becomes the input slew to the current stage
  - Noise offset must be adjusted according to delay and output slew values of the previous stages



 Worst-case configuration is always an element of |slew<sub>in</sub>|×|load<sub>out</sub>|×|cell<sub>size</sub>|×|amp<sub>noise</sub>|×|slew<sub>noise</sub>|×|offset<sub>noise</sub>|×|temp| (10/12)

## **Experimental Setup and Results**

#### Scripting to generate SPICE decks for 30720 configurations

| Parameter               | Values                                             |
|-------------------------|----------------------------------------------------|
| slew <sub>in</sub>      | {0.00056, 0.00112, 0.0392, 0.1728, 0.56, 0.7088}ns |
| loadout                 | {0.0009, 0.0049, 0.0208, 0.0842}pF                 |
| cell <sub>size</sub>    | INV: {1, 4, 8, 20}<br>2-input NAND: {1, 2, 4, 8}   |
| amp <sub>noise</sub>    | {0, 0.054, 0.144, 0.27}V                           |
| slew <sub>noise</sub>   | {0.01, 0.04, 0.07, 0.09}ns                         |
| offset <sub>noise</sub> | {-0.15, -0.05, 0, 0.05, 0.15}ns                    |
| temp                    | {-40, 25, 80, 125}°C                               |

- Three different paths with different number of stages: (1) only inverter, (2) only 2-input NAND, and (3) a mix of inverter and 2input NAND
- Models are insensitive to random selection of training data set
- Cell delay model within 6% of SPICE (on average)
- Our multi-stage path delay within 4.3% of SPICE simulation
- Worst-case predictions are in top 3 (out of 30720 configurations)
  W.r.t. list

## **Extensibility of Approach**

- Have used same methodology to develop models for interconnect wirelength (WL) and fanout (FO)
- Wirelength model
  - On average, within 3.4% of layout data
  - 91% reduction of avg error vs. existing models (cf. Christie et al. '00)
- Fanout model
  - On average, within 0.8% of the layout data
  - 96% reduction of avg error vs. existing models (cf. Zarkesh-Ha et al. '00)





### Conclusions

- Generally applicable gate delay modeling methodology
  - Leverage supply voltage and temperature variations
- Achieved accurate cell delay and output slew models
- Validated our models against 30720 configurations
- Proposed cell delay model is within 6% of SPICE (on average)
- Proposed path delay model is within 4.3% of SPICE (on average)
- Proposed models accurately detect worst-case supply noise waveform / performance

