# Logic Synthesis and Verification

Jie-Hong Roland Jiang 江介宏

Department of Electrical Engineering National Taiwan University



Fall 2013

# Sequential Synthesis

part of the following slides are by courtesy of Andreas Kuehlmann

2

#### Motivation

□ Pure combinational optimization can be suboptimal since relations across register boundaries are disregarded

# Overview of Circuit Optimization

Optimization Space
Distance from Physical Implementation
Clock
Combin

**System-Level Optimization** 

**Architectural Restructuring** 

Retiming

**Clock Skew Scheduling** 

**Combinational Optimization** 

Verification Challenge Necessity of Integrated Solution

3

### Sequential Optimization Techniques

- Clock skew scheduling
  - balance path delays by adjusting the relative clocking schedule of individual registers
- Retiming
  - balance path delays by moving registers within circuit topology
  - can be interleaved with combinational optimization techniques
- Architectural restructuring
  - add sequential redundancy
    - ☐ fixed: does not change input/output behavior
    - ☐ flexible: change input output behavior
- System-level optimization

#### Integration in Design Flow

- Optimization space
  - significantly more optimization freedom at a higher level for improving performance, power, area, etc.
- □ Distance from physical implementation
  - difficult to accurately model impacts on final implementation
  - difficult to mathematically characterize optimization space
- Verification challenge
  - departure from combinational comparison model would impede formal equivalence checking
  - different simulation behaviors cause acceptance problems
- Necessity of tight tool integration!

### Sequential Timing Constraints

- Minimum clock period
  - t<sub>clk</sub>(min) = max{t<sub>p</sub>, t<sub>x</sub>} + t<sub>c</sub> + t<sub>su</sub>, where t<sub>x</sub> is the time after the active clock edge at which the X inputs are stable





#### Sequential Timing Constraints



# Sequential Timing Constraints

- Hold-time constraint
  - $\blacksquare$   $t_p + t_c^{min} \ge t_h$



Clock Skew Scheduling



# Clock Skew Scheduling

- By controlling clock delays on registers, clock frequency may be increased
  - Do not change transition and output functions (not the case in retiming)
    - □ Good for functional verification
  - May require sophisticated timing verification
- □ Clock skew: clock signal arrives at different registers at different times
  - Positive skew: the sending register gets the clock earlier than the receiving register
  - Negative skew: the receiving register gets the clock earlier than the sending register

# Clock Skew Scheduling

- Pros
  - Inexpensive "post synthesis" technique to further reduce clock period
  - Combinational design model is preserved
- Cons
  - Setup **and** hold time constraints must be obeyed □including hold time constraints from scan chain
  - Interleaving with combinational optimizations impossible
  - Replication of clocking tree required

#### Retiming



#### Retiming

- Optimize sequential circuits by repositioning registers
  - Move registers so that clock cycle decreases or register count decreases
  - Input-output behavior is preserved; however, transition and output functions are changed due to the register movement



14

# Retiming

#### Pros

- Only setup time constraint (0 clock skew)
- Simple integration with other logical (e.g. combinational) or physical optimizations
  - E.g., iterative retiming and resynthesis
- Easy combination with clock skew scheduling to obtain global optimum

#### Cons

- Change combinational model of designSevere impact on verification methodology
- Inaccurate delay model
- Computation of equivalent reset state required

# Architectural Retiming



## Architectural Retiming

- Pros
  - Smooth extension of regular retiming
  - Potential to alleviate global performance bottlenecks by adding sequential redundancy and pipelining
- Cons
  - Significant change of design structure

    □ substantial impact on verification methodology
  - Flexible architectural restructuring changes I/O behavior □ existing RTL specification methods not always applicable

17

Verification Issues

- □ Timing verification unchanged
- □ Functional verification affected
  - Except for clock skew scheduling, sequential optimization does change register (transition) functions
  - Traditional combinational equivalence checking not applicable
  - Simulation runs not recognizable by designers acceptance problems
  - Solution:
    - □ preserve retime function (mapping function) from synthesis for:
      - reducing sequential EC problem back to combinational case
         no false positives possible!
      - modifying simulation model to reproduce original simulation output

18

## Retiming Circuits

- □ Objectives:
  - Reduce clock cycle time
  - Reduce register count (area)
  - Reduce power, etc.
- □ Input: A netlist of gates and registers



# **Retiming Circuits**

□ Circuit represented as retiming graph G(V, E)

[Leiserson and Saxe 1983, 1991]

- V: vertex set representing logic gates
- E: edge set representing connections
- $d(v) = delay of gate/vertex v, (d(v) \ge 0)$
- w(e) = number of registers on edge e,  $(w(e) \ge 0)$

#### **Retiming Circuits**

- Example
  - Synchronous circuit assumption: every cycle of a circuit has at least one register, i.e., no combinational loop



The host node represents the environment that interacts with the circuit via the primary inputs and outputs

| Operation | delay |  |  |
|-----------|-------|--|--|
| δ         | 3     |  |  |
| +         | 7     |  |  |

**Retiming Circuits** 

 $\square$  For a path  $p: v_0 \rightarrow v_1 \rightarrow \cdots v_{k-1} \rightarrow v_k$ 

■ Path delay  $d(p) = \sum_{i=1}^{n} d(v_i)$ 

(includes endpoints)

- Path weight  $w(p) = \sum w(e_i)$
- □ Minimum clock cycle

$$c = \max_{p: w(p)=0} \{d(p)\}$$



22

## **Retiming Circuits**

- ■Atomic operation
  - Move registers across a gate in a forward or backward direction

Retime by -1 (delayed by -1 cycle) Retime by 1 (delayed by 1 cycle)

Does not affect gate functionality, but timing

# **Retiming Circuits**

- Retiming can be formalized with a retime function r:  $V \rightarrow Z$ , where Z is the set of integers
  - I.e., a retime function performs integer labeling on
- Weight update after retiming with *r* 
  - $\mathbf{w}_{r}(e) = w(e) + r(v) r(u)$ , for edge e = (u, v)
  - $\mathbf{w}_r(p) = \mathbf{w}(p) + \mathbf{r}(t) \mathbf{r}(s)$ , for path p from s to t
- □ A retiming with some r is legal if  $w_r(e) \ge 0$ ,  $\forall e \in E$









#### Min-Cycle Retiming

□ Problem Statement: (minimum cycle retiming) Given G(V, E) with delay function d and weight function w, find a legal retiming r so that

$$c = \max_{p: w_r(p)=0} \{d(p)\}$$

is minimized

■ Retiming: two important matrices

Register weight matrix

$$W(u,v) = \min_{p} \{ w(p) : u \xrightarrow{p} v \}$$

Delay matrix

$$D(u,v) = \max_{p} \{d(p) : u \xrightarrow{p} v, w(p) = W(u,v)\}$$

#### Min-Cycle Retiming

#### ■ Example



W D V0 V1 V2 V3 V0 V1 V2 V3

| 0 |   | 0   | 0                     | 13<br>10 | 3<br>                           | 6<br>3                                | 13<br>10                                             | V1<br>V2                                              |
|---|---|-----|-----------------------|----------|---------------------------------|---------------------------------------|------------------------------------------------------|-------------------------------------------------------|
| U | Τ | Τ   | 0                     | - 1      | _                               |                                       | 1                                                    | V3                                                    |
|   | 0 | 0 U | 0 0 0<br>0 <u>1</u> 0 | 0 0 0 0  | 0 0 0 0 13<br>0 <u>L</u> 0 0 10 | 0 0 0 0 13 3<br>0 \( \preceq\$ 0 0 \) | $\begin{array}{cccccccccccccccccccccccccccccccccccc$ | $ \begin{array}{cccccccccccccccccccccccccccccccccccc$ |

For some constant  $\alpha$ , minimum clock cycle  $c \le \alpha \Leftrightarrow \forall p$ , if  $d(p) > \alpha$  then  $w(p) \ge 1$ 

W = register path weight matrix (minimum # registers on all paths between u and v) D = path delay matrix (maximum delay on the paths between u and v with w(p)=W(u,v))

Don't count paths passing through the host!

26

#### Min-Cycle Retiming

- $\blacksquare$  Assume that we are asked to check if a retiming exists for a clock cycle  $\alpha$
- Legal retiming:  $w_r(e) \ge 0$  for all e. Hence  $w_r(e) = w(e) + r(v) r(u) \ge 0$ , or  $r(u) r(v) \le w(e)$
- □ For all paths p:  $u \rightarrow v$  such that  $d(p) \ge \alpha$ , we require  $w_r(p) \ge 1$ . Thus

$$1 \le w_r(p) = \sum_{i=0}^{k-1} w_r(e_i)$$

$$= \sum_{i=0}^{k-1} [w(e_i) + r(v_{i+1}) - r(v_i)]$$

$$= w(p) + r(v_k) - r(v_0)$$

$$= w(p) + r(v) - r(u)$$

- □ Take the least w(p) (tightest constraint)  $r(u)-r(v) \le W(u,v)-1$ 
  - Note: This is independent of the path from u to v, so we just need to apply it to u, v such that D(u,v) > α

# Min-Cycle Retiming

#### Example

Assume  $\alpha = 7$ 

Legality: r(u)-r(v)≤w(e)

$$\frac{r(u)-r(v) \le w(e)}{r(v_0)-r(v_1) \le 2}$$

$$\begin{vmatrix} r(v_1) - r(v_2) \le 0 \\ r(v_1) - r(v_3) \le 0 \end{vmatrix}$$

$$r(v_2) - r(v_3) \le 0$$

$$r(v_3) - r(v_0) \le 0$$

 $r(u)-r(v) \le W(u,v)-1$  $r(v_0)-r(v_2) \le 1$ 

D>7:

$$r(v_0) - r(v_3) \le 1$$

$$r(v_1) - r(v_0) \le -1$$

$$r(v_1) - r(v_3) \le -1$$

$$r(v_2) - r(v_0) \le -1$$

$$r(v_2) - r(v_3) \le -1$$

All constraints are in the difference-of-2-variable form and closely related to shortest path problem





# Min-Cycle Retiming

#### Example

Legality: r(u)-r(v)≤w(e)

$$r(v_0) - r(v_1) \le 2$$
  
 $r(v_1) - r(v_2) \le 0$ 

$$r(v_1) - r(v_3) \le 0$$

$$r(v_2) - r(v_3) \le 0$$
  
 $r(v_3) - r(v_0) \le 0$ 

D>7:  $r(u)-r(v) \le W(u,v)-1$   $r(v_0)-r(v_3) \le 1$   $r(v_1)-r(v_0) \le -1$  $r(v_1)-r(v_3) \le -1$ 

$$r(v_2) - r(v_0) \le -1$$
  
 $r(v_2) - r(v_3) \le -1$ 

Search shortest path on constraint graph: Bellman-Ford algorithm O(|V||E|) or  $O(|V|^3)$ 

A solution exists if and only if there exists **no** negative weighted cycle



A solution is  $r(v_0) = r(v_3) = 0$ ,  $r(v_1) = r(v_2) = -1$ 

29

## Min-Cycle Retiming

☐ To find the minimum cycle time, do a binary search among the entries of the D matrix O(|V||E||log|V|)



30

# Min-Cycle Retiming

- □ Theorem: r is a legal retiming on G such that the clock cycle  $c \le α$  for some constant α if and only if
  - 1.  $r(v_h) = 0$
  - 2.  $r(u)-r(v) \le w(e)$  for every edge e(u,v)
  - 3.  $r(u)-r(v) \le W(u,v)-1$  (i.e. register count > 1) for every (u, v) with  $D(u,v) > \alpha$
- □ Solve the integer linear programming problem
  - Bellman-Ford method O(|V|³)

# Min-Cycle Retiming

- □Algorithm of optimal retiming:
  - 1. Compute W and D
  - 2. Binary search the minimum achievable clock period by applying Bellman-Ford algorithm to check the satisfication of the prior Theorem
  - 3. Derive r(v) under the minimum achievable clock period found in Step 2
- □ Complexity  $O(|V|^3 |g|V|)$

# Min-Cycle Retiming

- Two more algorithms:
- 1. Relaxation based:
  - Repeatedly find critical path
  - Retime vertex at end of path by +1 (O(|V||E|log|V|))



2. Also, Mixed Integer Linear Program formulation

#### Min-Area Retiming

Goal: minimize number of registers used  $\min N_r = \sum w_r(e)$ 

$$= \sum_{e: u \to v} (w(e) + r(v) - r(u))$$

$$=\sum_{e\in E}w(e)+\sum_{e:u\to v}(r(v)-r(u))$$

$$= N + \sum_{u \to v} (r(v) - r(u))$$

$$= N + \sum_{v \in V} \left[ r(v)(\# fanin(v) - \# fanout(v)) \right]$$

$$= N + \sum_{v \in V} a_V r(v)$$

where  $a_{\nu}$  is a constant

34

# Min-Area Retiming

■ Minimize:

$$\sum_{v \in V} a_v r(v)$$

■Subject to:

$$w_r(e) = w(e) + r(v) - r(u) \ge 0$$

■ Note: It is reducible to a flow problem

# Retiming Issues

- □ Computation of equivalent initial states
  - Equivalent initial states may not always exist



- General solution requires replication of logic for initialization
- □ Timing models
  - Too far away from actual implementation

## Retiming + Clock Scheduling

- Mathematical formulation
  - s:  $E \rightarrow R$ , a real edge labeling
  - s(e) denotes the clock signal delay of all registers of e
- ☐ In addition to the register weight matrix and delay matrix for the maximum delay, we also need the minimum paths delays

$$W(u,v) = \min_{p} \{ w(p) : u \xrightarrow{p} v \}$$

$$D(u,v) = \max_{p} \{ d(p) : u \xrightarrow{p} v, w(p) = w(u,v) \}$$

$$D_{\min}(u,v) = \min_{p} \{ d(p) : u \xrightarrow{p} v, w(p) = w(u,v) \}$$

Retiming + Clock Scheduling

□ A valid retiming and clock skew schedule is an assignment to r and s such that:

- (1)  $w_r \ge 0$
- (2)  $\forall (u',u),(v,v')$ :

$$w(u',u) > 0 \land w(v,v') > 0 \land W(u,v) = 0 \Rightarrow$$

$$D_{\min}(u,v) + s(u',u) - s(v,v') \ge T_{hold} \land$$

$$D(u,v) + s(u',u) - s(v,v') \le T_{clock} - T_{setup}$$

□ Solution Mixed Integer Linear Program (MILP)

37

### Retiming & Resynthesis

- Combine retiming and combinational optimization
  - Retime registers such that the circuit has a large combinational logic block for optimization
  - Resynthesize the combinational logic block with combinational logic minimization techniques
  - Retiming and resynthesis can be iteratedCan achieve any state re-encoding

# Retiming & Resynthesis

Example



--