# Logic Synthesis and Verification

Jie-Hong Roland Jiang 江介宏

Department of Electrical Engineering National Taiwan University

Fall 2012

# Sequential Synthesis

part of the following slides are by courtesy of Andreas Kuehlmann

#### Motivation

**Distance from Physical Implementation** 

**Optimization Space** 

Pure combinational optimization can be suboptimal since relations across register boundaries are disregarded

#### Overview of Circuit Optimization

**System-Level Optimization** 

**Architectural Restructuring** 

Retiming

**Clock Skew Scheduling** 

**Combinational Optimization** 

Verification Challenge Necessity of Integrated Solution

# Sequential Optimization Techniques

#### Clock skew scheduling

balance path delays by adjusting the relative clocking schedule of individual registers

#### Retiming

- balance path delays by moving registers within circuit topology
- can be interleaved with combinational optimization techniques

#### Architectural restructuring

add sequential redundancy
 fixed: does not change input/output behavior
 flexible: change input output behavior

#### System-level optimization

#### Integration in Design Flow

Optimization space

- significantly more optimization freedom at a higher level for improving performance, power, area, etc.
- Distance from physical implementation
  - difficult to accurately model impacts on final implementation
  - difficult to mathematically characterize optimization space
- Verification challenge
  - departure from combinational comparison model would impede formal equivalence checking
  - different simulation behaviors cause acceptance problems
- Necessity of tight tool integration!



#### Sequential Timing Constraints





#### **Clock Skew Scheduling**

By controlling clock delays on registers, clock frequency may be increased
Do not change transition and output functions (not the case in retiming)
Good for functional verification
May require sophisticated timing verification
Clock skew: clock signal arrives at different registers at different times
Positive skew: the sending register gets the clock earlier than the receiving register
Negative skew: the receiving register gets the clock earlier than the sending register

# Clock Skew Scheduling

#### Pros

- Inexpensive "post synthesis" technique to further reduce clock period
- Combinational design model is preserved

#### Cons

- Setup and hold time constraints must be obeyed
   including hold time constraints from scan chain
- Interleaving with combinational optimizations impossible
- Replication of clocking tree required

# Retiming



# Retiming

- Optimize sequential circuits by repositioning registers
  - Move registers so that clock cycle decreases or register count decreases
  - Input-output behavior is preserved; however, transition and output functions are changed due to the register movement



# Retiming

#### Pros

- Only setup time constraint (0 clock skew)
- Simple integration with other logical (e.g. combinational) or physical optimizations
  - E.g., iterative retiming and resynthesis
- Easy combination with clock skew scheduling to obtain global optimum

#### Cons

- Change combinational model of design
   Severe impact on verification methodology
- Inaccurate delay model
- Computation of equivalent reset state required

# Architectural Retiming





# Architectural Retiming

#### Pros

- Smooth extension of regular retiming
- Potential to alleviate global performance bottlenecks by adding sequential redundancy and pipelining

#### Cons

- Significant change of design structure
   substantial impact on verification methodology
- Flexible architectural restructuring changes I/O behavior
   existing RTL specification methods not always applicable

#### Verification Issues

Timing verification unchanged

- Functional verification affected
  - Except for clock skew scheduling, sequential optimization does change register (transition) functions
  - Traditional combinational equivalence checking not applicable
  - Simulation runs not recognizable by designers acceptance problems
  - Solution:

preserve retime function (mapping function) from synthesis for:

- reducing sequential EC problem back to combinational case
   no false positives possible!
- modifying simulation model to reproduce original simulation output

# **Retiming Circuits**

#### Objectives:

- Reduce clock cycle time
- Reduce register count (area)
- Reduce power, etc.

Input: A netlist of gates and registers



#### 19

# **Retiming Circuits**

- Circuit represented as retiming graph G(V, E) [Leiserson and Saxe 1983, 1991]
  - V: vertex set representing logic gates
  - E: edge set representing connections
  - d(v) = delay of gate/vertex v,  $(d(v) \ge 0)$
  - w(e) = number of registers on edge e,  $(w(e) \ge 0)$

# **Retiming Circuits**

#### Example

Synchronous circuit assumption: every cycle of a circuit has at least one register, i.e., no combinational loop





# **Retiming Circuits**

#### Atomic operation

Move registers across a gate in a forward or backward direction



Does not affect gate functionality, but timing

# **Retiming Circuits**

- Retiming can be formalized with a retime function  $r: V \rightarrow Z$ , where Z is the set of integers
  - I.e., a retime function performs integer labeling on vertices
- Weight update after retiming with r
  - $w_r(e) = w(e) + r(v) r(u)$ , for edge e = (u,v)
  - $w_r(p) = w(p) + r(t) r(s)$ , for path p from s to t
- A retiming with some *r* is legal if  $w_r(e) \ge 0$ ,  $\forall e \in E$







# <section-header><text><text><equation-block><text><text><list-item><list-item><list-item><equation-block><equation-block>

#### Min-Cycle Retiming



For some constant  $\alpha$ , minimum clock cycle  $c \leq \alpha \Leftrightarrow \forall p$ , if  $d(p) > \alpha$  then  $w(p) \geq 1$ 

W = register path weight matrix (minimum # registers on all paths between u and v) D = path delay matrix (maximum delay on the paths between u and v with w(p)=W(u,v))

Don't count paths passing through the host!

- Assume that we are asked to check if a retiming exists for a clock cycle α
- Legal retiming:  $w_r(e) \ge 0$  for all e. Hence  $W_r(e) = W(e) + r(v) - r(u) \ge 0$ , or  $r(u) - r(v) \le w(e)$
- **D** For all paths p:  $u \rightarrow v$  such that  $d(p) \ge \alpha$ , we require  $w_r(p) \ge 1$ . Thus k-1

$$1 \le w_r(p) = \sum_{i=0}^{k-1} w_r(e_i)$$
$$= \sum_{i=0}^{k-1} [w(e_i) + r(v_{i+1}) - r(v_i)]$$
$$= w(p) + r(v_k) - r(v_0)$$
$$= w(p) + r(v) - r(u)$$

**Take the least w(p) (tightest constraint)**  $r(u)-r(v) \le W(u,v)-1$ Note: This is independent of the path from u to v, so we just need to apply it to u, v such that  $D(u,v) > \alpha$ 

#### Min-Cycle Retiming

#### Example

ľ

Assume  $\alpha = 7$ 

Legality:  
$$r(u)-r(v) \le w(e)$$
D>7:  
 $r(u)-r(v) \le W(u,v)-1$  $r(v_0) - r(v_1) \le 2$  $r(u)-r(v) \le W(u,v)-1$  $r(v_1) - r(v_2) \le 0$  $r(v_1) - r(v_3) \le 1$  $r(v_1) - r(v_2) \le 0$  $r(v_1) - r(v_0) \le -1$  $r(v_2) - r(v_3) \le 0$  $r(v_1) - r(v_3) \le -1$  $r(v_3) - r(v_0) \le 0$  $r(v_2) - r(v_3) \le -1$ 

All constraints are in the difference-of-2-variable form and closely related to shortest path problem





29

# Min-Cycle Retiming

To find the minimum cycle time, do a binary search among the entries of the D matrix O(|V||E| log |V|)



- □ Theorem: *r* is a legal retiming on *G* such that the clock cycle  $c \le \alpha$  for some constant  $\alpha$  if and only if
  - 1.  $r(v_h) = 0$
  - 2.  $r(u)-r(v) \le w(e)$  for every edge e(u, v)
  - 3. r(u)- $r(v) \le W(u, v)$ -1 (i.e. register count > 1) for every (u, v) with  $D(u, v) > \alpha$

□ Solve the integer linear programming problem

Bellman-Ford method O(|V|<sup>3</sup>)

#### Min-Cycle Retiming

□ Algorithm of optimal retiming:

- 1. Compute W and D
- 2. Binary search the minimum achievable clock period by applying Bellman-Ford algorithm to check the satisfication of the prior Theorem
- 3. Derive r(v) under the minimum achievable clock period found in Step 2

Complexity  $O(|V|^3 |g| V|)$ 

- **Two more algorithms:**
- 1. Relaxation based:
  - Repeatedly find critical path
  - Retime vertex at end of path by +1 (O(|V||E|log|V|))



2. Also, Mixed Integer Linear Program formulation

#### Min-Area Retiming



where  $a_{\nu}$  is a constant

#### Min-Area Retiming

Minimize:

$$\sum_{v\in V}a_v r(v)$$

□Subject to:

 $w_r(e) = w(e) + r(v) - r(u) \ge 0$ 

□ Note: It is reducible to a flow problem

#### **Retiming Issues**

Computation of equivalent initial states
 Equivalent initial states may not always exist



General solution requires replication of logic for initialization

#### Timing models

Too far away from actual implementation

# Retiming + Clock Scheduling

#### Mathematical formulation

- s: E→R, a real edge labeling
- s(e) denotes the clock signal delay of all registers of e

In addition to the register weight matrix and delay matrix for the maximum delay, we also need the minimum paths delays

> $W(u,v) = \min_{p} \{w(p) : u \xrightarrow{p} v\}$   $D(u,v) = \max_{p} \{d(p) : u \xrightarrow{p} v, w(p) = w(u,v)\}$  $D_{\min}(u,v) = \min_{p} \{d(p) : u \xrightarrow{p} v, w(p) = w(u,v)\}$

37

# Retiming + Clock Scheduling

- A valid retiming and clock skew schedule is an assignment to r and s such that:
  - (1)  $w_r \ge 0$
  - (2)  $\forall (u',u), (v,v'):$

 $w(u',u) > 0 \land w(v,v') > 0 \land W(u,v) = 0 \Longrightarrow$  $D_{\min}(u,v) + s(u',u) - s(v,v') \ge T_{hold} \land$  $D(u,v) + s(u',u) - s(v,v') \le T_{clock} - T_{setup}$ 

Solution Mixed Integer Linear Program (MILP)

# Retiming & Resynthesis

# Combine retiming and combinational optimization Retime registers such that the circuit has a large combinational logic block for optimization Resynthesize the combinational logic block with combinational logic minimization techniques Retiming and resynthesis can be iterated Can achieve any state re-encoding

# Retiming & Resynthesis

