Parallel Per-Critical-Clock (PPCC) Logic Synthesis and Netlist Fusion For Best PPA and Convergence
Parallel Per-Critical-Clock (PPCC) Logic Synthesis and Netlist Fusion For Best PPA and Convergence
Parallel Per-Critical-Clock (PPCC) Logic Synthesis and Netlist Fusion For Best PPA and Convergence
Qualcomm
Bangalore,India
https://www.qualcomm.com
ABSTRACT
For complex designs with multiple high frequency clocks logic synthesis tools today are unable
to achieve best possible PPA (Performance-Power-Area), convergence and runtime due to its
limitation in methodology. This paper will discuss these present limitations in the logic
synthesis methodologies available. As a solution to this we present a logic synthesis framework
called PPCC (parallel per-critical-clock) synthesis to break down the problem to pieces and
introduce parallelism : we initiate multiple PPCC synthesis runs to achieve per-critical-clock
best PPA. At the end multiple netlists are blended together to create final synthesized netlist.
SNUG 2019
Table of Contents
1. Introduction – How Logic Optimization happens today...................................................................................... 3
2. Problem in th epresent methodology .......................................................... Error! Bookmark not defined.
2.1 Design exploration quits without achieving the best PPA…………………………………………………3
2.2 Design convergence challenge for complex design…………………………………………………….……3
2.3 High optimization runtime…………………………………………………………………………………………….3
3. 3. PPCC solution introduced………………………………………………………………………………………………………4
3.1 Identification of critical clock candidates……………………………………………………………………….4
3.2 Identification of corresponding submodule scopes…………………………………………………………4
3.3 Initiate multiple PPCC optimization……………………………………………………………………………….4
3.4 Netlist fusion…………………………………………………………………………………………...……………………4
4 Superiority of th eproposed solution………………………………………………………………………………………….5
4.1 Better Timing………………………………………………………………………………………………….…………….5
4.2 Better Area……………………………………………………………………………………………………….…………..5
4.3 Reduced optimization runtime………………………………………………………………………….…………...5
4.4 Predictability of solution and faster design convergence…………………………………….…………..6
5. Experiments and Results…………………………………………………………………………………………………………..6
6. Conclusion………………….………………………………………………………………………………………………………………………..9
7. References……………………………………………………………………………………………………………………………….9
Table of Figures
Figure 1. PPCC flow chart...................................................................................................................................................... 5
Table of Tables
Table 1. Frequency Targtes. ................................................................................................................................................ 6
Table 2. Timing and runtime results for default synthesis. ................................................................................... 7
Table 3. Timing and Runtime result for CLKa PPCC……………………………………………………………………….7
Table 4. Timing and Runtime result for CLKb PPCC……………………………………………………………………….7
Table 5. Timing and Runtime result for CLKc PPCC……………………………………………………………………….7
clock domains problem, individual PPCC runtime significantly improves w.r.t multiple-clock
synthesis runtime . Thus eventually we achieve a significant improvement in runtime.
Here CLKa, CLKb and CLKc are asynchronous to each other and the design has distinct hierarchies
SubDESa, SubDESb and SubDESc which work under CLKa, CLKb and CLKc respectively. Here 3 PPCC
synthesis runs were invoked.
1) First PPCC synthesis run was targeted for CLKa clock and SubDESa sub-module scope; where
CLKa was maintained at the targeted frequency and CLKb and CLKc were lowered to 100MHz.
Boundary optimization across SubDESa was turned off and its interfaces were kept intact.
2) Second PPCC synthesis run was targeted for CLKb clock and SubDESb submodule scope; here
CLKb was maintained at the targeted frequency and CLKa and CLKc were lowered to 100MHz.
Boundary optimization across SubDESb was turned off and its interfaces were kept intact.
3) Third PPCC synthesis run was targeted for CLKc clock and SubDESc submodule scope. CLKc
was maintained at the targeted frequency and CLKa and CLKb were lowered to 100MHz.
Boundary optimization across SubDESc was turned off and its interfaces were kept intact.
4) At the end, top netlist was created extracting and choosing SubDESa from first netlist,
SubDESb from second netlist and SubDESc from third netlist.
We are presenting relevant tables an graphs which compare the timing results, logic area and
compile runtime between PPCC netlist and the normal/default synthesis.
Overall Runtime of the PPCC synthesis is the maximum of all individual PPCC runtimes. In this
example compile time of CLKB PPCC runtime is maximum (465 minutes). So overall PPCC runtime
can be deemed to be 465 minutes here.
6. Conclusions
PPCC was experimented on couple of other cores also, everywhere this methodology is responding
the similar way. PPCC netlist achieved better timing , area as well as compile runtime. Apart from
that design convergence and predicatability of convergence increases significantly with PPCC. This
PPCC benefit will be amplified for when the complexity of the design grows.
7. References
[1] Solvnet
[2] VLSI forum