Building Better IP With RTL Architect NoC IP Physical Exploration by Arteris

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Building Better IP with RTL Architect

NoC IP Physical Exploration by Arteris


Shivakumar Musini – Principal Physical Design Hardware Engineer
Frank Schirrmeister – VP Solutions & Business Development
Arteris IP

SNUG SILICON VALLEY 2023 1


About Arteris

 Silicon-proven IP used in 3 billion+ SoCs shipped to date


 200+ customers and 600+ SoC design starts to date
 70-80% market share of automotive ADAS SoC market¹
 54 patents and 75 patent applications
 Broad support - any processors, IP, EDA, foundry
 Innovative technology coupled with expert support results in a
97% annual average customer retention rate
 Global reach – offices in 8 locations

¹Management estimates

SNUG SILICON VALLEY 2023 2


Network-On-Chip (NoC)
Interconnect IP is Critical to Every SoC to Enable Next-Generation Technologies
Shorter, More
Smaller Lower Power Faster Frequency Rapid Timing Automated Easy
Predictable
Die Area Consumption Lower Latency Closure Estimation Verification Configuration
Schedules

CPU Subsystem Domain-specific Subsystems Machine Learning Subsystem Safety Island / Safety-
Accelerator Subsystem(s) Critical Subsystem
Application-specific IP Subsystem
CPU CPU
Accel DSP IP SRAM
DSU (L3 Cache) DSU (L3 Cache) Arteris Ncore Arteris FlexNoC Arteris FlexNoC AI Package Arteris FlexNoC
Resilience (Safety)

CHI & ACE


Protocols
FlexNoC Resilience
Arteris Ncore® Arteris FlexNoC® (Safety) Package
Chiplet Chiplet
Link Cache Coherent
Proxy $
Non-coherent Link
SMC $ Interconnect Interconnect

Arteris CodaCache® Last Level Cache


Memory Scheduler

WiFi GSM HDMI MIPI JTAG


Memory Controller Arteris FlexNoC Subsystem Interconnect CRI Crypto RSA-PSS
Firewall Cert.
LP DDR USB 3 PHY (PCF+) Engine
HBM2 PHY PCIe PHY Ethernet LTE LTE Adv. Display PMU
DDR 4/5 USB 2 3.0, 2.0

Memory Subsystem High Speed Wired Peripherals Wireless Subsystem Security Subsystem I/O Peripherals

Arteris Ncore® cache coherent interconnect IP Arteris FlexNoC® non-coherent interconnect IP Arteris CodaCache® last level cache
SNUG SILICON VALLEY 2023 3
Arteris Technology

SNUG SILICON VALLEY 2023 4


Agenda
• Challenges with Traditional Flow
• Physical Exploration with RTL Architect
• Conclusions

SNUG SILICON VALLEY 2023 5


Challenges
Challenges with Traditional Flow

SNUG SILICON VALLEY 2023 6


Drivers for Network-on-Chip
Physical Awareness
Computing sophistication and complexity continue to skyrocket
– Exponential transistor count growth curve  increased data traffic
– Number of logical cores  quadratic/exponential growth in interconnect connectivity
– Overall SoC complexity  more sub-systems requiring their own connectivity

Physical effects are pronounced at 16nm and below


– SoCs contain multiple Networks-on-Chip (NoCs), accounting for 10-12% of silicon Source: “Microprocessor Trend Data – 50 Years”, Karl Rupp, Feb 2022

– Power, Performance, and Area (PPA) factors impacting NoC and SoC iterations
– Impacts exacerbated by more advanced nodes: 7nm, 5nm, 3nm, etc.

Market pressures put a squeeze on project realities


– Shrinking project schedules despite increasing project complexity
– Economics of using silicon-proven NoC IP versus in-house development
– Shortage of trained system IP engineers  driving the shift to flexible IP automation
Source: Techspot, A Brief History of the Multi-Core Desktop CPU (2021)

SNUG SILICON VALLEY 2023 7


The EDA Flow & NoC Design

Compute & Whiteboard Interconnect


Peripherals
Customer uses Arteris tools
to design the Network on
Chip (NoC)
Arteris NoC
Automation
Digital
Implementation RTL
of IP Blocks Digital
Implementation
of Networks on
Chip

Customer uses Synopsys Floorplan determines


NoC area
tools to implement from RTL
to GDSII
Implementation

EDA flow diagram source: Andrew B. Kahng, et al., “VLSI Physical Design: From Graph
Partitioning to Timing Closure,” Springer (2011)
SNUG SILICON VALLEY 2023 8
The EDA Flow & NoC Design
SNPS Platform Architect

Specification NoC Topology Architecture


# Initiators

Customer uses Arteris tools NoC

to design the Network on


# Clock Domains, Critical Paths

# Targets

Chip (NoC)
Arteris NoC
Automation

RTL
Digital
Implementation
of Networks on
Chip

Customer uses Synopsys Floorplan determines


NoC area
tools to implement from RTL
to GDSII
Implementation

EDA flow diagram source: Andrew B. Kahng, et al., “VLSI Physical Design: From Graph
Partitioning to Timing Closure,” Springer (2011)
SNUG SILICON VALLEY 2023 9
The EDA Flow & NoC Design

Time per iteration

Hours RTL simulation


Days RTL to gate level netlist

Weeks Netlist to GDSII

Failure to close
timing results in
weeks long delays
EDA flow diagram source: Andrew B. Kahng, et al., “VLSI Physical Design: From Graph
Partitioning to Timing Closure,” Springer (2011)
SNUG SILICON VALLEY 2023 10
Interconnect Timing Design Challenges
Interconnect RC Trend
• No standard methodology for timing 0,3

closure for on-chip IP 1,0E+06 0,28

communications Resistance (Ω/mm)


0,26

8,0E+05

• Process node advances add to RC


0,24
Capacitance (pF/mm)
0,22
delays for long, cross-chip

Capcitance
Resistance
6,0E+05
0,2

distances traveled 0,18


4,0E+05

• Interconnects that connect different 0,16

IP blocks span long distances and 2,0E+05 0,14

hence suffer from RC delays


0,12

0,0E+00 0,1
32 nm 22 nm 15 nm 11 nm 7 nm
Technology Node

Interface-Layer Effects and Grain Scattering Impact


Resistance Scaling
Source: Serkan Kincal, et al., “RC Performance Evaluation of Interconnect Architecture Options Beyond the 10‐nm
Logic Node,” IEEE (2014)

SNUG SILICON VALLEY 2023 11


SoC Timing Issues
Can’t Cross Advanced Node SoCs in One Clock Cycle
Physical distance impacts the number of pipeline stages

Endpoint (NIU)

Single NoC cycle path

Pipeline

4
3
1
2
0
#ToughToDoManually
Clock Cycles #AutomationNeeded

Transport delay = 𝑭𝑭 (foundry, routing stack, type of driving cell, process voltage, temperature, …)
SNUG SILICON VALLEY 2023 12
The Customer PPA Struggle
NoC RTL is different for every refinement

Change NoC
requirements Architecture
for RTL Development
Customer uses Arteris tools
to design the Network on
Arteris AE
Chip (NoC)
Team supports
customer’s
RTL implementation
RTL
Timing is not efforts generation &
met
Customer uses Synopsys export

tools to implement from RTL


to GDSII
Synthesis

EDA flow diagram source: Andrew B. Kahng, et al., “VLSI Physical Design: From Graph
Partitioning to Timing Closure,” Springer (2011)
SNUG SILICON VALLEY 2023 13
A Customer Example
Manual Flow With Some Layout Awareness & Guidance
NoC Topology Architecture
Software
Guidance to P&R
given manually
Specification
# Initiators
NoC Topology Manual update of constraints for P&R Architecture
NoC
# Clock Domains, Critical Paths
Manually Co-optimized
# Targets

RTL
Synthesis RTL
+ Constraints Pipeline Pipeline Verilog
Insertion Insertion

P&R Gate-Level

Timing Layout
Manual 14 – 35 days 70 Days (Customer Example)

SNUG SILICON VALLEY 2023 14


Automation Opportunities
Abstract estimations
NoC Topology Architecture
Software
Guidance to P&R
given manually
Specification
# Initiators
NoC Topology Manual update of constraints for P&R Architecture
NoC
# Clock Domains, Critical Paths
Manually Co-optimized
# Targets

RTL
Synthesis RTL
+ Constraints Pipeline Pipeline Verilog
Insertion Insertion

P&R Gate-Level
Abstract to RTL based estimations

Timing Layout
Manual 14 – 35 days 70 Days (Customer Example)

SNUG SILICON VALLEY 2023 15


What If?
What would earlier, accurate timing prediction achieve?
NoC Topology Architecture
Software
Guidance to P&R Earlier Timing Results
given manually with RTL Architect
Specification
# Initiators Manual update of
NoC Topology constraints for P&R Architecture
NoC
# Clock Domains, Critical Paths
Manually Co-optimized
# Targets
Pipeline
RTL Insertion
Synthesis RTL
+ Constraints Verilog

P&R Gate-Level

Timing
Layout
Earlier
Estimation 14 – 35 days Savings
from RTL
SNUG SILICON VALLEY 2023 16
Physical Exploration with RTL Architect
Limitations of the Existing Flow

SNUG SILICON VALLEY 2023 17


Traditional Flow
Slow to converge with many customer iterations
Whiteboard Software
Arteris NoC IP Technology
Interconnect Specification NoC Topology Architecture
# Initiators
NoC
NoC
# Clock Domains, Critical Paths Configurations Architecture
# Targets

RTL + Constraints RTL


• Problem: The margins don’t predict all the Verilog
physical effects during implementation Synthesis
+margins
• Customers iterate with Arteris to improve RTL

Customer
Gate-Level
PPA Estimates
• Customers provide gate-centric reports that
don’t pinpoint the problem with the RTL.
• It takes multiple iterations to converge P&R PPA Report Layout

SNUG SILICON VALLEY 2023 18


Traditional Flow
Slow to converge with many customer iterations
Whiteboard Software
Arteris NoC IP Technology
Interconnect Specification NoC Topology Architecture
# Initiators
NoC
NoC
# Clock Domains, Critical Paths Configurations Architecture
# Targets

RTL + Constraints RTL


Timing, logic levels, DC/DC Topo synthesis Verilog
Synthesis
flop count/area to confirm timing with Arteris
+margins
numbers determine configurable NoC RTL IP
Gate-Level
whether a design Design sizes and DC runtimes PPA Estimates
can be increase!
implemented
Some customers experience Layout
P&R PPA Report
timing closure issues even later,
using Fusion Compiler

SNUG SILICON VALLEY 2023 19


RTL Architect Exploration
Converge faster with less customer iterations
Whiteboard Software
Arteris NoC IP Technology
Interconnect Specification NoC Topology Architecture
# Initiators
NoC
NoC
# Clock Domains, Critical Paths Configurations Architecture
# Targets

RTLA synthesis RTL + Constraints RTL


Timing, logic levels, to confirm timing with Arteris RTL Architect Verilog
flop count/area configurable NoC RTL IP +aligned
numbers determine margins
better PPA and faster runtimes Gate-Level
whether a design PPA Estimates
can be
implemented
RTL Architect predicts implementation PPA more accurately Layout
Customers arrive at implementable configurations faster

SNUG SILICON VALLEY 2023 20


PPA Review with RTL Architect

• RTL Architect
– Provides reliable timing
– Allows assessments early
– … to fix high logic levels

• Designers review violations


Example
– Possible timing bug Results
– Possible recoding
– Allowing ULVT
– Adding a pipe stage
– Choosing faster memory

Source: Internal Reference Design, 5 nm

SNUG SILICON VALLEY 2023 21


PPA Review
Long path identified early
• RTL Architect allows early review of
placement issues
• Allows to address well before actual
layout runs are done
– Refining Floorplan
– Exploring with placement bounds
– Exploring rtl recoding

Source: Internal Reference Design, 5 nmSNUG SILICON VALLEY 2023 22


PPA Review

• Using RTL Architect in early


congestion analysis with PG
– Review Cell Density map & Hot spots
– Review Utilization
– Make floorplan changes if any

Source: Internal Reference Design, 5 nm

Source: Internal Reference Design, 5 nm


SNUG SILICON VALLEY 2023 23
Early Power Estimation
Fusion RTL
Compiler Architect
• RTL Architect provides reliable
power estimates

Source: Internal Test Design, 16nm

SNUG SILICON VALLEY 2023 24


NoC Implementation with RTL Architect
• RTL Architect QoR results highly • RTL Architect runtime 3X+ faster than
correlated to Fusion Compiler Fusion Compiler & 6X+ faster than
– Area within 2% DCNXT
– Timing within 10%
– Power within 5%

Fusion Compiler RTL Architect

High Metrics
Correlation 3X+
Faster
Runtime

Improved RTL quality with RTL Architect


Better RTL delivering better PPA SNUG SILICON VALLEY 2023 25
Conclusions

SNUG SILICON VALLEY 2023 26


Conclusion

• Fast RTL Exploration Solution


– Built on implementation and signoff engines for improved convergence
– Will benefit our customers and easy interaction
• Collaboration and Next Steps
– Arteris IP to update physical constraints generation to drive floor plan
refinement based on RTL Architect’s timing and power estimation
– Expand support of abutted vs. channel-based implementation

Synopsys and Arteris are collaborating to improve NoC


implementation flows for tough floorplans
ARTERIS
SNUG SILICON CONFIDENTIAL
VALLEY 2023 27 2
Further Automation Opportunities
Apply further abstraction
NoC Topology Architecture Automate insertion of Software
Guidance pipelines
to P&R using NoC IP
given manually today
knowledge
Specification
# Initiators
NoC Topology Manual update of constraints for P&R Architecture
NoC
# Clock Domains, Critical Paths
Manually Co-optimized
# Targets

Take actual layout RTL


RTL
information into Synthesis
+ Constraints Pipeline Pipeline Verilog
consideration earlier Insertion Insertion

P&R Gate-Level

Timing Layout
Manual 14 – 35 days 70 Days (Customer Example)

SNUG SILICON VALLEY 2023 28


FlexNoC 5
Announced February 2023
NoC Topology Architecture
Software

Automated
Specification
# Initiators
Constraints for P&R
NoC Topology
NoC Architecture
# Clock Domains, Critical Paths
Co-optimized
# Targets

Generated Automated
Layout RTL with RTL
.def import Constraints
Pipeline Insertion Co-Optimize NoC IP Verilog
from P&R Tools with
also: Vizio/Photo
Digital
P&R Implementation Gate-Level

Timing Layout
FlexNoc 5 10 – 25 days 23 days Up to 5x Faster Physical Closure

SNUG SILICON VALLEY 2023 29


YOUR

THANK YOU INNOVATION


YOUR
COMMUNITY

SNUG SILICON VALLEY 2023 30

You might also like