Snug
description
Transcript of Snug
(company logo if desired)
Gated Clock & Design Ware Handling On FPGA Prototype
Platforms
Einav Shmaryh
Texas Instruments
2
Agenda
• Introduction• TI Overview• TI Design Challenges• Summary
3
Introduction
• Texas Instruments WCS design connectivity solutions for the cellular market: Bluetooth, WLAN, GPS, GNSS, NFC and FM
Nokia C7 GPS, WLAN, Bluetooth®,
FM
RIMPlaybook
GPS, WLAN, Bluetooth®,
FM
LGEUptimus3D WLAN/
Bluetooth®
Motorola Droid X
GPS, WLAN, Bluetooth®,
FM
4
FPGA Prototype – Targets– At Speed RF connection to the FPGA, FPGA prototype designs work
at speed (ARM Cortex M3 at 80MHz). The FPGA platform is connected to an RF device which performs:
• GPS fix location from a satellite at real time• WLAN AP link or Bluetooth wireless
– Real time FW development & Debug before Tape Out– Real time FW/HW integration before Tape Out– Enables the ability to demonstrate the ASIC chip 2-3 weeks after chip
arrival (with ROM base solution) with mature FW
5
TI FPGA Design Complexity
• Includes 7 different CPUs (ARM 7, Cortex M3)• Multiple paths with more than 100 logic levels between FFs• Multiple clock dividers (dynamic & static) => needs to be
synchronize• Multiple Interfaces which include RF (BT=> wireless, GPS =>
to the satellite) • Usage of multiple Design Wares
• Dynamic power switching• Real Time - Close timing up to 80MHz
6
Design Challenges Moving Toward FPGA
• Our Design encountered 3 main challenges:
– FPGA vs. ASIC - Clock Tree – Design Ware Implementation in FPGA– FPGA Flow
7
FPGA Vs. ASIC - Clock Tree
• TI ASIC Design includes clock dividers, some with constant divider value and some with dynamic divider value, after each divider there is a new clock tree
• For higher frequency achievement and to eliminate clock skew there is a need for a minimum number of clocks,
• For this reason we used a “fix gated clock” option.• Uses of Synplify tool with “fix gated clock” option can solve only the constant
divider value, the tool “knew” the divider value and mapped it to the FF data or CE.
• Fixing the dynamic divider value there is a need to add dedicated RTL with the uses of the “fix gated clock” option (there is no tool that can guess the divider value)
8
FPGA vs. ASIC - Clock Tree
The next figure shows RTL with constant divider value & with Synplify “fix gated clock” option.
9
With dynamic divider value , Synplify Implementation breaks the clock tree and, now, the clocks are no longer aligned (there is no Synthesis tool that can “guess” the divider value)
FPGA Vs. ASIC - Clock Tree
10
Uses dedicated RTL with the “fix gated clock” tool option to solve the dynamic divider value clock tree
FPGA vs. ASIC - Clock Tree
11
FPGA Vs. ASIC - Clock Tree
Synplify Implementation , Fix the clock tree.
12
Advantages:– With one clock the tool can close higher frequency– Eliminate clock skew – Better turnaround time – Simplify the constraints – Less RTL changes (all the “swallow” RTL is in the ASIC RTL)
Disadvantage: 1. The clock duty cycle has changed - might create timing path if using
falling edge => these clocks need special code (fall detected)
FPGA Vs. ASIC - Clock Tree
13
Design Wares
DW02_mult
mult c[7:0]
Rc[7:0][7:0]
reset
clk
b[7:0] [7:0]
a[7:0] [7:0]
0 tc[7:0] a[7:0][7:0] b[7:0]
[15:0]product[15:0] [7:0]Q[7:0][7:0] D[7:0]
DW02_mult
mult c[7:0]
Rc[7:0][7:0]
reset
clk
b[7:0] [7:0]
a[7:0] [7:0]
0 tc[7:0] a[7:0][7:0] b[7:0]
[15:0]product[15:0] [7:0]Q[7:0][7:0] D[7:0]
• Synplify Premier Recognizes DesignWare Automatically– Equivalent RTL substituted from
built-in library– Mapped to FPGA just like other RTL
• True DesignWare components used if available (& licensed)– Exact same IP as SoC un2_tc
un10_tc
un23_tc un18_tc
un3_temp_a[1:8]
+
un3_temp_b[1:8]
+
temp_a_1[7:1]
0
1
temp_b_1[7:1]
0
1
product_inst_var_1[15:0]
*un3_product[1:16]
+
product_1[15:1]
0
1
product[15:0]
tc
b[7:0] [7:0]
a[7:0] [7:0]
[15:1]
[0]
[7:1]
[0]
[7:1]
[0]
[7]
[7]
[7]
[7]
[7:0]
[1:8]
1
[7:0]
[1:8]
1
[7:1]
[7:1]
[1:7]
[7:1]
[7:1]
[1:7]
[15:0]
[15:0]
[1:16]
1
[15:1]
[15:1]
[1:15]
un2_tc
un10_tc
un23_tc un18_tc
un3_temp_a[1:8]
+
un3_temp_b[1:8]
+
temp_a_1[7:1]
0
1
temp_b_1[7:1]
0
1
product_inst_var_1[15:0]
*un3_product[1:16]
+
product_1[15:1]
0
1
product[15:0]
tc
b[7:0] [7:0]
a[7:0] [7:0]
[15:1]
[0]
[7:1]
[0]
[7:1]
[0]
[7]
[7]
[7]
[7]
[7:0]
[1:8]
1
[7:0]
[1:8]
1
[7:1]
[7:1]
[1:7]
[7:1]
[7:1]
[1:7]
[15:0]
[15:0]
[1:16]
1
[15:1]
[15:1]
[1:15]
14
Design Wares
• TI ASIC design uses DW (Design wares) from Synopsys, like PCIe , USB HSIC. These DWs are integrated in the TI design
• TI FPGA prototype which uses Synopsys DW encountered two issues:
1. FPGA DW implementation, there are two option to synthesis the DW into FPGA
A. Synthesizing the DW with Synplify Premier tool – The tool synthesizes the DW as a black box using the Synopsys Library
B. Use ASIC Net List of the DW instead of the Synopsys DW IP and use Synplify Pro tool
Uses Synplify Premier is more FPGA friendly
Uses Synplify Premier tool Achieve more then 60% timing closer
15
Design Wares
2. Fix Gated Clock with DW
Use of Fix Gated Clock option with DW “breaks” the clock tree (add BUFG)
in the DW clock start point
RTL view with the same example (changing one FF to DW)
Net list view
16
FPGA Flow
To achieve best timing closure and fast turnaround time:
• Minimize design changes in the original RTL code for FPGA• Educate RTL designers to write RTL “FPGA friendly” (e.g. add “ifdefs” in
the RTL with extra pipe or changed clocks …)• Participate in ASIC architecture from the start to understand how the
FPGA can emulate the design better• Scripts-based working flow to avoid editing bugs and faster turnaround
time
17
Summary
1. For constant clock dividers the Synplify tool (with “fix gated clock” option) fixes the clock tree to one clock and clock enable
2. For dynamic clock dividers a special RTL “hook” is needed (with “fix gated
clock” option)
3. For best timing closure, the DW must synthesize with Synplify Premier tool
4. DW with Synplify Premier is more FPGA “friendly”
5. FPGA flow
(company logo if desired)
Q&A
(company logo if desired)
THANK YOU