nullFPGA设计时序收敛FPGA设计时序收敛
王巍
13820779613
wangweibit@163.com2007年Xilinx 联合实验室主任会议主要
内容
财务内部控制制度的内容财务内部控制制度的内容人员招聘与配置的内容项目成本控制的内容消防安全演练内容
主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE附加约束的基本作用附加约束的基本作用提高设计的工作频率
通过附加约束可以控制逻辑的综合、映射、布局和布线,以减小逻辑和布线延时,从而提高工作频率。
获得正确的时序分析
报告
软件系统测试报告下载sgs报告如何下载关于路面塌陷情况报告535n,sgs报告怎么下载竣工报告下载
FPGA设计平台包含静态时序分析工具,可以获得映射或布局布线后的时序分析报告,从而对设计的性能做出评估。
静态时序分析工具以约束作为判断时序是否满足设计要求的
标准
excel标准偏差excel标准偏差函数exl标准差函数国标检验抽样标准表免费下载红头文件格式标准下载
。
指定FPGA引脚位置与电气标准
FPGA的可编程特性使电路板设计加工和FPGA设计可以同时进行,而不必等FPGA引脚位置完全确定,从而节省了系统开发时间。
通过约束还可以指定I/O引脚所支持的接口标准和其他电气特性。周期约束周期约束周期(PERIOD)指参考网络为时钟的同步元件间的路径,包括:flip-flop、latch、synchronous RAM等。
周期约束不会优化以下路径:
从输入管脚到输出管脚之间的路径纯组合逻辑
从输入管脚到同步元件之间的路径
从同步元件到输出管脚的路径周期约束周期约束周期约束是一个基本时序和综合约束,它附加在时钟网线上,时序分析工具根据周期约束检查与同步时序约束端口(指有建立、保持时间要求的端口)相连接的所有路径延迟是否满足要求(不包括PAD到寄存器的路径)。
周期是时序中最简单也是最重要的含义,其它很多时序概念会因为软件商不同略有差异,而周期的概念却是最通用的,周期的概念是FPGA/ASIC时序定义的基础概念。后面要讲到的其它时序约束都是建立在周期约束的基础上的,很多其它时序公式,可以用周期公式推导。
在附加周期约束之前,首先要对电路的时钟周期有一定的估计,不能盲目上。约束过松,性能达不到要求,约束过紧,会大大增加布局布线时间,甚至效果相反。周期约束周期约束周期约束的计算
设计内部电路所能达到的最高运行频率取决于同步元件本身的建立保持时间,以及同步元件之间的逻辑和布线延迟。
时钟的最小周期为:
Tperiod= Tcko +Tlogic +Tnet +Tsetup-Tclk_skew
Tclk_skew =Tcd1-Tcd2
其中Tcko为时钟输出时间,Tlogic为同步元件之间的组合逻辑延迟,Tnet为网线延迟,Tsetup为同步元件的建立时间,Tclk_skew为时钟信号偏斜。 周期约束周期约束附加周期约束的一个例子:
NET SYS_CLK PERIOD=10ns HIGH 4ns
这个约束将被附加到SYS_CLK所驱动的所有同步元件上。
PERIOD约束自动处理寄存器时钟端的反相问题,如果相邻同步元件时钟相位相反,那么它们之间的延迟将被默认限制为PERIOD约束值的一半。 偏移约束偏移约束 偏移约束指数据和时钟之间的约束,偏移约束规定了外部时钟和数据输入输出引脚之间的时序关系,只用于与PAD相连的信号,不能用于内部信号。偏移约束偏移约束偏移约束优化以下时延路径
从输入管脚到同步元件偏置输入(OFFSET IN)
从同步元件到输出管脚偏置输出(OFFSET OUT)
为了确保芯片数据采样可靠和下级芯片之间正确的交换数据,需要约束外部时钟和数据输入输出引脚之间的时序关系。偏移约束的内容的时刻,从而保证与下一级电路的时序关系。告诉综合器、布线器输入数据到达的时刻,或者输出数据稳定。偏移约束偏移约束OFFSET_IN_BEFORE
说明了输入数据比有效时钟沿提前多长时间准备好,于是芯片内部与输入引脚的组合逻辑延迟就不能大于该时间(上限,最大值),否则将发生采样错误。
OFFSET_IN_AFTER
指出输入数据在有效时钟沿之后多长时间到达芯片的输入引脚,也可以得到芯片内部延迟的上限。 偏移约束偏移约束输入到达时间计算时序描述
OFFSET_IN_AFTER定义的含义是输入数据在有效时钟沿之后的Tarrival时刻到达。即:
Tarrival=Tcko+Toutput+Tlogic
综合实现工具将努力使输入端延迟Tinput 满足以下关系:
Tarrival +Tinput+Tsetup
答案
八年级地理上册填图题岩土工程勘察试题省略号的作用及举例应急救援安全知识车间5s试题及答案
:
OFFSET_OUT_BEFORE 偏移约束为:
NET DATA_OUT OFFSET=OUT 13ns BEFORE CLK
OFFSET_OUT_AFTER约束:
NET DATA_OUT FFSET=OUT 7ns AFTER CLK
偏移约束偏移约束Given the system diagram below, what values would you put in the Constraints Editor so that the system will run at 100 MHz?(Assume no clock skew between devices)Path-Specific Timing ConstraintsPath-Specific Timing ConstraintsUsing global timing constraints (PERIOD, OFFSET, and PAD-TO-PAD) will constrain your entire design
Using only global constraints often leads to over-constrained designs
Constraints are too tight
Increases compile time and can prevent timing objectives from being met
Review performance estimates provided by your synthesis tool or the Post-Map Static Timing Report
Path-specific constraints override the global constraints on specified paths
This allows you to loosen the timing requirements on specific paths
Path-Specific Timing ConstraintsPath-Specific Timing ConstraintsAreas of your design that can benefit from path-specific constraints
Multi-cycle paths
Paths that cross between clock domains
Bidirectional buses
I/O timing
Path-specific timing constraints should be used to define your performance objectives and should not be indiscriminately placedPath-Specific Timing ConstraintsPath-Specific Timing ConstraintsPath-Specific Timing ConstraintsPath-Specific Timing ConstraintsPath-Specific Timing ConstraintsPath-Specific Timing Constraints 假设要做一个32位的高速计数器,由于计数器的速度取决于最低位到最高位的进位延迟,为了提高速度采用了预定标计数器的结构,也就是把计数器分成一个小计数器和一个大计数器,如图所示。 其中小计数器是两位的,大计数器是30位,它们由同一时钟驱动。大计数器使能端EN受小计数器进位驱动,小计数器每4个CLK进位一次,使EN持续有效一个CLK的时间,此时有效时钟沿到来大计数器加1。
可见,小计数器的寄存器可能每个CLK翻转1次,低位寄存器输出的数据必须在1个CLK内到达高位寄存器的输入端,即寄存器之间的最大延时为1个CLK。而大计数器内部的寄存器每4个时钟周期才可能翻转一次,低位寄存器输出的数据在4个CLK内到达高位寄存器的输入端即可,即寄存器之间的最大延迟为4个CLK,因此降低了计数器的时序要求,可以实现规模较大的高速计数器。预定标计数器Path-Specific Timing ConstraintsPath-Specific Timing Constraints约束文件Path-pin offset Timing ConstraintsPath-pin offset Timing ConstraintsUse the Pad to Setup and Clock to Pad columns to specify OFFSETs for all I/O paths on each clock domain. Easiest way to constrain most I/O paths
However, this can lead to an over-constrained design
Use the Pad to Setup and Clock to Pad columns to specify OFFSETs for each I/O pinUse this type of constraint when only a few I/O pins need different timing
False paths ConstraintsFalse paths ConstraintsIf a PERIOD constraint were placed on this design, what delay paths would be constrained?
If the goal is to optimize the input and output times without constraining the paths between registers, what constraints are needed?
Assume that a global PERIOD constraint is already defined
Timing Constraint PriorityTiming Constraint PriorityFalse paths
Must be allowed to override any timing constraint
FROM THRU TO
FROM TO
Pin-specific OFFSETs
Group OFFSETs
Groups of pads or registers
Global PERIOD and OFFSETs
Lowest priority constraints
主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE时序收敛流程时序收敛流程 设计完成后,如何判断一个成功的设计?
设计是否满足面积要求---是否能在选定的器件中实现。
设计是否满足性能要求---能否达到要求的工作频率。
管脚定义是否满足要求---信号名、位置、电平标准及数据 流方向等。时序收敛流程时序收敛流程如何判断设计适合所选芯片?
所选芯片是否有足够的资源容纳更多的逻辑?如果有,有多少?
如果适合所选芯片, 能否完全成功布通?
手段:查看 Map Report 或者 Place & Route Report
时序收敛流程时序收敛流程Project Navigator 产生两种时序报告:
Post-Map Static Timing Report
Post-Place & Route Static Timing Report
时序报告包含没有满足时序要求的详细路径的描述,用于分析判断时序要求没有得到满足的原因。
Timing Analyzer用于建立和阅读时序报告。时序收敛流程时序收敛流程合理的性能约束的依据
Post-Map Static Timing Report
包括:实际的逻辑延迟和(block delays)和0.1 ns网络延迟( net delays)
合理的时序性能约束的原则:60/40 原则
If less than 60 percent of the timing budget is used for logic delays, the Place & Route tools should be able to meet the constraint easily.
Between 60 to 80 percent, the software run time will increase.
Greater than 80 percent, the tools may have trouble meeting your goals.时序收敛流程时序收敛流程时序收敛流程时序收敛流程性能突破只要三步:
1. 充分利用嵌入式(专用)资源
DSP48, PowerPC processor, EMAC, MGT,
FIFO, block RAM, ISERDES, and OSERDES, 等等。
2. 追求优秀的代码风格
Use synchronous design methodology
Ensure the code is written optimally for critical paths
Pipeline( Xilinx FPGAs have abundant Registers )
3. 充分利用synthesis工具和Place & Route工具参数选择
Try different optimization techniques
Add critical timing constraints in synthesis
Preserve hierarchy
Apply full and correct constraints
Use High effort时序收敛流程时序收敛流程Use embedded blocksSimple Coding Steps Yield 3x PerformanceSimple Coding Steps Yield 3x PerformanceUse pipeline stages-more bandwidth
Use synchronous reset-better system control
Use Finite State Machine optimizations
Use inferable resources
Multiplexer
Shift Register LUT (SRL)
Block RAM, LUT RAM
Cascade DSP
Avoid high-level constructs (loops, for example) in code
Many synthesis tool produce slow implementations时序收敛流程Synthesis guidelinesSynthesis guidelinesUse timing constraints
Define tight but realistic individual clock constraints
Put unrelated clocks into different clock groups
Use proper options and attributes
Turn off resource sharing
Move flip-flops from IOBs closer to logic
Turn on FSM optimization
Use the retiming option
时序收敛流程时序收敛流程时序收敛流程Impact of ConstraintsPlace & Route GuidelinesPlace & Route GuidelinesTiming constraints
Use tight, realistic constraints
Recommended options
High-effort Place & Route
By default, effort is set to Standard
Timing-driven MAP
Multi-Pass Place & Route (MPPR)
Tools to help meet timing
Floorplanning(Use the PACE and PlanAhead software tools)
Physical synthesis tools
Other available options:
Incremental design
Modular design flows时序收敛流程Impact of Constraints in ToolsImpact of Constraints in Tools时序收敛流程主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE代码风格代码风格使用同步设计技术
使用Xilinx-Specific代码
使用Xilinx提供的核
使用层次化设计使用ISE产生的静态时序分析报告,找出时序关键路径,并进行优化主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE综合技术综合技术使用综合工具提供的参数选项,尤其是constraint-driven技术,可以优化设计网表,提高系统性能为综合工具指定关键路径,综合工具可以提高 工作级别,使用更深入的算法,减少关键路径延迟综合技术综合技术综合工具提供许多优化选择,以获得期望的系统性能和面积要求参考F1帮助信息
或
XST UserguideRegister Duplication
Timing-Driven Synthesis
Timing Constraint Editor
FSM Extraction
Retiming
Hierarchy Management
Schematic Viewer
Error Navigation
Cross-Probing
Physical OptimizationDuplicating Flip-FlopsDuplicating Flip-FlopsHigh-fanout nets can be slow and hard to route
Duplicating flip-flops can fix both problems
Reduced fanout shortens net delays
Each flip-flop can fanout to a different physical region of the chip to reduce routing congestion
Design trade-offs
Gain routability and performance
Increase design area
Increase fanout of other nets综合技术Timing-Driven SynthesisTiming-Driven SynthesisSynplify, Precision, and XST software
Timing-driven synthesis uses performance objectives to drive the optimization of the design
Based on your performance objectives, the tools will try several algorithms to attempt to meet performance while keeping the amount of resources in mind
Performance objectives are provided to the synthesis tool via timing constraints综合技术综合技术综合技术实施period约束和input/output约束(.xcf文件)
通常,根据期望的性能目标进行1.5X-2X的过约束,综合工具会提高工作级别,有利于在实现中更容易满足时序目标
切记:如果使用过约束,不要把这些约束传递给实现工具
使用Multi-cycle和false paths约束
使用Critical path约束,对Critical path进行优化
Timing-Driven SynthesisRetimingRetimingSynplify, Precision, and XST software
Retiming: The synthesis tool automatically tries to move register stages to balance combinatorial delay on each side of the registersBefore RetimingAfter Retiming综合技术Hierarchy ManagementHierarchy ManagementSynplify, Precision, and XST software
The basic settings are:
Flatten the design: Allows total combinatorial optimization across all boundaries
Maintain hierarchy: Preserves hierarchy without allowing optimization of combinatorial logic across boundaries
If you have followed the synchronous design guidelines, use the setting
-maintain hierarchy
If you have not followed the synchronous design guidelines, use the setting -flatten the design
Your synthesis tool may have additional settings
Refer to your synthesis documentation for details on these settings综合技术Hierarchy Preservation BenefitsHierarchy Preservation BenefitsEasily locate problems in the code based on the hierarchical instance names contained within static timing analysis reports
Enables floorplanning and incremental design flow
The primary advantage of flattening is to optimize combinatorial logic across hierarchical boundaries
If the outputs of leaf-level blocks are registered, there is no need to flatten
综合技术主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE管脚约束管脚约束管脚约束通常在设计早期就要确定下来,以保证电路板的设计同步进行
对高速设计、复杂设计和具有大量I/O管脚的设计,Xilinx推荐手工进行管脚约束
实现工具可以自动布局逻辑和管脚,但是一般来说不会是最优的
管脚约束可以指导内部数据流向,不合理的管脚布局很容易降低系统性能
合理的管脚布局需要对所设计系统和Xilinx器件结构的详细了解,如要考虑I/O bank、I/O电气标准等
时钟(单端或差分)必须约束在专用时钟管脚
注意:时钟资源数量的限制
最后使用dual-purpose管脚(如配置和DCI管脚)
根据数据流指导管脚约束根据数据流指导管脚约束用于控制信号的I/O置于器件的顶部或底部
控制信号垂直布置
用于数据总线的I/O置于器件的左部和右部
数据流水平布置。以上布局方法可以充分利用Xilinx器件的资源布局方式
进位链排列方式
块RAM,乘法器位置
管脚约束使用PACE进行管脚约束使用PACE进行管脚约束管脚约束主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE时序约束时序约束如果实现后性能目标得到满足,则设计完成
否则,施加特定路径时序约束施加multi-cycle,false path和关键路径约束,实现工具会优先考虑这些特定路径约束主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACE静态时序分析静态时序分析Post-map:Map后,使用Post-map timing report确定关键路径的逻辑延迟
Post-PAR:PAR后,使用Post-PAR static timing report确定时序约束是否满足
Logic delay Vs. Routing delay:60%/40%原则
Timing Analyzer可以读取时序报告,查找关键路径,并与Floorplanner协同解决时序问题
Report ExampleReport Example静态时序分析Analyzing Post-Place & Route TimingAnalyzing Post-Place & Route TimingThere are many factors that contribute to timing errors, including
Neglecting synchronous design rules or using incorrect HDL coding style
Poor synthesis results (too many logic levels in the path)
Inaccurate or incomplete timing constraints
Poor logic mapping or placement
Each root cause has a different solution
Rewrite HDL code
Add timing constraints
Resynthesize or re-implement with different software options
Correct interpretation of timing reports can reveal the most likely cause
Therefore, the most likely solution静态时序分析静态时序分析静态时序分析Case1Poor Placement: SolutionsPoor Placement: SolutionsIncrease Placement effort level (or Overall effort level)
Timing-driven packing, if the placement is caused by packing unrelated logic together
Cross-probe to the Floorplanner to see what has been packed together
This option is covered in the .Advanced Implementation Options. module
PAR extra effort or MPPR options
Covered in the .Advanced Implementation Options. module
Floorplanning or Relative Location Constraints (RLOCs) if you have the skill静态时序分析静态时序分析静态时序分析Case2High Fanout: SolutionsHigh Fanout: SolutionsMost likely solution is to duplicate the source of the high-fanout net
the net is the output of a flip-flop, the solution is to duplicate the flip-flop
Use manual duplication (recommended) or synthesis options
If the net is driven by combinatorial logic, locating the source of the net in the HDL code may be more difficult
Use synthesis options to duplicate the source
静态时序分析静态时序分析静态时序分析Case3Too Many Logic Levels: SolutionsToo Many Logic Levels: SolutionsThe implementation tools cannot do much to improve performance
The netlist must be altered to reduce the amount of logic between flip-flops
Possible solutions
Check whether the path is a multicycle path
If yes, add a multicycle path constraint
Use the retiming option during synthesis to distribute logic more evenly between flip-flops
Confirm that good coding techniques were used to build this logic (no nested if or case statements)
Add a pipeline stage
静态时序分析主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACER&R参数选项:Effort LevelR&R参数选项:Effort Level使用更高级别的Effort Level:可以提高时序性能,而不必采取其它措施(如施加更高级的时序约束,使用高级工具或者更改代码等)
Xilinx推荐:第一遍实现时,使用全局时序约束和缺省的实现参数选项。如果不能满足时序要求:
尝试修改代码,如使用合适的代码风格,增加流水线等
修改综合参数选项,如Optimization Effort ,Use Synthesis Constraints File ,Keep Hierarchy ,Register Duplication,Register Balancing 等
增加PAR Effort Level
Apply path-specific timing constraints for synthesis and implementation实现技术实现技术和PAR一样,可以使用Map-timing参数选项针对关键路径进行约束。如参数 “Timing-Driven Packing and Placement ”给关键路径以优先时序约束的权利。用户约束通过Translate过程从User Constraints File (UCF ) 中传递到设计中 。实现技术Timing-Driven PackingTiming-Driven PackingTiming constraints are used to optimize which pieces of logic are packed into each slice
Normal (standard) packing is performed
PAR is run through the placement phase
Timing analysis analyzes the amount of slack in constrained paths
If necessary, packing changes are made to allow better placement
The output of MAP contains both mapping and placement information
The Post-Map Static Timing Report contains more realistic net delays
Place & Route runtime is reduced because some placement is already done实现技术ExampleExampleOriginally, the flip-flops were packed together into a slice.
After placement and timing analysis, the flip-flops are packed into different slices to allow independent movement
实现技术Trade-OffsTrade-OffsTypical performance improvement: Five to eight percent
Density improvements are also seen
Has the greatest effect on high-density designs when unrelated packing has occurred
Look in the Map Report, Design Summary section
Number of slices containing unrelated logic
If no unrelated packing has occurred, performance improvement will be minimal
Runtime for the MAP process always increases
Up to 200 percent
But you recover some of this increased runtime by saving runtime during Place & Route实现技术MPPR和PAR Extra EffortMPPR和PAR Extra EffortMPPR:对同一个设计运行PAR多次,试图找到最可能满足设计要求的结果,保留作为设计结果当最高级别的PAR Effort Level被选择时,PAR Extra Effort可选三种选择:None,Normal和Continue on impossible
典型情况下,大约可以提高4%的性能
通常PAR消耗更多的时间(增加200%以上)实现技术主要内容主要内容时序约束的概念
时序收敛流程
时序收敛流程-代码风格
时序收敛流程-综合技术
时序收敛流程-管脚约束
时序收敛流程-时序约束
时序收敛流程-静态时序分析
时序收敛流程-实现技术
时序收敛流程-FloorPlanner和PACEFloorplanning和PACEFloorplanning和PACE使用Floorplanning和PACE指导逻辑布局
性能可能更坏!!!
如果时序有提高,但还是不能满足要求,使用MPPR
Map-timing与Floorplanning不能很好配合Floorplanning和PACEFloorplanning和PACE尽量使用前面提高的时序收敛流程,而不使用这个工具,除非:
非常了解这个设计
非常了解Xilinx器件结构
非常了解Xilinx工具软件的使用
使用Floorplanner的好处(如果你有足够的使用技巧):
在大型设计中,Floorplanner可以为实现工具提供设计的布局指导
有助于减少实现运算时间,提高系统性能
在incremental design技术和modular设计技术中需要使用Floorplanner区域约束(Area Constraints)区域约束(Area Constraints)Area Constraints是Floorplanner最容易、最有效的应用
大型设计首选布局工具- Floorplanner
在综合中,为了防止单独的component名称被改变,选择“Keep Hierarchy”参数选项
设计的每个组成部分可以被约束限定到某一个区域
更高级的升级设计工具是:PlanaheadFloorplanning和PACEnull