VHDL 布局布线路径分析 [英] VHDL Place and route path analysis

查看:25
本文介绍了VHDL 布局布线路径分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是,当我使用 Xilinx ISE 14.7 + XPS 实现我的设计时,我经常在静态时序分析中获得非常不同数量的分析路径,在 .vhd 文件中也几乎没有差异.特别是,我更改(或我认为要更改...)的唯一文件类似于:

my problem is that when I implement my design using Xilinx ISE 14.7 + XPS I often obtain a very different number of analyzed paths in the static timing analysis, also having very few differences in the .vhd files. In particular, the only file that I change (or that I think to change...) is something like:

entity my_entity is(
    ...
    data_in : in std_logic_vector(N*B-1 downto 0);
    ...
);
end entity my_entity;

architecture bhv of my_entity is
    signal data : std_logic_vector(B-1 downto 0);
    signal idx_vect : std_logic_vector(log2(N)-1 downto 0);
    signal idx : integer range 0 to N-1;
    ...
begin
    process(clk)
    begin
        if(rising_edge(clk))then
            idx_vect <= idx_vect + 1;
        end if;
    end process;

    idx <= to_integer(unsigned(idx_vect));

    data <= data_in((idx+1)*B-1 downto idx*B);

end architecture bhv;

我不确定问题出自这里,但我没有发现任何其他可能导致分析路径数量减少五倍的原因.为了获得正确的实现,是否有一些必须避免的语法?使用整数索引数组(如示例编解码器中)是否有可能以某种方式破坏路径,使它们不被分析?

I'm not sure the problem comes from here, but I'm not finding any other possible cause to a decrease of five times in the number of analyzed paths. Are there some syntax that one must avoid in order to obtain a correct implementation? Is it possible that indexing an array using an integer (as in the example codec) breaks up in some way the paths, making them not analyzed?

代码更改类似于:

process(shift_reg, data_in)
    for i in range 0 to N-1 loop
        if(shift_reg(i) = '1')then
            data <= data_in((i+1)*B-1 downto i*B);
        end if;
    end loop;
end process;

其中我有一个 N 位的循环单热移位寄存器,而不是增量 idx_vect.提前致谢.

in which instead of increment idx_vect I have a circular one-hot shift register of N bits. Thanks in advance.

推荐答案

此行多路复用器的编码风格

The coding style of the multiplexer at this line

data <= data_in((idx+1)*B-1 downto idx*B);

会严重影响逻辑综合.这导致用于时序分析的路径数量非常不同.

can heavily influence the logic synthesis. This results in very different number of paths to analyze for timing.

我首先使用这个小例子检查了上面一行的合成:

I first checked the synthesis of the above line using this small example:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity mux1 is
    generic (
        B : positive := 32;
        M : positive := 7); -- M := ceil(log_2 N)
    port (
        d : in  STD_LOGIC_VECTOR ((2**M)*B-1 downto 0); -- input data
        s : in  STD_LOGIC_VECTOR (M-1 downto 0);        -- selector
        y : out  STD_LOGIC_VECTOR(B-1 downto 0));       -- result
end mux1;

architecture Behavioral of mux1 is
    constant N : positive := 2**M;
    signal idx : integer range 0 to N-1;
begin
    idx <= to_integer(unsigned(s));
    y <= d((idx+1)*B-1 downto idx*B);
end Behavioral;

如果有人为 Spartan-6 合成这个,XST 会报告这个(摘录):

If one synthesizes this for a Spartan-6, XST reports this (excerpt):

Macro Statistics
# Adders/Subtractors                                   : 2
 13-bit subtractor                                     : 1
 8-bit adder                                           : 1
...
 Number of Slice LUTs:                 1516  out of   5720    26%  
...
Timing constraint: Default path analysis
  Total number of paths / destination ports: 139264 / 32

因此,未检测到多路复用器,时序分析器必须分析大量路径.逻辑利用率还可以.

Thus, no multiplexer was detected and the timing analyzer has to analyze a huge number of paths. The logic utilization is ok.

可以通过以下方式实现相同的多路复用:(编辑:错误修复和简化)

The same multiplexing can be achieved with: (EDIT: bugfix and simplification)

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity mux2 is
    generic (
        B : positive := 32;
        M : positive := 7); -- M := ceil(log_2 N)
    port (
        d : in  STD_LOGIC_VECTOR ((2**M)*B-1 downto 0);
        s : in  STD_LOGIC_VECTOR (M-1 downto 0);
        y : out  STD_LOGIC_VECTOR(B-1 downto 0));
end mux2;

-- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-- !! The entire architecture has been FIXED and simplified. !!
-- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
architecture Behavioral of mux2 is
    constant N : positive := 2**M;
    type matrix is array (N-1 downto 0) of std_logic_vector(B-1 downto 0);
    signal dd : matrix;
begin
    -- reinterpret 1D vector 'd' as 2D matrix, i.e.
    -- row 0 holds d(B-1 downto 0) which is selected in case s = 0
    row_loop: for row in 0 to N-1 generate
        dd(row) <= d((row+1)*B-1 downto row*B);
    end generate;

    -- select the requested row
    y <= dd(to_integer(unsigned(s)));
end Behavioral;

现在,XST 报告看起来好多了:

Now, the XST report looks much better:

Macro Statistics
# Multiplexers                                         : 1
 32-bit 128-to-1 multiplexer                           : 1
...
 Number of Slice LUTs:                 1344  out of   5720    23%  
...
Timing constraint: Default path analysis
  Total number of paths / destination ports: 6816 / 32

它检测到每个输出位需要一个 128 对 1 的多路复用器.这种宽多路复用器的优化综合内置于综合工具中.LUT 的数量仅略有减少.但是,时序分析器要处理的路径数量显着减少了20 倍!

It detects that for each output-bit a 128-to-1 multiplexer is required. The optimized synthesis of such a wide multiplexer is built-in to the synthesis tool. The number of LUTs is only reduced slightly. But, the number of paths to be processed by the timing analyzer is reduced dramatically by a factor of 20!

以上示例使用二进制编码的选择器信号.我还检查了使用 one-hot 编码的变体:

The above examples use a binary-encoded selector signal. I checked also the variant with the one-hot encoded one:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity mux3 is
    generic (
        B : positive := 32;
        N : positive := 128);
    port ( d : in  STD_LOGIC_VECTOR (N*B-1 downto 0);
           s : in  STD_LOGIC_VECTOR (N-1 downto 0);
           y : out  STD_LOGIC_VECTOR(B-1 downto 0));
end mux3;

architecture Behavioral of mux3 is

begin
    process(d, s)
    begin
        y <= (others => '0'); -- avoid latch!
        for i in 0 to N-1 loop
            if s(i) = '1' then
                y <= d((i+1)*B-1 downto i*B);
            end if;
        end loop;
    end process;

end Behavioral;

现在,XST 报告再次不同:

Now, the XST report is different again:

Macro Statistics
# Multiplexers                                         : 128
 32-bit 2-to-1 multiplexer                             : 128
...
Number of Slice LUTs:                 2070  out of   5720    36%  
...
Timing constraint: Default path analysis
  Total number of paths / destination ports: 13376 / 32

检测到 2 对 1 多路复用器,因为描述了与此方案类似的优先级多路复用器:

2-to-1 multiplexer are detected, because a priority mux analog to this scheme was described:

if s(127) = '1' then
  y <= d(128*B-1 downto 127*B);
else
  if s(126) = '1' then
    y <= d(127*B-1 downto 126*B);
  else
    ...
                             if s(0) = '1' then
                               y <= d(B-1 downto 0);
                             else
                               y <= (others => '0');
                             end if;
  end if; -- s(126)
end if; -- s(127)

出于教学原因,我没有在这里使用 elsif.每个 if-else 阶段都是一个 32 位宽的 2-to-1 多路复用器.这里的问题是,合成不知道 s 是一个单热编码信号.因此,在我的优化实现中需要更多的逻辑.

I have not used elsif here for didactical reasons. Each if-else stage is a 32-bit wide 2-to-1 mutiplexer. The problem here is, that the synthesis does not know, that s is a one-hot encoded signal. Thus, a little more logic is required as in my optimized implementation.

要分析时序的路径数量再次发生显着变化.这个数字比原始实现低 10 倍,但比我优化的高 2 倍.

The number of paths to analyze for timing changes again significantly. The number is 10 times lower than in the original implementation, but 2 times higher than in my optimized one.

这篇关于VHDL 布局布线路径分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆