浮点除法器硬件实现细节 [英] Floating Point Divider Hardware Implementation Details

查看:23
本文介绍了浮点除法器硬件实现细节的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在硬件中实现一个 32 位浮点硬件除法器,我想知道是否可以就不同算法之间的一些权衡获得任何建议?

我的浮点单元目前支持乘法和加法/减法,但我不打算将其切换到融合乘法加法 (FMA) 浮点架构,因为这是一个嵌入式平台,我试图将区域使用量降至最低.

解决方案

很久以前,我在那个时期的军用 FPU 中遇到过这种简洁且易于实现的浮点/定点除法算法:

p>

  1. 输入必须是无符号和移位的,所以 x <y 并且两者都在范围 <0.5;1 >

    不要忘记存储移位sh = shx - shy和原始符号的差异

  2. 找到 f(通过迭代)所以 y*f ->1 .... 之后 x*f ->x/y是除法结果

  3. x*f 移回 sh 并恢复结果符号 (sig=sigx*sigy)

    x*f 可以这样轻松计算:

    z=1-y(x*f)=(x/y)=x*(1+z)*(1+z^2)*(1+z^4)*(1+z^8)*(1+z^16)...(1+z^2n)

    在哪里

    n = log2(定点的小数位数,或浮点数的尾数位大小)

    对于固定位宽数据类型,您也可以在 z^2n 为零时停止.

[Edit2] 有一点时间和心情,所以这里是 32 位 IEEE 754 C++ 实现

我删除了旧的 (bignum) 示例以避免将来的读者混淆(如果需要,它们仍然可以在编辑历史记录中访问)

//--------------------------------------------------------------------------//IEEE 754 单一掩码常量 DWORD _f32_sig =0x80000000;//符号常量 DWORD _f32_exp =0x7F800000;//指数常量 DWORD _f32_exp_sig=0x40000000;//指数符号常量 DWORD _f32_exp_bia=0x3F800000;//指数偏差常量 DWORD _f32_exp_lsb=0x00800000;//指数 LSB常量 DWORD _f32_exp_pos= 23;//指数 LSB 位位置常量 DWORD _f32_man =0x007FFFFF;//尾数常量 DWORD _f32_man_msb=0x00400000;//尾数MSB常量 DWORD _f32_man_bits= 23;//尾数位//--------------------------------------------------------------------------浮动 f32_div(浮动 x,浮动 y){union _f32//浮点位访问{浮动 f;//32位浮点数DWORD 你;//32 位单位};_f32 xx,yy,zz;int sh;DWORD zsig;浮动z;//结果符号绝对值xx.f=x;zsig =xx.u&_f32_sig;xx.u&=(0xFFFFFFFF^_f32_sig);yy.f=y;zsig^=yy.u&_f32_sig;yy.u&=(0xFFFFFFFF^_f32_sig);//初始指数差 sh 和归一化指数以加快范围的移动sh = 0;sh-=((xx.u&_f32_exp)>>_f32_exp_pos)-(_f32_exp_bia>>_f32_exp_pos);xx.u&=(0xFFFFFFFF^_f32_exp);xx.u|=_f32_exp_bia;sh+=((yy.u&_f32_exp)>>_f32_exp_pos)-(_f32_exp_bia>>_f32_exp_pos);yy.u&=(0xFFFFFFFF^_f32_exp);yy.u|=_f32_exp_bia;//在范围内移动输入而 (xx.f>1.0f) { xx.f*=0.5f;嘘——;}而 (xx.f<0.5f) { xx.f*=2.0f;sh++;}而 (yy.f>1.0f) { yy.f*=0.5f;sh++;}而 (yy.f<0.5f) { yy.f*=2.0f;嘘——;}而 (xx.f<=yy.f) { yy.f*=0.5f;sh++;}//分隔块z=(1.0f-yy.f);zz.f=xx.f*(1.0f+z);为了 (;;){z*=z;如果 (z==0.0f) 中断;zz.f*=(1.0f+z);}//将结果移回对于 (;sh>0;) { sh--;zz.f*=0.5f;}对于 (;sh<0;) { sh++;zz.f*=2.0f;}//设置符号zz.u&=(0xFFFFFFFF^_f32_sig);zz.u|=zsig;返回 zz.f;}//--------------------------------------------------------------------------

我想保持简单,所以它还没有优化.例如,您可以将所有 *=0.5*=2.0 替换为指数 inc/dec ...如果您与 *=0.5 上的 FPU 结果进行比较code>float operator / 这会不太精确,因为大多数 FPU 以 80 位内部格式计算,而此实现仅在 32 位上.

如您所见,我使用 FPU 只是 +,-,*.这些东西可以通过使用快速的 sqr 算法来加速,比如

特别是如果你想使用大位宽...

不要忘记实施规范化和/或上溢/下溢校正.

I am trying to implement a 32-bit floating point hardware divider in hardware and I am wondering if I can get any suggestions as to some tradeoffs between different algorithms?

My floating point unit currently suppports multiplication and addition/subtraction, but I am not going to switch it to a fused multiply-add (FMA) floating point architecture since this is an embedded platform where I am trying to minimize area usage.

解决方案

Once upon a very long time ago i come across this neat and easy to implement float/fixed point divison algorithm used in military FPUs of that time period:

  1. input must be unsigned and shifted so x < y and both are in range < 0.5 ; 1 >

    don't forget to store the difference of shifts sh = shx - shy and original signs

  2. find f (by iterating) so y*f -> 1 .... after that x*f -> x/y which is the division result

  3. shift the x*f back by sh and restore result sign (sig=sigx*sigy)

    the x*f can be computed easily like this:

    z=1-y
    (x*f)=(x/y)=x*(1+z)*(1+z^2)*(1+z^4)*(1+z^8)*(1+z^16)...(1+z^2n)
    

    where

    n = log2(num of fractional bits for fixed point, or mantisa bit size for floating point)
    

    You can also stop when z^2n is zero on fixed bit width data types.

[Edit2] Had a bit of time&mood for this so here 32 bit IEEE 754 C++ implementation

I removed the old (bignum) examples to avoid confusion for future readers (they are still accessible in edit history if needed)

//---------------------------------------------------------------------------
// IEEE 754 single masks
const DWORD _f32_sig    =0x80000000;    // sign
const DWORD _f32_exp    =0x7F800000;    // exponent
const DWORD _f32_exp_sig=0x40000000;    // exponent sign
const DWORD _f32_exp_bia=0x3F800000;    // exponent bias
const DWORD _f32_exp_lsb=0x00800000;    // exponent LSB
const DWORD _f32_exp_pos=        23;    // exponent LSB bit position
const DWORD _f32_man    =0x007FFFFF;    // mantisa
const DWORD _f32_man_msb=0x00400000;    // mantisa MSB
const DWORD _f32_man_bits=       23;    // mantisa bits
//---------------------------------------------------------------------------
float f32_div(float x,float y)
    {
    union _f32          // float bits access
        {
        float f;        // 32bit floating point
        DWORD u;        // 32 bit uint
        };
    _f32 xx,yy,zz; int sh; DWORD zsig; float z;
    //      result signum        abs value
    xx.f=x; zsig =xx.u&_f32_sig; xx.u&=(0xFFFFFFFF^_f32_sig);
    yy.f=y; zsig^=yy.u&_f32_sig; yy.u&=(0xFFFFFFFF^_f32_sig);
    // initial exponent difference sh and normalize exponents to speed up shift in range
    sh =0;
    sh-=((xx.u&_f32_exp)>>_f32_exp_pos)-(_f32_exp_bia>>_f32_exp_pos); xx.u&=(0xFFFFFFFF^_f32_exp); xx.u|=_f32_exp_bia;
    sh+=((yy.u&_f32_exp)>>_f32_exp_pos)-(_f32_exp_bia>>_f32_exp_pos); yy.u&=(0xFFFFFFFF^_f32_exp); yy.u|=_f32_exp_bia;
    // shift input in range
    while (xx.f> 1.0f) { xx.f*=0.5f; sh--; }
    while (xx.f< 0.5f) { xx.f*=2.0f; sh++; }
    while (yy.f> 1.0f) { yy.f*=0.5f; sh++; }
    while (yy.f< 0.5f) { yy.f*=2.0f; sh--; }
    while (xx.f<=yy.f) { yy.f*=0.5f; sh++; }
    // divider block
    z=(1.0f-yy.f);
    zz.f=xx.f*(1.0f+z);
    for (;;)
        {
        z*=z; if (z==0.0f) break;
        zz.f*=(1.0f+z);
        }
    // shift result back
    for (;sh>0;) { sh--; zz.f*=0.5f; }
    for (;sh<0;) { sh++; zz.f*=2.0f; }
    // set signum
    zz.u&=(0xFFFFFFFF^_f32_sig);
    zz.u|=zsig;
    return zz.f;
    }
//---------------------------------------------------------------------------

I wanted to keep it simple so it is not optimized yet. You can for example replace all *=0.5 and *=2.0 by exponent inc/dec ... If you compare with FPU results on float operator / this will be a bit less precise because most FPUs compute on 80 bit internal format and this implementation is only on 32 bits.

As you can see I am using from FPU just +,-,*. The stuff can be speed up by using fast sqr algorithms like

especially if you want to use big bit widths ...

Do not forget to implement normalization and or overflow/underflow correction.

这篇关于浮点除法器硬件实现细节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆