浮点除法硬件实施细则 [英] Floating Point Divider Hardware Implementation Details

查看:142
本文介绍了浮点除法硬件实施细则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想实现一个32位浮点硬件除法器的硬件和我想知道如果我能得到任何建议,不同的算法之间的一些权衡?

I am trying to implement a 32-bit floating point hardware divider in hardware and I am wondering if I can get any suggestions as to some tradeoffs between different algorithms?

我的浮点运算单元目前suppports乘法和加法/减法,但我不会把它切换到乘加(FMA)浮点架构因为这是我试图尽量减少面积使用的嵌入式平台。

My floating point unit currently suppports multiplication and addition/subtraction, but I am not going to switch it to a fused multiply-add (FMA) floating point architecture since this is an embedded platform where I am trying to minimize area usage.

推荐答案

从前很长一段时间以前,我碰到这个整洁且易于实现的那个时期军事的FPU使用浮点/定点divison算法:

Once upon a very long time ago i come across this neat and easy to implement float/fixed point divison algorithm used in military FPUs of that time period:

  1. 输入必须是无符号和偏移,使X'LT; y和两者在范围和LT; 0.5; 1>不要忘记存储移SH = SHX的区别 - 害羞,原来signums

  1. input must be unsigned and shifted so x < y and both are in range < 0.5 ; 1 > dont forget to store the difference of shifts sh = shx - shy and original signums

找到F(通过迭代),所以是 F - > 1 ....之后X F - > X / Y是除法结果

find f (by iterating) so yf -> 1 .... after that xf -> x/y which is the division result

转移在x F回来sh和恢复的结果正负号(SIG = sigx 的sigy)

shift the xf back by sh and restore result signum (sig=sigxsigy)

在x * F能够很容易地计算是这样的:

the x*f can be computed easily like this:

z=1-y
(x*f)=(x/y)=x*(1+z)*(1+z^2)*(1+z^4)*(1+z^8)*(1+z^16)...(1+z^2n)

,其中n = LOG2(小数位的固定点,或mantisa位大小浮点的NUM)

where n = log2(num of fractional bits for fixed point, or mantisa bit size for floating point)

我用这divison在我BIGNUM算术,C ++实现的高层次divison是这样的:

I am using this divison in my bignum arithmetics, C++ implementation of high level divison is like this:

fixnum fixnum::operator / (const fixnum &x) // return = this/x
    {
    fixnum u,v,w;
    int k=0,s;
    s=sig*x.sig;                // compute result signum
    u=this[0]; u.sig=+1;
    v=x; v.sig=+1;
    w.one();
    while (geq(v,w)) { v=v>>1; k++; }   // shift input in range
    w=w>>1;
    while (geq(w,v)==1) { v=v<<1; k--; }
    w.div(u,v);             // use divider block
    w=w>>k;                 // shift result back
    w.sig=s;                // set signum
    return w;
    }

这是开发的时候,任何晶体管数...所以你应该能与您使用+和*单位压缩它。希望它可以帮助...

this was developed in time when any transistor count ... so you should be able to compact it with use of your + and * units. hope it helps....

这是我的浮点执行

void arbnum::div(const arbnum &x,const arbnum &y,int acc)
    {
    // O(log(N)*(sqr+mul+inc)) ~ O(1.5*log(N)*(N^2))
    // x<y  = < 0.5 ; 1 >
    // x*f -> x/y , y*f -> 1
    int i,nz;
    arbnum c,z,q;
    c=x;
    z.one(); z.sub(z,y);    // z=1-y
    q=z;
    q.inci();
    c.mul(c,q);             // (x/y)'=x*(1+z)
    c._normalize();
    nz=z.nfbits();
    if (acc<=0) acc=(nz+c.nfbits())<<1;
    for (i=int_log2(acc);i>=0;i--)
        {
//      z.mul(z,z);
        z.sqr(z);
        nz<<=1; if (nz>acc) nz=acc; z._normalize(nz);
        q=z;
        q.inci();
        c.mul(c,q);         // (x/y)'=x*(1+z)*(1+z^2)*(1+z^4)*(1+z^8)*(1+z^16)...
        if (i) c._normalize(acc+nz);
        }
    c._normalize(acc);
    overflow();
    c.sig=sig;
    *this=c;
    }

数字是:

DWORD *dat; int siz,exp,sig,bits;

DAT [SIZ] :mantisa MSW = DAT [0]
EXP :基地2个指数mantisa
民生银行 SIG :中mantisa
正负号 :mantisa速度使用比特,someoperations
a.inci() A ++
a.zero A = 0
a.one A = 1
a.geq(X,Y):比较 | x |,| Y | ,返回 0 | x |&LT; | Y | 1&GT; 2 == a.add(X,Y) A = X + Y
a.sub(X,Y) A = X-Y
a.mul(X,Y) A = X * Y
a.sqr(X) A = X * X
a.nfbits():一些使用小数位数的回报NUM( 00000100.00011100b - →6
a._normalize()
正常化号码(尾数为1的MSB) a.overflow():如果发现num是?111111111111111111111111111111111111111111111b 则轮 ?+ 1.0B
ACC 希望尾数位precision(我arbnum有无限的尾数precision位)

dat[siz]: mantisa MSW = dat[0]
exp: base 2 exponent of MSB of mantisa
sig: signum of mantisa
bits: used bits of mantisa for speed up someoperations
a.inci(): a++
a.zero: a=0
a.one: a=1
a.geq(x,y): compare |x|,|y|, return 0 for |x|<|y|, 1 > 2 == a.add(x,y): a=x+y
a.sub(x,y): a=x-y
a.mul(x,y): a=x*y
a.sqr(x): a=x*x
a.nfbits(): return num of used fractional bits of number (00000100.00011100b -> 6)
a._normalize(): normalize number (MSB of mantissa = 1)
a.overflow(): if finds that num is ?.111111111111111111111111111111111111111111111b then it round to ?+1.0b
acc is desired mantissa bits precision (my arbnum have unlimited mantissa precision bits)

这篇关于浮点除法硬件实施细则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆