如何减去 IEEE 754 数字? [英] How to subtract IEEE 754 numbers?

查看:24
本文介绍了如何减去 IEEE 754 数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何减去 IEEE 754 数字?

How do I subtract IEEE 754 numbers?

例如:0,546875 - 32.875...

For example: 0,546875 - 32.875...

-> 0,546875 在 IEEE-754 中是 0 01111110 10001100000000000000000

-> 0,546875 is 0 01111110 10001100000000000000000 in IEEE-754

-> -32.875 在 IEEE-754 中是 1 10000111 01000101111000000000000

-> -32.875 is 1 10000111 01000101111000000000000 in IEEE-754

那么我该如何做减法呢?我知道我必须使两个指数相等,但在那之后我该怎么办?2' -32.875 尾数的补码并加上 0.546875 尾数?

So how do I do the subtraction? I know I have to to make both exponents equal but what do I do after that? 2'Complement of -32.875 mantissa and add with 0.546875 mantissa?

推荐答案

真的和你用铅笔和纸做的没有什么不同.好吧有点不同

Really not any different than you do it with pencil and paper. Okay a little different

123400 - 5432 = 1.234*10^5 - 5.432*10^3

较大的数字占主导地位,将较小数字的尾数移到位桶中,直到指数匹配

the bigger number dominates, shift the smaller number's mantissa off into the bit bucket until the exponents match

1.234*10^5 - 0.05432*10^5

然后用尾数进行减法

1.234 - 0.05432 = 1.17968
1.17968 * 10^5

然后标准化(在这种情况下是)

Then normalize (which in this case it is)

那是基数为 10 的数字.

That was with base 10 numbers.

在 IEEE 浮点数中,单精度

In IEEE float, single precision

123400 = 0x1E208 = 0b11110001000001000
11110001000001000.000...

标准化,我们必须将小数点向左移动 16 位,所以

normalize that we have to shift the decimal place 16 places to the left so

1.1110001000001000 * 2^16

指数有偏差,所以我们将 127 添加到 16,得到 143 = 0x8F.它是一个正数所以符号位是 0 我们开始构建 IEEE 浮点数1 在小数点之前是隐含的,不用于单精度,我们去掉它,保留小数

The exponent is biased so we add 127 to 16 and get 143 = 0x8F. It is a positive number so the sign bit is a 0 we start to build the IEEE floating point number the leading 1 before the decimal is implied and not used in single precision, we get rid of it and keep the fraction

符号位、指数、尾数

0 10001111 1110001000001000...
0100011111110001000001000...
0100 0111 1111 0001 0000 0100 0...
0x47F10400

如果你编写一个程序来查看计算机的东西 123400 是什么,你会得到同样的东西:

And if you write a program to see what a computer things 123400 is you get the same thing:

0x47F10400 123400.000000

所以我们知道第一个操作数的指数和尾数'

So we know the exponent and mantissa for the first operand'

现在是第二个操作数

5432 = 0x1538 = 0b0001010100111000

标准化,十进制左移 12 位

Normalize, shift decimal 12 bits left

1010100111000.000
1.010100111000000 * 2^12

指数有偏差加127得到139 = 0x8B = 0b10001011

The exponent is biased add 127 and get 139 = 0x8B = 0b10001011

把它们放在一起

0 10001011 010100111000000
010001011010100111000000
0100 0101 1010 1001 1100 0000...
0x45A9C00

计算机程序/编译器给出相同的

And a computer program/compiler gives the same

0x45A9C000 5432.000000

现在回答你的问题.使用浮点数的组成部分,我在这里恢复了隐含的1,因为我们需要它

Now to answer your question. Using the component parts of the floating point numbers, I have restored the implied 1 here because we need it

0 10001111 111100010000010000000000 -  0 10001011 101010011100000000000000

在进行减法之前,我们必须像小学一样排列小数位,因此在这种情况下,您必须将较小的指数数字右移,将尾数位从末尾扔掉,直到指数匹配

We have to line up our decimal places just like in grade school before we can subtract so in this context you have to shift the smaller exponent number right, tossing mantissa bits off the end until the exponents match

0 10001111 111100010000010000000000 -  0 10001011 101010011100000000000000
0 10001111 111100010000010000000000 -  0 10001100 010101001110000000000000
0 10001111 111100010000010000000000 -  0 10001101 001010100111000000000000
0 10001111 111100010000010000000000 -  0 10001110 000101010011100000000000
0 10001111 111100010000010000000000 -  0 10001111 000010101001110000000000

现在我们可以减去尾数.如果符号位匹配,那么如果它们不匹配,我们将实际减去,然后我们添加.他们匹配这将是一个减法.

Now we can subtract the mantissas. If the sign bits match then we are going to actually subtract if they dont match then we add. They match this will be a subtraction.

计算机通过使用加法逻辑执行减法,在进入加法器的途中反转第二个运算符并断言进位位,如下所示:

computers perform a subtraction by using addition logic, inverting the second operator on the way into the adder and asserting the carry in bit, like this:

                         1
  111100010000010000000000
+ 111101010110001111111111
==========================

现在就像用纸和铅笔一样让我们执行添加

And now just like with paper and pencil lets perform the add

 1111000100000111111111111
  111100010000010000000000
+ 111101010110001111111111
==========================
  111001100110100000000000 

或者在你的计算器上用十六进制计算

or do it with hex on your calculator

111100010000010000000000 = 1111 0001 0000 0100 0000 0000 = 0xF10400
111101010110001111111111 = 1111 0101 0110 0011 1111 1111 = 0xF563FF
0xF10400 + 0xF563FF + 1 = 0x1E66800
1111001100110100000000000 =1 1110 0110 0110 1000 0000 0000 = 0x1E66800

关于硬件如何工作的一点,因为这实际上是使用加法器的减法,我们还反转进位位(或在某些计算机上保持原样).所以说出来一个 1 是件好事,我们基本上就舍弃了它.如果它是一个零,我们将需要更多的工作.我们没有执行,所以我们的答案确实是 0xE66800.

A little bit about how the hardware works, since this was really a subtract using the adder we also invert the carry out bit (or on some computers they leave it as is). So that carry out of a 1 is a good thing we basically discard it. Had it been a carry out of a zero we would have needed more work. We dont have a carry out so our answer is really 0xE66800.

很快让我们看看另一种方式,而不是反转和添加一个,只需使用计算器

Very quickly lets see that another way, instead of inverting and adding one lets just use a calculator

111100010000010000000000 -  000010101001110000000000 = 
0xF10400 - 0x0A9C00 = 
0xE66800

通过尝试将其形象化,我可能会使情况变得更糟.尾数减法的结果是 111001100110100000000000 (0xE66800),最高有效位没有移动,在这种情况下,我们最终得到 24 位数字,msbit 为 1.没有标准化.要标准化,您需要向左或向右移动尾数,直到 24 位与最左边位置的最高有效 1 对齐,调整每个位移的指数.

By trying to visualize it I perhaps made it worse. The result of the mantissa subtracting is 111001100110100000000000 (0xE66800), there was no movement in the most significant bit we end up with a 24 bit number in this case with the msbit of a 1. No normalization. To normalize you need to shift the mantissa left or right until the 24 bits lines up with the most significant 1 in that left most position, adjusting the exponent for each bit shift.

现在去掉答案中的 1. 部分,我们将部分放在一起

Now stripping the 1. bit off the answer we put the parts together

0 10001111 11001100110100000000000
01000111111001100110100000000000
0100 0111 1110 0110 0110 1000 0000 0000
0x47E66800

如果您一直在通过编写程序来执行此操作,那么我也这样做了.该程序以不正确的方式使用联合,从而违反了 C 标准.我在我的电脑上用我的编译器逃脱了它,不要指望它一直工作.

If you have been following along by writing a program to do this, I did as well. This program violates the C standard by using a union in an improper way. I got away with it with my compiler on my computer, dont expect it to work all the time.

#include <stdio.h>

union
{
    float f;
    unsigned int u;
} myun;


int main ( void )
{
    float a,b,c;

    a=123400;
    b=  5432;

    c=a-b;

    myun.f=a; printf("0x%08X %f
",myun.u,myun.f);
    myun.f=b; printf("0x%08X %f
",myun.u,myun.f);
    myun.f=c; printf("0x%08X %f
",myun.u,myun.f);

    return(0);
}

我们的结果与上面程序的输出相匹配,我们手工得到了一个 0x47E66800

And our result matches the output of the above program, we got a 0x47E66800 doing it by hand

0x47F10400 123400.000000
0x45A9C000 5432.000000
0x47E66800 117968.000000

如果您正在编写一个程序来综合您的程序可以执行减法的浮点数学运算,则您不必进行求逆和加法运算,正如我们上面看到的那样,它过于复杂.如果虽然需要处理符号位但得到了否定结果,请反转结果,然后进行归一化.

If you are writing a program to synthesize the floating point math your program can perform the subtract, you dont have to do the invert and add plus one thing, over complicates it as we saw above. If you get a negative result though you need to play with the sign bit, invert your result, then normalize.

所以:

1) 提取部分、符号、指数、尾数.

1) extract the parts, sign, exponent, mantissa.

2) 通过牺牲具有最小指数的数字中的尾数位来对齐小数位,将该尾数向右移动直到指数匹配

2) Align your decimal places by sacrificing mantissa bits from the number with the smallest exponent, shift that mantissa to the right until the exponents match

3) 作为减法运算,如果符号位相同,则执行减法,如果符号位不同,则执行尾数相加.

3) being a subtract operation if the sign bits are the same then you perform a subtract, if the sign bits are different you perform an add of the mantissas.

4) 如果结果为零,则您的答案为零,将 IEEE 值编码为零作为结果,否则:

4) if the result is a zero then your answer is a zero, encode the IEEE value for zero as the result, otherwise:

5) 将数字归一化,将答案向右或向左移动(答案可以是 24 位加法/减法的 25 位,加法/减法可以有一个戏剧性的转变来归一化,向右或向左移动一位或多位左边),直到你有一个 24 位数字,最重要的一个左对齐.24 位用于单精度浮点数.定义归一化的更正确方法是向左或向右移动,直到数字类似于 1.something.如果你有 0.001 你会左移 3,如果你有 11.10 你会右移 1. 左移增加你的指数,右移减少它.与上面我们从整数转换为浮点数没有什么不同.

5) normalize the number, shift the answer to the right or left (The answer can be 25 bits from a 24 bit add/subtract, add/subtract can have a dramatic shift to normalize, either one right or many bits to the left) until you have a 24 bit number with the most significant one left justified. 24 bits is for single precision float. The more correct way to define normalizing is to shift left or right until the number resembles 1.something. if you had 0.001 you would shift left 3, if you had 11.10 you would shift right 1. a shift left increases your exponent, a shift right decreases it. No different than when we converted from integer to float above.

6) 对于单精度,从尾数中删除前导 1.,如果指数溢出,那么您将开始构建信号 nan.如果符号位不同并且您执行了加法,那么您必须处理找出结果符号位.如果如上所述一切正常,您只需在结果中放置符号位、指数和尾数

6) for single precision remove the leading 1. from the mantissa, if the exponent has overflowed then you get into building a signaling nan. If the sign bits were different and you performed an add, then you have to deal with figuring out the result sign bit. If as above everything fine you just place the sign bit, exponent and mantissa in the result

乘法和除法是不同的,你问的是减法,这就是我的全部内容.

Multiply and divide is different, you asked about subract, so that is all I covered.

这篇关于如何减去 IEEE 754 数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆