浮点数转换为8位浮点数 [英] Floating point conversion for 8-bit floating point numbers

查看:475
本文介绍了浮点数转换为8位浮点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


考虑以下基于IEEE浮点
格式的8位(是,8位,而不是8个字节)浮点表示。


  1. 格式A:

    有一个符号位。 k = 3个指数位。

    有n = 4个小数位。

  2. 格式B:

    有一个符号位。

    有k = 4个指数位。


  3. 有n = 3个小数位。

下面给出一些模式A的模式。你的任务是找出给定数字的值
格式A,并将它们转换为格式B中最接近的值。

 格式A格式B 
位数值位
1 010 1000
1 110 0000
0 101 1010
0 000 1001


这是作业...我不想为我完成任务。我只想学习如何转换。漂浮点让我非常困惑。

有人可以请一个格式A,并告诉我如何获得价值/转换一步一步?

解决方案

问题是缺少许多对于定义浮点格式非常重要的细节。我将尝试回答填充缺失信息的问题的第一部分,假设没有指定的东西遵循IEEE Std 754-2008 IEEE浮点运算标准中的二进制交换格式的通用规则。


$ b 格式A的给定参数,按照表3.3标准,是 k = 8和 p = 5斜体字是标准中的参数,而不是问题)。

从标准公式中可以看出 bias = emax = 2 **( k - p - 1) - 1 = 3。

以0为例0011 0011



分数为二进制0011/10000,十进制3/16 = 0.1875。指数位是非零的,所以它是一个正常值,一个非存储的前导一位,所以有效位是1.1875。

指数是二进制,001-011,十进制1-3 = -2。

乘以2 **( - 2)= 1/4,给出绝对值0.296875。由于符号位是零,所以绝对值是最终值。


Consider the following 8-bit (yes, 8-bit, not 8-byte) floating point representation based on the IEEE floating point format.

  1. Format A:
    There is one sign bit.
    There are k=3 exponent bits.
    There are n=4 fraction bits.

  2. Format B:
    There is one sign bit.
    There are k=4 exponent bits.
    There are n=3 fraction bits.

Below, you are given some bit patterns of pattern A. Your task is to find out the values of numbers given by format A and also convert them to the closest value in format B.

Format A                       Format B
  Bits             Value          Bits 
  1 010 1000 
  1 110 0000 
  0 101 1010 
  0 000 1001

This is homework... I don't want the assignment done for me. I just want to learn on how to convert. Floating point gets me extremely confused.

Can someone just please make up a "Format A" and show me how to get the value/convert step-by-step?

解决方案

The question is missing many details that are important for defining a floating point format. I'm going to try to answer the first part of the question filling in the missing information by assuming that everything unspecified follows the common rules for binary interchange formats in IEEE Std 754-2008 IEEE Standard for Floating-Point Arithmetic.

The given parameters for Format A, in terms of Table 3.3 in the standard, are k=8 and p=5 (italic letters are parameters in the standard, not the question).

From that, and the formula in the standard, bias = emax = 2**(k - p - 1) - 1 = 3.

Taking example bits 0 001 0011

The fraction is, in binary, 0011/10000, decimal 3/16 = 0.1875. The exponent bits are non-zero so it is a normal value, with a non-stored leading one bit, so the significand is 1.1875.

The exponent is, in binary, 001-011, decimal 1-3 = -2.

Multiply the signficand by 2**(-2) = 1/4, giving absolute value 0.296875. Since the sign bit is zero, the absolute value is the final value.

这篇关于浮点数转换为8位浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆