8 位浮点数的浮点转换 [英] Floating point conversion for 8-bit floating point numbers

查看：50 发布时间：2021/11/26 14:30:28 floating-point 32-bit

本文介绍了8 位浮点数的浮点转换的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑以下基于 IEEE 浮点的 8 位(是，8 位，不是 8 字节)浮点表示格式.

Consider the following 8-bit (yes, 8-bit, not 8-byte) floating point representation based on the IEEE floating point format.

格式A:
有一个符号位.
有 k=3 个指数位.
有 n=4 个小数位.

Format A:
There is one sign bit.
There are k=3 exponent bits.
There are n=4 fraction bits.

格式 B:
有一个符号位.
有 k=4 个指数位.
有 n=3 个小数位.

Format B:
There is one sign bit.
There are k=4 exponent bits.
There are n=3 fraction bits.

下面，您将获得模式 A 的一些位模式.您的任务是找出给定数字的值按格式 A 并将它们转换为格式 B 中最接近的值.

Below, you are given some bit patterns of pattern A. Your task is to find out the values of numbers given by format A and also convert them to the closest value in format B.

Format A                       Format B
  Bits             Value          Bits 
  1 010 1000 
  1 110 0000 
  0 101 1010 
  0 000 1001

这是作业...我不想为我完成作业.我只想学习如何转换.浮点数让我非常困惑.

This is homework... I don't want the assignment done for me. I just want to learn on how to convert. Floating point gets me extremely confused.

有人可以制作一个格式 A"并告诉我如何逐步获取值/转换?

Can someone just please make up a "Format A" and show me how to get the value/convert step-by-step?

推荐答案

这个问题缺少许多对于定义浮点格式很重要的细节.我将尝试通过假设所有未指定的内容都遵循 IEEE Std 754-2008 IEEE 浮点运算标准中二进制交换格式的通用规则来回答问题的第一部分，以填补缺失的信息.

The question is missing many details that are important for defining a floating point format. I'm going to try to answer the first part of the question filling in the missing information by assuming that everything unspecified follows the common rules for binary interchange formats in IEEE Std 754-2008 IEEE Standard for Floating-Point Arithmetic.

Format A 给定的参数，按照标准中的表 3.3，是 k=8 和 p=5(斜体字母是标准中的参数，不是问题).

The given parameters for Format A, in terms of Table 3.3 in the standard, are k=8 and p=5 (italic letters are parameters in the standard, not the question).

由此，以及标准中的公式，bias = emax = 2**(k - p - 1) - 1 = 3.

From that, and the formula in the standard, bias = emax = 2**(k - p - 1) - 1 = 3.

以位 0 001 0011 为例

Taking example bits 0 001 0011

分数是二进制的 0011/10000，十进制的 3/16 = 0.1875.指数位非零所以是一个正常值，没有存储前导一位，所以有效数是1.1875.

The fraction is, in binary, 0011/10000, decimal 3/16 = 0.1875. The exponent bits are non-zero so it is a normal value, with a non-stored leading one bit, so the significand is 1.1875.

指数是二进制的 001-011，十进制的 1-3 = -2.

The exponent is, in binary, 001-011, decimal 1-3 = -2.

将有效数乘以 2**(-2) = 1/4，得到绝对值 0.296875.由于符号位为零，所以绝对值为最终值.

Multiply the signficand by 2**(-2) = 1/4, giving absolute value 0.296875. Since the sign bit is zero, the absolute value is the final value.

这篇关于8 位浮点数的浮点转换的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

8 位浮点数的浮点转换 [英] Floating point conversion for 8-bit floating point numbers

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

8 位浮点数的浮点转换 [英] Floating point conversion for 8-bit floating point numbers

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭