将字节转换为float是安全的还是会产生未定义的行为? [英] Is transmuting bytes to a float safe or might it produce undefined behavior?

查看:246
本文介绍了将字节转换为float是安全的还是会产生未定义的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有字节序列在转换为f32f64时会在Rust中产生不确定的行为?我正在将非有限值(例如NaN,Infinity等)计数为有效的浮点值.

Are there byte-sequences that, when transmuted into either f32 or f64, produce undefined-behavior in Rust? I'm counting non-finite values, such as NaN, Infinity, etc. as valid floating point values.

此答案的评论暗示从原始字节转换浮点数可能存在一些问题.

The comments to this answer hint that there may be some problem converting a float from raw bytes.

推荐答案

Rust参考提供了发生未定义行为的情况的良好列表.其中,与该问题最相关的一个是:

The Rust reference provides a good list of situations where undefined behavior occurs. Of those, the one that most closely relates to the question is the following:

原始类型中的无效值,即使在私有字段/本地变量中也是如此:

Invalid values in primitive types, even in private fields/locals:

  • 悬空/空引用或框
  • 布尔值中的false(0)或true(1)以外的值
  • 类型定义中未包含的枚举中的判别式
  • char中的一个值,该值等于或大于char :: MAX
  • str中的非UTF-8字节序列
  • Dangling/null references or boxes
  • A value other than false (0) or true (1) in a bool
  • A discriminant in an enum not included in the type definition
  • A value in a char which is a surrogate or above char::MAX
  • Non-UTF-8 byte sequences in a str

仍然没有列出浮点类型.这是因为根据IEEE 754-2008 binary32和binary64浮点类型,任何位序列(f32为32位; f64为64位)都是有效的浮点值状态.它们可能不是 normal (其他类是 zero subnormal infinite 不是数字),但仍然有效.

And still, floating point types are not listed. This is because any bit sequence (32 bits for f32; 64 bits for f64) is a valid state for a floating point value, in accordance to the IEEE 754-2008 binary32 and binary64 floating-point types. They might not be normal (other classes are zero, subnormal, infinite, or not a number), but still valid nonetheless.

最后,始终应该另一种方式 c4>.特别是byteorder条板箱提供了一种安全,直观的方式来从字节流中读取数字.

In the end though, there should always be Another Way around transmute. In particular, the byteorder crate provides a safe and intuitive way to read numbers from a stream of bytes.

use byteorder::{ByteOrder, LittleEndian}; // or NativeEndian

let bytes = [0x00u8, 0x00, 0x80, 0x7F];
let number = LittleEndian::read_f32(&bytes);
println!("{}", number);

游乐场

好吧,实际上有一种非常特殊的情况,将位转换为浮点数可能会导致发信号的NaN ,这在某些CPU架构和配置中将触发低级异常.有关详细信息,请参见 rust#39271 中的讨论.当前已知,实现信号通知NaN并不是未定义的行为,并且如果启用了浮点异常(默认情况下未启用),则这不太可能成为问题.

Ok, there actually is a very peculiar edge case where transmuting bits to a float can result in a signalling NaN, which in some CPU architectures and configurations will trigger a low-level exception. See the discussion in rust#39271 for details. It is currently known that materializing signalling NaNs is not undefined behavior, and that if floating point exceptions are enabled, which are not by default, this is unlikely to be a problem.

Rust库团队已经实施的决定是,即使没有任何掩蔽,转换为浮点数也是安全的. f32::from_bits :

The already implemented decision from the Rust library team is that transmuting to a float is safe, even without any kind of masking. The reasoning is very well described in the documentation for f32::from_bits:

当前在所有平台上均与transmute::<u32, f32>(v)相同.事实证明,它具有难以置信的可移植性,这有两个原因:

This is currently identical to transmute::<u32, f32>(v) on all platforms. It turns out this is incredibly portable, for two reasons:

  • 浮点数和整数在所有受支持的平台上具有相同的字节序.
  • IEEE-754非常精确地指定了float的位布局.

但是有一个警告:在2008年版的IEEE-754之前,实际上没有指定如何解释NaN信令位.大多数平台(特别是x86和ARM)选择了最终在2008年标准化的解释,但有些则没有(特别是MIPS).结果,MIPS上的所有信号NaN都是x86上的安静NaN,反之亦然.

However there is one caveat: prior to the 2008 version of IEEE-754, how to interpret the NaN signaling bit wasn't actually specified. Most platforms (notably x86 and ARM) picked the interpretation that was ultimately standardized in 2008, but some didn't (notably MIPS). As a result, all signaling NaNs on MIPS are quiet NaNs on x86, and vice-versa.

此实现不是尝试保留跨信令的信令,而是倾向于保留确切的位.这意味着,即使将这种方法的结果通过网络从x86机器发送到MIPS机器,也将保留以NaNs编码的任何有效载荷.

Rather than trying to preserve signaling-ness cross-platform, this implementation favours preserving the exact bits. This means that any payloads encoded in NaNs will be preserved even if the result of this method is sent over the network from an x86 machine to a MIPS one.

如果此方法的结果仅由产生它们的相同体系结构来操纵,则无需考虑可移植性.

If the results of this method are only manipulated by the same architecture that produced them, then there is no portability concern.

如果输入不是NaN,那么就不会有可移植性问题.

If the input isn't NaN, then there is no portability concern.

如果您不太在意信号传递性,那么就不必担心可移植性.

If you don't care about signalingness (very likely), then there is no portability concern.

由于在Rust的历史上尚不确定这一问题,某些解析/编码库可能仍会将各种NaN转换为可以确定的安静NaN.

Some parsing/encoding libraries may still be converting all kinds of NaN to an assuredly quiet NaN, as this matter was uncertain for a while in the history of Rust.

这篇关于将字节转换为float是安全的还是会产生未定义的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆