将大端字节浮点数据直接复制到矢量< float>中.和字节交换就位.安全吗? [英] Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

查看:62
本文介绍了将大端字节浮点数据直接复制到矢量< float>中.和字节交换就位.安全吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够将未对齐的网络缓冲区中的大字节序 float 数组直接复制到 std :: vector< float> 并执行字节交换返回到主机顺序就地",而不涉及中间的 std :: vector< uint32_t> .这样安全吗?我担心大端浮点数据可能会意外地解释为NaN并触发意外行为.这是一个有效的担忧吗?

I'd like to be able to copy big endian float arrays directly from an unaligned network buffer into a std::vector<float> and perform the byte swapping back to host order "in place", without involving an intermediate std::vector<uint32_t>. Is this even safe? I'm worried that the big endian float data may accidentally be interpreted as NaNs and trigger unexpected behavior. Is this a valid concern?

出于这个问题的目的,假定接收数据的主机是低端字节序.

For the purposes of this question, assume that the host machine receiving the data is little endian.

这里有一些代码演示了我正在尝试做的事情:

Here's some code that demonstrates what I'm trying to do:

std::vector<float> source{1.0f, 2.0f, 3.0f, 4.0f};
std::size_t number_count = source.size();

// Simulate big-endian float values being received from network and stored
// in byte buffer. A temporary uint32_t vector is used to transform the
// source data to network byte order (big endian) before being copied
// to a byte buffer.
std::vector<uint32_t> temp(number_count, 0);
std::size_t byte_length = number_count * sizeof(float);
std::memcpy(temp.data(), source.data(), byte_length);
for (uint32_t& datum: temp)
    datum = ::htonl(datum);
std::vector<uint8_t> buffer(byte_length, 0);
std::memcpy(buffer.data(), temp.data(), byte_length);
// buffer now contains the big endian float data, and is not aligned at word boundaries

// Copy the received network buffer data directly into the destination float vector
std::vector<float> numbers(number_count, 0.0f);
std::memcpy(numbers.data(), buffer.data(), byte_length); // IS THIS SAFE??

// Perform the byte swap back to host order (little endian) in place,
// to avoid needing to allocate an intermediate uint32_t vector.
auto ptr = reinterpret_cast<uint8_t*>(numbers.data());
for (size_t i=0; i<number_count; ++i)
{
    // IS THIS SAFE??
    uint32_t datum;
    std::memcpy(&datum, ptr, sizeof(datum));
    *datum = ::ntohl(*datum);
    std::memcpy(ptr, &datum, sizeof(datum));
    ptr += sizeof(datum);
}

assert(numbers == source);

请注意两个是否安全?".上面的评论.

Note the two "IS THIS SAFE??" comments above.

动机:我正在编写一个支持

Motivation: I'm writing a CBOR serialization library with support for typed arrays. CBOR allows typed arrays to be transmitted as either big endian or little endian.

编辑:使用 memcpy 替换了非法的 reinterpret_cast< uint32_t *> 类型在字节序交换循环中进行调整.

EDIT: Replaced illegal reinterpret_cast<uint32_t*> type punning in endian swap loop with memcpy.

推荐答案

编辑后:

关于 auto datum = reinterpret_cast< uint32_t *>(numbers.data()); :这在C ++中是不允许的,只能安全地将pun-type键入到 uint8_t (仅当CHAR_BIT == 8时,更确切地说,此类型限制型异常仅适用于 char 类型)

Regarding the auto datum = reinterpret_cast<uint32_t*>(numbers.data());: This is not allowed in C++, one can only safely type-pun to uint8_t (only if CHAR_BIT == 8, more precisely this type-punning exception only holds for the char types)

旧答案:以下是有关编辑之前的问题(带有 bit_cast 的问题).

Old answer: Below is for the question before the edit (the one with bit_cast).

这是安全的,只要提供 sizeof(float)== sizeof(uint32_t)

This is safe, provided sizeof(float) == sizeof(uint32_t)

不要担心会发出NaN信号.通常会禁用这些例外,即使启用了例外,它们也只会在生成信号NaN时发生.移动指令不会生成异常.

Dont worry about signaling NaNs. The exceptions are usually disabled, and even if they are enabled, they are only happening when a signaling NaN is generated. The move instructions do not generate exceptions.

支持通过 data()指针访问矢量元素(用于读取和写入).保证 vector 具有连续的存储空间.

Accessing the vector elements via data() pointer is supported (for both reading and writing). vector is guarantueed to have a contiguous storage.

但是为什么不在没有临时缓冲区的情况下仅在一个循环中完成所有操作?

But why aren't you doing all in only a single loop without the temp buffers?

仅具有浮点向量(输入或输出)和数据缓冲区(uint8_t向量).为了仅在float输入向量上发送迭代,请对每个元素执行字节交换,并将4个字节写入数据缓冲区.一次一个.然后,您不需要任何中间缓冲区.它可能不会变慢.对于接收,请相反.

Just have the float vector (input or output) and the data buffer (uint8_t vector). For sending just iterate over the float input vector, for each element perform the byte swapping and write the 4 bytes to the data buffer. One at a time. Then you do not need any intermediate buffers. It will probably not be slower. For receiving do the reverse.

使用 std :: bit_cast 将浮点数从/转换为 std :: array< uint8_t,4> .这将是正确的"消息.C ++ 20中的方法(您不能直接在bit_cast中使用C数组).使用这种方法,您无需调用 ntohl ,只需按正确的顺序将字节从缓冲区复制到缓冲区即可.

Use std::bit_cast for conversion of float from/to std::array<uint8_t,4>. This would be the "correct" way in C++20 (you cant use C arrays directly with bit_cast). With this approach you do not need to invoke ntohl, just copy the bytes in correct order from/to buffer.

这篇关于将大端字节浮点数据直接复制到矢量&lt; float&gt;中.和字节交换就位.安全吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆