将float序列化为32位整数的可移植方法 [英] Portable way to serialize float as 32-bit integer

查看:262
本文介绍了将float序列化为32位整数的可移植方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力寻找一种可移植的方法来序列化C和C ++中要与微控制器之间发送和发送的32位浮点变量。我希望格式定义得足够好,以便可以从其他语言完成序列化/反序列化,而无需花费太多精力。相关问题是:

I have been struggling with finding a portable way to serialize 32-bit float variables in C and C++ to be sent to and from microcontrollers. I want the format to be well-defined enough so that serialization/de-serialization can be done from other languages as well without too much effort. Related questions are:

Portability of binary serialization of double/float type in C++

使用C序列化double和float

将c ++便携式转换成long到double

我知道在大多数情况下, typecast union / memcpy可以很好地工作,因为float表示形式是相同的,但是我希望有更多的控制权和思路。到目前为止,我想到的是以下内容:

I know that in most cases a typecast union/memcpy will work just fine because the float representation is the same, but I would prefer to have a bit more control and piece of mind. What I came up with so far is the following:

void serialize_float32(uint8_t* buffer, float number, int32_t *index) {
    int e = 0;
    float sig = frexpf(number, &e);
    float sig_abs = fabsf(sig);
    uint32_t sig_i = 0;

    if (sig_abs >= 0.5) {
        sig_i = (uint32_t)((sig_abs - 0.5f) * 2.0f * 8388608.0f);
        e += 126;
    }

    uint32_t res = ((e & 0xFF) << 23) | (sig_i & 0x7FFFFF);
    if (sig < 0) {
        res |= 1 << 31;
    }

    buffer[(*index)++] = (res >> 24) & 0xFF;
    buffer[(*index)++] = (res >> 16) & 0xFF;
    buffer[(*index)++] = (res >> 8) & 0xFF;
    buffer[(*index)++] = res & 0xFF;
}

float deserialize_float32(const uint8_t *buffer, int32_t *index) {
    uint32_t res = ((uint32_t) buffer[*index]) << 24 |
                ((uint32_t) buffer[*index + 1]) << 16 |
                ((uint32_t) buffer[*index + 2]) << 8 |
                ((uint32_t) buffer[*index + 3]);
    *index += 4;

    int e = (res >> 23) & 0xFF;
    uint32_t sig_i = res & 0x7FFFFF;
    bool neg = res & (1 << 31);

    float sig = 0.0;
    if (e != 0 || sig_i != 0) {
        sig = (float)sig_i / (8388608.0 * 2.0) + 0.5;
        e -= 126;
    }

    if (neg) {
        sig = -sig;
    }

    return ldexpf(sig, e);
}

frexp ldexp 函数似乎是为此目的而制作的,但是在无法使用它们的情况下,我也尝试使用常见的函数手动实现它们:

The frexp and ldexp functions seem to be made for this purpose, but in case they aren't available I tried to implement them manually as well using functions that are common:

float frexpf_slow(float f, int *e) {
    if (f == 0.0) {
        *e = 0;
        return 0.0;
    }

    *e = ceil(log2f(fabsf(f)));
    float res = f / powf(2.0, (float)*e);

    // Make sure that the magnitude stays below 1 so that no overflow occurs
    // during serialization. This seems to be required after doing some manual
    // testing.

    if (res >= 1.0) {
        res -= 0.5;
        *e += 1;
    }

    if (res <= -1.0) {
        res += 0.5;
        *e += 1;
    }

    return res;
}

float ldexpf_slow(float f, int e) {
    return f * powf(2.0, (float)e);
}

我一直在考虑的一件事是是否使用8388608(2 ^ 23)或8388607(2 ^ 23-1)作为乘数。该文档说,frexp返回的值的大小小于1,经过一些试验,似乎8388608给出的结果与实际的浮点数是准确的,并且我找不到任何溢出的极端情况。但是,对于其他编译器/系统,可能并非如此。如果这可能成为问题,那么使用较小的乘法器会降低精度,这对我来说也很好。我知道这不能处理Inf或NaN,但是现在这不是必需的。

One thing I have been considering is whether to use 8388608 (2^23) or 8388607 (2^23 - 1) as the multiplier. The documentation says that frexp returns values that are less than 1 in magnitude, and after some experimentation it seems that 8388608 gives results that are bit-accurate with actual floats and I could not find any corner case where this overflows. That might not be true with a different compiler/system though. If this can become a problem a smaller multiplier which reduces the accuracy a bit is fine with me as well. I know that this does not handle Inf or NaN, but for now that is not a requirement.

因此,最后,我的问题是:这看起来像是一种合理的方法吗? ,还是我只是在制定一个仍然存在可移植性问题的复杂解决方案?

So, finally, my question is: Does this look like a reasonable approach, or am I just making a complicated solution that still has portability issues?

推荐答案

您似乎在<$ c中存在错误$ c> serialize_float :最后4行应为:

You seem to have a bug in serialize_float: the last 4 lines should read:

buffer[(*index)++] = (res >> 24) & 0xFF;
buffer[(*index)++] = (res >> 16) & 0xFF;
buffer[(*index)++] = (res >> 8) & 0xFF;
buffer[(*index)++] = res & 0xFF;

您的方法对于无穷大和/或NaN可能无法正确运行,因为会被<$ c $抵消c> 126 而不是 128 。请注意,您可以通过广泛的测试来验证它:只有40亿个值,尝试所有可能性应该不会花费很长时间。

Your method might not work correctly for infinities and/or NaNs because of the offset by 126 instead of 128. Note that you can validate it by extensive testing: there are only 4 billion values, trying all possibilities should not take very long.

<$ c在内存中的实际表示形式$ c> float 的值在不同的体系结构上可能有所不同,但是IEEE 854(或更确切地说是IEC 60559)在当今很普遍。您可以通过检查是否定义了 __ STDC_IEC_559 __ 来验证您的特定目标是否符合要求。但是请注意,即使可以采用IEEE 854,也必须处理系统之间可能不同的字节序。您不能假设 float s的字节序与同一平台的整数相同。

The actual representation in memory of float values may differ on different architectures, but IEEE 854 (or more precisely IEC 60559) is largely prevalent today. You can verify if your particular targets are compliant or not by checking if __STDC_IEC_559__ is defined. Note however that even if you can assume IEEE 854, you must handle potentially different endianness between the systems. You cannot assume the endianness of floats to be the same as that of integers for the same platform.

注意同样,简单的转换也将是不正确的: uint32_t res = *(uint32_t *)&number; 违反了严格的别名规则。您应该使用工会或使用 memcpy(& res,& number,sizeof(res));

Note also that the simple cast would be incorrect: uint32_t res = *(uint32_t *)&number; violates the strict aliasing rule. You should either use a union or use memcpy(&res, &number, sizeof(res));

这篇关于将float序列化为32位整数的可移植方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆