在 C 中读取/写入浮点类型时如何处理字节顺序差异? [英] How do I handle byte order differences when reading/writing floating-point types in C?
问题描述
我正在为我的应用程序设计一种文件格式,我显然希望它能够在大端和小端系统上工作.我已经找到了使用 htonl
和 ntohl
管理整数类型的有效解决方案,但是在尝试使用 float
和 double
值.
I'm devising a file format for my application, and I'd obviously like for it to work on both big-endian and little-endian systems. I've already found working solutions for managing integral types using htonl
and ntohl
, but I'm a bit stuck when trying to do the same with float
and double
values.
鉴于浮点表示的工作方式,我假设标准字节顺序函数不适用于这些值.同样,我什至不完全确定传统意义上的字节序是否决定了这些类型的字节顺序.
Given the nature of how floating-point representations work, I would assume that the standard byte-order functions won't work on these values. Likewise, I'm not even entirely sure if endianness in the traditional sense is what governs the byte order of these types.
我需要的只是一致性.一种写出 double
的方法,并确保我在读回它时得到相同的值.我如何在 C 中做到这一点?
All I need is consistency. A way to write a double
out, and ensure I get that same value when I read it back in. How can I do this in C?
推荐答案
另一种选择是使用 double frexp(double value, int *exp);
from <math.h>
(C99) 将浮点值分解为归一化分数(在 [0.5, 1) 范围内)和 2 的整数幂.然后您可以将分数乘以 FLT_RADIX
DBL_MANT_DIG
获取范围 [FLT_RADIX
DBL_MANT_DIG
/2 内的整数, FLT_RADIX
DBL_MANT_DIG
).然后你保存两个整数 big-endian 或 little-endian,无论你选择哪种格式.
Another option could be to use double frexp(double value, int *exp);
from <math.h>
(C99) to break down the floating-point value into a normalized fraction (in the range [0.5, 1)) and an integral power of 2. You can then multiply the fraction by FLT_RADIX
DBL_MANT_DIG
to get an integer in the range [FLT_RADIX
DBL_MANT_DIG
/2, FLT_RADIX
DBL_MANT_DIG
). Then you save both integers big- or little-endian, whichever you choose in your format.
当你加载一个保存的数字时,你做相反的操作并使用 double ldexp(double x, int exp);
将重构的分数乘以 2 的幂.
When you load a saved number, you do the reverse operation and use double ldexp(double x, int exp);
to multiply the reconstructed fraction by the power of 2.
当 FLT_RADIX
=2(我想几乎是所有系统?)和 DBL_MANT_DIG
<=64 时,这将最有效.
This will work best when FLT_RADIX
=2 (virtually all systems, I suppose?) and DBL_MANT_DIG
<=64.
必须小心避免溢出.
doubles
的示例代码:
#include <limits.h>
#include <float.h>
#include <math.h>
#include <string.h>
#include <stdio.h>
#if CHAR_BIT != 8
#error currently supported only CHAR_BIT = 8
#endif
#if FLT_RADIX != 2
#error currently supported only FLT_RADIX = 2
#endif
#ifndef M_PI
#define M_PI 3.14159265358979324
#endif
typedef unsigned char uint8;
/*
10-byte little-endian serialized format for double:
- normalized mantissa stored as 64-bit (8-byte) signed integer:
negative range: (-2^53, -2^52]
zero: 0
positive range: [+2^52, +2^53)
- 16-bit (2-byte) signed exponent:
range: [-0x7FFE, +0x7FFE]
Represented value = mantissa * 2^(exponent - 53)
Special cases:
- +infinity: mantissa = 0x7FFFFFFFFFFFFFFF, exp = 0x7FFF
- -infinity: mantissa = 0x8000000000000000, exp = 0x7FFF
- NaN: mantissa = 0x0000000000000000, exp = 0x7FFF
- +/-0: only one zero supported
*/
void Double2Bytes(uint8 buf[10], double x)
{
double m;
long long im; // at least 64 bits
int ie;
int i;
if (isnan(x))
{
// NaN
memcpy(buf, "x00x00x00x00x00x00x00x00" "xFFx7F", 10);
return;
}
else if (isinf(x))
{
if (signbit(x))
// -inf
memcpy(buf, "x00x00x00x00x00x00x00x80" "xFFx7F", 10);
else
// +inf
memcpy(buf, "xFFxFFxFFxFFxFFxFFxFFx7F" "xFFx7F", 10);
return;
}
// Split double into normalized mantissa (range: (-1, -0.5], 0, [+0.5, +1))
// and base-2 exponent
m = frexp(x, &ie); // x = m * 2^ie exactly for FLT_RADIX=2
// frexp() can't fail
// Extract most significant 53 bits of mantissa as integer
m = ldexp(m, 53); // can't overflow because
// DBL_MAX_10_EXP >= 37 equivalent to DBL_MAX_2_EXP >= 122
im = trunc(m); // exact unless DBL_MANT_DIG > 53
// If the exponent is too small or too big, reduce the number to 0 or
// +/- infinity
if (ie > 0x7FFE)
{
if (im < 0)
// -inf
memcpy(buf, "x00x00x00x00x00x00x00x80" "xFFx7F", 10);
else
// +inf
memcpy(buf, "xFFxFFxFFxFFxFFxFFxFFx7F" "xFFx7F", 10);
return;
}
else if (ie < -0x7FFE)
{
// 0
memcpy(buf, "x00x00x00x00x00x00x00x00" "x00x00", 10);
return;
}
// Store im as signed 64-bit little-endian integer
for (i = 0; i < 8; i++, im >>= 8)
buf[i] = (uint8)im;
// Store ie as signed 16-bit little-endian integer
for (i = 8; i < 10; i++, ie >>= 8)
buf[i] = (uint8)ie;
}
void Bytes2Double(double* x, const uint8 buf[10])
{
unsigned long long uim; // at least 64 bits
long long im; // ditto
unsigned uie;
int ie;
double m;
int i;
int negative = 0;
int maxe;
if (!memcmp(buf, "x00x00x00x00x00x00x00x00" "xFFx7F", 10))
{
#ifdef NAN
*x = NAN;
#else
*x = 0; // NaN is not supported, use 0 instead (we could return an error)
#endif
return;
}
if (!memcmp(buf, "x00x00x00x00x00x00x00x80" "xFFx7F", 10))
{
*x = -INFINITY;
return;
}
else if (!memcmp(buf, "xFFxFFxFFxFFxFFxFFxFFx7F" "xFFx7F", 10))
{
*x = INFINITY;
return;
}
// Load im as signed 64-bit little-endian integer
uim = 0;
for (i = 0; i < 8; i++)
{
uim >>= 8;
uim |= (unsigned long long)buf[i] << (64 - 8);
}
if (uim <= 0x7FFFFFFFFFFFFFFFLL)
im = uim;
else
im = (long long)(uim - 0x7FFFFFFFFFFFFFFFLL - 1) - 0x7FFFFFFFFFFFFFFFLL - 1;
// Obtain the absolute value of the mantissa, make sure it's
// normalized and fits into 53 bits, else the input is invalid
if (im > 0)
{
if (im < (1LL << 52) || im >= (1LL << 53))
{
#ifdef NAN
*x = NAN;
#else
*x = 0; // NaN is not supported, use 0 instead (we could return an error)
#endif
return;
}
}
else if (im < 0)
{
if (im > -(1LL << 52) || im <= -(1LL << 53))
{
#ifdef NAN
*x = NAN;
#else
*x = 0; // NaN is not supported, use 0 instead (we could return an error)
#endif
return;
}
negative = 1;
im = -im;
}
// Load ie as signed 16-bit little-endian integer
uie = 0;
for (i = 8; i < 10; i++)
{
uie >>= 8;
uie |= (unsigned)buf[i] << (16 - 8);
}
if (uie <= 0x7FFF)
ie = uie;
else
ie = (int)(uie - 0x7FFF - 1) - 0x7FFF - 1;
// If DBL_MANT_DIG < 53, truncate the mantissa
im >>= (53 > DBL_MANT_DIG) ? (53 - DBL_MANT_DIG) : 0;
m = im;
m = ldexp(m, (53 > DBL_MANT_DIG) ? -DBL_MANT_DIG : -53); // can't overflow
// because DBL_MAX_10_EXP >= 37 equivalent to DBL_MAX_2_EXP >= 122
// Find out the maximum base-2 exponent and
// if ours is greater, return +/- infinity
frexp(DBL_MAX, &maxe);
if (ie > maxe)
m = INFINITY;
else
m = ldexp(m, ie); // underflow may cause a floating-point exception
*x = negative ? -m : m;
}
int test(double x, const char* name)
{
uint8 buf[10], buf2[10];
double x2;
int error1, error2;
Double2Bytes(buf, x);
Bytes2Double(&x2, buf);
Double2Bytes(buf2, x2);
printf("%+.15E '%s' -> %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X
",
x,
name,
buf[0],buf[1],buf[2],buf[3],buf[4],buf[5],buf[6],buf[7],buf[8],buf[9]);
if ((error1 = memcmp(&x, &x2, sizeof(x))) != 0)
puts("Bytes2Double(Double2Bytes(x)) != x");
if ((error2 = memcmp(buf, buf2, sizeof(buf))) != 0)
puts("Double2Bytes(Bytes2Double(Double2Bytes(x))) != Double2Bytes(x)");
puts("");
return error1 || error2;
}
int testInf(void)
{
uint8 buf[10];
double x, x2;
int error;
x = DBL_MAX;
Double2Bytes(buf, x);
if (!++buf[8])
++buf[9]; // increment the exponent beyond the maximum
Bytes2Double(&x2, buf);
printf("%02X %02X %02X %02X %02X %02X %02X %02X %02X %02X -> %+.15E
",
buf[0],buf[1],buf[2],buf[3],buf[4],buf[5],buf[6],buf[7],buf[8],buf[9],
x2);
if ((error = !isinf(x2)) != 0)
puts("Bytes2Double(Double2Bytes(DBL_MAX) * 2) != INF");
puts("");
return error;
}
#define VALUE_AND_NAME(V) { V, #V }
const struct
{
double value;
const char* name;
} testData[] =
{
#ifdef NAN
VALUE_AND_NAME(NAN),
#endif
VALUE_AND_NAME(0.0),
VALUE_AND_NAME(+DBL_MIN),
VALUE_AND_NAME(-DBL_MIN),
VALUE_AND_NAME(+1.0),
VALUE_AND_NAME(-1.0),
VALUE_AND_NAME(+M_PI),
VALUE_AND_NAME(-M_PI),
VALUE_AND_NAME(+DBL_MAX),
VALUE_AND_NAME(-DBL_MAX),
VALUE_AND_NAME(+INFINITY),
VALUE_AND_NAME(-INFINITY),
};
int main(void)
{
unsigned i;
int errors = 0;
for (i = 0; i < sizeof(testData) / sizeof(testData[0]); i++)
errors += test(testData[i].value, testData[i].name);
errors += testInf();
// Test subnormal values. A floating-point exception may be raised.
errors += test(+DBL_MIN / 2, "+DBL_MIN / 2");
errors += test(-DBL_MIN / 2, "-DBL_MIN / 2");
printf("%d error(s)
", errors);
return 0;
}
输出(ideone):
+NAN 'NAN' -> 00 00 00 00 00 00 00 00 FF 7F
+0.000000000000000E+00 '0.0' -> 00 00 00 00 00 00 00 00 00 00
+2.225073858507201E-308 '+DBL_MIN' -> 00 00 00 00 00 00 10 00 03 FC
-2.225073858507201E-308 '-DBL_MIN' -> 00 00 00 00 00 00 F0 FF 03 FC
+1.000000000000000E+00 '+1.0' -> 00 00 00 00 00 00 10 00 01 00
-1.000000000000000E+00 '-1.0' -> 00 00 00 00 00 00 F0 FF 01 00
+3.141592653589793E+00 '+M_PI' -> 18 2D 44 54 FB 21 19 00 02 00
-3.141592653589793E+00 '-M_PI' -> E8 D2 BB AB 04 DE E6 FF 02 00
+1.797693134862316E+308 '+DBL_MAX' -> FF FF FF FF FF FF 1F 00 00 04
-1.797693134862316E+308 '-DBL_MAX' -> 01 00 00 00 00 00 E0 FF 00 04
+INF '+INFINITY' -> FF FF FF FF FF FF FF 7F FF 7F
-INF '-INFINITY' -> 00 00 00 00 00 00 00 80 FF 7F
FF FF FF FF FF FF 1F 00 01 04 -> +INF
+1.112536929253601E-308 '+DBL_MIN / 2' -> 00 00 00 00 00 00 10 00 02 FC
-1.112536929253601E-308 '-DBL_MIN / 2' -> 00 00 00 00 00 00 F0 FF 02 FC
0 error(s)
这篇关于在 C 中读取/写入浮点类型时如何处理字节顺序差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!