夯实安全的char *在C翻番 [英] Safely punning char* to double in C

查看:116
本文介绍了夯实安全的char *在C翻番的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个开放源码节目我
,我从文件中读取二进制数据(由另一程序编写)和输出整数,双打,
和其他各种数据类型。面临的挑战之一是,它需要
两个字节排列顺序的32位和64位计算机上运行,​​这意味着我
最终不得不做了不少低级位变换的。我知道(非常)
关于双关语的类型和严格的抗锯齿和点点希望确保我
做事的正确方法。

In an Open Source program I wrote, I'm reading binary data (written by another program) from a file and outputting ints, doubles, and other assorted data types. One of the challenges is that it needs to run on 32-bit and 64-bit machines of both endiannesses, which means that I end up having to do quite a bit of low-level bit-twiddling. I know a (very) little bit about type punning and strict aliasing and want to make sure I'm doing things the right way.

基本上,它很容易从一个char *转换成不同大小的一个int:

Basically, it's easy to convert from a char* to an int of various sizes:

int64_t snativeint64_t(const char *buf) 
{
    /* Interpret the first 8 bytes of buf as a 64-bit int */
    return *(int64_t *) buf;
}

和我的支持功能铸造交换字节顺序根据需要,例如
如:

and I have a cast of support functions to swap byte orders as needed, such as:

int64_t swappedint64_t(const int64_t wrongend)
{
    /* Change the endianness of a 64-bit integer */
    return (((wrongend & 0xff00000000000000LL) >> 56) |
            ((wrongend & 0x00ff000000000000LL) >> 40) |
            ((wrongend & 0x0000ff0000000000LL) >> 24) |
            ((wrongend & 0x000000ff00000000LL) >> 8)  |
            ((wrongend & 0x00000000ff000000LL) << 8)  |
            ((wrongend & 0x0000000000ff0000LL) << 24) |
            ((wrongend & 0x000000000000ff00LL) << 40) |
            ((wrongend & 0x00000000000000ffLL) << 56));
}

在运行时,程序会检测机和受让人的字节序
上述中的一个函数指针:

At runtime, the program detects the endianness of the machine and assigns one of the above to a function pointer:

int64_t (*slittleint64_t)(const char *);
if(littleendian) {
    slittleint64_t = snativeint64_t;
} else {
    slittleint64_t = sswappedint64_t;
}

现在,棘手的部分是当我试着投一个char *为双。 ID
要重新使用的字节序交换code,像这样:

Now, the tricky part comes when I'm trying to cast a char* to a double. I'd like to re-use the endian-swapping code like so:

union 
{
    double  d;
    int64_t i;
} int64todouble;

int64todouble.i = slittleint64_t(bufoffset);
printf("%lf", int64todouble.d);

然而,一些编译器可以优化掉int64todouble.i转让
和中断程序。有没有一种更安全的方式做到这一点,同时考虑
这一方案必须针对性能优化的保持,而且我会
preFER不写一组平行变换铸造的char *为
直接双击?如果说双关语的结合方法是安全的,应该是我
重新写我的功能,如snativeint64_t使用它?

However, some compilers could optimize away the "int64todouble.i" assignment and break the program. Is there a safer way to do this, while considering that this program must stay optimized for performance, and also that I'd prefer not to write a parallel set of transformations to cast char* to double directly? If the union method of punning is safe, should I be re-writing my functions like snativeint64_t to use it?

我最终使用史蒂夫·杰索普的答案,因为转换函数重新编写使用memcpy的,就像这样:

I ended up using Steve Jessop's answer because the conversion functions re-written to use memcpy, like so:

int64_t snativeint64_t(const char *buf) 
{
    /* Interpret the first 8 bytes of buf as a 64-bit int */
    int64_t output;
    memcpy(&output, buf, 8);
    return output;
}

编译成完全相同的汇编作为我原来的code:

compiled into the exact same assembler as my original code:

snativeint64_t:
        movq    (%rdi), %rax
        ret

在这两个,该版本的memcpy更明确前presses我想要做的,应该工作,即使是最天真的编译器。

Of the two, the memcpy version more explicitly expresses what I'm trying to do and should work on even the most naive compilers.

亚当,你的答案是同样精彩,我从中学到了很多东西。感谢张贴!

Adam, your answer was also wonderful and I learned a lot from it. Thanks for posting!

推荐答案

因为你似乎不够了解您的实现,以确保的int64_t和双大小相同,并有适当的储存再presentations,你可能大胆地的memcpy。然后,你甚至不必去想走样。

Since you seem to know enough about your implementation to be sure that int64_t and double are the same size, and have suitable storage representations, you might hazard a memcpy. Then you don't even have to think about aliasing.

由于您使用的,如果你愿意释放多个二进制文件可能很容易被内联函数的函数指针,性能一定不能是一个巨大的问题,无论如何,但你可能想知道,一些编译器可以说是相当魔王优化的memcpy - 为小整数大小的一组加载和存储可内联的,你甚至可能会发现变量完全优化掉和编译器复制仅仅是重新分配它使用的变量栈插槽,就像工会。

Since you're using a function pointer for a function that might easily be inlined if you were willing to release multiple binaries, performance must not be a huge issue anyway, but you might like to know that some compilers can be quite fiendish optimising memcpy - for small integer sizes a set of loads and stores can be inlined, and you might even find the variables are optimised away entirely and the compiler does the "copy" simply be reassigning the stack slots it's using for the variables, just like a union.

int64_t i = slittleint64_t(buffoffset);
double d;
memcpy(&d,&i,8); /* might emit no code if you're lucky */
printf("%lf", d);

检查所得code,或者只是个人资料吧。机会是即使在最坏的情况下,它也不会是缓慢的。

Examine the resulting code, or just profile it. Chances are even in the worst case it will not be slow.

在一般情况下,虽然,做什么用byteswapping在便携性问题,结果太聪明了。存在与中等端双打,每个字是小端的ABI,但大的字是第一位的。

In general, though, doing anything too clever with byteswapping results in portability issues. There exist ABIs with middle-endian doubles, where each word is little-endian, but the big word comes first.

通常情况下,你可以考虑使用存储的sprintf和sscanf你的双打,但为您的项目文件格式是不是你的控制之下。但是,如果你的应用程序只是铲IEEE双打从输入文件以一种格式输出文件为另一种格式(不知道如果是这样,因为我不知道有问题的数据库格式,但即便如此),那么也许你可有关的事实,这是一个双重的,因为你不使用它的算术反正忘记。只把它当作一个不透明的char [8],要求byteswapping仅当文件格式不同。

Normally you could consider storing your doubles using sprintf and sscanf, but for your project the file formats aren't under your control. But if your application is just shovelling IEEE doubles from an input file in one format to an output file in another format (not sure if it is, since I don't know the database formats in question, but if so), then perhaps you can forget about the fact that it's a double, since you aren't using it for arithmetic anyway. Just treat it as an opaque char[8], requiring byteswapping only if the file formats differ.

这篇关于夯实安全的char *在C翻番的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆