用C89中的无符号整数填充位和按位运算 [英] Padding bits in unsigned integers and bitwise operations in C89

查看:115
本文介绍了用C89中的无符号整数填充位和按位运算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多代码可以对无符号整数执行按位运算.我写代码的前提是这些操作是在固定宽度的整数上进行的,没有任何填充位.例如,一个32位无符号整数的数组,其中所有32位可用于每个整数.

I have a lot of code that performs bitwise operations on unsigned integers. I wrote my code with the assumption that those operations were on integers of fixed width without any padding bits. For example an array of 32-bit unsigned integers of which all 32 bits available for each integer.

我希望使我的代码更具可移植性,我专注于确保我 GMP手册:

I'm looking to make my code more portable and I'm focused on making sure I'm C89 compliant (in this case). One of the issues that I've come across is possible padded integers. Take this extreme example, taken from the GMP manual:

但是,在Cray向量系统上,可能会注意到short和int始终以8字节存储(sizeof表示),但仅使用32或46位.美甲功能可以通过传递例如8*sizeof(int)-INT_BIT来解决此问题.

However on Cray vector systems it may be noted that short and int are always stored in 8 bytes (and with sizeof indicating that) but use only 32 or 46 bits. The nails feature can account for this, by passing for instance 8*sizeof(int)-INT_BIT.

在其他地方,我也读过有关这种填充的信息.我实际上是昨晚在SO上读过的一句话(请原谅,我没有链接,我将从内存中引用类似内容),例如,如果您有一个具有60个可用位的双精度型,另外4个可以用于填充,并且那些填充位可以用于某些内部目的,因此无法对其进行修改.

I've also read about this type of padding in other places. I actually read of a post on SO last night (forgive me, I don't have the link and I'm going to cite something similar from memory) where if you have, say, a double with 60 usable bits the other 4 could be used for padding and those padding bits could serve some internal purpose so they cannot be modified.

例如,假设我的代码在一个平台上编译,该平台上无符号int类型的大小为4个字节,每个字节为8位,但是最重要的2位是填充位.在这种情况下,UINT_MAX是0x3FFFFFFF(1073741823)吗?

So let's say for example my code is compiled on a platform where an unsigned int type is sized at 4 bytes, each byte being 8 bits, however the most significant 2 bits are padding bits. Would UINT_MAX in that case be 0x3FFFFFFF (1073741823)?

#include <stdio.h>
#include <stdlib.h>

/* padding bits represented by underscores */
int main( int argc, char **argv )
{
    unsigned int a = 0x2AAAAAAA; /* __101010101010101010101010101010 */
    unsigned int b = 0x15555555; /* __010101010101010101010101010101 */
    unsigned int c = a ^ b; /* ?? __111111111111111111111111111111 */
    unsigned int d = c << 5; /* ??  __111111111111111111111111100000 */
    unsigned int e = d >> 5; /* ?? __000001111111111111111111111111 */

    printf( "a: %X\nb: %X\nc: %X\nd: %X\ne: %X\n", a, b, c, d, e );
    return 0;
}

  • 使用填充位对两个整数进行XOR比较安全吗?
  • 我不对填充位进行XOR吗?
  • 我找不到C89涵盖的这种行为.

    I can't find this behavior covered in C89.

    此外,c变量是否保证为0x3FFFFFFF,或者例如如果两个填充位都在a或b中都为c,则为0xFFFFFFFF?

    Furthermore is the c variable guaranteed to be 0x3FFFFFFF or if for example the two padding bits were both on in a or b would c be 0xFFFFFFFF?

    de相同的问题.我是否通过移位来操纵填充位? 我希望在下面看到这种情况,假设32位具有2个最高有效位用于填充,但是我想知道是否可以保证这样的事情:

    Same question with d and e. Am I manipulating the padding bits by shifting? I would expect to see this below, assuming 32 bits with the 2 most significant bits used for padding, but I want to know if something like this is guaranteed:

    a: 2AAAAAAA
    b: 15555555
    c: 3FFFFFFF
    d: 3FFFFFE0
    e: 01FFFFFF
    

    填充位是否始终是最高有效位,或者它们是否是最低有效位?

    Also are padding bits always the most significant bits or could they be the least significant bits?

    编辑2010年12月19日,美国东部标准时间下午:克里斯托夫回答了我的问题.谢谢!
    我还问过(上方)填充位是否始终是最高有效位. C99标准的基本原理对此进行了引用,答案是否定的.我玩的很安全,对于C89也是一样. C99的基本原理是针对§6.2.6.2(整数类型的表示):

    EDIT 12/19/2010 5PM EST: Christoph has answered my question. Thanks!
    I had also asked (above) whether padding bits are always the most significant bits. This is cited in the rationale for the C99 standard, and the answer is no. I am playing it safe and assuming the same for C89. Here is specifically what the C99 rationale says for §6.2.6.2 (Representation of Integer Types):

    填充位是用户可访问的(无符号整数类型).例如,假设一台机器使用一对16位的short(每个都有自己的符号位)组成一个32位的int,而在此32位int中使用时,低位short的符号位将被忽略.然后,作为32位带符号的int,在确定32位带符号的int的值时会有一个填充位(在32位的中间)被忽略.但是,如果将此32位项目视为32位无符号整数,则该填充位对用户程序可见. C委员会被告知,有一台以这种方式工作的机器,这就是向C99中添加填充位的原因之一.

    Padding bits are user-accessible in an unsigned integer type. For example, suppose a machine uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign bit of the lower short is ignored when used in this 32-bit int. Then, as a 32-bit signed int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value of the 32-bit signed int. But, if this 32-bit item is treated as a 32-bit unsigned int, then that padding bit is visible to the user’s program. The C committee was told that there is a machine that works this way, and that is one reason that padding bits were added to C99.

    脚注44和45提到奇偶校验位可能是填充位.委员会不知道任何机器中用户可访问的奇偶校验位在一个整数内.因此,委员会不知道将奇偶校验位视为填充位的任何机器.

    Footnotes 44 and 45 mention that parity bits might be padding bits. The committee does not know of any machines with user-accessible parity bits within an integer. Therefore, the committee is not aware of any machines that treat parity bits as padding bits.


    编辑美国东部标准时间2010年12月28日下午3点:几个月前,我在comp.lang.c上发现了一个有趣的讨论.


    EDIT 12/28/2010 3PM EST: I found an interesting discussion on comp.lang.c from a few months ago.

    • Bitwise Operator Effects on Padding Bits (VelocityReviews reader)
    • Bitwise Operator Effects on Padding Bits (Google Groups alternate link)

    Dietmar提出的一个观点我很有趣:

    One point made by Dietmar which I found interesting:

    让我们注意,对于陷阱表示的存在,填充位不是必需的.不代表对象类型值的值位组合也可以.

    Let's note that padding bits are not necessary for the existence of trap representations; combinations of value bits which do not represent a value of the object type would also do.

    推荐答案

    按位运算(如算术运算)对值进行运算并忽略填充.该实现可能会或可能不会修改填充位(或在内部使用它们,例如,用作奇偶校验位),但是可移植的C代码将永远无法检测到这一点.任何值(包括UINT_MAX)将不包含填充.

    Bitwise operations (like arithmetic operations) operate on values and ignore padding. The implementation may or may not modify padding bits (or use them internally, eg as parity bits), but portable C code will never be able to detect this. Any value (including UINT_MAX) will not include the padding.

    如果使用sizeof (int) * CHAR_BIT之类的东西,然后尝试使用shift来访问所有这些位,则整数填充可能会导致问题.如果要便携式,则仅使用(unsigned)char,固定大小的整数(加C99)或以编程方式确定值位数.这可以在编译时通过将UINT_MAX与2的幂进行比较来在预处理时完成,也可以在运行时通过使用位操作来完成.

    Where integer padding might lead to problems on is if you use things like sizeof (int) * CHAR_BIT and then try to use shifts to access all these bits. If you want to be portable, either only use (unsigned) char, fixed-sized integers (a C99 addition) or determine the number of value-bits programatically. This can be done at compile-time with the preprocessor by comparing UINT_MAX against powers of 2 or at runtime by using bit-operations.

    C90根本没有提到整数填充,但据我所知,不可见"的前或尾整数填充位不应违反标准(我没有仔细研究所有相关部分以确保这是确实如此);正如C99基本原理中提到的那样,存在填充和值位混合的问题,因为否则就不需要更改标准.

    C90 does not mention integer padding at all, but as far as I can tell, 'invisible' preceding or trailing integer padding bits shouldn't violate the standard (I didn't go through all relevant sections to make sure this is really the case, though); there probaby are problems with mixed padding and value bits as mentioned in the C99 rationale because otherwise, the standard would not have needed to be changed.

    关于用户可访问性的含义:只要使用((unsigned char *)&foo)[…]的位操作就可以始终获取foo的任何位(包括填充),就可以访问填充位.但是,在修改填充位时要小心:结果不会更改整数的值,但是仍然可能创建陷阱表示.对于C90,它是隐式未指定的(根本没有提到),对于C99,它是实现定义的.

    As to the meaning of user-accessible: Padding bits are accessible insofar as you can alwaye get at any bit of foo (including padding) by using bit-operations on ((unsigned char *)&foo)[…]. Be careful when modifying the padding bits, though: the result won't change the value of the integer, but might create be a trap-representation nevertheless. In case of C90, this is implicitly unspecified (as in not mentioned at all), in case of C99, it's implementation-defined.

    但是,这与基本原理的引用无关:所引用的体系结构通过两个16位整数表示32位整数.如果是无符号类型,则所得整数具有32个值位,精度为32;对于带符号整数,它仅具有31个值位,精度为30:16位整数的符号位之一用作32位整数的符号位,另一位被忽略,因此创建由值位包围的填充位.现在,如果您将32位有符号整数作为无符号整数访问(这是明确允许的,并且不违反C99别名规则),则填充位将成为(用户可访问的)值位.

    This was not what the rationale quotation was about, though: the cited architecture represents 32-bit integers via two 16-bit integers. In case of unsigned types, the resulting integer has 32 value bits and a precision of 32; in case of signed integers, it only has 31 value bits and a precision of 30: one of the sign bits of the 16-bit integers is used as the sign bit of the 32-bit integer, the other one is ignored, thus creating a padding bit surrounded by value bits. Now, if you access a 32-bit signed integer as an unsigned integer (which is explicitly allowed and does not violate the C99 aliasing rules), the padding bit becomes a (user-accessible) value bit.

    这篇关于用C89中的无符号整数填充位和按位运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆