便携式和紧密的位包装 [英] Portable and Tight Bit Packing

查看:95
本文介绍了便携式和紧密的位包装的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有三个 unsigned int、{abcd},我想分别用非标准长度 {9,5,7,11} 打包.我希望制作一个网络数据包(unsigned char pkt[4]),我可以将这些值打包到其中,并在另一台机器上使用相同的头文件可靠地解包,而不管字节序如何.

Suppose I have three unsigned ints, {a, b, c, d}, which I want to pack with non-standard lengths, {9,5,7,11} respectively. I wish to make a network packet (unsigned char pkt[4]) that I can pack these values into and unpack them reliably on another machine using the same header file regardless of endianness.

我读到的关于使用打包结构的所有内容都表明位排序是不可预测的,所以这是不可能的.所以这给我留下了位设置和位清除操作,但我对如何确保字节序不会给我带来问题没有信心.以下是否足够,或者我是否会分别遇到 ad 的字节序问题?

Everything I have read about using packed structs suggests that the bit-ordering will not be predictable so that is out of the question. So that leaves me with bit-set and bit-clear operations, but I'm not confident in how to ensure that endianness will not cause me problems. Is the following sufficient, or shall I run into problems with the endianness of a and d separately?

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    *pkt = htonl(pkt_h);
}

void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *pkt){
    uint32_t pkt_h = ntohl(*pkt);
    (*a) = pkt_h & 0x1FF;
    (*b) = (pkt_h >> 9) & 0x1F;
    (*c) = (pkt_h >> 14) & 0x3F;
    (*d) = (pkt_h >> 21) & 0x7FF;
}

如果是这样,我还可以采取哪些其他措施来确保可移植性?

If so, what other measures can I take to ensure portability?

推荐答案

带有位域的结构对于这个目的来说确实是没有用的,因为它们的域顺序甚至填充规则都不一致.

Structs with bitfields are indeed essentially useless for this purpose, as their field order and even padding rules are not consistent.

我是否会分别遇到 ad 的字节序问题?

shall I run into problems with the endianness of a and d separately?

ad 的字节序无关紧要,它们的字节顺序从未被使用过.ad 不会被重新解释为原始字节,仅使用或分配它们的整数值,在这些情况下,字节序不会进入图片.

The endianness of a and d doesn't matter, their byte-order is never used. a and d are not reinterpreted as raw bytes, only their integer values are used or assigned to, and in those cases endianness does not enter the picture.

还有一个问题:uint8_t *pkt*pkt = htonl(pkt_h); 结合意味着 只保存最低有效字节(不管是由 little endian 还是 big endian 机器执行,因为这不是重新解释,而是隐式转换).uint8_t *pkt 本身是可以的,但是生成的 4 个字节组必须复制到它指向的缓冲区中,不能一次性全部赋值.uint32_t *pkt 可以让这样的单一赋值在不丢失数据的情况下工作,但这使得该函数使用起来不太方便.

There is an other problem though: uint8_t *pkt in combination with *pkt = htonl(pkt_h); means that only the least significant byte is saved (regardless of whether it is executed by a little endian or big endian machine, because this is not a reinterpretation, it's an implicit conversion). uint8_t *pkt is OK by itself, but then the resulting group of 4 bytes must be copied into the buffer it points to, it cannot be assigned all in one go. uint32_t *pkt would enable such a single-assignment to work without losing data, but that makes the function less convenient to use.

类似在unpack_pkt中,当前只使用了一个字节的数据.

Similarly in unpack_pkt, only one byte of data is currently used.

当这些问题得到解决后,应该就好了:

When those issues are fixed, it should be good:

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *buffer){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    uint32_t pkt = htonl(pkt_h);
    memcpy(buffer, &pkt, sizeof(uint32_t));
}

void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *buffer){
    uint32_t pkt;
    memcpy(&pkt, buffer, sizeof(uint32_t));
    uint32_t pkt_h = ntohl(pkt);
    (*a) = pkt_h & 0x1FF;
    (*b) = (pkt_h >> 9) & 0x1F;
    (*c) = (pkt_h >> 14) & 0x3F;
    (*d) = (pkt_h >> 21) & 0x7FF;
}

在任何时候都不用担心字节序的另一种方法是手动解构 uint32_t(而不是有条件地用 htonl 进行字节交换,然后将其重新解释为原始字节),例如:

An alternative that works without worrying about endianness at any point is manually deconstructing the uint32_t (rather than conditionally byte-swapping it with htonl and then reinterpreting it as raw bytes), for example:

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    // example serializing the bytes in big endian order, regardless of host endianness
    pkt[0] = pkt_h >> 24;
    pkt[1] = pkt_h >> 16;
    pkt[2] = pkt_h >> 8;
    pkt[3] = pkt_h;
}

最初的方法还不错,这只是一种替代方法,需要考虑.由于没有任何东西被重新解释,字节序根本不重要,这可能会增加对代码正确性的信心.当然作为一个缺点,它需要更多的代码来完成同样的事情.顺便说一句,尽管手动解构 uint32_t 并存储 4 个单独的字节看起来很繁琐,但 GCC 可以将其高效地编译bswap 和 32 位存储中.另一方面,Clang 错过了这个机会,其他编译器也可能错过了这个机会,所以这并非没有缺点.

The original approach isn't bad, this is just an alternative, something to consider. Since nothing is ever reinterpreted, endianness does not matter at all, which may increase confidence in the correctness of the code. Of course as a downside, it requires more code to get the same thing done. By the way, even though manually deconstructing the uint32_t and storing 4 separate bytes looks like a lot of work, GCC can compile it efficiently into a bswap and 32bit store. On the other hand Clang misses this opportunity and other compilers may as well, so this is not without its drawbacks.

这篇关于便携式和紧密的位包装的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆