从C ++中的二进制文件读取不同字节序的整数 [英] Reading integers in different endianness from binary file in C++

查看:182
本文介绍了从C ++中的二进制文件读取不同字节序的整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 ESRI Shapefile ,令人沮丧的是,它在不同的点使用了大尾数法和小尾数法(例如,参见第4页的表以及第5至8页的表).

I'm reading an ESRI Shapefile, and to my dismay it uses big endian and little endian at different points (see, for instance, the table at page 4, plus the tables from page 5 to 8).

所以我用C ++创建了两个函数,每个函数 endianness .

So I created two functions in C++, one for each endianness.

uint32_t readBig(ifstream& f) {
    uint32_t num;
    uint8_t buf[4];
    f.read((char*)buf,4);
    num = buf[3] | buf[2]<<8 | buf[1]<<16 | buf[0]<<24;
    return num;
}

uint32_t readLittle(ifstream& f) {
    uint32_t num;
    f.read(reinterpret_cast<char *>(&num),4);
    //f.read((char*)&num,4);
    return num;
}

但是我不确定这是最有效的方法. 可以改进此代码吗?请记住,单个shapefile可以运行数千次,甚至数百万次.因此,即使其中一个函数调用另一个函数似乎也比拥有两个单独的函数差.使用reinterpret_cast或显式类型转换(char *)之间在性能上有区别吗?我应该在两个函数中使用相同的符号吗?

But I'm not sure this is the most efficient way to do it. Can this code be improved? Keep in mind it will run thousands, maybe millions of times for a single shapefile. So to have even one of the functions calling the other seem worse than to have two separate functions. Is there a difference in performance between using reinterpret_cast or explicit type conversion (char*)? Should I use the same in both functions?

推荐答案

  1. 在指针类型之间进行铸造不会影响性能-在 在这种情况下,使编译器满意只是技术上的问题.
  2. 如果您确实要为每个32位分别调用read 值,字节交换操作所花费的时间可能是 在噪音中.为了提高速度,您可能应该拥有自己的 缓冲层,这样您的内部循环就不会起作用 电话.
  3. 如果交换可以编译成单个操作码(例如bswap),那很好,但是无论是否 可能或最快的选择是特定于处理器的.
  4. 如果您真的对最大化速度感兴趣,请考虑使用SIMD内部函数.
  1. Casting between pointer types does not affect performance -- In this case, it's just a technicality to make the compiler happy.
  2. If you're really making a separate call to read for every 32-bit value, the time taken by the byte-swapping operation will likely be in the noise. For speed, you probably should have your own buffering layer so that you inner loop doesn't make any function calls.
  3. It's nice if the swap compiles down to a single opcode (like bswap), but whether or not that is possible, or the fastest option, is processor-specific.
  4. If you're really interested in maximizing speed, consider using SIMD intrinsics.

这篇关于从C ++中的二进制文件读取不同字节序的整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆