在C ++中读取IBM浮点数 [英] Reading IBM floating-point in C++

查看:123
本文介绍了在C ++中读取IBM浮点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一堆标题和浮点数据的二进制文件格式.我正在研究解析二进制文件的代码.读取标题并不难,但是当我尝试读取数据时,遇到了一些困难.

I have a a binary file format with a bunch of headers and floating point data. I am working on a code that parses the binary file. Reading the headers was not hard but when I tried to read the data I ran into some difficulties.

我打开文件并按以下方式读取标题:

I opened the file and read the headers as the following:

ifs.open(fileName, std::ifstream::in | std::ifstream::binary);
char textHeader[3200];
BinaryHeader binaryHeader;
ifs.read(textHeader,sizeof(textHeader));
ifs.read(reinterpret_cast<char *>(&binaryHeader), sizeof(binaryHeader));

文档说数据存储为: 4字节IBM浮点数,我尝试过类似的操作:

The documentation says the data is stored as: 4-byte IBM floating-point and I tried something similar:

vector<float> readData(int sampleSize){
    float tmp;
    std::vector<float> tmpVector;
    for (int i = 0; i<sampleSize; i++){
        ifs.read(reinterpret_cast<char *>(&tmp), sizeof(tmp));
        std::cout << tmp << std::endl;
        tmpVector.push_back(tmp);
    }
    return tmpVector;
}

遗憾的是,结果似乎不正确.我该怎么办?

Sadly the result does not seem correct. What do I do wrong?

忘了提了,二进制数据是大端的,但是如果我将tmp值打印出来的话,这两种方法似乎都不正确.

Forgot to mention, the binary data is in big-endian, but if I print the tmp values out the data does not seem correct either way.

结论: 4字节IBM浮点数与 浮动.

推荐答案

需要考虑以下几点:

  • 第一个,我不确定100%是否会有所作为,但是您正在使用字符数组作为标题char textHeader[3200];.也许您可以尝试将其更改为unsigned char的数组...

  • The first one, I'm not 100% sure if this would make a difference or not, but you are using an array of chars for your header char textHeader[3200];. Maybe you could try changing this to an array of unsigned char instead...

我认为第二个问题可能与性能有关,这可能是一个更大的问题,这是您的readData函数本身.您正在该功能堆栈框架上创建floats的本地临时std::vector.然后,您将其退回.返回值甚至都不是通过引用或指针返回的,因此这也将创建不必要的副本,但是,当下一段代码尝试使用此向量时,由于该函数已超出范围,因此该临时文件已被销毁.对于这个问题,我可能会建议更改此函数的声明和定义.

The second one in which I think may be a bigger issue which has to do more with performance is within your readData function itself. You are creating a local temporary std::vector of floats on that functions stack frame. Then you are returning it. The return isn't even by reference or pointer so this will also create unnecessary copies, however by the time the next piece of code tries to use this vector, the temporary has already been destroyed since the function has already gone out of scope. For this issue I would probably suggest changing the declaration and definition of this function.

我将根据您当前拥有的内容进行更改:

I would change it from what you currently have:

vector<float> readData(int sampleSize)

对此:

void readData( int sampleSizes, std::vector<float>& data )

  • 在我最初写这篇文章的时候,用户RetiredNinja在您的评论中以一个问题的形式提到了第三个,这可能是三个中最重要的,它问了您一个关于数据尾数的很好的问题类型被存储.这也可能是一个主要因素.我认为,实际存储在内存中的实际数据表示形式是最大的问题.
  • The third which is probably the most important of the three was mentioned in a form of a question in your comments by user RetiredNinja as I was originally writing this, had asked you a very good question about the endian of the data type being stored. This can also be a major factor. The actual data representation that is physically stored in memory I think is the biggest concern here.

根据您的文档已声明将其存储为4字节IBM浮点类型并且使用大尾数的事实.我发现IBM的规范可能会有所帮助给你.

According to the fact that your documentation has stated that it is stored as a 4-byte IBM floating-point type and that it is in big endian; I have found this specification by IBM that may be of help to you.

这篇关于在C ++中读取IBM浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆