什么把一个二进制数据文件头 [英] What to put in a binary data file's header

查看:276
本文介绍了什么把一个二进制数据文件头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个模拟,读取我们创建(10S到GB的100S)大型二进制数据文件。我们使用二进制的速度的原因。这些文件是依赖于系统,每个我们的系统运行文本文件转换的,所以我不担心可移植性。该文件目前是一个POD结构的多个实例,以书面FWRITE

I have a simulation that reads large binary data files that we create (10s to 100s of GB). We use binary for speed reasons. These files are system dependent, converted from text files on each system that we run, so I'm not concerned about portability. The files currently are many instances of a POD struct, written with fwrite.

我需要改变结构,所以我想补充一点,有一个文件的版本号,这将是随时递增结构变化的标题。因为我这样做,我想添加一些其他信息,以及。我想了结构的大小,字节顺序,也许创建二进制文件的code的SVN版本号。还有什么,将是加有用吗?

I need to change the struct, so I want to add a header that has a file version number in it, which will be incremented anytime the struct changes. Since I'm doing this, I want to add some other information as well. I'm thinking of the size of the struct, byte order, and maybe the svn version number of the code that created the binary file. Is there anything else that would be useful to add?

推荐答案

在我的经验,第二猜测你需要的数据总是浪费时间。最重要的是组织你的的的方式,具有可扩展性。对于XML文件,这是简单的,但二进制文件需要一些更多的思考。

In my experience, second-guessing the data you'll need is invariably wasted time. What's important is to structure your metadata in a way that is extensible. For XML files, that's straightforward, but binary files require a bit more thought.

我倾向于在文件,而不是一开始的END的结构来存储元数据。这有两个好处:

I tend to store metadata in a structure at the END of the file, not the beginning. This has two advantages:


  • 截断/未终止的文件
    很容易检测。

  • 元数据页脚往往可以
    附加到现有的文件,而不
    影响他们的阅读code。

我用最简单的元数据页脚看起来是这样的:

The simplest metadata footer I use looks something like this:

struct MetadataFooter{
  char[40] creatorVersion;
  char[40] creatorApplication;
  .. or whatever
} 

struct FileFooter
{
  int64 metadataFooterSize;  // = sizeof(MetadataFooter)
  char[10] magicString;   // a unique identifier for the format: maybe "MYFILEFMT"
};

原始数据,元数据页脚,然后将该文件后注脚写。

After the raw data, the metadata footer and THEN the file footer are written.

在读取文件,寻求结束 - 的sizeof(FileFooter)。阅读页脚,并验证magicString。然后,寻求回根据metadataFooterSize和读取的元数据。根据该文件中包含页脚大小,你可以使用默认值缺失的字段。

When reading the file, seek to the end - sizeof(FileFooter). Read the footer, and verify the magicString. Then, seek back according to metadataFooterSize and read the metadata. Depending on the footer size contained in the file, you can use default values for missing fields.

由于 KeithB 所指出的,你甚至可以使用这种技术来存储元数据作为一个XML字符串,给人优势的两个完全可扩展的元数据,以二进制数据的紧凑性和速度。

As KeithB points out, you could even use this technique to store the metadata as an XML string, giving the advantages of both totally extensible metadata, with the compactness and speed of binary data.

这篇关于什么把一个二进制数据文件头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆