在Delphi中读取/解析无类型二进制文件的最佳方法 [英] Best way to read/parse a untyped binary file in Delphi

查看:300
本文介绍了在Delphi中读取/解析无类型二进制文件的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道解析无类型二进制文件的最佳方法是什么.例如,一个EBML文件. (http://ebml.sourceforge.net/). EBML基本上是一个二进制xml文件.它基本上可以存储任何内容,但目前主要使用的是MKV视频文件(matroska).

I would like to know what is the best way to parse an untyped binary file. For example, a EBML file. (http://ebml.sourceforge.net/). EBML is basically a binary xml file. It can store basically anything, but its predominate use right now are MKV video files (matroska).

要在字节级别读取EBML文件,请读取标头以确保它是EBML文件,然后检索该文件上的信息. MKV文件可能很大,大小为1-30gb.

To read a EBML file at the byte level, reading the header making sure it is a EBML file and retrieving information on the file. MKV files can be huge, 1-30gb in size.

二进制文件可以是任何东西,例如jpeg,bmp,avi等... 我只想学习如何阅读它们.

The binary file could be anything, jpeg, bmp, avi etc ... I just want to learn how to read them.

推荐答案

基本上,您这样做

const
  MAGIC_WORD = $535B;

type
  TMyFileTypeHeader = packed record
    MagicWord: word; // = MAGIC_WORD
    Size: cardinal;
    Version: cardinal;
    Width: cardinal;
    Height: cardinal;
    ColorDepth: cardinal;
    Title: array[0..31] of char;
  end;

procedure ReadFile(const FileName: string);
var
  f: file;
  amt: integer;
  FileHeader: TMyFileTypeHeader;
begin

  FileMode := fmOpenRead;
  AssignFile(f, FileName);

  try
    Reset(f, 1);

    BlockRead(f, FileHeader, sizeof(TMyFileTypeHeader), amt);

    if FileHeader.MagicWord <> MAGIC_WORD then
      raise Exception.Create(Format('File "%s" is not a valid XXX file.', [FileName]));

    // Read, parse, and do something

  finally
    CloseFile(f);
  end;     


end;

例如,位图文件以 结构,后跟(在第3版中) BITMAPINFOHEADER .接着是可选的调色板项目数组,然后是未压缩的RGB像素数据(在最简单的情况下,此处为24位格式):BBGGRRBBGGRRBBGGRR ...

For instance, a bitmap file begins with a BITMAPFILEHEADER structure, followed (in version 3) by a BITMAPINFOHEADER. Followed by an optional array of palette items, followed by uncompressed RGB pixel data (in the simplest case, here in 24-bit format): BBGGRRBBGGRRBBGGRR...

读取JPG非常非常复杂,因为JPG数据的压缩方式需要很多高级数学才能理解(我认为-我实际上已经从来没有真正研究过JPG规范).至少对于许多现代图像文件格式而言,这是正确的.另一方面,BMP是微不足道的-可能发生的最糟糕"的事情是图像被RLE压缩.

Reading a JPG, on the other hand, is very complicated, because the JPG data is compressed in a way that requires a lot of advanced mathematics to even understand (I think -- I have actually never really dug into the JPG specs). At least, this is true for a lot of modern image file formats. BMP, on the other hand, is trivial -- the "worst" thing that can happen is that the image is RLE compressed.

解析文件的细节"完全取决于文件格式.文件格式 specification 告诉开发人员如何以二进制形式存储数据(以上两个位图结构是Windows位图规范的一部分).它就像是一个合同,由此类文件的所有编码器/解码器签署(而不是字面意义).对于EBML,该规范似乎可以在此处获得.

The "details" of parsing a file depends entirely on the file format. The file format specification tells the developer how the data is stored in binary form (above, the two bitmap structures are part of the Windows bitmap specification). It is like a contract, signed (not literally) by all encoders/decoders of such files. In the case of EBML, the specification appears to be available here.

这篇关于在Delphi中读取/解析无类型二进制文件的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆