我可以使用C#序列化以自定义格式读取二进制文件吗? [英] Can I use C# Serialization to read a binary file in custom format?
问题描述
我有一个自定义二进制文件,我想将其读入C#程序.
I have a custom binary file which I want to read into my C# program.
有几种不同的格式,其中一些是MSB在先,某些是LSB在前,还有一些变量的顺序不同.
There are several different formats, some MSB first, some LSB first and some with the variables in different orders.
目前,我有一个可以读取正确数量的字节的类,一次读取一个.
Currently, I have a class which reads the right number of bytes, one at a time.
这非常慢,因此我希望尽我所能提高性能.
It is very slow and so I am looking to improve performance any way I can.
序列化性能可能会更好吗?如果是这样,我描述过的场景是否有可能?是否可以针对大/小尾数格式自定义BinaryFormatter?
Is serialization likely to perform better? If so, is this possible with the scenario I have decsribed? Is it possible to customise the BinaryFormatter for big/little-endian format?
谢谢.
推荐答案
使用BinaryFormatter
无法做到这一点-它将期望对象周围有其他元数据/填充.您必须手动从Stream
或类似地通过二进制读取器读取.
You can't do that with BinaryFormatter
- it will expect additional meta-data/padding around object. You would have to read manually either from a Stream
or similarly via a binary reader.
已经完成了一些非常相似的代码,我将编写自己的读取器,该读取器位于流的顶部,并具有以下方法:ReadInt32LittleEndian
,ReadInt32BigEndian
(例如您需要的所有内容)-并使用shift(<<
/>>
)组合字节.但是重要我将使用后备缓冲区来减少对底层流的调用量(即使使用缓冲区,这也可能会慢得令人无法接受).
Having done some very similar code, I would write my own reader that sits on top of a stream, with methods like: ReadInt32LittleEndian
, ReadInt32BigEndian
(etc for everything you need) - and use shift (<<
/ >>
) to assemble the bytes. But importantly I would use a backing buffer to reduce the amount of calls to the underlying stream (even with a buffer, this can be unacceptably slow).
让我介绍一下protobuf-net上执行此操作的一些代码...特别是
Let me refer you to some code from protobuf-net that does this... in particular ProtoReader
, taking an example:
/// <summary>
/// Reads an unsigned 32-bit integer from the stream; supported wire-types: Variant, Fixed32, Fixed64
/// </summary>
public uint ReadUInt32()
{
switch (wireType)
{
case WireType.Variant:
return ReadUInt32Variant(false);
case WireType.Fixed32:
if (available < 4) Ensure(4, true);
position += 4;
available -= 4;
return ((uint)ioBuffer[ioIndex++])
| (((uint)ioBuffer[ioIndex++]) << 8)
| (((uint)ioBuffer[ioIndex++]) << 16)
| (((uint)ioBuffer[ioIndex++]) << 24);
case WireType.Fixed64:
ulong val = ReadUInt64();
checked { return (uint)val; }
default:
throw CreateException();
}
}
(这里wireType
广泛用作字节序等的指示符,但这并不重要)
(here wireType
broadly acts as an indicater of endianness etc, but that isn't important)
查看Fixed32
实现:
-
Ensure
确保后备缓冲区中至少还有4个字节(如果需要,可获取更多字节) - 我们增加一些计数器,以便我们可以跟踪逻辑缓冲区中的位置
- 我们从缓冲区读取数据
- The
Ensure
makes sure that we have at least 4 more bytes in our backing buffer (fetching more if we desire) - we increment some counters so we can track our position in the logical buffer
- we read the data from the buffer
如果您有格式的阅读器,反序列化应该容易得多.
One you have a reader for your format, deserialization should be much easier.
这篇关于我可以使用C#序列化以自定义格式读取二进制文件吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!