C#二进制文件查找字符串 [英] C# binary files finding strings

查看：80 发布时间：2019/6/18 8:05:14 C#

本文介绍了C#二进制文件查找字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

大家好，

我有一些数据文件(〜2Mb大小)，其中包含以小端字节序和大端字节序混合编码的数据.数据块由用作数据字段标签的字符串(UTF8)分隔.由于.NET的二进制阅读器不支持此类混合数据流，因此我尝试实现自定义阅读器.为了找到特定数据字段的偏移量，我尝试了如下操作:

Hello All,

I have some data files (~2Mb size) which contain data encoded with mixed little- and big-endian byte-ordering. Data-Chunks are delimited by Strings (UTF8) which act as labels for data fields. Since .NET''s binary reader doesn''t support such mixed data streams I have tried to implement a custom reader. To find the offset to a particular data-field I have tried the some thing like the following:

byte[] byteBuffer = File.ReadAllBytes("SomeFilePath");
string byteBufferAsString = System.Text.Encoding.UTF8.GetString(byteBuffer);
Int32 offset1 = byteBufferAsString.IndexOf("StringToFind");

但是，这似乎有可变的结果.有时，偏移值正好指向缓冲区中StringToFind文本的起始位置，而有时偏移量将指向实际起始位置之前的两个字节，即指向一个Int16，该Int16指示紧随其后的字符串的字节长度.

有没有人有过类似的经历?否则，有人对处理二进制文件和搜索字符串位置有什么建议吗?

cheers

However this seems to have variable results. Sometimes the offset value point exactly to the start-position of the StringToFind text in the buffer and other times it will point two bytes in front of the actual start position i.e. pointing to a Int16 which indicates the byte-length of string immediately following.

Has anyone had similar experience? Otherwise does anyone have any advice for working with binary-files and searching for string positions?

cheers

推荐答案

我认为这一步

I think this step

报价:

字符串byteBufferAsString = System.Text.Encoding.UTF8.GetString(byteBuffer)

string byteBufferAsString = System.Text.Encoding.UTF8.GetString(byteBuffer)

较弱.

相反，您应该执行相反的操作:获取代表搜索字符串的字节数组，然后在数据缓冲区内进行搜索.

is weak.

You should instead do the opposite: get the array of bytes representing the search string and search it inside the data buffer.

您需要以二进制形式搜索UTF-8字符串.这样的东西(未经测试):

You need to search the UTF-8 string as binary. Something like this (not tested):

byte[] ByteBuffer = File.ReadAllBytes("SomeFilePath");
byte[] StringBytes = Encoding.UTF8.GetBytes("StringToFind");
for (i = 0; i <= (ByteBuffer.Length - StringBytes.Length); i++)
{
    if (ByteBuffer[i] == StringBytes[0])
    {
        for (j = 1; j < StringBytes.Length && ByteBuffer[i + j] == StringBytes[j]; j++) ;
        if (j == StringBytes.Length)
            Console.WriteLine("String was found at offset {0}", i);
    }
}

请注意，这是区分大小写的搜索！

Please note that this is a case-sensitive search!

好吧，我没有这种经验，只是因为我完全避免处理极端愚蠢，所以我唯一的建议是:放弃;处理所有以这种怪异方式构造的数据的软件，并确保您将来避免此类谬论；编写全新的软件，它将使用一些合理的持久性；并节省大量时间和精力.如果此建议似乎不适合您，非常欢迎您自行解决问题".

我严重怀疑您会从有类似经验"的人那里获得更好的建议.一些经验并没有真正的帮助.

—SA

Well, I don''t have such experience, just because I thoroughly avoid dealing with extreme stupidities, so my only advice would be: give up; through out all software dealing with the data structured in this weird way and make sure you prevent such fallacies in future; write brand new software, which will use some reasonable persistence; and save huge amount of time and nerve. If this advice seems to be not suitable for you, you are very welcome to ram the "problem" on your own.

I seriously doubt you can get better advice from anyone who "had similar experience". Some experiences are not really helpful.

—SA

这篇关于C#二进制文件查找字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

C#二进制文件查找字符串 [英] C# binary files finding strings

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

C#二进制文件查找字符串 [英] C# binary files finding strings

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭