替换二进制文件中的字节序列 [英] Replace sequence of bytes in binary file

查看:126
本文介绍了替换二进制文件中的字节序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将二进制文件中的字节序列替换为其他字节的相同长度的最佳方法是什么?二进制文件将非常大,大约50 mb,不应立即加载到内存中.

What is the best method to replace sequence of bytes in binary file to the same length of other bytes? The binary files will be pretty large, about 50 mb and should not be loaded at once in memory.

更新:我不知道需要替换的字节的位置,我需要先找到它们.

Update: I do not know location of bytes which needs to be replaced, I need to find them first.

推荐答案

假设您要替换文件的已知部分.

  • 打开具有读/写访问权限的FileStream
  • 寻找正确的地方
  • 覆盖现有数据

示例代码即将到来...

Sample code coming...

public static void ReplaceData(string filename, int position, byte[] data)
{
    using (Stream stream = File.Open(filename, FileMode.Open))
    {
        stream.Position = position;
        stream.Write(data, 0, data.Length);
    }
}

如果您实际上是在尝试制作string.Replace的二进制版本(例如,总是用{20,35,15}替换字节{51,20,34},那么这会比较困难.)你会做什么:

If you're effectively trying to do a binary version of a string.Replace (e.g. "always replace bytes { 51, 20, 34} with { 20, 35, 15 } then it's rather harder. As a quick description of what you'd do:

  • 分配至少具有您感兴趣的数据大小的缓冲区
  • 反复读入缓冲区,扫描数据
  • 如果找到匹配项,请找回正确的位置(例如stream.Position -= buffer.Length - indexWithinBuffer;并覆盖数据
  • Allocate a buffer of at least the size of data you're interested in
  • Repeatedly read into the buffer, scanning for the data
  • If you find a match, seek back to the right place (e.g. stream.Position -= buffer.Length - indexWithinBuffer; and overwrite the data

到目前为止听起来很简单...但是棘手的问题是,数据开始是在缓冲区末尾附近.您需要记住所有潜在匹配项以及到目前为止已匹配的距离,以便在读取 next 缓冲区的值时获得匹配项时,您可以检测到它.

Sounds simple so far... but the tricky bit is if the data starts near the end of the buffer. You need to remember all potential matches and how far you've matched so far, so that if you get a match when you read the next buffer's-worth, you can detect it.

也许可以避免这种棘手的问题,但我不想尝试将它们付诸东流:)

There are probably ways of avoiding this trickiness, but I wouldn't like to try to come up with them offhand :)

好的,我有个主意,可能对您有帮助...

Okay, I've got an idea which might help...

  • 保留至少两倍于所需缓冲区的缓冲区
  • 反复:
    • 将缓冲区的 second 的一半复制到前一半
    • 从文件填充缓冲区的后半部分
    • 在整个整个缓冲区中搜索所需的数据
    • Keep a buffer which is at least twice as big as you need
    • Repeatedly:
      • Copy the second half of the buffer into the first half
      • Fill the second half of the buffer from the file
      • Search throughout the whole buffer for the data you're looking for

      这样,如果有数据存在,它将完全在缓冲区内.

      That way at some point, if the data is present, it will be completely within the buffer.

      您需要注意流的位置,以便回到正确的位置,但是我认为这应该可行.如果您要查找所有 all 匹配项,将比较棘手,但至少第一个匹配项应该相当简单...

      You'd need to be careful about where the stream was in order to get back to the right place, but I think this should work. It would be trickier if you were trying to find all matches, but at least the first match should be reasonably simple...

      这篇关于替换二进制文件中的字节序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆