在Python中比较二进制文件 [英] Diffing Binary Files In Python

查看：499 发布时间：2020/10/22 0:26:24 python diff

本文介绍了在Python中比较二进制文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个二进制文件。它们看起来像这样，但是数据更加随机：

I've got two binary files. They look something like this, but the data is more random:

文件A：

FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF ...

文件B：

41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37 ...

我想要的是这样的称呼：

What I'd like is to call something like:

>>> someDiffLib.diff(file_a_data, file_b_data)

并收到以下内容：

[Match(pos=4, length=4)]

表示在两个文件中，位置4的字节对于4个字节都是相同的。序列 44 43 42 41 不匹配，因为它们在每个文件中的位置都不相同。

Indicating that in both files the bytes at position 4 are the same for 4 bytes. The sequence 44 43 42 41 would not match because they're not in the same positions in each file.

有没有可以帮我做个比较的图书馆？还是我应该编写循环进行比较？

Is there a library that will do the diff for me? Or should I just write the loops to do the comparison?

推荐答案

您可以使用 itertools.groupby（） 为此，下面是一个示例：

You can use itertools.groupby() for this, here is an example:

from itertools import groupby

# this just sets up some byte strings to use, Python 2.x version is below
# instead of this you would use f1 = open('some_file', 'rb').read()
f1 = bytes(int(b, 16) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = bytes(int(b, 16) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())

matches = []
for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i]):
    if k:
        pos = next(g)
        length = len(list(g)) + 1
        matches.append((pos, length))

或者使用列表理解与上述相同：

Or the same thing as above using a list comprehension:

matches = [(next(g), len(list(g))+1)
           for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i])
               if k]

以下是使用Python 2.x的示例设置：

Here is the setup for the example if you are using Python 2.x:

f1 = ''.join(chr(int(b, 16)) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = ''.join(chr(int(b, 16)) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())

这篇关于在Python中比较二进制文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Python中比较二进制文件 [英] Diffing Binary Files In Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python中比较二进制文件 [英] Diffing Binary Files In Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭