使用python处理和使用二进制数据HEX [英] Handling and working with binary data HEX with python

查看：243 发布时间：2020/9/24 18:42:30 python file byte

本文介绍了使用python处理和使用二进制数据HEX的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试比较某些字节值-源A来自正在读取的文件：

I'm trying to do a comparison of some byte values - source A comes from a file that is being 'read':

f = open(fname, "rb")
f_data = f.read()
f.close()

这些文件可以是从几Kb到几Mb的任何文件

These files can be anything from a few Kb to a few Mb large

来源B是已知模式的词典：

Source B is a dictionary of known patterns:

eof_markers = {
    'jpg':b'\xff\xd9',
    'pdf':b'\x25\x25\x45\x4f\x46',
    }

（基本过程完成后，此列表将得到扩展）

(This list will be extended once the basic process works)

基本上，我正在尝试读取文件（源A），然后递增检查最后一个字节是否与模式列表匹配 testString = f_data [-counter：] 如果未找到匹配项，则应将计数器增加1，并尝试对模式进行模式匹配再次与列表相对。

Essentially I'm trying to 'read' the file (source A) and then incrementally inspect the last byte for matches to the pattern list testString = f_data[-counter:] If no match is found, it should increase counter by 1, and try to pattern match against the list again.

我尝试了多种方法来使之正常工作，我可以使testString正确地递增，但是我一直遇到需要各种方法的编码问题ASCII化字节进行比较。

I've tried a number of different ways to get this working, I can get the testString to increment correctly, but I keep running into encode issue where various approaches are want to ASCIIify the byte to undertake the comparison.

我有点迷茫，而且不是第一次徘徊在将 int 更改为 u 到 b ，而不会遇到诸如 d9 这样的保留问题值，因此无法使用ASCII类型比较工具，例如如果testString中的format_type：（导致 UnicodeDecodeError：'ascii'编解码器无法解码字节a9

I'm a bit lost, and not for the first time wandering around the code changing int to u to b and not getting past issues like d9 being a reserved value, and therefore not being able to use the ASCII type comparison tools e.g. if format_type in testString: (results in a UnicodeDecodeError: 'ascii' codec can't decode byte a9

我试图将所有内容都转换为整数，但这引发了此错误： ValueError：int（）的无效文字，基数为2：'。'或 ValueError：int（）以10为基的无效文字：。 我试图转换 testString 转换为十六进制字节，但不断得到 TypeError：hex（）参数无法转换为十六进制（这比我缺乏理解要多

I tried to convert everything to an integer, but that was throwing this error: ValueError: invalid literal for int() with base 2: '.' or ValueError: invalid literal for int() with base 10: '.' I tried to convert the testString to hex bytes, but kept getting TypeError: hex() argument can't be converted to hex (this is more my lack of understanding than anything else I'm sure!....)

我发现有很多资源都在谈论编码/十六进制比较，例如stackoverflow.com/questions/10561923/unicodedecodeerror -ascii-codec-cant-decode-byte-0xef-in-position-1），我只是没有发现我可以完全理解的东西，或者指出了正确的方法。

There are a number of resources I've found that talk about encoding / hex comparisons e.g. stackoverflow.com/questions/10561923/unicodedecodeerror-ascii-codec-cant-decode-byte-0xef-in-position-1), I've just not found something that I can either fully understand, or that points me down the right path.

已经有一段时间了，我很高兴收到任何指针。

Its been a while I've been stuck on this, so any pointers are gratefully received.

推荐答案

我不确定您要做什么，但我在Python 3.2.3中运行了此代码。

I'm not sure exactly what you're trying to do, but I ran this code in Python 3.2.3.

#f = open(fname, "rb")
#f_data = f.read()
#f.close()
f_data = b'\x12\x43\xff\xd9\x00\x23'
eof_markers = {
    'jpg':b'\xff\xd9',
    'pdf':b'\x25\x25\x45\x4f\x46',
    }

for counter in range(-4, 0):
  for name, marker in eof_markers.items():
    print(counter, ('' if marker in f_data[counter:] else '!') + name)

我使用的是硬编码的f_data，但您可以通过取消注释第1行来撤消该操作3和注释行4。

I'm using a hardcoded f_data, but you can undo that by just uncommenting lines 1-3 and comment line 4.

这是输出：

-4 !pdf
-4 jpg
-3 !pdf
-3 !jpg
-2 !pdf
-2 !jpg
-1 !pdf
-1 !jpg

是否存在您需要做的事情？？

Is there something this isn't doing that you need to do?

这篇关于使用python处理和使用二进制数据HEX的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用python处理和使用二进制数据HEX [英] Handling and working with binary data HEX with python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用python处理和使用二进制数据HEX [英] Handling and working with binary data HEX with python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭