使用python处理和使用二进制数据HEX [英] Handling and working with binary data HEX with python

查看:243
本文介绍了使用python处理和使用二进制数据HEX的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试比较某些字节值-源A来自正在读取的文件:

I'm trying to do a comparison of some byte values - source A comes from a file that is being 'read':

f = open(fname, "rb")
f_data = f.read()
f.close()

这些文件可以是从几Kb到几Mb的任何文件

These files can be anything from a few Kb to a few Mb large

来源B是已知模式的词典:

Source B is a dictionary of known patterns:

eof_markers = {
    'jpg':b'\xff\xd9',
    'pdf':b'\x25\x25\x45\x4f\x46',
    }

(基本过程完成后,此列表将得到扩展)

(This list will be extended once the basic process works)

基本上,我正在尝试读取文件(源A),然后递增检查最后一个字节是否与模式列表匹配 testString = f_data [-counter:] 如果未找到匹配项,则应将计数器增加1,并尝试对模式进行模式匹配再次与列表相对。

Essentially I'm trying to 'read' the file (source A) and then incrementally inspect the last byte for matches to the pattern list testString = f_data[-counter:] If no match is found, it should increase counter by 1, and try to pattern match against the list again.

我尝试了多种方法来使之正常工作,我可以使testString正确地递增,但是我一直遇到需要各种方法的编码问题ASCII化字节进行比较。

I've tried a number of different ways to get this working, I can get the testString to increment correctly, but I keep running into encode issue where various approaches are want to ASCIIify the byte to undertake the comparison.

我有点迷茫,而且不是第一次徘徊在将 int 更改为 u b ,而不会遇到诸如 d9 这样的保留问题值,因此无法使用ASCII类型比较工具,例如如果testString中的format_type:(导致 UnicodeDecodeError:'ascii'编解码器无法解码字节a9

I'm a bit lost, and not for the first time wandering around the code changing int to u to b and not getting past issues like d9 being a reserved value, and therefore not being able to use the ASCII type comparison tools e.g. if format_type in testString: (results in a UnicodeDecodeError: 'ascii' codec can't decode byte a9

我试图将所有内容都转换为整数,但这引发了此错误: ValueError:int()的无效文字,基数为2:'。' ValueError:int()以10为基的无效文字:。 我试图转换 testString 转换为十六进制字节,但不断得到 TypeError:hex()参数无法转换为十六进制(这比我缺乏理解要多

I tried to convert everything to an integer, but that was throwing this error: ValueError: invalid literal for int() with base 2: '.' or ValueError: invalid literal for int() with base 10: '.' I tried to convert the testString to hex bytes, but kept getting TypeError: hex() argument can't be converted to hex (this is more my lack of understanding than anything else I'm sure!....)

我发现有很多资源都在谈论编码/十六进制比较,例如stackoverflow.com/questions/10561923/unicodedecodeerror -ascii-codec-cant-decode-byte-0xef-in-position-1),我只是没有发现我可以完全理解的东西,或者指出了正确的方法。

There are a number of resources I've found that talk about encoding / hex comparisons e.g. stackoverflow.com/questions/10561923/unicodedecodeerror-ascii-codec-cant-decode-byte-0xef-in-position-1), I've just not found something that I can either fully understand, or that points me down the right path.

已经有一段时间了,我很高兴收到任何指针。

Its been a while I've been stuck on this, so any pointers are gratefully received.

推荐答案

我不确定您要做什么,但我在Python 3.2.3中运行了此代码。

I'm not sure exactly what you're trying to do, but I ran this code in Python 3.2.3.

#f = open(fname, "rb")
#f_data = f.read()
#f.close()
f_data = b'\x12\x43\xff\xd9\x00\x23'
eof_markers = {
    'jpg':b'\xff\xd9',
    'pdf':b'\x25\x25\x45\x4f\x46',
    }

for counter in range(-4, 0):
  for name, marker in eof_markers.items():
    print(counter, ('' if marker in f_data[counter:] else '!') + name)

我使用的是硬编码的f_data,但您可以通过取消注释第1行来撤消该操作3和注释行4。

I'm using a hardcoded f_data, but you can undo that by just uncommenting lines 1-3 and comment line 4.

这是输出:

-4 !pdf
-4 jpg
-3 !pdf
-3 !jpg
-2 !pdf
-2 !jpg
-1 !pdf
-1 !jpg

是否存在您需要做的事情? ?

Is there something this isn't doing that you need to do?

这篇关于使用python处理和使用二进制数据HEX的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆