用于用户友好的任意字节字符串的配置文件格式是什么? [英] What config file format to use for user-friendly strings of arbitrary bytes?
问题描述
因此,我通过检查它们的魔术编号/文件签名优先:
So I made a short Python script to launch files in Windows with ambiguous extensions by examining their magic number/file signature first:
- https://superuser.com/a/317927/13889
- https://gist.github.com/1119561
我想将其编译为.exe以使关联更容易(使用bbfreeze或用C重写),但是我需要某种用户友好的配置文件来指定匹配的字节字符串和程序路径.基本上,我想以某种方式将这些信息放入纯文本文件中:
I'd like to compile it to a .exe to make association easier (either using bbfreeze or rewriting in C), but I need some kind of user-friendly config file to specify the matching byte strings and program paths. Basically I want to put this information into a plain text file somehow:
magic_numbers = {
# TINA
'OBSS': r'%PROGRAMFILES(X86)%\DesignSoft\Tina 9 - TI\TINA.EXE',
# PSpice
'*version': r'%PROGRAMFILES(X86)%\Orcad\Capture\Capture.exe',
'x100\x88\xce\xcf\xcfOrCAD ': '', #PSpice?
# Protel
'DProtel': r'%PROGRAMFILES(X86)%\Altium Designer S09 Viewer\dxp.exe',
# Eagle
'\x10\x80': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
'\x10\x00': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE eagle ': r'%PROGRAMFILES(X86)%\EAGLE-5.11.0\bin\eagle.exe',
# PADS Logic
'\x00\xFE': r'C:\MentorGraphics\9.3PADS\SDD_HOME\Programs\powerlogic.exe',
}
(十六进制字节只是任意字节,不是Unicode字符.)
(The hex bytes are just arbitrary bytes, not Unicode characters.)
我猜以这种格式的.py文件可以工作,但是我必须不对其进行编译,并且仍然以某种方式将其导入已编译的文件中,并且还有很多无关紧要的内容,例如{
和,
通过/向上拉动.
I guess a .py file in this format works, but I have to leave it uncompiled and somehow still import it into the compiled file, and there's still a bunch of extraneous content like {
and ,
to be confused by/screw up.
我看过YAML,它很棒,除了它首先需要base64编码的二进制内容,这并不是我真正想要的.我希望配置文件包含字节的十六进制表示形式.但也就是ASCII表示形式,如果仅此而已是文件签名.也许还有正则表达式. :D(例如,如果基于XML的格式可以用不同数量的空格编写)
I looked at YAML, and it would be great except that it requires base64-encoding binary stuff first, which isn't really what I want. I'd prefer the config file to contain hex representations of the bytes. But also ASCII representations, if that's all the file signature is. And maybe also regexes. :D (In case the XML-based format can be written with different amounts of whitespace, for instance)
有什么想法吗?
推荐答案
您已经获得答案:YAML.
You've already got your answer: YAML.
您在上方发布的数据正在存储二进制数据的文本表示形式;这对YAML来说很好,您只需对其进行正确分析即可.通常,您会使用binascii模块中的内容;在这种情况下,可能是binascii.a2b_qp
函数.
The data you posted up above is storing text representations of binary data; that will be fine for YAML, you just need to parse it properly. Usually you'd use something from the binascii module; in this case, likely the binascii.a2b_qp
function.
magic_id_str = 'x100\x88\xce\xcf\xcfOrCAD '
magic_id = binascii.a2b_qp(magic_id_str)
为了阐明这一点,我将使用unicode字符作为将二进制数据粘贴到REPL(Python 2.7)的简便方法:
To elucidate, I will use a unicode character as an easy way to paste binary data into the REPL (Python 2.7):
>>> a = 'Φ'
>>> a
'\xce\xa6'
>>> binascii.b2a_qp(a)
'=CE=A6'
>>> magic_text = yaml.load("""
... magic_string: '=CE=A6'
... """)
>>> magic_text
{'magic_string': '=CE=A6'}
>>> binascii.a2b_qp(magic_text['magic_string'])
'\xce\xa6'
这篇关于用于用户友好的任意字节字符串的配置文件格式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!