用于自定义BNF解析器的任何python模块吗? [英] Any python module for customized BNF parser?

查看:228
本文介绍了用于自定义BNF解析器的任何python模块吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

朋友.

我需要解析一个类似于"make"的样式文件.语法是这样的:

I have a 'make'-like style file needed to be parsed. The grammar is something like:

samtools=/path/to/samtools
picard=/path/to/picard

task1: 
    des: description
    path: /path/to/task1
    para: [$global.samtools,
           $args.input,
           $path
          ]

task2: task1

其中$global包含在全局范围内定义的变量. $path是一个本地"变量. $args包含用户传递的密钥/对值.

Where $global contains the variables defined in a global scope. $path is a 'local' variable. $args contains the key/pair values passed in by users.

我想通过一些python库解析此文件.最好返回一些解析树.如果有一些错误,最好报告一下.我找到了一个: CodeTalker

I would like to parse this file by some python libraries. Better to return some parse tree. If there are some errors, better to report them. I found this one: CodeTalker and yeanpypa. Can they be used in this case? Any other recommendations?

推荐答案

我必须根据您的示例来猜测您的makefile结构所允许的内容,但这应该可以使您接近:

I had to guess what your makefile structure allows based on your example, but this should get you close:

from pyparsing import *
# elements of the makefile are delimited by line, so we must
# define skippable whitespace to include just spaces and tabs
ParserElement.setDefaultWhitespaceChars(' \t')
NL = LineEnd().suppress()

EQ,COLON,LBRACK,RBRACK = map(Suppress, "=:[]")
identifier = Word(alphas+'_', alphanums)

symbol_assignment = Group(identifier("name") + EQ + empty + 
                          restOfLine("value"))("symbol_assignment")
symbol_ref = Word("$",alphanums+"_.")

def only_column_one(s,l,t):
    if col(l,s) != 1:
        raise ParseException(s,l,"not in column 1")
# task identifiers have to start in column 1
task_identifier = identifier.copy().setParseAction(only_column_one)

task_description = "des:" + empty + restOfLine("des")
task_path = "path:" + empty + restOfLine("path")
task_para_body = delimitedList(symbol_ref)
task_para = "para:" + LBRACK + task_para_body("para") + RBRACK
task_para.ignore(NL)
task_definition = Group(task_identifier("target") + COLON + 
        Optional(delimitedList(identifier))("deps") + NL +
        (
        Optional(task_description + NL) & 
        Optional(task_path + NL) & 
        Optional(task_para + NL)
        )
    )("task_definition")

makefile_parser = ZeroOrMore(
    symbol_assignment |
    task_definition |
    NL
    )


if __name__ == "__main__":
    test = """\
samtools=/path/to/samtools
picard=/path/to/picard

task1:  
    des: description 
    path: /path/to/task1 
    para: [$global.samtools, 
           $args.input, 
           $path 
          ] 

task2: task1 
"""

# dump out what we parsed, including results names
for element in makefile_parser.parseString(test):
    print element.getName()
    print element.dump()
    print

打印:

symbol_assignment
['samtools', '/path/to/samtools']
- name: samtools
- value: /path/to/samtools

symbol_assignment
['picard', '/path/to/picard']
- name: picard
- value: /path/to/picard

task_definition
['task1', 'des:', 'description ', 'path:', '/path/to/task1 ', 'para:', 
 '$global.samtools', '$args.input', '$path']
- des: description 
- para: ['$global.samtools', '$args.input', '$path']
- path: /path/to/task1 
- target: task1

task_definition
['task2', 'task1']
- deps: ['task1']
- target: task2

dump()输出显示您可以使用哪些名称来获取已解析元素中的字段,或区分所拥有的元素类型. dump()是一种方便的通用工具,用于输出pyparsing解析的内容.这是一些特定于您的特定解析器的代码,显示了如何将字段名称用作点对象引用(element.targetelement.depselement.name等)或字典样式的引用(element[key]) ):

The dump() output shows you what names you can use to get at the fields within the parsed elements, or to distinguish what kind of element you have. dump() is a handy, generic tool to output whatever pyparsing has parsed. Here is some code that is more specific to your particular parser, showing how to use the field names as either dotted object references (element.target, element.deps, element.name, etc.) or dict-style references (element[key]):

for element in makefile_parser.parseString(test):
    if element.getName() == 'task_definition':
        print "TASK:", element.target,
        if element.deps:
            print "DEPS:(" + ','.join(element.deps) + ")"
        else:
            print
        for key in ('des', 'path', 'para'):
            if key in element:
                print " ", key.upper()+":", element[key]

    elif element.getName() == 'symbol_assignment':
        print "SYM:", element.name, "->", element.value

打印:

SYM: samtools -> /path/to/samtools
SYM: picard -> /path/to/picard
TASK: task1
  DES: description 
  PATH: /path/to/task1 
  PARA: ['$global.samtools', '$args.input', '$path']
TASK: task2 DEPS:(task1)

这篇关于用于自定义BNF解析器的任何python模块吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆