pyparsing-如何使用比较运算符解析字符串? [英] pyparsing - How to parse string with comparison operators?

查看:119
本文介绍了pyparsing-如何使用比较运算符解析字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我有一个NumericStringParser类(摘自此处),定义如下:

 from __future__ import division
from pyparsing import Literal, CaselessLiteral, Word, Combine, Group, Optional, ZeroOrMore, Forward, nums, alphas, oneOf, ParseException
import math
import operator

class NumericStringParser(object):

    def __push_first__(self, strg, loc, toks):
        self.exprStack.append(toks[0])

    def __push_minus__(self, strg, loc, toks):
        if toks and toks[0] == "-":
            self.exprStack.append("unary -")

    def __init__(self):
        point = Literal(".")
        e = CaselessLiteral("E")
        fnumber = Combine(Word("+-" + nums, nums) +
                          Optional(point + Optional(Word(nums))) +
                          Optional(e + Word("+-" + nums, nums)))
        ident = Word(alphas, alphas + nums + "_$")
        plus = Literal("+")
        minus = Literal("-")
        mult = Literal("*")
        floordiv = Literal("//")
        div = Literal("/")
        mod = Literal("%")
        lpar = Literal("(").suppress()
        rpar = Literal(")").suppress()
        addop = plus | minus
        multop = mult | floordiv | div | mod
        expop = Literal("^")
        pi = CaselessLiteral("PI")
        tau = CaselessLiteral("TAU")
        expr = Forward()
        atom = ((Optional(oneOf("- +")) +
                 (ident + lpar + expr + rpar | pi | e | tau | fnumber).setParseAction(self.__push_first__))
                | Optional(oneOf("- +")) + Group(lpar + expr + rpar)
                ).setParseAction(self.__push_minus__)

        factor = Forward()
        factor << atom + \
            ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
        term = factor + \
            ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
        expr << term + \
            ZeroOrMore((addop + term).setParseAction(self.__push_first__))

        self.bnf = expr

        self.opn = {
            "+": operator.add,
            "-": operator.sub,
            "*": operator.mul,
            "/": operator.truediv,
            "//": operator.floordiv,
            "%": operator.mod,
            "^": operator.pow,
            "=": operator.eq,
            "!=": operator.ne,
            "<=": operator.le,
            ">=": operator.ge,
            "<": operator.lt,
            ">": operator.gt
            }

        self.fn = {
            "sin": math.sin,
            "cos": math.cos,
            "tan": math.tan,
            "asin": math.asin,
            "acos": math.acos,
            "atan": math.atan,
            "exp": math.exp,
            "abs": abs,
            "sqrt": math.sqrt,
            "floor": math.floor,
            "ceil": math.ceil,
            "trunc": math.trunc,
            "round": round,
            "fact": factorial,
            "gamma": math.gamma
            }

    def __evaluate_stack__(self, s):
        op = s.pop()
        if op == "unary -":
            return -self.__evaluate_stack__(s)
        if op in ("+", "-", "*", "//", "/", "^", "%", "!=", "<=", ">=", "<", ">", "="):
            op2 = self.__evaluate_stack__(s)
            op1 = self.__evaluate_stack__(s)
            return self.opn[op](op1, op2)
        if op == "PI":
            return math.pi
        if op == "E":
            return math.e
        if op == "PHI":
            return phi
        if op == "TAU":
            return math.tau
        if op in self.fn:
            return self.fn[op](self.__evaluate_stack__(s))
        if op[0].isalpha():
            raise NameError(f"{op} is not defined.")
        return float(op)
 

我有一个evaluate()函数,定义如下:

 def evaluate(expression, parse_all=True):
    nsp = NumericStringParser()
    nsp.exprStack = []
    try:
        nsp.bnf.parseString(expression, parse_all)
    except ParseException as error:
        raise SyntaxError(error)
    return nsp.__evaluate_stack__(nsp.exprStack[:])
 

evaluate()是一个将解析字符串以计算数学运算的函数,例如:

 >>> evaluate("5+5")
10

>>> evaluate("5^2+1")
26
 

问题在于它无法计算比较运算符(=!=<><=>=),当我尝试:evaluate("5=5")时,它会抛出SyntaxError: Expected end of text (at char 1), (line:1, col:2)而不是返回True.函数如何计算这六个比较运算符?

解决方案

如@rici所指出的,您已经添加了评估部分,但未添加解析部分.

解析器在以下几行中定义:

    factor = atom + \
        ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
    term = factor + \
        ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
    expr <<= term + \
        ZeroOrMore((addop + term).setParseAction(self.__push_first__))

这些语句的顺序很重要,因为它们使解析器识别您在中学数学中学到的运算的优先级.也就是说,乘幂最高,其次是乘法和除法,然后是加法和减法.

您需要按照相同的模式将关系运算符插入此解析器定义.添加之后,C语言运算符优先级的约定(我找到了此参考- https://www. tutorialspoint.com/cprogramming/c_operators_precedence.htm )是:

relational operations - <=, >=, >, <
equality operations - ==, !=

在您的情况下,您选择使用'='而不是'==',在这种情况下应该可以.我建议您使用pyparsing的oneOf助手来定义这些运算符组,因为它将处理短字符串可能掩盖较长字符串的情况(例如,在较早的文章中'/'掩盖了'//'). /p>

请注意,通过将所有这些操作混合到一个表达式解析器中,您将得到5 + 2 > 3之类的东西.由于'>'的优先级较低,因此将首先对5 + 2求值,得到7,然后对7> 3求值,并且operator.__gt__返回1或0.

将此示例扩展到其他运算符的困难是导致我在pyparsing中编写infixNotation helper方法的原因.您可能需要看看一下.

您询问有关使用Literal('<=') | Literal('>=) | etc.的信息,并且在您编写该代码时,它就可以正常工作.您只需要注意要在较短的运算符之前寻找较长的运算符.如果您写Literal('>') | Literal('>=') | ...,则匹配> ="将失败,因为第一个匹配将匹配>",然后您将留下"=".使用oneOf可以帮您解决这个问题.

要添加其他解析器步骤,只需要对最后一级执行expr <<= ...步骤.再次查看语句模式.将expr <<= term + etc.更改为arith_expr = term + etc.,然后按照它添加relational_exprequality_expr的级别,然后以expr <<= equality_expr完成.

此模式基于:

factor := atom (^ atom)...
term := factor (mult_op factor)...
arith_expr := term (add_op term)...
relation_expr := arith_expr (relation_op arith_expr)...
equality_expr := relation_expr (equality_op relation_expr)...

尝试自行转换为Python/pyparsing.

So, I have a NumericStringParser class (extracted from here), defined as below:

from __future__ import division
from pyparsing import Literal, CaselessLiteral, Word, Combine, Group, Optional, ZeroOrMore, Forward, nums, alphas, oneOf, ParseException
import math
import operator

class NumericStringParser(object):

    def __push_first__(self, strg, loc, toks):
        self.exprStack.append(toks[0])

    def __push_minus__(self, strg, loc, toks):
        if toks and toks[0] == "-":
            self.exprStack.append("unary -")

    def __init__(self):
        point = Literal(".")
        e = CaselessLiteral("E")
        fnumber = Combine(Word("+-" + nums, nums) +
                          Optional(point + Optional(Word(nums))) +
                          Optional(e + Word("+-" + nums, nums)))
        ident = Word(alphas, alphas + nums + "_$")
        plus = Literal("+")
        minus = Literal("-")
        mult = Literal("*")
        floordiv = Literal("//")
        div = Literal("/")
        mod = Literal("%")
        lpar = Literal("(").suppress()
        rpar = Literal(")").suppress()
        addop = plus | minus
        multop = mult | floordiv | div | mod
        expop = Literal("^")
        pi = CaselessLiteral("PI")
        tau = CaselessLiteral("TAU")
        expr = Forward()
        atom = ((Optional(oneOf("- +")) +
                 (ident + lpar + expr + rpar | pi | e | tau | fnumber).setParseAction(self.__push_first__))
                | Optional(oneOf("- +")) + Group(lpar + expr + rpar)
                ).setParseAction(self.__push_minus__)

        factor = Forward()
        factor << atom + \
            ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
        term = factor + \
            ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
        expr << term + \
            ZeroOrMore((addop + term).setParseAction(self.__push_first__))

        self.bnf = expr

        self.opn = {
            "+": operator.add,
            "-": operator.sub,
            "*": operator.mul,
            "/": operator.truediv,
            "//": operator.floordiv,
            "%": operator.mod,
            "^": operator.pow,
            "=": operator.eq,
            "!=": operator.ne,
            "<=": operator.le,
            ">=": operator.ge,
            "<": operator.lt,
            ">": operator.gt
            }

        self.fn = {
            "sin": math.sin,
            "cos": math.cos,
            "tan": math.tan,
            "asin": math.asin,
            "acos": math.acos,
            "atan": math.atan,
            "exp": math.exp,
            "abs": abs,
            "sqrt": math.sqrt,
            "floor": math.floor,
            "ceil": math.ceil,
            "trunc": math.trunc,
            "round": round,
            "fact": factorial,
            "gamma": math.gamma
            }

    def __evaluate_stack__(self, s):
        op = s.pop()
        if op == "unary -":
            return -self.__evaluate_stack__(s)
        if op in ("+", "-", "*", "//", "/", "^", "%", "!=", "<=", ">=", "<", ">", "="):
            op2 = self.__evaluate_stack__(s)
            op1 = self.__evaluate_stack__(s)
            return self.opn[op](op1, op2)
        if op == "PI":
            return math.pi
        if op == "E":
            return math.e
        if op == "PHI":
            return phi
        if op == "TAU":
            return math.tau
        if op in self.fn:
            return self.fn[op](self.__evaluate_stack__(s))
        if op[0].isalpha():
            raise NameError(f"{op} is not defined.")
        return float(op)

I have an evaluate() function, defined as below:

def evaluate(expression, parse_all=True):
    nsp = NumericStringParser()
    nsp.exprStack = []
    try:
        nsp.bnf.parseString(expression, parse_all)
    except ParseException as error:
        raise SyntaxError(error)
    return nsp.__evaluate_stack__(nsp.exprStack[:])

evaluate() is a function that will parse a string to calculate a mathematical operation, for example:

>>> evaluate("5+5")
10

>>> evaluate("5^2+1")
26

The problem is that it cannot compute comparison operators (=, !=, <, >, <=, >=), and when I try: evaluate("5=5"), it throws SyntaxError: Expected end of text (at char 1), (line:1, col:2) instead of returning True. How can the function compute those six comparison operators?

解决方案

As pointed out by @rici, you have added the evaluation part, but not the parsing part.

The parser is defined in these lines:

    factor = atom + \
        ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
    term = factor + \
        ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
    expr <<= term + \
        ZeroOrMore((addop + term).setParseAction(self.__push_first__))

The order of these statements is important, because they cause the parser to recognize the precedence of operations, which you learned in high school math. That is, exponentiation is highest, then multiplication and division next, then addition and subtraction next.

You'll need to insert your relational operators to this parser definition following the same pattern. After addition, the convention from C language operator precedence (I found this reference - https://www.tutorialspoint.com/cprogramming/c_operators_precedence.htm) is:

relational operations - <=, >=, >, <
equality operations - ==, !=

In your case, you choose to use '=' instead of '==', and that should be okay in this setting. I suggest you use pyparsing's oneOf helper to define these operator groups, as it will take care of the case where a short string might mask a longer string (as when '/' masked '//' in your earlier post).

Note that, by mixing these operations all into one expression parser, you will get things like 5 + 2 > 3. Since '>' has lower precedence, 5+2 will be evaluated first giving 7, then 7 > 3 will be evaluated, and operator.__gt__ will return 1 or 0.

The difficulty in extending this example to other operators was what caused me to write the infixNotation helper method in pyparsing. You may want to give that a look.

EDIT:

You asked about using Literal('<=') | Literal('>=) | etc., and as you wrote it, that will work just fine. You just have to be careful to look for the longer operators ahead of the shorter ones. If you write Literal('>') | Literal('>=') | ... then matching '>=' would fail because the first match would match the '>' and then you would be left with '='. Using oneOf takes care of this for you.

To add the additional parser steps, you only want do the expr <<= ... step for the last level. Look at the pattern of statements again. Change expr <<= term + etc. to arith_expr = term + etc., follow it to add levels for relational_expr and equality_expr, and then finish with expr <<= equality_expr.

The pattern for this is based on:

factor := atom (^ atom)...
term := factor (mult_op factor)...
arith_expr := term (add_op term)...
relation_expr := arith_expr (relation_op arith_expr)...
equality_expr := relation_expr (equality_op relation_expr)...

Try doing that conversion to Python/pyparsing on your own.

这篇关于pyparsing-如何使用比较运算符解析字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆