在调车场处理额外的操作员 [英] Handling extra operators in Shunting-yard

查看:81
本文介绍了在调车场处理额外的操作员的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出如下输入:3+4+ 算法将其转换为3 4 + +

Given an input like this: 3+4+ Algorithm turns it in to 3 4 + +

我应该在执行后缀表达式时发现错误. 但是,是否有可能在转换过程中发现这一点?

I can find the error when it's time to execute the postfix expression. But, is it possible to spot this during the conversion?

(我读过的维基百科文章和互联网文章都无法解决这种情况)

(Wikipedia article and internet articles I've read do not handle this case)

谢谢

推荐答案

除括号不匹配外,可用正则表达式验证有效表达式. (不匹配的括号将被维基百科页面上指示的调车码算法捕获,因此我将忽略它们.)

Valid expressions can be validated with a regular expression, aside from parenthesis mismatching. (Mismatched parentheses will be caught by the shunting-yard algorithm as indicated in the wikipedia page, so I'm ignoring those.)

正则表达式如下:

PRE* OP POST* (INF PRE* OP POST*)*

其中:

PRE  is a prefix operator or (
POST is a postfix operator or )
INF  is an infix operator
OP   is an operand (a literal or a variable name)

您链接的Wikipedia页面包含函数调用,但不包含数组下标;如果您想处理这些情况,则:

The Wikipedia page you linked includes function calls but not array subscripting; if you want to handle these cases, then:

PRE  is a prefix operator or (
POST is a postfix operator or ) or ]
INF  is an infix operator or ( or ,
OP   is an operand, including function names

上面要注意的一件事是PREINF处于不相交的上下文中;也就是说,如果PRE有效,则INF无效,反之亦然.这意味着对于前缀运算符和中缀运算符使用相同的符号是明确的.此外,PREPOST处于不相交的上下文中,因此您可以对前缀和后缀运算符使用相同的符号.但是,后缀和中缀运算符不能使用相同的符号.作为示例,请考虑C/C ++运算符:

One thing to note in the above is that PRE and INF are in disjoint contexts; that is, if PRE is valid, then INF is not, and vice versa. This implies that using the same symbol for both a prefix operator and an infix operator is unambiguous. Also, PRE and POST are in disjoint contexts, so you can use the same symbol for prefix and postfix operators. However, you cannot use the same symbol for postfix and infix operators. As examples, consider the C/C++ operators:

-  prefix or infix
+  prefix or infix
-- prefix or postfix
++ prefix or postfix

我在上面隐式使用此功能,以允许(用作表达式分组程序(有效前缀)和函数调用(中缀,因为它位于函数名和参数列表之间;运算符为呼叫".)

I implicitly used this feature above to allow ( to be used either as an expression grouper (effectively prefix) and as a function call (infix, because it comes between the function name and the argument list; the operator is "call".)

最常见的是将正则表达式实现为状态机,因为只有两种状态:

It's most common to implement that regular expression as a state machine, because there are only two states:

 +-----+            +-----+
 |State|-----OP---->|State|
 |  1  |<----INF----|  2  |
 |     |---+        |     |---+
 +-----+   |        +-----+   |
    ^     PRE          ^     POST
    |      |           |      |
    +------+           +------+

我们可以将状态1称为想要的操作数",将状态2称为具有操作数".一个简单的实现只是将Wikipedia中介绍的调车场算法分成两个循环,就像这样(如果您不喜欢goto,可以将其消除,但这实际上是呈现状态机的最清晰方法)

We could call State 1 "want operand" and State 2 "have operand". A simple implementation would just split the shunting yard algorithm as presented in wikipedia into two loops, something like this (if you don't like goto, it can be eliminated, but it really is the clearest way to present a state machine).

want_operand:
  read a token. If there are no more tokens, announce an error.
  if the token is an prefix operator or an '(':
    mark it as prefix and push it onto the operator stack
    goto want_operand
  if the token is an operand (identifier or variable):
    add it to the output queue
    goto have_operand
  if the token is anything else, announce an error and stop. (**)

have_operand:
  read a token
  if there are no more tokens:
    pop all operators off the stack, adding each one to the output queue.
    if a `(` is found on the stack, announce an error and stop.
  if the token is a postfix operator:
    mark it as postfix and add it to the output queue
    goto have_operand.
  if the token is a ')':
    while the top of the stack is not '(':
      pop an operator off the stack and add it to the output queue
    if the stack becomes empty, announce an error and stop.
    if the '(' is marked infix, add a "call" operator to the output queue (*)
    pop the '(' off the top of the stack
    goto have_operand
  if the token is a ',':
    while the top of the stack is not '(':
      pop an operator off the stack and add it to the output queue
    if the stack becomes empty, announce an error
    goto want_operand
  if the token is an infix operator:
    (see the wikipeda entry for precedence handling)
    (normally, all prefix operators are considered to have higher precedence
    than infix operators.)
    got to want_operand
  otherwise, token is an operand. Announce an error

(*) The procedure as described above does not deal gracefully with parameter lists;
    when the postfix expression is being evaluated and a "call" operator is found, it's
    not clear how many arguments need to be evaluated. It might be that function names
    are clearly identifiable, so that the evaluator can just attach arguments to the
    "call" until a function name is found. But a cleaner solution is for the "," handler
    to increment the argument count of the "call" operator (that is, the open
    parenthesis marked as "infix"). The ")" handler also needs to increment the
    argument count.

(**) The state machine as presented does not correctly handle function calls with 0
    arguments. This call will show up as a ")" read in the want_operand state and with
    a "call" operator on top of the stack. If the "call" operator is marked with an
    argument count, as above, then the argument count must be 0 when the ")" is read,
    and in this case, unlike the have_operand case, the argument count must not be
    incremented.

这篇关于在调车场处理额外的操作员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆