使用来自 git 的 cparse 库解析用户输入的字符串 [英] Parsing strings of user input using the cparse library from git

查看:22
本文介绍了使用来自 git 的 cparse 库解析用户输入的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 C++ 编程的新手,希望使用 cparse 库 在这里 https://github.com/cparse/cparse 在我的项目中.我想解释用户输入的字符串,如a*(b+3)"(对于变量 ab)并将其用作一个函数在不同的输入集上重复.

I am new to c++ programming and wish to use the cparse library found here https://github.com/cparse/cparse in my project. I want to interpret strings of user input like "a*(b+3)" (for variables a and b) and use it as a function repeatedly on different sets of input.

例如,将文本文件作为输入,每行有 2 个 double 数字,我的代码将编写一个新文件,结果为 "a*(b+3)" 在每一行(假设 a 是第一个数字,b 是第二个数字).

For example, taking a text file as input with 2 double numbers per line my code will write a new file with the result of "a*(b+3)" on each line (assuming a is the first number and b is the second).

当我尝试从 git 中包含 cparse 库 时,我的问题出现了.我天真地遵循了设置说明(对 git 来说是新手):

My problem arises when I try and include the cparse library from git. I followed the setup instructions naively (being new to git):

$ cd cparse
$ make release

但是我无法使用 make 命令,因为我使用的是 Windows.我尝试下载 zip 文件并将 .cpp.h 文件直接复制到项目中,并在可视化中使用 include existing" 选项studio 中,但会出现大量编译器错误并且无法让代码自己运行.

But I cannot use the make command as I am using windows. I tried downloading the zip file and copying the .cpp and .h files into the project directly and using the "include existing" option in visual studio, but get a huge number of compiler errors and can't make the code work myself.

我是不是没抓住重点?我如何让它工作?

Am I missing the point somehow? How do I get it to work?

如果做不到这一点,是否有另一种方法来解析用户输入的字符串并将它们用作函数?

Failing that, is there another way to parse strings of user input and use them as functions?

推荐答案

如果做不到这一点,是否有另一种方法来解析用户输入的字符串并将它们用作函数?

Failing that, is there another way to parse strings of user input and use them as functions?

我想回答您问题的这一部分,因为我觉得功能齐全的 C 解析器可能对于您的意图来说有点太重了.(顺便说一句.一旦你运行了 C 解析器——如何处理它的输出?动态链接?)

I would like to answer this part of your question because I have the feeling that a full-featured C parser might be a little bit too heavy for what might be your intention. (Btw. once you got the C parser running – how to process its output? Dynamic linking?)

相反,我想向您展示如何自己构建一个简单的计算器(使用递归下降解析器).对于我将使用的技术的文档,我热烈推荐 Compilers (Principles, Techniques& 工具)作者:Aho、Lam、Sethi、Ullman(更广为人知的名字是龙之书"),尤其是第 4 章.

Instead, I want to show you how to build a simple calculator (with recursive descent parser) on your own. For the documentation of the techniques I will use, I warmly recommend Compilers (Principles, Techniques & Tools) by Aho, Lam, Set Ullman (better known as "Dragon books") and especially Chapter 4.

在下文中,我将逐个描述我的示例解决方案.

In the following I describe my sample solution part by part.

在开始编写编译器或解释器之前,定义一种应该被接受的语言是合理的.我想使用非常有限的 C: 表达式子集,包括

Before starting to write a compiler or interpreter, it's reasonable to define a language which shall be accepted. I want to use a very limited sub-set of C: expressions consisting of

  • C 喜欢浮点数(常量)
  • C 类标识符(变量)
  • 一元运算符 +-
  • 二元运算符 +-*/
  • 括号()
  • 分号 ;(用于标记表达式的结束,强制).
  • C like floating point numbers (constants)
  • C like identifiers (variables)
  • unary operators + and -
  • binary operators +, -, *, and /
  • parentheses ()
  • semicolons ; (to mark the end of an expression, mandatory).

空格(包括换行符)将被简单地忽略,但可用于分隔事物以及提高人类可读性.C 或 C++ 之类的注释(以及许多其他糖)我没有考虑将源代码保持在尽可能少的范围内.(尽管如此,我得到了将近 500 行.)

Whitespaces (including line-breaks) will be simply ignored but may be used to separate things as well as to improve human readability. C or C++ like comments (and a lot of other sugar) I didn't consider to keep the source code as minimal as possible. (For all that, I got nearly 500 lines.)

OP 的具体示例将通过添加分号适合这种语言:

The specific example of the OP will fit into this language with an added semicolon:

a*(b+3);

将只支持一种类型:double.因此,我不需要类型或任何使事情变得更容易的声明.

There will be only one type supported: double. Thus, I don't need types or any declaration which makes things easier.

在我开始描绘这种语言的语法之前,我在考虑目标"的编译并决定为...制作类

Before I started to sketch the grammar of this language, I was thinking about the "target" of compiling and decided to make classes for...

起初–一个存储变量的类:

At first – a class to store variables:

// storage of variables

class Var {
  private:
    double _value;
  public:
    Var(): _value() { }
    ~Var() = default;
    double get() const { return _value; }
    void set(double value) { _value = value; }
};

变量为值提供存储,但不为标识符提供存储.后者单独存储,因为实际使用变量时不需要它,只需按名称查找即可:

The variables provide storage for a value but not for the identifier. The latter is stored separately as it is not needed for the actual usage of a variable but only to find it by name:

typedef std::map<std::string, Var> VarTable;

std::map 的使用可以自动创建变量.许多高级语言都知道,变量在第一次访问时就开始存在.

The usage of a std::map automates the creation of a variable. As known from many high level languages, a variable starts to exist when its accessed the first time.

抽象语法树是一个

用编程语言编写的源代码的抽象句法结构的树表示.树的每个节点表示源代码中出现的一个构造.

a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code.

我从上面链接的维基百科文章中获取了这段文字–不能说更短.下面是我的 AST 课程:

I took this text from the above linked Wikipedia article – couldn't said it shorter. In the following my classes for the AST:

// abstract syntax tree -> storage of "executable"

namespace AST {

class Expr {
  protected:
    Expr() = default;
  public:
    virtual ~Expr() = default;
  public:
    virtual double solve() const = 0;
};

class ExprConst: public Expr {
  private:
    double _value;
  public:
    ExprConst(double value): Expr(), _value(value) { }
    virtual ~ExprConst() = default;
    virtual double solve() const { return _value; }
};

class ExprVar: public Expr {
  private:
    Var *_pVar;
  public:
    ExprVar(Var *pVar): Expr(), _pVar(pVar) { }
    virtual ~ExprVar() = default;
    virtual double solve() const { return _pVar->get(); }
};

class ExprUnOp: public Expr {
  protected:
    Expr *_pArg1;
  protected:
    ExprUnOp(Expr *pArg1): Expr(), _pArg1(pArg1) { }
    virtual ~ExprUnOp() { delete _pArg1; }
};

class ExprUnOpNeg: public ExprUnOp {
  public:
    ExprUnOpNeg(Expr *pArg1): ExprUnOp(pArg1) { }
    virtual ~ExprUnOpNeg() = default;
    virtual double solve() const
    {
      return -_pArg1->solve();
    }
};

class ExprBinOp: public Expr {
  protected:
    Expr *_pArg1, *_pArg2;
  protected:
    ExprBinOp(Expr *pArg1, Expr *pArg2):
      Expr(), _pArg1(pArg1), _pArg2(pArg2)
    { }
    virtual ~ExprBinOp() { delete _pArg1; delete _pArg2; }
};

class ExprBinOpAdd: public ExprBinOp {
  public:
    ExprBinOpAdd(Expr *pArg1, Expr *pArg2): ExprBinOp(pArg1, pArg2) { }
    virtual ~ExprBinOpAdd() = default;
    virtual double solve() const
    {
      return _pArg1->solve() + _pArg2->solve();
    }
};

class ExprBinOpSub: public ExprBinOp {
  public:
    ExprBinOpSub(Expr *pArg1, Expr *pArg2): ExprBinOp(pArg1, pArg2) { }
    virtual ~ExprBinOpSub() = default;
    virtual double solve() const
    {
      return _pArg1->solve() - _pArg2->solve();
    }
};

class ExprBinOpMul: public ExprBinOp {
  public:
    ExprBinOpMul(Expr *pArg1, Expr *pArg2): ExprBinOp(pArg1, pArg2) { }
    virtual ~ExprBinOpMul() = default;
    virtual double solve() const
    {
      return _pArg1->solve() * _pArg2->solve();
    }
};

class ExprBinOpDiv: public ExprBinOp {
  public:
    ExprBinOpDiv(Expr *pArg1, Expr *pArg2): ExprBinOp(pArg1, pArg2) { }
    virtual ~ExprBinOpDiv() = default;
    virtual double solve() const
    {
      return _pArg1->solve() / _pArg2->solve();
    }
};

因此,使用 AST 类,样本 a*(b+3) 的表示将是

Thus, using the AST classes, the representation of sample a*(b+3) would be

VarTable varTable;
Expr *pExpr
= new ExprBinOpMul(
    new ExprVar(&varTable["a"]),
    new ExprBinOpAdd(
      new ExprVar(&varTable["b"]),
      new ExprConst(3)));

注意:

没有从 Expr 派生的类来表示括号 () 因为这根本没有必要.在构建树本身时会考虑括号的处理.通常,具有较高优先级的运算符会成为具有较低优先级的运算符的子代.因此,前者先于后者计算.在上面的例子中,ExprBinOpAdd 的实例是 ExprBinOpMul 实例的子代(虽然乘法的优先级高于加法的优先级),这是适当考虑的结果括号.

There is no class derived from Expr to represent the parentheses () because this is simply not necessary. The processing of parentheses is considered when building the tree itself. Normally, the operators with higher precedence become children of the operators with lower precedence. As a result, the former are computed before the latter. In the above example, the instance of ExprBinOpAdd is a child of the instance of ExprBinOpMul (although precedence of multiply is higher than precedence of add) which results from the proper consideration of the parentheses.

除了存储解析后的表达式之外,这棵树还可以通过调用根节点的Expr::solve()方法来计算表达式:

Beside of storing a parsed expression, this tree can be used to compute the expression by calling the Expr::solve() method of the root node:

double result = pExpr->solve();

我们的微型计算器有一个后端,接下来是前端.

Having a backend for our Tiny Calculator, next is the front-end.

形式语言最好用语法来描述.

A formal language is best described by a grammar.

program
  : expr Semicolon program
  | <empty>
  ;

expr
  : addExpr
  ;

addExpr
  : mulExpr addExprRest
  ;

addExprRest
  : addOp mulExpr addExprRest
  | <empty>
  ;

addOp
  : Plus | Minus
  ;

mulExpr
  : unaryExpr mulExprRest
  ;

mulExprRest
  : mulOp unaryExpr mulExprRest
  | <empty>
  ;

mulOp
  : Star | Slash
  ;

unaryExpr
  : unOp unaryExpr
  | primExpr
  ;

unOp
  : Plus | Minus
  ;

primExpr
  : Number
  | Id
  | LParen expr RParen
  ;

带开始符号program.

规则包含

  • 终结符(以大写字母开头)和
  • 非终结符(以小写开头)
  • 一个冒号 (:) 将左侧和右侧分开(左侧的非终结符可能会扩展到右侧的符号).
  • 竖线 (|) 来分隔备选方案
  • 一个 符号,用于扩展为空(用于终止递归).
  • terminal symbols (starting with uppercase letter) and
  • non-terminal symbols (starting with lowercase)
  • a colon (:) to separate left side from right side (non-terminal on left side may expand to the symbols on the right side).
  • vertical bars (|) to separate alternatives
  • an <empty> symbol for expanding to nothing (used to terminate recursions).

从终端符号中,我将推导出扫描仪的令牌.

From the terminal symbols, I will derive the tokens for the scanner.

非终结符将被转化为解析器函数.

The non-terminal symbols will be transformed into the parser functions.

有意分离addExprmulExpr.因此,乘法运算符比加法运算符的优先级将在语法本身中燃烧".显然,括号中的浮点常量、变量标识符或表达式(在 primExpr 中接受)将具有最高优先级.

The separation of addExpr and mulExpr has been done intentionally. Thus, the precedence of multiplicative operators over additive operators will be "burnt in" the grammar itself. Obviously, the floating point constants, variable identifiers, or expressions in parentheses (accepted in primExpr) will have highest precedence.

规则只包含右递归.这是递归下降解析器的要求(正如我在 Dragon 书籍和调试器中的实践经验中学到的理论知识,直到我完全理解原因).在递归下降解析器中实现左递归规则会导致非终止递归,进而导致 StackOverflow.

The rules contain only right recursions. This is a requirement for recursive-descent parsers (as I've learnt theoretically in the Dragon books and from practical experiences in the debugger until I fully understood the reason why). Implementing a left recursive rule in a recursive-descent parser results in non-terminated recursions which, in turn, end up with a StackOverflow.

通常将编译器分为扫描器和解析器.

It's usual to split the compiler in a scanner and a parser.

扫描器处理输入字符流并将字符组合成标记.令牌在解析器中用作终结符.

The Scanner processes the input character stream and groups characters together to tokens. The tokens are used as terminal symbols in the parser.

对于token的输出,我做了一个类.在我的专业项目中,它还存储了确切的文件位置以表示其来源.这可以方便地使用源代码引用以及任何错误消息和调试信息的输出来标记创建的对象.(...省略此处以使其尽可能最小...)

For the output of tokens, I made a class. In my professional projects, it stores additionally the exact file position to denote its origin. This is convenient to tag created objects with source code references as well as any output of error messages and debugging info. (...left out here to keep it as minimal as possible...)

// token class - produced in scanner, consumed in parser
struct Token {
  // tokens
  enum Tk {
    Plus, Minus, Star, Slash, LParen, RParen, Semicolon,
    Number, Id,
    EOT, Error
  };
  // token number
  Tk tk;
  // lexem as floating point number
  double number;
  // lexem as identifier
  std::string id;

  // constructors.
  explicit Token(Tk tk): tk(tk), number() { }
  explicit Token(double number): tk(Number), number(number) { }
  explicit Token(const std::string &id): tk(Id), number(), id(id) { }
};

特殊标记有两个枚举器:

There are two enumerators for special tokens:

  • EOT ...文本结束(备注输入结束)
  • Error ... 为不适合任何其他标记的任何字符生成.
  • EOT ... end of text (remarks the end of input)
  • Error ... generated for any character which does not fit in any other token.

令牌用作实际扫描仪的输出:

The tokens are used as output of the actual scanner:

// the scanner - groups characters to tokens
class Scanner {
  private:
    std::istream &_in;
  public:
    // constructor.
    Scanner(std::istream &in): _in(in) { }
    /* groups characters to next token until the first character
     * which does not match (or end-of-file is reached).
     */
    Token scan()
    {
      char c;
      // skip white space
      do {
        if (!(_in >> c)) return Token(Token::EOT);
      } while (isspace(c));
      // classify character and build token
      switch (c) {
        case '+': return Token(Token::Plus);
        case '-': return Token(Token::Minus);
        case '*': return Token(Token::Star);
        case '/': return Token(Token::Slash);
        case '(': return Token(Token::LParen);
        case ')': return Token(Token::RParen);
        case ';': return Token(Token::Semicolon);
        default:
          if (isdigit(c)) {
            _in.unget(); double value; _in >> value;
            return Token(value);
          } else if (isalpha(c) || c == '_') {
            std::string id(1, c);
            while (_in >> c) {
              if (isalnum(c) || c == '_') id += c;
              else { _in.unget(); break; }
            }
            return Token(id);
          } else {
            _in.unget();
            return Token(Token::Error);
          }
      }
    }
};

扫描器用于解析器.

class Parser {
  private:
    Scanner _scanner;
    VarTable &_varTable;
    Token _lookAhead;

  private:
    // constructor.
    Parser(std::istream &in, VarTable &varTable):
      _scanner(in), _varTable(varTable), _lookAhead(Token::EOT)
    {
      scan(); // load look ahead initially
    }

    // calls the scanner to read the next look ahead token.
    void scan() { _lookAhead = _scanner.scan(); }

    // consumes a specific token.
    bool match(Token::Tk tk)
    {
      if (_lookAhead.tk != tk) {
        std::cerr << "SYNTAX ERROR! Unexpected token!" << std::endl;
        return false;
      }
      scan();
      return true;
    }

    // the rules:

    std::vector<AST::Expr*> parseProgram()
    {
      // right recursive rule
      // -> can be done as iteration
      std::vector<AST::Expr*> pExprs;
      for (;;) {
        if (AST::Expr *pExpr = parseExpr()) {
          pExprs.push_back(pExpr);
        } else break;
        // special error checking for missing ';' (usual error)
        if (_lookAhead.tk != Token::Semicolon) {
          std::cerr << "SYNTAX ERROR: Semicolon expected!" << std::endl;
          break;
        }
        scan(); // consume semicolon
        if (_lookAhead.tk == Token::EOT) return pExprs;
      }
      // error handling
      for (AST::Expr *pExpr : pExprs) delete pExpr;
      pExprs.clear();
      return pExprs;
    }

    AST::Expr* parseExpr()
    {
      return parseAddExpr();
    }

    AST::Expr* parseAddExpr()
    {
      if (AST::Expr *pExpr1 = parseMulExpr()) {
        return parseAddExprRest(pExpr1);
      } else return nullptr; // ERROR!
    }

    AST::Expr* parseAddExprRest(AST::Expr *pExpr1)
    {
      // right recursive rule for left associative operators
      // -> can be done as iteration
      for (;;) {
        switch (_lookAhead.tk) {
          case Token::Plus:
            scan(); // consume token
            if (AST::Expr *pExpr2 = parseMulExpr()) {
              pExpr1 = new AST::ExprBinOpAdd(pExpr1, pExpr2);
            } else {
              delete pExpr1;
              return nullptr; // ERROR!
            }
            break;
          case Token::Minus:
            scan(); // consume token
            if (AST::Expr *pExpr2 = parseMulExpr()) {
              pExpr1 = new AST::ExprBinOpSub(pExpr1, pExpr2);
            } else {
              delete pExpr1;
              return nullptr; // ERROR!
            }
            break;
          case Token::Error:
            std::cerr << "SYNTAX ERROR: Unexpected character!" << std::endl;
            delete pExpr1;
            return nullptr;
          default: return pExpr1;
        }
      }
    }

    AST::Expr* parseMulExpr()
    {
      if (AST::Expr *pExpr1 = parseUnExpr()) {
        return parseMulExprRest(pExpr1);
      } else return nullptr; // ERROR!
    }

    AST::Expr* parseMulExprRest(AST::Expr *pExpr1)
    {
      // right recursive rule for left associative operators
      // -> can be done as iteration
      for (;;) {
        switch (_lookAhead.tk) {
          case Token::Star:
            scan(); // consume token
            if (AST::Expr *pExpr2 = parseUnExpr()) {
              pExpr1 = new AST::ExprBinOpMul(pExpr1, pExpr2);
            } else {
              delete pExpr1;
              return nullptr; // ERROR!
            }
            break;
          case Token::Slash:
            scan(); // consume token
            if (AST::Expr *pExpr2 = parseUnExpr()) {
              pExpr1 = new AST::ExprBinOpDiv(pExpr1, pExpr2);
            } else {
              delete pExpr1;
              return nullptr; // ERROR!
            }
            break;
          case Token::Error:
            std::cerr << "SYNTAX ERROR: Unexpected character!" << std::endl;
            delete pExpr1;
            return nullptr;
          default: return pExpr1;
        }
      }
    }

    AST::Expr* parseUnExpr()
    {
      // right recursive rule for right associative operators
      // -> must be done as recursion
      switch (_lookAhead.tk) {
        case Token::Plus:
          scan(); // consume token
          // as a unary plus has no effect it is simply ignored
          return parseUnExpr();
        case Token::Minus:
          scan();
          if (AST::Expr *pExpr = parseUnExpr()) {
            return new AST::ExprUnOpNeg(pExpr);
          } else return nullptr; // ERROR!
        default:
          return parsePrimExpr();
      }
    }

    AST::Expr* parsePrimExpr()
    {
      AST::Expr *pExpr = nullptr;
      switch (_lookAhead.tk) {
        case Token::Number:
          pExpr = new AST::ExprConst(_lookAhead.number);
          scan(); // consume token
          break;
        case Token::Id: {
          Var &var = _varTable[_lookAhead.id]; // find or create
          pExpr = new AST::ExprVar(&var);
          scan(); // consume token
        } break;
        case Token::LParen:
          scan(); // consume token
          if (!(pExpr = parseExpr())) return nullptr; // ERROR!
          if (!match(Token::RParen)) {
            delete pExpr; return nullptr; // ERROR!
          }
          break;
        case Token::EOT:
          std::cerr << "SYNTAX ERROR: Premature EOF!" << std::endl;
          break;
        case Token::Error:
          std::cerr << "SYNTAX ERROR: Unexpected character!" << std::endl;
          break;
        default:
          std::cerr << "SYNTAX ERROR: Unexpected token!" << std::endl;
      }
      return pExpr;
    }

  public:

    // the parser function
    static std::vector<AST::Expr*> parse(
      std::istream &in, VarTable &varTable)
    {
      Parser parser(in, varTable);
      return parser.parseProgram();
    }
};

基本上,解析器本质上由一堆规则函数(根据语法规则)组成,它们相互调用.规则函数周围的类负责管理一些全局解析器上下文.因此,class Parser 的唯一公共方法是

Basically, the parser consists essentially of a bunch of rule functions (according to the grammar rules) which are calling each other. The class around the rule functions is responsible to manage some global parser context. Hence, the only public method of class Parser is

static std::vector<AST::Expr*> Parser::parse();

它构造一个实例(带有私有构造函数)并调用与起始符号program相对应的函数Parser::parseProgram().

which constructs an instance (with the private constructor) and calls the function Parser::parseProgram() corresponding to the start symbol program.

在内部,解析器调用 Scanner::scan() 方法来填充它的前瞻标记.

Internally, the parser calls the Scanner::scan() method to fill its look-ahead token.

这是在 Parser::scan() 中完成的,它总是在必须消耗令牌时调用.

This is done in Parser::scan() which is called always when a token has to be consumed.

仔细观察,可以看到规则是如何转换为解析器函数的:

Looking closer, a pattern becomes visible how the rules have been translated into the parser functions:

  • 左侧的每个非终结符都成为解析函数.(仔细查看源代码,你会发现我并没有完全这样做.一些规则已经内联"了.–从我的角度来看,我插入了一些额外的规则来简化我没有做的语法不打算从头变身.对不起.)

  • Each non-terminal on the left side becomes a parse function. (Looking closer in the source code, you will realize that I didn't do this exactly. Some of the rules have been "inlined". – From my point of view, I inserted some extra rules to simplify the grammar which I didn't intend to transform from beginning. Sorry.)

替代方案 (|) 实现为 switch (_lookAhead.tk).因此,每个 case 标签对应于最左边的符号可以扩展到的第一个终端(一个或多个).(我相信这就是为什么它被称为前瞻解析器"——应用规则的决定总是基于前瞻标记.)龙书有一个关于 FIRST-FOLLOW 集的主题,它更详细地解释了这一点.

Alternatives (|) are implemented as switch (_lookAhead.tk). Thereby, each case label corresponds to the first terminal(s) (token(s)) to which the left most symbol may expand. (I believe that's why it is called "look-ahead parser" – decisions to apply rules are always done based on the look ahead token.) The dragon book has a subject about FIRST-FOLLOW sets which explains this in more detail.

对于终结符,Parser::scan() 被调用.在特殊情况下,如果只需要一个终端(令牌),它会被 Parser::match() 替换.

For terminal symbols, Parser::scan() is called. In special cases, it is replaced by Parser::match() if precisely one terminal (token) is expected.

对于非终结符,完成相应函数的调用.

For non-terminal symbols, the call of the corresponding function is done.

符号序列只是作为上述调用的序列完成的.

Symbol sequences are simply done as sequences of the above calls.

这个解析器的错误处理是我做过的最简单的.它可以/应该做更多的支持(投入更多的努力,即额外的代码行).(...但在这里我尽量保持最小...)

The error-handling of this parser is the most simple I ever did. It could/should be done much more supporting (investing more effort, i.e. additional lines of code). (...but here I tried to keep it minimal...)

为了测试和演示,我准备了一个 main() 函数和一些内置示例(程序的源代码和要处理的数据):

For testing and demonstration, I prepared a main() function with some built-in samples (source code of program and data to process):

// a sample application

using namespace std;

int main()
{
  // the program:
  const char *sourceCode =
    "1 + 2 * 3 / 4 - 5;
"
    "a + b;
"
    "a - b;
"
    "a * b;
"
    "a / b;
"
    "a * (b + 3);
";
  // the variables
  const char *vars[] = { "a", "b" };
  enum { nVars = sizeof vars / sizeof *vars };
  // the data
  const double data[][nVars] = {
    { 4.0, 2.0 },
    { 2.0, 4.0 },
    { 10.0, 5.0 },
    { 42, 6 * 7 }
  };
  // compile program
  stringstream in(sourceCode);
  VarTable varTable;
  vector<AST::Expr*> program = Parser::parse(in, varTable);
  if (program.empty()) {
    cerr << "ERROR: Compile failed!" << endl;
    string line;
    if (getline(in, line)) {
      cerr << "Text at error: '" << line << "'" << endl;
    }
    return 1;
  }
  // apply program to the data
  enum { nDataSets = sizeof data / sizeof *data };
  for (size_t i = 0; i < nDataSets; ++i) {
    const char *sep = "";
    cout << "Data Set:" << endl;
    for (size_t j = 0; j < nVars; ++j, sep = ", ") {
      cout << sep << vars[j] << ": " << data[i][j];
    }
    cout << endl;
    // load data
    for (size_t j = 0; j < nVars; ++j) varTable[vars[j]].set(data[i][j]);
    // perform program
    cout << "Compute:" << endl;
    istringstream in(sourceCode);
    for (const AST::Expr *pExpr : program) {
      string line; getline(in, line);
      cout << line << ": " << pExpr->solve() << endl;
    }
    cout << endl;
  }
  // clear the program
  for (AST::Expr *pExpr : program) delete pExpr;
  program.clear();
  // done
  return 0;  
}

我在 VS2013 (Windows 10) 上编译和测试并得到:

I compiled and tested on VS2013 (Windows 10) and got:

Data Set:
a: 4, b: 2
Compute:
1 + 2 * 3 / 4 - 5;: -2.5
a + b;: 6
a - b;: 2
a * b;: 8
a / b;: 2
a * (b + 3);: 20

Data Set:
a: 2, b: 4
Compute:
1 + 2 * 3 / 4 - 5;: -2.5
a + b;: 6
a - b;: -2
a * b;: 8
a / b;: 0.5
a * (b + 3);: 14

Data Set:
a: 10, b: 5
Compute:
1 + 2 * 3 / 4 - 5;: -2.5
a + b;: 15
a - b;: 5
a * b;: 50
a / b;: 2
a * (b + 3);: 80

Data Set:
a: 42, b: 42
Compute:
1 + 2 * 3 / 4 - 5;: -2.5
a + b;: 84
a - b;: 0
a * b;: 1764
a / b;: 1
a * (b + 3);: 1890

请考虑解析器本身会忽略任何空格和换行符.但是,为了简化示例输出格式,我必须每行使用一个以分号结尾的表达式.否则,将很难将源代码行与相应的编译表达式相关联.(请记住我上面关于 Token 的注释,其中可能会添加源代码引用(又名文件位置).)

Please, consider that the parser itself ignores any spaces and line-breaks. However, to make the sample output formatting simple, I have to use one semicolon-terminated expression per line. Otherwise, it will be difficult to associate the source code lines with the corresponding compiled expressions. (Remember my note above about the Token to which a source code reference (aka file position) might be added.)

...可以在 ideone 上找到源代码和测试运行.

...with source code and test run can be found on ideone.

这篇关于使用来自 git 的 cparse 库解析用户输入的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆