树梢的基本解析和正则表达式的用法 [英] Treetop basic parsing and regular expression usage

查看:131
本文介绍了树梢的基本解析和正则表达式的用法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用ruby Treetop库开发脚本,并且在使用正则表达式的语法时遇到问题.首先,许多可在其他设置中使用的正则表达式在树梢视图中无法正常工作.

I'm developing a script using the ruby Treetop library and having issues working with its syntax for regex's. First off, many regular expressions that work in other settings dont work the same in treetop.

这是我的语法:(myline.treetop)

This is my grammar: (myline.treetop)


grammar MyLine
    rule line
        string whitespace condition
    end
    rule string
        [\S]*
    end
    rule whitespace
        [\s]*
    end
    rule condition
        "new" / "old" / "used"
    end
end

这是我的用法:(usage.rb)

This is my usage: (usage.rb)


require 'rubygems'
require 'treetop'
require 'polyglot'
require 'myline'

parser = MyLineParser.new
p parser.parse("randomstring new")

这应该可以肯定地找到"new"这个词,而且确实可以!现在,我不会扩展它,以便在输入字符串变为"randomstring anotherstring new yetanother andanother"时可以找到新的字符串 并且在规则条件的正则表达式前后可能包含任意数量的字符串,后跟空格(包括制表符).换句话说,如果我在其中传递任何带有"new"等单词的句子,则它应该能够与之匹配.

This should find the word new for sure and it does! Now I wont to extend it so that it can find new if the input string becomes "randomstring anotherstring new yetanother andanother" and possibly have any number of strings followed by whitespace (tab included) before and after the regex for rule condition. In other words, if I pass it any sentence with the word "new" etc in it, it should be able to match it.

因此,假设我将语法更改为:

So let's say I change my grammar to:


rule line
    string whitespace condition whitespace string
end

然后,它应该能够找到以下项的匹配项:

Then, it should be able to find a match for:

p parser.parse("randomstring new anotherstring")

那么,我该怎么做才能使条件之前和之后的字符串空白都可以重复?如果我尝试这样写:

So, what do I have to do to allow the string whitespace to be repeated before and after condition? If I try to write this:


rule line
    (string whitespace)* condition (whitespace string)*
end

,它进入无限循环.如果我将上述()替换为[],则返回nil 总的来说,当我使用上面的表达式时,正则表达式返回一个匹配项,而树顶正则表达式则不返回匹配项. 有人对如何做到这一点有任何提示/观点吗?另外,由于关于树梢的文档并不多,而且示例太琐碎或太复杂,所以有人知道树梢的更详尽的文档/指南吗?

, it goes in an infinite loop. If i replace the above () with [], it returns nil In general, regex's return a match when i use the above, but treetop regex's dont. Does anyone have any tips/points on how to go about this? Plus, since there isn't much documentation for treetop and the examples are either too trivial or too complex, is there anyone who knows a more thorough documentation/guide for treetop?

推荐答案

看起来您甚至不需要语法就可以完成您要问的事情.在这种情况下,一个简单的正则表达式就足够了:

It looks like you don't even need a grammar to do what you're asking. A simple regex is sufficient in this case:

line.match(/(.*)\s(new|old|used)\s(.*)/)

(例如: http://rubular.com/r/Kl8rUifxeu )

您可以使用以下条件获取包含条件之前和之后内容的数组:

You can get an array containing the stuff before and after the condition with:

Regexp.last_match(1).split + Regexp.last_match(3)

并使用以下条件测试条件:

And test the condition with:

return "Sweet, it's new!" if Regexp.last_match(2) == "new"

这篇关于树梢的基本解析和正则表达式的用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆