Anchor \z 和 \G 在 Ruby 中是如何工作的? [英] How the Anchor \z and \G works in Ruby?

查看:32
本文介绍了Anchor \z 和 \G 在 Ruby 中是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Ruby1.9.3.我是这个平台的新手.

I am using Ruby1.9.3. I am newbie to this platform.

doc 我刚刚熟悉了两个 anchor\z\G.现在我稍微玩了一下 \z 来看看它是如何工作的,因为定义(EndEnd of String)让我感到困惑,我无法理解它的意思 - End.所以我尝试了下面的小片段.但是还是抓不到.

From the doc I just got familiared with two anchor which are \z and \G. Now I little bit played with \z to see how it works, as the definition(End or End of String) made me confused, I can't understand what it meant say - by End. So I tried the below small snippets. But still unable to catch.

代码

irb(main):011:0> str = "Hit him on the head me 2\n" + "Hit him on the head wit>
=> "Hit him on the head me 2\nHit him on the head with a 24\n"
irb(main):012:0> str =~ /\d\z/
=> nil

irb(main):013:0> str = "Hit him on the head me 24 2\n" + "Hit him on the head >
=> "Hit him on the head me 24 2\nHit him on the head with a 24\n"
irb(main):014:0> str =~ /\d\z/
=> nil

irb(main):018:0> str = "Hit1 him on the head me 24 2\n" + "Hit him on the head>
=> "Hit1 him on the head me 24 2\nHit him on the head with a11 11 24\n"
irb(main):019:0> str =~ /\d\z/
=> nil
irb(main):020:0>

每次我得到 nil 作为输出.那么 \z 的计算是如何进行的?End 是什么意思?- 我认为我的概念对文档中的 End 词有任何错误.所以任何人都可以帮助我了解发生的原因以及为什么会发生这种情况?

Every time I got nil as the output. So how the calculation is going on for \z ? what does End mean? - I think my concept took anything wrong with the End word in the doc. So anyone could help me out to understand the reason what is happening with the out why so happening?

而且我也没有找到锚 \G 的任何示例.请问你们有什么例子可以使 \G 在实时编程中的使用可视化?

And also i didn't find any example for the anchor \G . Any example please from you people to make visualize how \G used in real time programming?

编辑

irb(main):029:0>
irb(main):030:0*  ("{123}{45}{6789}").scan(/\G(?!^)\{\d+\}/)
=> []
irb(main):031:0>  ('{123}{45}{6789}').scan(/\G(?!^)\{\d+\}/)
=> []
irb(main):032:0>

谢谢

推荐答案

\z 匹配输入的结尾.您正在尝试查找在输入末尾出现 4 的匹配项.问题是,输入的末尾有一个换行符,所以你找不到匹配项.\Z 匹配输入的结尾或输入结尾的换行符.

\z matches the end of the input. You are trying to find a match where 4 occurs at the end of the input. Problem is, there is a newline at the end of the input, so you don't find a match. \Z matches either the end of the input or a newline at the end of the input.

所以:

/\d\z/

匹配4":

"24"

和:

/\d\Z/

匹配上例中的4"和:

"24\n"

以使用 \G 为例,查看此问题:
Java 中的正则表达式匹配器 \G(前一个匹配的结束)的示例会很好

Check out this question for example of using \G:
Examples of regex matcher \G (The end of the previous match) in Java would be nice

更新:\G

我想出了一个更真实的例子.假设您有一个由无法很好预测的任意字符分隔的单词列表(或者列出的可能性太多).您希望匹配这些单词,其中每个单词都是它自己的匹配项,直到特定单词为止,之后您不想再匹配任何单词.例如:

I came up with a more real world example. Say you have a list of words that are separated by arbitrary characters that cannot be well predicted (or there's too many possibilities to list). You'd like to match these words where each word is its own match up until a particular word, after which you don't want to match any more words. For example:

foo,bar.baz:buz'fuzz*hoo-har/haz|fil^bil!bak

foo,bar.baz:buz'fuzz*hoo-har/haz|fil^bil!bak

您想匹配每个单词直到 'har'.您不想匹配 'har' 或后面的任何单词.您可以使用以下模式相对轻松地完成此操作:

You want to match each word until 'har'. You don't want to match 'har' or any of the words that follow. You can do this relatively easily using the following pattern:

/(?<=^|\G\W)\w+\b(?<!har)/

rubular

第一次尝试将匹配输入的开头后跟零个非单词字符后跟 3 个单词字符 ('foo') 后跟一个单词边界.最后,否定回溯确保刚刚匹配的单词不是 'har'.

The first attempt will match the beginning of the input followed by zero non-word character followed by 3 word characters ('foo') followed by a word boundary. Finally, a negative lookbehind assures that the word which has just been matched is not 'har'.

在第二次尝试时,匹配会在最后一场比赛结束时恢复.匹配了 1 个非单词字符(',' - 尽管由于后视而未被捕获,这是一个零宽度断言),然后是 3 个字符('bar').

On the second attempt, matching picks back up at the end of the last match. 1 non-word character is matched (',' - though it is not captured due to the lookbehind, which is a zero-width assertion), followed by 3 characters ('bar').

这种情况一直持续到 'har' 被匹配,此时负向后视被触发,匹配失败.因为所有匹配都应该附加"到最后一个成功的匹配上,所以不会匹配任何额外的词.

This continues until 'har' is matched, at which point the negative lookbehind is triggered and the match fails. Because all matches are supposed to be "attached" to the last successful match, no additional words will be matched.

结果是:

foo
bar
baz
buz
fuzz
hoo

如果您想反转它并在 'har' 之后包含所有单词(但同样,不包括 'har'),您可以使用这样的表达式:

If you want to reverse it and have all words after 'har' (but, again, not including 'har'), you can use an expression like this:

/(?!^)(?<=har\W|\G\W)\w+\b/

rubular

这将匹配紧跟在 'har' 之前的单词或最后一个匹配的结尾(除非我们必须确保不匹配输入的开头).匹配列表是:

This will match either a word which is immediately preceeded by 'har' or the end of the last match (except we have to make sure not to match the beginning of the input). The list of matches is:

haz
fil
bil
bak

如果您确实想匹配 'har' 和以下所有单词,您可以使用:

If you do want to match 'har' and all following words, you could use this:

/\bhar\b|(?!^)(?<=\G\W)\w+\b/

rubular

这会产生以下匹配项:

har
haz
fil
bil
bak

这篇关于Anchor \z 和 \G 在 Ruby 中是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆