正则表达式中的 $/和 $¢ 有什么区别? [英] What's the difference between $/ and $¢ in regex?

查看:70
本文介绍了正则表达式中的 $/和 $¢ 有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如标题所示,$/ 有什么区别?它们似乎总是具有相同的值:

As the title indicates, what is the difference between $/ and ? They appear to always have the same value:

my $text = "Hello world";

$text ~~ /(\w+) { say $/.raku } (\w+)/;
$text ~~ /(\w+) { say $¢.raku } (\w+)/;

两者都会导致具有相同值的匹配对象.使用一个而不是另一个的逻辑是什么?

Both result in Match objects with the same values. What's the logic in using one over the other?

推荐答案

变量 $/ 指的是最近的匹配,而变量 指的是最近的匹配最近最外面的比赛.在像上面这样的大多数基本正则表达式中,这可能是一个和相同的.但是从 .raku 方法的输出可以看出,Match 对象可以包含其他 Match 对象(这就是你使用时得到的$$1 用于捕获).

The variable $/ refers to the most recent match while the variable refers to the most recent outermost match. In most basic regexes like the above, that may be one and the same. But as can be seen from the output of the .raku method, Match objects can contain other Match objects (that's what you get when you use $<foo> or $1 for captures).

假设我们有以下带有量化捕获的正则表达式

Suppose instead we had the following regex with a quantified capture

/ ab (cd { say $¢.from, " ", $¢.to } ) + /

如果我们匹配abcdcdcd",运行它会看到以下输出:

And ran it would see the following output if we matched against "abcdcdcd":

0 2
0 4
0 6

但是如果我们从使用 改为 $/,我们会得到不同的结果:

But if we change from using to $/, we get a different result:

2 2
4 4
6 6

(.to 似乎有点偏离的原因是它 - 和 .pos - 在捕获块结束之前不会更新.)

(The reason the .to seems to be a bit off is that it —and .pos— are not updated until the end of the capture block.)

换句话说,总是指代你的最终匹配对象(即,$final = $text ~~ $regex) 这样你就可以像完成完整匹配后一样在正则表达式中遍历复杂的捕获树 所以在上面的例子中,你可以只做 $¢[0] 来引用到第一场比赛,$¢[1] 第二场,依此类推

In other words, will always refer to what will be your final match object (i.e., $final = $text ~~ $regex) so you can traverse a complex capture tree inside of the regex exactly as you would after having finished the full match So in the above example, you could just do $¢[0] to refer to the first match, $¢[1] the second, etc.

在正则表达式代码块内,$/ 将引用最直接的匹配.在上面的例子中,这是 ( ) 内部的匹配,不会知道其他匹配,也不知道匹配的原始开始:只是 ( ) 的开始代码>块.所以给出一个更复杂的正则表达式:

Inside of a regex code block, $/ will refer to the most immediate match. In the above case, that's the match for inside the ( ) and won't know about the other matches, nor the original start of the matching: just the start for the ( ) block. So give a more complex regex:

/ a $<foo>=(b $<bar>=(c)+ )+ d /

我们可以通过说 使用 $¢ 访问所有 foo 令牌.我们可以通过使用 [0] 来访问给定 foobar 标记.如果我们在 foo 的捕获中插入一个代码块,它将能够通过使用 $ 访问 bar 令牌或$/,但是它不能访问其他foo.

We can access at any point using $¢ all of the foo tokens by saying $¢<foo>. We can access the bar tokens of a given foo by using $¢<foo>[0]<bar>. If we insert a code block inside of foo's capture, it will be able to access bar tokens by using $<bar> or $/<bar>, but it won't be able to access other foos.

这篇关于正则表达式中的 $/和 $¢ 有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆