如何在 Ruby 中的其他正则表达式中嵌入正则表达式 [英] How to embed regular expressions in other regular expressions in Ruby

查看:70
本文介绍了如何在 Ruby 中的其他正则表达式中嵌入正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串:

'A Foo'

并想在其中找到Foo".

and want to find "Foo" in it.

我有一个正则表达式:

/foo/

我正在嵌入另一个不区分大小写的正则表达式,因此我可以分步构建模式:

that I'm embedding into another case-insensitive regular expression, so I can build the pattern in steps:

foo_regex = /foo/
pattern = /A #{ foo_regex }/i

但它不会正确匹配:

'A Foo' =~ pattern # => nil

如果我将文本直接嵌入到它的工作模式中:

If I embed the text directly into the pattern it works:

'A Foo' =~ /A foo/i # => 0

怎么了?

推荐答案

从表面上看,在另一个模式中嵌入模式似乎很简单,但这是基于对模式在 Ruby 中如何工作的错误假设,即它们只是字符串.使用:

On the surface it seems that embedding a pattern inside another pattern would simply work, but that's based on a bad assumption of how patterns work in Ruby, that they're simply strings. Using:

foo_regex = /foo/

创建一个 Regexp 对象:

creates a Regexp object:

/foo/.class # => Regexp

因此,它知道用于创建它的可选标志:

As such it has knowledge of the optional flags used to create it:

( /foo/    ).options # => 0
( /foo/i   ).options # => 1
( /foo/x   ).options # => 2
( /foo/ix  ).options # => 3
( /foo/m   ).options # => 4
( /foo/im  ).options # => 5
( /foo/mx  ).options # => 6
( /foo/imx ).options # => 7

或者,如果你喜欢二进制:

or, if you like binary:

'%04b' % ( /foo/    ).options # => "0000"
'%04b' % ( /foo/i   ).options # => "0001"
'%04b' % ( /foo/x   ).options # => "0010"
'%04b' % ( /foo/xi  ).options # => "0011"
'%04b' % ( /foo/m   ).options # => "0100"
'%04b' % ( /foo/mi  ).options # => "0101"
'%04b' % ( /foo/mx  ).options # => "0110"
'%04b' % ( /foo/mxi ).options # => "0111"

并在使用 Regexp 时记住那些,无论是作为独立模式还是嵌入到另一个模式中.

and remembers those whenever the Regexp is used, whether as a standalone pattern or if embedded in another.

如果我们查看嵌入后模式的样子,您可以看到这一点:

You can see this in action if we look to see what the pattern looks like after embedding:

/#{ /foo/  }/ # => /(?-mix:foo)/
/#{ /foo/i }/ # => /(?i-mx:foo)/

?-mix:?i-mx: 是这些选项在嵌入模式中的表示方式.

?-mix: and ?i-mx: are how those options are represented in an embedded-pattern.

根据选项<的Regexp文档/a>:

According to the Regexp documentation for Options:

imx 也可以通过 (?on-off)<应用于子表达式级别/em> 构造,它为括号括起来的表达式启用选项 on,并禁用选项 off.

i, m, and x can also be applied on the subexpression level with the (?on-off) construct, which enables options on, and disables options off for the expression enclosed by the parentheses.

因此,Regexp 会记住这些选项,即使在外部模式内部,也会导致整个模式无法匹配:

So, Regexp is remembering those options, even inside the outer pattern, causing the overall pattern to fail the match:

pattern = /A #{ foo_regex }/i # => /A (?-mix:foo)/i
'A Foo' =~ pattern # => nil

可以确保所有子表达式与其周围的模式相匹配,但这很快就会变得过于复杂或混乱:

It's possible to make sure that all sub-expressions match their surrounding patterns, however that can quickly become too convoluted or messy:

foo_regex = /foo/i
pattern = /A #{ foo_regex }/i # => /A (?i-mx:foo)/i
'A Foo' =~ pattern # => 0

相反,我们有返回模式文本的 source 方法:

Instead we have the source method which returns the text of a pattern:

/#{ /foo/.source  }/ # => /foo/
/#{ /foo/i.source }/ # => /foo/

使用其他Regexp方法时也会出现嵌入模式记住选项的问题,例如union:

The problem with the embedded pattern remembering the options also appears when using other Regexp methods, such as union:

/#{ Regexp.union(%w[a b]) }/ # => /(?-mix:a|b)/

再说一次,source 可以提供帮助:

and again, source can help:

/#{ Regexp.union(%w[a b]).source }/ # => /a|b/

知道这一切:

foo_regex = /foo/
pattern = /#{ foo_regex.source }/i # => /foo/i
'A Foo' =~ pattern # => 2

这篇关于如何在 Ruby 中的其他正则表达式中嵌入正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆