如何使用 Regexp.union 匹配字符串开头的字符 [英] How to use Regexp.union to match a character at the beginning of my string

查看:49
本文介绍了如何使用 Regexp.union 匹配字符串开头的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Ruby 2.4.我想匹配一个可选的a"或b"字符,后跟任意数量的空格,然后是一个或多个数字,但我的正则表达式无法匹配其中任何一个:

I'm using Ruby 2.4. I want to match an optional "a" or "b" character, followed by an arbitrary amount of white space, and then one or more numbers, but my regex's are failing to match any of these:

2.4.0 :017 > MY_TOKENS = ["a", "b"]
 => ["a", "b"]
2.4.0 :018 > str = "40"
 => "40"
2.4.0 :019 > str =~ Regexp.new("^[#{Regexp.union(MY_TOKENS)}]?[[:space:]]*\d+[^a-z^0-9]*$")
 => nil
2.4.0 :020 > str =~ Regexp.new("^#{Regexp.union(MY_TOKENS)}?[[:space:]]*\d+[^a-z^0-9]*$")
 => nil
2.4.0 :021 > str =~ Regexp.new("^#{Regexp.union(MY_TOKENS)}?[[:space:]]*\d+$")
 => nil

我很难过我做错了什么.

I'm stumped as to what I'm doing wrong.

推荐答案

我相信您想匹配一个字符串,该字符串可能包含您在 MY_TOKENS 中定义的任何替代项,然后是 0+ 个空格和然后是 1 个或多个数字,直到字符串的末尾.

I believe you want to match a string that may contain any of the alternatives you defined in the MY_TOKENS, then 0+ whitespaces and then 1 or more digits up to the end of the string.

那么你需要使用

Regexp.new("\\A#{Regexp.union(MY_TOKENS)}?[[:space:]]*\\d+\\z").match?(s)

/\A#{Regexp.union(MY_TOKENS)}?[[:space:]]*\d+\z/.match?(s)

当你使用 Regexp.new 时,你应该记住双转义反斜杠来定义一个文字反斜杠(例如\d"是一个数字匹配模式).在正则表达式文字符号中,您可以使用单个反斜杠 (/\d/).

When you use a Regexp.new, you should rememeber to double escape backslashes to define a literal backslash (e.g. "\d" is a digit matching pattern). In a regex literal notation, you may use a single backslash (/\d/).

不要忘记用 \A 匹配字符串的开头,用 \z 锚匹配字符串的结尾.

Do not forget to match the start of a string with \A and end of string with \z anchors.

请注意,[...] 创建了一个字符类,该类匹配其中定义的任何字符:[ab] 匹配 a> 或 b[program] 将匹配一个字符,或者 p, r, ogram.如果MY_TOKENS 中有多字符序列,则需要从模式中删除[...].

Note that [...] creates a character class that matches any char that is defined inside it: [ab] matches an a or b, [program] will match one char, either p, r, o, g, r, a or m. If you have multicharacter sequences in the MY_TOKENS, you need to remove [...] from the pattern.

要使正则表达式不区分大小写,请将不区分大小写的修饰符传递给模式,并确保使用 Regex.union 创建的正则表达式的 .source 属性来删除标志(谢谢,埃里克):

To make the regex case insensitive, pass a case insensitive modifier to the pattern and make sure you use .source property of the Regex.union created regex to remove flags (thanks, Eric):

Regexp.new("(?i)\\A#{Regexp.union(MY_TOKENS).source}?[[:space:]]*\\d+\\z")

/\A#{Regexp.union(MY_TOKENS).source}?[[:space:]]*\d+\z/i

创建的正则表达式是 /(?i-mx:\Aa|b?[[:space:]]*\d+\z)/ 其中 (?i-mx) 表示不区分大小写模式和多行(点匹配换行符和详细模式关闭).

The regex created is /(?i-mx:\Aa|b?[[:space:]]*\d+\z)/ where (?i-mx) means the case insensitive mode is on and multiline (dot matches line breaks and verbose modes are off).

这篇关于如何使用 Regexp.union 匹配字符串开头的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆