使用 Ruby 扫描方法捕获组无法按预期工作 [英] Capturing groups don't work as expected with Ruby scan method

查看:39
本文介绍了使用 Ruby 扫描方法捕获组无法按预期工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从多行字符串中获取一组浮点数(正数和负数).例如:-45.124, 1124.325

I need to get an array of floats (both positive and negative) from the multiline string. E.g.: -45.124, 1124.325 etc

这就是我所做的:

text.scan(/(\+|\-)?\d+(\.\d+)?/)

虽然它在 regex101 上运行良好(捕获组 0 匹配我需要的所有内容),但它没有不适用于 Ruby 代码.

Although it works fine on regex101 (capturing group 0 matches everything I need), it doesn't work in Ruby code.

知道为什么会发生这种情况以及我可以如何改进吗?

Any ideas why it's happening and how I can improve that?

推荐答案

参见 scan 文档:

See scan documentation:

如果模式不包含组,则每个单独的结果都由匹配的字符串 $& 组成.如果模式包含组,则每个单独的结果本身就是一个数组,每个组包含一个条目.

If the pattern contains no groups, each individual result consists of the matched string, $&. If the pattern contains groups, each individual result is itself an array containing one entry per group.

您应该删除捕获组(如果它们是多余的),或者让它们非捕获(如果您只需要对一系列模式进行分组以便能够对其进行量化),或者在无法避免捕获组的情况下使用额外的代码/组.

You should remove capturing groups (if they are redundant), or make them non-capturing (if you just need to group a sequence of patterns to be able to quantify them), or use extra code/group in case a capturing group cannot be avoided.

  1. 在这种情况下,捕获组用于量化模式序列,因此您需要做的就是将捕获组转换为非捕获组,替换所有未转义的 ((?: (这里只出现了一次):
  1. In this scenario, the capturing group is used to quantifiy a pattern sequence, thus all you need to do is convert the capturing group into a non-capturing one by replacing all unescaped ( with (?: (there is only one occurrence here):

text = " -45.124, 1124.325"
puts text.scan(/[+-]?\d+(?:\.\d+)?/)

演示,输出:

-45.124
1124.325

好吧,如果你还需要匹配像 .04 这样的浮点数,你可以使用 [+-]?\d*\.?\d+.请参阅另一个演示

Well, if you need to also match floats like .04 you can use [+-]?\d*\.?\d+. See another demo

  1. 有些情况下您无法摆脱捕获组,例如当正则表达式包含对捕获组的反向引用时.在这种情况下,您可以 a) 声明一个变量来存储所有匹配项并将它们全部收集在 scan 块中,或者 b) 用另一个捕获组将整个模式包围起来并映射结果以获得每场比赛的第一项,c) 您可以使用 gsub 只用一个正则表达式作为单个参数返回一个 Enumerator,用 .to_a 获取匹配数组:
  1. There are cases when you cannot get rid of a capturing group, e.g. when the regex contains a backreference to a capturing group. In that case, you may either a) declare a variable to store all matches and collect them all inside a scan block, or b) enclose the whole pattern with another capturing group and map the results to get the first item from each match, c) you may use a gsub with just a regex as a single argument to return an Enumerator, with .to_a to get the array of matches:

text = "11234566666678"
# Variant a:
results = []
text.scan(/(\d)\1+/) { results << Regexp.last_match(0) }
p results                              # => ["11", "666666"]
# Variant b:
p text.scan(/((\d)\2+)/).map(&:first)  # => ["11", "666666"]
# Variant c:
p text.gsub(/(\d)\1+/).to_a  # => ["11", "666666"]

请参阅此 Ruby 演示.

这篇关于使用 Ruby 扫描方法捕获组无法按预期工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆