使用分组时,如何使用gsub在Ruby正则表达式(regex)中进行反向引用? [英] How to backreference in Ruby regular expression (regex) with gsub when I use grouping?
问题描述
我想修补从网页提取的一些文本数据. 样本:
I would like to patch some text data extracted from web pages. sample:
t="First sentence. Second sentence.Third sentence."
第二句结尾处的点后没有空格.这表明我在原始文档中第3句话在一行中(在br标签之后).
There is no space after the point at the end of the second sentence. This sign me that the 3rd sentence was in a separate line (after a br tag) in the original document.
我想使用此正则表达式在适当的位置插入"\ n"字符并修补我的文本. 我的正则表达式:
I want to use this regexp to insert "\n" character into the proper places and patch my text. My regex:
t2=t.gsub(/([.\!?])([A-Z1-9])/,$1+"\n"+$2)
但是不幸的是,它不起作用:"NoMethodError:nil:NilClass的未定义方法'+'" 如何正确回溯匹配的组? 在Microsoft Word中是如此简单,我只需要使用\ 1和\ 2符号.
But unfortunately it doesn't work: "NoMethodError: undefined method `+' for nil:NilClass" How can I properly backreference to the matched groups? It was so easy in Microsoft Word, I just had to use \1 and \2 symbols.
推荐答案
您可以使用\1
向后替换字符串中的反向引用(以匹配捕获组1).
You can backreference in the substitution string with \1
(to match capture group 1).
t = "First sentence. Second sentence.Third sentence!Fourth sentence?Fifth sentence."
t.gsub(/([.!?])([A-Z1-9])/, "\\1\n\\2") # => "First sentence. Second sentence.\nThird sentence!\nFourth sentence?\nFifth sentence."
这篇关于使用分组时,如何使用gsub在Ruby正则表达式(regex)中进行反向引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!