如何使用 Vim 提取文件中的所有正则表达式匹配项? [英] How to extract all regex matches in a file using Vim?
问题描述
考虑以下示例:
case Foo:
...
break;
case Bar:
...
break;
case More: case Complex:
...
break:
...
假设,我们想要检索正则表达式的所有正则表达式匹配项(整个匹配文本,或者更好的是,(
和 )
之间的部分)case ([^:]*):
,它应该给我们(最好是在一个新的新缓冲区中)类似于:
Say, we would like to retrieve all the regex matches (the whole matching text, or even better, the part between (
and )
) of the regex case ([^:]*):
, which should give us (preferably in a new new buffer) something like:
Foo
Bar
More
Complex
...
另一个用例示例是提取某些部分,例如从 HTML 文件中提取图像的 URL.
Another use case example would be the extraction of some parts, for instance, URLs of images from an HTML file.
是否有一种简单的方法可以绘制所有正则表达式匹配项并将它们放入 Vim 的缓冲区中?
Is there a simple way to graph all regex matches and put them in a buffer in Vim?
注意:它类似于问题How to extract text matching a使用 Vim 的正则表达式?".但是,与该问题中的设置不同,我也有兴趣删除不匹配的行,最好不要使用庞大或复杂的正则表达式.
Note: It’s similar to the question "How to extract text matching a regex using Vim?". However, unlike the setting in that question, I’m also interested in removing the lines that don’t match, preferably without a huge or complex regex.
推荐答案
有一个通用的方法来收集整篇文章中的模式匹配的文本.该技术利用了替代品:substitute
命令的表达式特征(参见 :help sub-replace-=
).关键思想是使用替换枚举所有模式匹配来评估表达式存储无需更换.
There is a general way of collecting pattern matches throughout a piece
of text. The technique takes advantage of the substitute with an
expression feature of the :substitute
command
(see :help sub-replace-=
). The key idea is to use a substitution
enumerating all of the pattern matches to evaluate an expression storing
them without replacement.
首先,让我们考虑保存匹配项.为了保持一个序列匹配文本片段,使用列表很方便(参见:help List
).但是,无法修改列表直接使用 :let
命令,因为没有办法在表达式中运行 Ex 命令(包括 =
替换表达式).然而,我们可以调用其中一个修改列表的函数.为了例如,add()
函数旨在将给定的项目附加到指定列表(参见:help add()
).
First, let us consider saving the matches. In order to keep a sequence
of matching text fragments, it is convenient to use a list
(see :help List
). However, it is not possible to modify a list
straightforwardly, using the :let
command, since there is no way to
run Ex commands in expressions (including =
substitute expressions).
Yet, we can call one of the functions that modify a list in place. For
example, the add()
function is designed to append a given item to the
specified list (see :help add()
).
另一个问题是如何避免在运行时修改文本一种替代.一种方法是使模式始终具有通过添加 ze
或添加 zs
原子来实现零宽度匹配(参见:help/zs
、:help/ze
).这样修改的图案捕获出现之前或之后的空字符串文本中的原始模式(此类匹配称为零宽度匹配在 Vim 中;见 :help/zero-width
).然后,如果替换文本也是空,替换有效地改变什么:它只是替换与空字符串的零宽度匹配.
Another problem is how to avoid text modifications while running
a substitution. One approach is to make the pattern always have
a zero-width match by prepending ze
or by appending zs
atoms to it
(see :help /zs
, :help /ze
). The pattern modified in this way
captures an empty string preceding or succeeding an occurrence of the
original pattern in text (such matches are called zero-width matches
in Vim; see :help /zero-width
). Then, if the replacement text is also
empty, substitution effectively changes nothing: it just replaces
a zero-width match with an empty string.
由于add()
函数,以及大部分列表修改函数,返回对更改列表的引用,对于我们的技术为了工作,我们需要以某种方式从中获取一个空字符串.最简单的方法是通过指定范围从中提取零长度的子列表索引的起始索引大于结束索引.
Since the add()
function, as well as the most of the list modifying
functions, returns the reference to the changed list, for our technique
to work, we need to somehow get an empty string from it. The simplest
way is to extract a sublist of zero length from it by specifying a range
of indices such that a starting index is greater than an ending one.
结合上述思路,我们得到如下Ex命令:
Combining the aforementioned ideas, we obtain the following Ex command:
:let t=[] | %s/<cases+(w+):zs/=add(t,submatch(1))[1:0]/g
执行后,第一个子组的所有匹配都被累加在变量 t
引用的列表中,可以按原样使用或以某种方式处理.例如,粘贴列表一的内容在插入模式下在单独的行中逐行键入
After its execution, all matches of the first subgroup are accumulated
in the list referenced by the variable t
, and can be used as is or
processed in some way. For instance, to paste contents of the list one
by one on separate lines in Insert mode, type
Ctrl+R=t
回车
要在普通模式下执行相同操作,只需使用 :put
命令:
To do the same in Normal mode, simply use the :put
command:
:pu=t
这篇关于如何使用 Vim 提取文件中的所有正则表达式匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!