如何使用 Vim 提取文件中的所有正则表达式匹配项? [英] How to extract all regex matches in a file using Vim?

查看:66
本文介绍了如何使用 Vim 提取文件中的所有正则表达式匹配项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下示例:

case Foo:
    ...
    break;
case Bar:
    ...
    break;
case More: case Complex:
    ...
    break:
...

假设,我们想要检索正则表达式的所有正则表达式匹配项(整个匹配文本,或者更好的是,() 之间的部分)case ([^:]*):,它应该给我们(最好是在一个新的新缓冲区中)类似于:

Say, we would like to retrieve all the regex matches (the whole matching text, or even better, the part between ( and )) of the regex case ([^:]*):, which should give us (preferably in a new new buffer) something like:

Foo
Bar
More
Complex
...

另一个用例示例是提取某些部分,例如从 HTML 文件中提取图像的 URL.

Another use case example would be the extraction of some parts, for instance, URLs of images from an HTML file.

是否有一种简单的方法可以绘制所有正则表达式匹配项并将它们放入 Vim 的缓冲区中?

Is there a simple way to graph all regex matches and put them in a buffer in Vim?

注意:它类似于问题How to extract text matching a使用 Vim 的正则表达式?".但是,与该问题中的设置不同,我也有兴趣删除不匹配的行,最好不要使用庞大或复杂的正则表达式.

Note: It’s similar to the question "How to extract text matching a regex using Vim?". However, unlike the setting in that question, I’m also interested in removing the lines that don’t match, preferably without a huge or complex regex.

推荐答案

有一个通用的方法来收集整篇文章中的模式匹配的文本.该技术利用了替代品:substitute 命令的表达式特征(参见 :help sub-replace-=).关键思想是使用替换枚举所有模式匹配来评估表达式存储无需更换.

There is a general way of collecting pattern matches throughout a piece of text. The technique takes advantage of the substitute with an expression feature of the :substitute command (see :help sub-replace-=). The key idea is to use a substitution enumerating all of the pattern matches to evaluate an expression storing them without replacement.

首先,让我们考虑保存匹配项.为了保持一个序列匹配文本片段,使用列表很方便(参见:help List).但是,无法修改列表直接使用 :let 命令,因为没有办法在表达式中运行 Ex 命令(包括 = 替换表达式).然而,我们可以调用其中一个修改列表的函数.为了例如,add() 函数旨在将给定的项目附加到指定列表(参见:help add()).

First, let us consider saving the matches. In order to keep a sequence of matching text fragments, it is convenient to use a list (see :help List). However, it is not possible to modify a list straightforwardly, using the :let command, since there is no way to run Ex commands in expressions (including = substitute expressions). Yet, we can call one of the functions that modify a list in place. For example, the add() function is designed to append a given item to the specified list (see :help add()).

另一个问题是如何避免在运行时修改文本一种替代.一种方法是使模式始终具有通过添加 ze 或添加 zs 原子来实现零宽度匹配(参见:help/zs:help/ze).这样修改的图案捕获出现之前或之后的空字符串文本中的原始模式(此类匹配称为零宽度匹配在 Vim 中;见 :help/zero-width).然后,如果替换文本也是空,替换有效地改变什么:它只是替换与空字符串的零宽度匹配.

Another problem is how to avoid text modifications while running a substitution. One approach is to make the pattern always have a zero-width match by prepending ze or by appending zs atoms to it (see :help /zs, :help /ze). The pattern modified in this way captures an empty string preceding or succeeding an occurrence of the original pattern in text (such matches are called zero-width matches in Vim; see :help /zero-width). Then, if the replacement text is also empty, substitution effectively changes nothing: it just replaces a zero-width match with an empty string.

由于add()函数,以及大部分列表修改函数,返回对更改列表的引用,对于我们的技术为了工作,我们需要以某种方式从中获取一个空字符串.最简单的方法是通过指定范围从中提取零长度的子列表索引的起始索引大于结束索引.

Since the add() function, as well as the most of the list modifying functions, returns the reference to the changed list, for our technique to work, we need to somehow get an empty string from it. The simplest way is to extract a sublist of zero length from it by specifying a range of indices such that a starting index is greater than an ending one.

结合上述思路,我们得到如下Ex命令:

Combining the aforementioned ideas, we obtain the following Ex command:

:let t=[] | %s/<cases+(w+):zs/=add(t,submatch(1))[1:0]/g

执行后,第一个子组的所有匹配都被累加在变量 t 引用的列表中,可以按原样使用或以某种方式处理.例如,粘贴列表一的内容在插入模式下在单独的行中逐行键入

After its execution, all matches of the first subgroup are accumulated in the list referenced by the variable t, and can be used as is or processed in some way. For instance, to paste contents of the list one by one on separate lines in Insert mode, type

Ctrl+R=t回车

要在普通模式下执行相同操作,只需使用 :put 命令:

To do the same in Normal mode, simply use the :put command:

:pu=t

这篇关于如何使用 Vim 提取文件中的所有正则表达式匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆