首页
开发方法
正则表达式可以返回找到匹配的行数吗？

正则表达式可以返回找到匹配的行数吗？ [英] Can a Regex Return the Number of the Line where the Match is Found?

查看：471 发布时间：2017/11/9 20:56:40 regex replace find editor

本文介绍了正则表达式可以返回找到匹配的行数吗？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在一个文本编辑器中，我想用找到这个单词的行号的数字替换给定的单词。这是可能的正则表达式？

解决方案

递归，自引用组（Qtax技巧），反向Qtax或平衡组< h2>

简介

在输入的底部添加一个整数列表类似于着名的数据库黑客（与正则表达式无关），其中一个连接到一个整数表。我原来的答案使用了@Qtax技巧。目前的答案使用递归，Qtax技巧（直接或颠倒的变化），或平衡组。

是的，这是可能的...有一些注意事项和正则表达式欺骗。

在这个答案中的解决方案是为了演示一些正则表达式语法，而不是实际的答案来实现。

在文件的最后，我们将粘贴一个数字列表，前面带有一个唯一的分隔符。对于这个实验，附加的字符串是`：1：2：3：4：5：6：7` 这是一个类似于使用表

对于前两个解决方案，我们需要一个编辑器，它使用允许递归（解决方案1）或自引用捕获组（解决方案2和3）的正则表达式。记得有两个：Notepad ++和EditPad Pro。对于第三种解决方案，我们需要一个支持平衡组的编辑器。输入文件：

假设我们正在寻找 `pig` ，并且希望用行号替换它。

我们将以此作为输入：

`我的猫狗我的猪我的牛我的鼠标：1：2：3：4：5：6：7`

$ b
第一个解决方案：递归

支持的语言：除上面提到的文本编辑器（Notepad ++和EditPad Pro），这个解决方案应该用使用PCRE（PHP，R，Delphi）的语言，在Perl中，在Python中使用Matthew Barnett的 `regex` 模块（未测试）。

递归结构处于前瞻状态，是可选的。它的工作是平衡左侧不包含 `pig` 的行和数字，右边：将其视为平衡一个嵌套结构，如 `{{{}}}` ...除了在左边，我们有不匹配的行，在右边，我们有数字。重点是，当我们退出预测时，我们知道有多少行被跳过。

搜索：

`（？sm）（？=。？pig）（？=（（？：^（？:(？！猪） \ ]）（?: \r \\\ ））（:( 1）| [^：？？？？。] +）（：\d +）））* \Kpig（=？。？（？（2）\2）:( \d +））`
Free-Spacing Version with Comments：
（？xsm）＃free-spacing mode，multi-line （？=。？pig）＃如果猪不存在，就立即失败（？=＃递归结构存在于这个前瞻（＃Group 1 （？：＃跳过一行 ^ （？:( ?!猪）[^ \r\\\ ]）＃零个或多个字符不能跟着猪？：\ r？\\\ ）＃换行符）（？:(？1）| [^：] +）＃递归组1或者匹配所有不是： b $ b（：\d +）＃匹配数字）？＃结束组）＃结束前瞻。。？\Kpig＃得到猪（？=。？（？（2）\2）:( \d +））＃Lookahead：捕获下一个数字
替换： `\ 3`

在演示中，看到底部的替换。你可以使用前两行的字母（删除一个空格来制作 `pig` ）来移动第一个出现的 `pig regex` 模块（未经测试）。通过将 `\K` 转换为一个原子组和一个原子组的占有量词（见下面的.NET版本），该解决方案很容易适应.NET。）

搜索：

`（？SM）（？？= 猪）（？：（？：^（:(？猪）[^ \r\\\ ]）（?: \r \\\ ？））（？= [^：] +（（（1）\1）：\d +））） + * \Kpig。？（= [^：] +（（1→）\1）：（\d +））`
.NET版本：回到未来

.NET没有 `\K` 。它的地方，我们使用一个回到未来的后视（一个后视，其中包含一个前瞻，跳过比赛）。此外，我们需要使用一个原子组而不是占有量词。（？sm）（？<=（？=。？pig）（？=？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？：^ [^ \r\\\
]）（?: \r \\\
））（= [^：]（:(？猪？）？？？+（（（1）\1 ）：\d +））））。）pig（？= [^：] +（？（1）\1）:( \d +））
/ pre>

包含注释的免费版本（Perl / PCRE版本）：
（？xsm）＃自由间隔模式，多行（？=。？pig）＃前瞻：如果猪不存在，立即失败保存（？：＃start counter-line-skipper（不包括猪的线）（？：＃跳过一行 ^＃（？:(猪！）＃零或多个字符没有跟着猪（？：\ r？\\\ ）＃换行字符）＃为每一行跳过，让组1匹配数字字符串在底部（？=＃lookahead [^：] +＃跳过所有不是冒号的字符（＃开始组1 （？（1）\1）＃匹配组1如果设置：\d +＃匹配冒号和一些数字）＃结束组1 ）＃结束前瞻）* +＃结束反向线 - 船长：零次或多次。？＃match \ K＃放下我们已经匹配的所有东西 pig＃match pig（这是匹配！）（？= [^：] +（？（1） \ 1）:( \d +））＃捕获下一个数字到组2
替换：
`\ 2`
输出：

`我的猫 dog my 3 my cow my mouse $ b $：1：2：3：4：5：6：7`
在演示，请参阅底部的替换。你可以使用前两行的字母（删除一个空格来制作 `pig` ）来移动第一个出现的 `pig` 到另一行，看看它是如何影响结果的。

数字分隔符的选择

在我们的例子中，数字串的分隔符`：`是相当常见的，并且可能发生在其他地方。我们可以创建一个 `UNIQUE_DELIMITER` 并稍微调整表达式。但是下面的优化效率更高，让我们保持`：`

第二种解决方案的优化：数字反转字符串

不是按顺序粘贴数字，而是按照相反的顺序使用它们： code>：7：6：5：4：3：2：1

在我们的lookaheads中，通过一个简单的`。` 向下输入到底部，然后从那里开始回溯。由于我们知道我们在字符串的末尾，所以我们不必担心`：digits` 作为字符串另一部分的一部分。

输入：

`我的猫pi g dog p ig 我的猪我的牛我的鼠标 $ b $：7：6：5：4： 3：2：1`
搜索：

（？xsm）＃free-spacing mode，multi-line （？=。？pig）＃lookahead：如果猪不在那里，立即失败以节省工作（？：＃start counter-line-skipper（不包括猪的行）（？：＃跳过一行没有猪 ^＃（？:(？！pig）[^ \r\\\ ]）＃零个或多个字符没有后跟猪（？：\ r？\\\ ）＃换行符）＃组1匹配底部数字字符串的增加部分（？=＃lookahead 。＃到输入的结尾（＃开始组1 ：\d +＃匹配colo n和一些数字（？（1）\1）＃匹配组1如果设置）＃结束组1 ）＃结束前瞻） +结束计数器-line-skipper：零次或多次。？＃匹配 \ K＃放弃比赛到目前为止猪＃匹配猪（这是匹配！）（？=。（\ d +）（？（1）\ 1））＃将下一个数字捕获到第2组
替换： `\ 2`

查看

这个解决方案是针对.NET的。

搜索：

（M +）（小于？= \A（小于？c取代; ^（:(？猪）[^ \r\\\ ]）（?: \\ （\\ 自由间距版本评论： pre $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ （？< c>＃跳过一行d没有猪＃c组捕获的长度将作为计数器 ^＃行首（？:(？！猪）[^ \r\\\ ]）＃零个或多个字符没有跟着猪（？：\ r？\\\ ）＃换行符）＃结尾跳过＃重复跳过。？＃我们在猪线上：lazily match chars before pig ）＃end lookbehind pig＃match pig：这是匹配（？=＃lookahead [^ ：] +＃得到数字（？（c）＃如果c组已被设置（？< -c>：\d +）＃递减c，而我们匹配一组数字＃重复：只要c组的长度被捕获，则只会重复> 0 ）＃如果组c已被设置，则结束：（\ d +）＃匹配下一个数字组，捕获数字）＃end lokahead
替换： `$ 1`

参考

Qtax技巧

找到正确匹配的行号

In a text editor, I want to replace a given word with the number of the line number on which this word is found. Is this is possible with Regex?
解决方案
Recursion, Self-Referencing Group (Qtax trick), Reverse Qtax or Balancing Groups

Introduction

The idea of adding a list of integers to the bottom of the input is similar to a famous database hack (nothing to do with regex) where one joins to a table of integers. My original answer used the @Qtax trick. The current answers use either Recursion, the Qtax trick (straight or in a reversed variation), or Balancing Groups.

Yes, it is possible... With some caveats and regex trickery.

The solutions in this answer are meant as a vehicle to demonstrate some regex syntax more than practical answers to be implemented.

At the end of your file, we will paste a list of numbers preceded with a unique delimiter. For this experiment, the appended string is `:1:2:3:4:5:6:7` This is a similar technique to a famous database hack that uses a table of integers.

For the first two solutions, we need an editor that uses a regex flavor that allows recursion (solution 1) or self-referencing capture groups (solutions 2 and 3). Two come to mind: Notepad++ and EditPad Pro. For the third solution, we need an editor that supports balancing groups. That probably limits us to EditPad Pro or Visual Studio 2013+.

Input file:

Let's say we are searching for `pig` and want to replace it with the line number.

We'll use this as input:
`my cat dog my pig my cow my mouse :1:2:3:4:5:6:7`

First Solution: Recursion

Supported languages: Apart from the text editors mentioned above (Notepad++ and EditPad Pro), this solution should work in languages that use PCRE (PHP, R, Delphi), in Perl, and in Python using Matthew Barnett's `regex` module (untested).

The recursive structure lives in a lookahead, and is optional. Its job is to balance lines that don't contain `pig`, on the left, with numbers, on the right: think of it as balancing a nested construct like `{{{ }}}`... Except that on the left we have the no-match lines, and on the right we have the numbers. The point is that when we exit the lookahead, we know how many lines were skipped.

Search:
`(?sm)(?=.?pig)(?=((?:^(?:(?!pig)[^\r\n])(?:\r?\n))(?:(?1)|[^:]+)(:\d+))?).?\Kpig(?=.?(?(2)\2):(\d+))`
Free-Spacing Version with Comments:
`(?xsm) # free-spacing mode, multi-line (?=.?pig) # fail right away if pig isn't there (?= # The Recursive Structure Lives In This Lookahead ( # Group 1 (?: # skip one line ^ (?:(?!pig)[^\r\n])* # zero or more chars not followed by pig (?:\r?\n) # newline chars ) (?:(?1)|[^:]+) # recurse Group 1 OR match all chars that are not a : (:\d+) # match digits )? # End Group ) # End lookahead. .?\Kpig # get to pig (?=.?(?(2)\2):(\d+)) # Lookahead: capture the next digits`
Replace: `\3`

In the demo, see the substitutions at the bottom. You can play with the letters on the first two lines (delete a space to make `pig`) to move the first occurrence of `pig` to a different line, and see how that affects the results.

Second Solution: Group that Refers to Itself ("Qtax Trick")

Supported languages: Apart from the text editors mentioned above (Notepad++ and EditPad Pro), this solution should work in languages that use PCRE (PHP, R, Delphi), in Perl, and in Python using Matthew Barnett's `regex` module (untested). The solution is easy to adapt to .NET by converting the `\K` to a lookahead and the possessive quantifier to an atomic group (see the .NET Version a few lines below.)

Search:
`(?sm)(?=.?pig)(?:(?:^(?:(?!pig)[^\r\n])(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))+.?\Kpig(?=[^:]+(?(1)\1):(\d+))`
.NET version: Back to the Future

.NET does not have `\K`. It its place, we use a "back to the future" lookbehind (a lookbehind that contains a lookahead that skips ahead of the match). Also, we need to use an atomic group instead of a possessive quantifier.
`(?sm)(?<=(?=.?pig)(?=(?>(?:^(?:(?!pig)[^\r\n])(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))).)pig(?=[^:]+(?(1)\1):(\d+))`
Free-Spacing Version with Comments (Perl / PCRE Version):
(?xsm) # free-spacing mode, multi-line (?=.?pig) # lookahead: if pig is not there, fail right away to save the effort (?: # start counter-line-skipper (lines that don't include pig) (?: # skip one line ^ # (?:(?!pig)[^\r\n]) # zero or more chars not followed by pig (?:\r?\n) # newline chars ) # for each line skipped, let Group 1 match an ever increasing portion of the numbers string at the bottom (?= # lookahead [^:]+ # skip all chars that are not colons ( # start Group 1 (?(1)\1) # match Group 1 if set :\d+ # match a colon and some digits ) # end Group 1 ) # end lookahead )+ # end counter-line-skipper: zero or more times .? # match \K # drop everything we've matched so far pig # match pig (this is the match!) (?=[^:]+(?(1)\1):(\d+)) # capture the next number to Group 2
Replace:
`\2`
Output:
`my cat dog my 3 my cow my mouse :1:2:3:4:5:6:7`
In the demo, see the substitutions at the bottom. You can play with the letters on the first two lines (delete a space to make `pig`) to move the first occurrence of `pig` to a different line, and see how that affects the results.

Choice of Delimiter for Digits

In our example, the delimiter `:` for the string of digits is rather common, and could happen elsewhere. We can invent a `UNIQUE_DELIMITER` and tweak the expression slightly. But the following optimization is even more efficient and lets us keep the `:`

Optimization on Second Solution: Reverse String of Digits

Instead of pasting our digits in order, it may be to our benefit to use them in the reverse order: `:7:6:5:4:3:2:1`

In our lookaheads, this allows us to get down to the bottom of the input with a simple `.`, and to start backtracking from there. Since we know we're at the end of the string, we don't have to worry about the `:digits` being part of another section of the string. Here's how to do it.

Input:
`my cat pi g dog p ig my pig my cow my mouse :7:6:5:4:3:2:1`
Search:
(?xsm) # free-spacing mode, multi-line (?=.?pig) # lookahead: if pig is not there, fail right away to save the effort (?: # start counter-line-skipper (lines that don't include pig) (?: # skip one line that doesn't have pig ^ # (?:(?!pig)[^\r\n])* # zero or more chars not followed by pig (?:\r?\n) # newline chars ) # Group 1 matches increasing portion of the numbers string at the bottom (?= # lookahead .* # get to the end of the input ( # start Group 1 :\d+ # match a colon and some digits (?(1)\1) # match Group 1 if set ) # end Group 1 ) # end lookahead )+ # end counter-line-skipper: zero or more times .? # match \K # drop match so far pig # match pig (this is the match!) (?=.(\d+)(?(1)\1)) # capture the next number to Group 2
Replace: `\2`

See the substitutions in the demo.

Third Solution: Balancing Groups

This solution is specific to .NET.

Search:
`(?m)(?<=\A(?<c>^(?:(?!pig)[^\r\n])(?:\r?\n)).?)pig(?=[^:]+(?(c)(?<-c>:\d+)):(\d+))`
Free-Spacing Version with Comments:
(?xm) # free-spacing, multi-line (?<= # lookbehind \A # (?<c> # skip one line that doesn't have pig # The length of Group c Captures will serve as a counter ^ # beginning of line (?:(?!pig)[^\r\n]) # zero or more chars not followed by pig (?:\r?\n) # newline chars ) # end skipper * # repeat skipper .? # we're on the pig line: lazily match chars before pig ) # end lookbehind pig # match pig: this is the match (?= # lookahead [^:]+ # get to the digits (?(c) # if Group c has been set (?<-c>:\d+) # decrement c while we match a group of digits # repeat: this will only repeat as long as the length of Group c captures > 0 ) # end if Group c has been set :(\d+) # Match the next digit group, capture the digits ) # end lokahead
Replace: `$1`

Reference

Qtax trick

On Which Line Number Was the Regex Match Found?

这篇关于正则表达式可以返回找到匹配的行数吗？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

相关文章

正则表达式可以返回找到匹配项的行号吗?;

PCRE 正则表达式可以匹配空字符吗?;

XQuery 正则表达式可以匹配空字符吗?;

匹配有效正则表达式的正则表达式;

正则表达式匹配：;

正则表达式 - 匹配;

正则表达式匹配;

正则表达式匹配＃;

找到所有正则表达式匹配的索引?;

正则表达式模式匹配返回结果;

正则表达式 - 如何在匹配中找到匹配?;

返回匹配的正则表达式的部分;

php - 如何用正则表达式匹配正则表达式？;

可以使用正则表达式匹配嵌套模式吗?;

在哪个行号找到正则表达式匹配？;

Javascript - 正则表达式找到多个括号匹配;

正则表达式 - javascript 正则匹配;

正则表达式匹配单个新行。正则表达式匹配双新线;

正则表达式可以匹配引号之外的所有单词吗?;

Jmeter正则表达式可变行数;

JavaScript正则表达式-null参数使正则表达式匹配;

正则表达式正则表达式;

正则表达式的正则表达式?;

正则表达式匹配EOF;

正则表达式匹配次数;

开发方法最新文章

如何向$ window.open添加身份验证头;

将Json对象从控制器操作返回到jQuery;

JavaFX TabPane - 每个选项卡的一个控制器;

ScriptedSandbox64.exe已停止工作 - Visual Studio 2015;

为什么不能从java中的RequestBody获取文件数据？;

如何通过WebSocket的二进制发送arraybuffer？;

使用AES-GCM的协议的nonce / IV的来源和重要性;

Sqoop导出错误 - 原因：org.apache.hadoop.mapreduce.lib.input.InvalidInputException：输入路径不存在;

如何使用leaflet map.on（'click'，function）事件处理程序添加标记到地图;

无法创建配置，因为找不到Bean验证提供程序。在类路径中添加一个像Hibernate Validator（RI）这样的提供程序;

热门教程

Java教程

Apache ANT 教程

Kali Linux教程

JavaScript教程

JavaFx教程

MFC 教程

Apache HTTP客户端教程

Microsoft Visio 教程

热门工具

Java 在线工具

C(GCC) 在线工具

PHP 在线工具

C# 在线工具

Python 在线工具

MySQL 在线工具

VB.NET 在线工具

Lua 在线工具

Oracle 在线工具

C++(GCC) 在线工具

Go 在线工具

Fortran 在线工具

登录关闭

扫码关注1秒登录

发送“验证码”获取 | 15天全站免登陆

友情链接： IT屋 Chrome插件谷歌浏览器插件

IT屋 ©2016-2022 琼ICP备2021000895号-1 站点地图站点标签 SiteMap <免责申明> 本站内容来源互联网,如果侵犯您的权益请联系我们删除.