在哪个行号找到正则表达式匹配? [英] On Which Line Number Was the Regex Match Found?
问题描述
我想使用正则表达式搜索 .java
文件,我想知道是否有办法检测文件中找到匹配项的哪一行。
I would like to search a .java
file using Regular Expressions and I wonder if there is a way to detect one what lines in the file the matches are found.
例如,如果我用Java正则表达式查找匹配 hello
,有些方法会告诉我匹配是在第9,15和30行找到的?
For example if I look for the match hello
with Java regular expressions, will some method tell me that the matches were found on lines 9, 15, and 30?
推荐答案
可能......正则表达真的恶魔!
免责声明:这并不是一个实用的解决方案,而是一个使用极好的正则表达式黑客扩展的方法的说明。此外,它仅适用于允许捕获组引用自身的正则表达式引擎。例如,您可以在Notepad ++中使用它,因为它使用PCRE引擎 - 但不是Java。
Disclaimer: This is not meant to be a practical solution, but an illustration of a way to use an extension of a terrific regex hack. Moreover, it only works on regex engines that allow capture groups to refer to themselves. For instance, you could use it in Notepad++, as it uses the PCRE engine—but not in Java.
假设您的文件是:
some code
more code
hey, hello!
more code
在文件底部,粘贴: 1:2:3:4:5:6:7
,其中:
是在其余代码中找不到的分隔符,其中这些数字至少与行数一样高。
At the bottom of the file, paste :1:2:3:4:5:6:7
, where :
is a delimiter not found in the rest of the code, and where the numbers go at least as high as the number of lines.
然后,获取第一个 hello的行
,你可以使用:
Then, to get the line of the first hello
, you can use:
(?m)(?:(?:^(?:(?!hello).)*(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))*.*hello(?=[^:]+((?(1)\1)+:(\d+)))
第2组将捕获包含hello的第一行的行号。
The line number of the first line containing hello will be captured by Group 2.
- 在演示,请参阅右侧窗格中的第2组捕获。
- 黑客依赖于指向自身的组。在经典的@Qtax技巧中,这是通过
(?> \ 1?)
完成的。对于多样性,我使用了条件。
- In the demo, see Group 2 capture in the right pane.
- The hack relies on a group referring to itself. In the classic @Qtax trick, this is done with
(?>\1?)
. For diversity, I used a conditional instead.
解释
- 正则表达式的第一部分是一个行程序员,它捕获了底部的行计数器越来越多的第1组
- 正则表达式的第二部分匹配
hello
并将行号捕获到第2组 - 在行序列中,
(?:^(?:(?!hello)。)*(?: \r?\ n))
匹配不包含hello的行。 - 仍在队长中,
(?= [^:] +((?(1)\ 1):\ d +))
lookahead将我们带到第一个:
,其中[^:] +
然后是<的外括号code>((?(1)\ 1):\ d +))捕获到组1 ...如果组1设置为(?(1 )\ 1)
然后是组1,然后,无论是冒号还是一些数字。这可以确保每次线段管理员匹配一条线时,第1组扩展到的较长部分:1:2:3:4:5:6:7
-
*
将线路船长归零次或多次 -
。*你好
将该行与匹配
- 前瞻
(?= [^ :] +((?(1)\ 1)+ :( \ d +)))
与行队长中的相同,只是这次数字被捕获到第2组:(\d +)
-
- The first part of the regex is a line skipper, which captures an increasing amount of the the line counter at the bottom to Group 1
- The second part of the regex matches
hello
and captures the line number to Group 2 - Inside the line skipper,
(?:^(?:(?!hello).)*(?:\r?\n))
matches a line that doesn't contain hello. - Still inside the line skipper, the
(?=[^:]+((?(1)\1):\d+))
lookahead gets us to the first:
with[^:]+
then the outer parentheses in((?(1)\1):\d+))
capture to Group 1... if Group 1 is set(?(1)\1)
then Group 1, then, regardless, a colon and some digits. This ensures that each time the line skipper matches a line, Group 1 expands to a longer portion of:1:2:3:4:5:6:7
- The
*
mataches the line skipper zero or more times .*hello
matches the line withhello
- The lookahead
(?=[^:]+((?(1)\1)+:(\d+)))
is identical to the one in the line skipper, except that this time the digits are captured to Group 2:(\d+)
-
参考
- Qtax技巧(最近由@AmalMurali授予额外奖励)
- 将一个单词替换为找到它的行数
- Qtax trick (recently awarded an additional bounty by @AmalMurali)
- Replace a word with the number of the line on which it is found
这篇关于在哪个行号找到正则表达式匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!