在哪个行号找到正则表达式匹配? [英] On Which Line Number Was the Regex Match Found?

查看:133
本文介绍了在哪个行号找到正则表达式匹配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用正则表达式搜索 .java 文件,我想知道是否有办法检测文件中找到匹配项的哪一行。

I would like to search a .java file using Regular Expressions and I wonder if there is a way to detect one what lines in the file the matches are found.

例如,如果我用Java正则表达式查找匹配 hello ,有些方法会告诉我匹配是在第9,15和30行找到的?

For example if I look for the match hello with Java regular expressions, will some method tell me that the matches were found on lines 9, 15, and 30?

推荐答案

可能......正则表达真的恶魔!

免责声明:这并不是一个实用的解决方案,而是一个使用极好的正则表达式黑客扩展的方法的说明。此外,它仅适用于允许捕获组引用自身的正则表达式引擎。例如,您可以在Notepad ++中使用它,因为它使用PCRE引擎 - 但不是Java。

Disclaimer: This is not meant to be a practical solution, but an illustration of a way to use an extension of a terrific regex hack. Moreover, it only works on regex engines that allow capture groups to refer to themselves. For instance, you could use it in Notepad++, as it uses the PCRE engine—but not in Java.

假设您的文件是:

some code
more code
hey, hello!
more code

在文件底部,粘贴: 1:2:3:4:5:6:7 ,其中是在其余代码中找不到的分隔符,其中这些数字至少与行数一样高。

At the bottom of the file, paste :1:2:3:4:5:6:7, where : is a delimiter not found in the rest of the code, and where the numbers go at least as high as the number of lines.

然后,获取第一个 hello的行,你可以使用:

Then, to get the line of the first hello, you can use:

(?m)(?:(?:^(?:(?!hello).)*(?:\r?\n))(?=[^:]+((?(1)\1):\d+)))*.*hello(?=[^:]+((?(1)\1)+:(\d+)))

第2组将捕获包含hello的第一行的行号。

The line number of the first line containing hello will be captured by Group 2.


  • 演示,请参阅右侧窗格中的第2组捕获。

  • 黑客依赖于指向自身的组。在经典的@Qtax技巧中,这是通过(?> \ 1?)完成的。对于多样性,我使用了条件。

  • In the demo, see Group 2 capture in the right pane.
  • The hack relies on a group referring to itself. In the classic @Qtax trick, this is done with (?>\1?). For diversity, I used a conditional instead.

解释


  • 正则表达式的第一部分是一个行程序员,它捕获了底部的行计数器越来越多的第1组

  • 正则表达式的第二部分匹配 hello 并将行号捕获到第2组

  • 在行序列中,(?:^(?:(?!hello)。)*(?: \r?\ n))匹配不包含hello的行。

  • 仍在队长中,(?= [^:] +((?(1)\ 1):\ d +)) lookahead将我们带到第一个,其中 [^:] + 然后是<的外括号code>((?(1)\ 1):\ d +))捕获到组1 ...如果组1设置为(?(1 )\ 1)然后是组1,然后,无论是冒号还是一些数字。这可以确保每次线段管理员匹配一条线时,第1组扩展到的较长部分:1:2:3:4:5:6:7

  • * 将线路船长归零次或多次

  • 。*你好将该行与匹配

  • 前瞻(?= [^ :] +((?(1)\ 1)+ :( \ d +)))与行队长中的相同,只是这次数字被捕获到第2组:(\d +)

  • -
  • The first part of the regex is a line skipper, which captures an increasing amount of the the line counter at the bottom to Group 1
  • The second part of the regex matches hello and captures the line number to Group 2
  • Inside the line skipper, (?:^(?:(?!hello).)*(?:\r?\n)) matches a line that doesn't contain hello.
  • Still inside the line skipper, the (?=[^:]+((?(1)\1):\d+)) lookahead gets us to the first : with [^:]+ then the outer parentheses in ((?(1)\1):\d+)) capture to Group 1... if Group 1 is set (?(1)\1) then Group 1, then, regardless, a colon and some digits. This ensures that each time the line skipper matches a line, Group 1 expands to a longer portion of :1:2:3:4:5:6:7
  • The * mataches the line skipper zero or more times
  • .*hello matches the line with hello
  • The lookahead (?=[^:]+((?(1)\1)+:(\d+))) is identical to the one in the line skipper, except that this time the digits are captured to Group 2: (\d+)
  • -

参考

  • Qtax trick (recently awarded an additional bounty by @AmalMurali)
  • Replace a word with the number of the line on which it is found

这篇关于在哪个行号找到正则表达式匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆