为什么我的正则表达式可以在RegexPlanet和regex101上运行,而不能在我的代码中运行? [英] Why does my regex work on RegexPlanet and regex101 but not in my code?

查看:54
本文介绍了为什么我的正则表达式可以在RegexPlanet和regex101上运行,而不能在我的代码中运行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出字符串#100 = SAMPLE('Test','Test',我要提取 100 Test .我创建了正则表达式 ^#(\ d +)= SAMPLE \('([\ w-] +)'.* .

Given the string #100=SAMPLE('Test','Test', I want to extract 100 and Test. I created the regular expression ^#(\d+)=SAMPLE\('([\w-]+)'.* for this purpose.

我在 RegexPlanet

I tested the regex on RegexPlanet and regex101. Both tools give me the expected results, but when I try to use it in my code I don't get matches. I used the following snippet for testing the regex:

final String line = "#100=SAMPLE('Test','Test',";
final Pattern pattern = Pattern.compile("^#(\\d+)=SAMPLE\\('([\\w-]+)'.*");
final Matcher matcher = pattern.matcher(line);

System.out.println(matcher.matches());
System.out.println(matcher.find());
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));

输出为

true
false
Exception in thread "main" java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:536)
    at java.util.regex.Matcher.group(Matcher.java:496)
    at Test.main(Test.java:15)

我使用Java 8来编译和运行程序.为什么正则表达式不能与在线工具一起使用,而不能在我的程序中使用?

I used Java 8 for compiling and running the program. Why does the regex work with the online tools but not in my program?

推荐答案

A

A Matcher object allows you to query it several times, so that you can find the expression, get the groups, find the expression again, get the groups, and so on.

这意味着每次调用后,它都会保持状态-匹配成功后产生的组以及继续搜索的位置.

This means that it keeps state after each call - both for the groups that resulted from a successful match, and the position where to continue searching.

连续运行两个匹配/查找方法时,您将拥有:

When you run two matching/finding methods consecutively, what you have is:

  1. matches()-在字符串的开头进行匹配,设置组.
  2. find()-尝试在先前匹配/找到的出现之后找到模式的下一个出现,设置组.
  1. matches() - Matches at the beginning of the string, sets the groups.
  2. find() - tries to find the next occurrence of the pattern after the previously matched/found occurrence, sets the groups.

但是,当然,在您的情况下,文本不包含模式的两次出现,只有一次.因此,尽管 matches()成功并设置了适当的组,但 find()然后却找不到另一个匹配项,并且这些组无效(匹配/查找失败).

But of course, in your case, the text doesn't contain two occurrences of the pattern, only one. So although matches() was successful and set proper groups, the find() then fails to find another match, and the groups are invalid (the groups are not accessible after a failed match/find).

这就是为什么您收到错误消息的原因.

And that's why you get the error message.

现在,如果您只是在玩这个,看看 matches find 之间的区别,那么将它们都放在程序.但是您需要在它们之间使用 reset(),这将导致 find()不要尝试从 matches()停止的位置继续(如果 matches()成功,它将始终失败).相反,它将从头开始扫描,就像您有一个新的 Matcher 一样.它将成功并为您提供团体.

Now, if you're just playing around with this, to see the difference between matches and find, then there is nothing wrong with having both of them in the program. But you need to use reset() between them, which will cause find() not to try to continue from where matches() stopped (which will always fail if matches() succeeded). Instead, it will start scanning from the start, as if you had a fresh Matcher. And it will succeed and give you groups.

但是正如这里的其他答案所暗示的,如果您不只是尝试比较 matches find 的结果,而是想匹配您的模式并获得结果,那么您应该只选择其中的一个.

But as other answers here hinted, if you're not just trying to compare the results of matches and find, but just wanted to match your pattern and get the results, then you should choose only one of them.

  • matches()将尝试匹配整个字符串.因此,如果成功,则在它之后运行 find()将永远不会成功-因为它在字符串的末尾开始搜索.如果您使用 matches(),则在模式的开头和结尾都不需要像 ^ $ 这样的锚点.
  • find()将尝试匹配字符串中的任何位置.它将从左侧开始扫描,但不要求实际匹配从此处开始.也可以不止一次使用它.
  • lookingAt()将尝试在字符串的开头进行匹配,但不一定与完整的字符串匹配.就像在模式的开头有一个 ^ 锚点.
  • matches() will try to match the entire string. For this reason, if it succeeds, running find() after it will never succeed - because it starts searching at the end of the string. If you use matches(), you don't need anchors like ^ and $ at the beginning and the end of your pattern.
  • find() will try to match anywhere in the string. It will start scanning from the left, but doesn't require that the actual match start there. It is also possible to use it more than once.
  • lookingAt() will try to match at the beginning of the string, but will not necessarily match the complete string. It's like having an ^ anchor at the beginning of your pattern.

因此,您可以选择其中一种适合自己,然后使用它,然后可以使用组.在尝试使用组之前,请始终测试匹配是否成功!

So you choose which one of these is appropriate for you, and use it, and then you can use the groups. Always test that the match succeeded before attempting to use the groups!

这篇关于为什么我的正则表达式可以在RegexPlanet和regex101上运行,而不能在我的代码中运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆