使用 Scala 模式匹配时工作正则表达式失败 [英] Working regex fails when using Scala pattern matching

查看:24
本文介绍了使用 Scala 模式匹配时工作正则表达式失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下代码中,使用 Java API 时匹配相同的模式,但使用 Scala 模式匹配时不匹配.

In a following code the same pattern matches when Java API is used, but not when using Scala pattern matching.

import java.util.regex.Pattern

object Main extends App {
  val text = "/oAuth.html?state=abcde&code=hfjksdhfrufhjjfkdjfkds"

  val statePatternString = """\/.*\?.*state=([^&\?]*)"""
  val statePattern = statePatternString.r
  val statePatternJ = Pattern.compile(statePatternString)

  val sj = statePatternJ.matcher(text)
  val sjMatch = if (sj.find()) sj.group(1) else ""
  println(s"Java match $sjMatch")

  val ss = statePattern.unapplySeq(text)
  println(s"Scala unapplySeq $ss")
  val sm = statePattern.findFirstIn(text)
  println(s"Scala findFirstIn $sm")

  text match {
    case statePattern(s) =>
      println(s"Scala matching $s")
    case _ =>
      println("Scala not matching")
  }

}

应用输出为:

Java 匹配 abcde

Java match abcde

Scala unapplySeq 无

Scala unapplySeq None

Scala findFirstIn Some(/oAuth.html?state=abcde)

Scala findFirstIn Some(/oAuth.html?state=abcde)

Scala 不匹配

当使用提取器语法 val statePattern(se) = text 时,错误是 scala.MatchError.

When using the extractor syntax val statePattern(se) = text the error is scala.MatchError.

是什么导致 Scala 正则表达式 unapplySeq 失败?

What is causing the Scala regex unapplySeq to fail?

推荐答案

当你定义一个 Scala 模式时,它默认是锚定的(=需要一个完整的字符串匹配),而你的 Java sj.find() 正在寻找字符串内任何位置的匹配项.为 Scala 正则表达式添加 .unnchored 以允许部分匹配:

When you define a Scala pattern, it is anchored by default (=requires a full string match), while your Java sj.find() is looking for a match anywhere inside the string. Add .unanchored for the Scala regex to also allow partial matches:

val statePattern = statePatternString.r.unanchored
                                       ^^^^^^^^^^^

参见 IDEONE 演示

一些UnanchoredRegex 参考:

def unanchored: UnanchoredRegex

使用相同的模式创建一个新的正则表达式,但不需要整个字符串在提取器模式中匹配.

Create a new Regex with the same pattern, but no requirement that the entire String matches in extractor patterns.

通常,日期匹配的行为就像模式被包含在锚点中一样,^pattern$.

Normally, matching on date behaves as though the pattern were enclosed in anchors, ^pattern$.

未锚定的 Regex 表现得好像这些锚点已被移除.

The unanchored Regex behaves as though those anchors were removed.

请注意,此方法实际上并未从模式中去除任何匹配器.

Note that this method does not actually strip any matchers from the pattern.

AN ALTERNATIVE SOLUTION 意味着在模式末尾添加 .* ,但请记住,默认情况下点不匹配换行符.如果解决方案应该是通用的,则应该在模式的开头指定 (?s) DOTALL 修饰符,以确保匹配具有潜在换行符序列的整个字符串.

AN ALTERNATIVE SOLUTION would mean adding the .* at the pattern end, but remember that a dot does not match a newline by default. If a solution should be generic, the (?s) DOTALL modifier should be specified at the beginning of the pattern to make sure the whole string with potential newline sequences is matched.

这篇关于使用 Scala 模式匹配时工作正则表达式失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆