Scala正则表达式（字符串用双引号分隔） [英] Scala Regular Expressions (string delimited by double quotes)

查看：274 发布时间：2020/10/26 0:19:45 regex scala double-quotes

本文介绍了Scala正则表达式（字符串用双引号分隔）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是scala的新手。我试图匹配一个用双引号分隔的字符串，但我对以下行为感到困惑：

I am new to scala. I am trying to match a string delimited by double quotes, and I am a bit puzzled by the following behavior:

如果我执行以下操作：

val stringRegex = """"([^"]*)"(.*$)"""
val regex = stringRegex.r
val tidyTokens = Array[String]("1", "\"test\"", "'c'", "-23.3")
tidyTokens.foreach {
    token => if (token.matches (stringRegex)) println (token + " matches!")
}

我得到

"test" matches!

否则，如果执行以下操作：

otherwise, if I do the following:

tidyTokens.foreach {
    token => token match {
        case regex(token) => println (token + " matches!")
        case _ => println ("No match for token " + token)
    }
}

我知道

No match for token 1
No match for token "test"
No match for token 'c'
No match for token -23.3

为什么在第二种情况下测试不匹配？

Why doesn't "test" match in the second case?

推荐答案

使用正则表达式：

 "([^"]*)"(.*$)

使用 .r ，此字符串将产生一个 regex 对象-如果与输入字符串匹配，则必须产生 2 个捕获的字符串-一个用于（[[^] *），另一个用于（。* $）。您的代码


When compiled with .r, this string yields a regex object - which, if it matches it's input string, must yield 2 captured strings - one for the ([^"]*) and the other for the (.*$). Your code
  case regex(token) => ...

应该反映这一点，所以也许您想要
Ought to reflect this, so maybe you want
  case regex(token, otherStuff) => ...

或者只是
  case regex(token, _) => ...

为什么？因为 case regex（matchedCaputures ...）语法有效，因为 regex 是带有 unapplySeq 方法。  case regex（token）=> ... 大致翻译为：
Why? Because the case regex(matchedCaputures...) syntax works because regex is an 
object with an unapplySeq method.  case regex(token) => ... translates (roughly) to:
 case List(token) => ...

 列表（令牌）是 regex.unapplySeq（inputString）返回的内容：
 regex.unapplySeq("\"test\"") // Returns Some(List("test", ""))

您的正则表达式确实匹配字符串 test ，但在 case 语句中，正则表达式提取器的 unapplySeq 方法返回一个 2 字符串列表，因为正则表达式表示捕获了该字符串。不幸的是，但是编译器无法在这里为您提供帮助，因为正则表达式是在运行时从字符串编译而成的。
Your regex does match the string "test" but in the case statement the regex extractor's unapplySeq method returns a list of 2 strings because that is what the regex says it captures.  That's unfortunate, but the compiler can't help you here because regular expressions are compiled from strings at runtime.
一种选择是使用非捕获组：
One alternative would be to use a non-capturing group: 
 val stringRegex = """"([^"]*)"(?:.*$)"""
 //                             ^^

然后您的代码将起作用，因为 regex 现在将成为一个提取器对象，其
  unapplySeq 方法仅返回一个捕获的组：
Then your code would work, because regex will now be an extractor object whose
unapplySeq method returns only a single captured group:
 tidyTokens foreach { 
    case regex(token) => println (token + " matches!")
    case t => println ("No match for token " + t)
 }

请参阅提取器对象，以更好地了解
 如何应用 / 取消应用 /  unapplySeq 有效。
Have a look at the tutorial on Extractor Objects, for a better understanding on 
how apply / unapply  / unapplySeq works.

                        这篇关于Scala正则表达式（字符串用双引号分隔）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Scala正则表达式（字符串用双引号分隔） [英] Scala Regular Expressions (string delimited by double quotes)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Scala正则表达式（字符串用双引号分隔） [英] Scala Regular Expressions (string delimited by double quotes)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭