Kotlin中的正则表达式匹配 [英] Regex matching in Kotlin

查看:252
本文介绍了Kotlin中的正则表达式匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何匹配字符串中的secret_code_data:

How do I match secret_code_data in string:

xeno://soundcloud/?code=secret_code_data#

我尝试过

val regex = Regex("""xeno://soundcloud/?code=(.*?)#""")
field = regex.find(url)?.value ?: ""

没有运气.我猜测 ?在代码可能是问题之前,我应该以某种方式逃脱它.你能帮忙吗?

without luck. I suspect ? before code might be the problem, should I escape it somehow. Can you help?

推荐答案

这里有三个选项,第一个提供一个很好的Regex来满足您的需求,另外两个选项使用Regex的替代方法(用于处理URL组件)来解析URL.正确编码/解码.

Here are three options, the first providing a good Regex that does what you want, and the other two for parsing URL's using an alternative to Regex which handle URL component encoding/decoding correctly.

注意: 正则表达式方法在大多数情况下不安全,因为它无法正确地将URL解析为组件,然后分别解码每个组件.通常,您无法将整个URL解码为一个字符串,然后再安全地进行解析,因为某些编码字符可能会在以后使Regex感到困惑.这类似于使用正则表达式解析XHTML(如在这里描述).请在下面查看正则表达式的替代方法.

NOTE: Regex method is unsafe in most use cases since it does not properly parse the URL into components, then decode each component separately. Normally you cannot decode the whole URL into one string and then parse safely because some encoded characters might confuse the Regex later. This is similar to parsing XHTML using regex (as described here). See alternatives to Regex below.

这是一个清理过的正则表达式,作为一个单元测试用例,可以安全地处理更多URL.这篇文章的最后是一个可以用于每种方法的单元测试.

Here is a cleaned up regex as a unit test case that handles more URLs safely. At the end of this post is a unit test you can use for each method.

private val SECRET_CODE_REGEX = """xeno://soundcloud[/]?.*[\?&]code=([^#&]+).*""".toRegex()
fun findSecretCode(withinUrl: String): String? =
        SECRET_CODE_REGEX.matchEntire(withinUrl)?.groups?.get(1)?.value

此正则表达式处理以下情况:

This regex handles these cases:

  • 在路径中带有和不带有尾随的/
  • 有和没有片段
  • 参数作为参数列表中的第一个,中间或最后一个
  • 参数作为唯一参数

请注意,在Kotlin中制作正则表达式的惯用方式是 someString.toRegex() .可以在 Kotlin API参考中找到它.

Note that idiomatic way to make a regex in Kotlin is someString.toRegex(). It and other extension methods can be found in the Kotlin API Reference.

这里是使用 Klutter的kt#L10-L26"rel =" noreferrer> UriBuilder Kotlin库.此版本处理

Here is an example using the UriBuilder from the Klutter library for Kotlin. This version handles encoding/decoding including more modern JavaScript unicode encodings not handled by the Java standard URI class (which has many issues). This is safe, easy, and you don't need to worry about any special cases.

实施:

fun findSecretCode(withinUrl: String): String? {
    fun isValidUri(uri: UriBuilder): Boolean = uri.scheme == "xeno"
                    && uri.host == "soundcloud"
                    && (uri.encodedPath == "/" || uri.encodedPath.isNullOrBlank())
    val parsed = buildUri(withinUrl)
    return if (isValidUri(parsed)) parsed.decodedQueryDeduped?.get("code") else null
}

Klutter uy.klutter:klutter-core-jdk6:$ klutter_version 工件很小,并且包括一些其他扩展,包括现代化的URL编码/解码.(对于 $ klutter_version ,请使用最新版本)

The Klutter uy.klutter:klutter-core-jdk6:$klutter_version artifact is small, and includes some other extensions include the modernized URL encoding/decoding. (For $klutter_version use the most current release).

此版本稍长一些,它表明您需要自己解析原始查询字符串,在解析后进行解码,然后找到查询参数:

This version is a little longer, and shows you need to parse the raw query string yourself, decode after parsing, then find the query parameter:

fun findSecretCode(withinUrl: String): String? {
    fun isValidUri(uri: URI): Boolean = uri.scheme == "xeno"
            && uri.host == "soundcloud"
            && (uri.rawPath == "/" || uri.rawPath.isNullOrBlank())

    val parsed = URI(withinUrl)
    return if (isValidUri(parsed)) {
        parsed.getRawQuery().split('&').map {
            val parts = it.split('=')
            val name = parts.firstOrNull() ?: ""
            val value = parts.drop(1).firstOrNull() ?: ""
            URLDecoder.decode(name, Charsets.UTF_8.name()) to URLDecoder.decode(value, Charsets.UTF_8.name())
        }.firstOrNull { it.first == "code" }?.second
    } else null
}

这可以写为URI类本身的扩展:

This could be written as an extension on the URI class itself:

fun URI.findSecretCode(): String? { ... }

在正文中删除 parsed 变量,并使用 this ,因为您已经有了URI,那么您就是URI.然后使用:

In the body remove parsed variable and use this since you already have the URI, well you ARE the URI. Then call using:

val secretCode = URI(myTestUrl).findSecretCode()

单元测试

鉴于以上任何功能,请运行此测试以证明其有效:

Unit Tests

Given any of the functions above, run this test to prove it works:

class TestSo34594605 {
    @Test fun testUriBuilderFindsCode() {
        // positive test cases

        val testUrls = listOf("xeno://soundcloud/?code=secret_code_data#",
                "xeno://soundcloud?code=secret_code_data#",
                "xeno://soundcloud/?code=secret_code_data",
                "xeno://soundcloud?code=secret_code_data",
                "xeno://soundcloud?code=secret_code_data&other=fish",
                "xeno://soundcloud?cat=hairless&code=secret_code_data&other=fish",
                "xeno://soundcloud/?cat=hairless&code=secret_code_data&other=fish",
                "xeno://soundcloud/?cat=hairless&code=secret_code_data",
                "xeno://soundcloud/?cat=hairless&code=secret_code_data&other=fish#fragment"
        )

        testUrls.forEach { test ->
            assertEquals("secret_code_data", findSecretCode(test), "source URL: $test")
        }

        // negative test cases, don't get things on accident

        val badUrls = listOf("xeno://soundcloud/code/secret_code_data#",
                "xeno://soundcloud?hiddencode=secret_code_data#",
                "http://www.soundcloud.com/?code=secret_code_data")

        badUrls.forEach { test ->
            assertNotEquals("secret_code_data", findSecretCode(test), "source URL: $test")
        }
    }

这篇关于Kotlin中的正则表达式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆