如何在另一个字符串中查找一个字符串,而忽略某些字符? [英] How to find a string within another, ignoring some characters?

查看:125
本文介绍了如何在另一个字符串中查找一个字符串,而忽略某些字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您希望从格式化的电话号码中找到部分文本,并希望标记该发现.

Suppose you wish to find a partial text from a formatted phone number, and you wish to mark the finding.

例如,如果您拥有以下电话号码:"+ 972 50-123-4567",并搜索2501,则可以在其中标记"2 50-1"中的文字.

For example, if you have this phone number: "+972 50-123-4567" , and you search for 2501 , you will be able to mark the text within it, of "2 50-1".

如果要搜索的文本为"+972 50-123-45678",并且允许的字符为"01234567890 + *#",则显示查询哈希图和预期结果的更多示例:

More examples of a hashmap of queries and the expected result, if the text to search in is "+972 50-123-45678", and the allowed characters are "01234567890+*#" :

    val tests = hashMapOf(
            "" to Pair(0, 0),
            "9" to Pair(1, 2),
            "97" to Pair(1, 3),
            "250" to Pair(3, 7),
            "250123" to Pair(3, 11),
            "250118" to null,
            "++" to null,
            "8" to Pair(16, 17),
            "+" to Pair(0, 1),
            "+8" to null,
            "78" to Pair(15, 17),
            "5678" to Pair(13, 17),
            "788" to null,
            "+ " to Pair(0, 1),
            "  " to Pair(0, 0),
            "+ 5" to null,
            "+ 9" to Pair(0, 2)
    )

问题

您可能会想:为什么不只使用"indexOf"或清理字符串并查找出现的内容呢?

The problem

You might think: Why not just use "indexOf" or clean the string and find the occurrence ?

但这是错误的,因为我想标记事件的发生,而忽略了途中的某些字符.

But that's wrong, because I want to mark the occurrence, ignoring some characters on the way.

经过一段时间的研究,我实际上得到了答案.只是想分享它,并有选择地看看是否有人可以编写更好或更短的代码,这将产生相同的行为.

I actually have the answer after I worked on it for quite some time. Just wanted to share it, and optionally see if anyone can write a nicer/shorter code, that will produce the same behavior.

我以前有一个解决方案,该解决方案要短得多,但是它假定查询仅包含允许的字符.

I had a solution before, which was quite shorter, but it assumed that the query contains only allowed characters.

这次没有问题了,因为我自己找到了答案.

Well there is no question this time, because I've found an answer myself.

不过,再次,如果您能想到一种更优雅,更简短的解决方案,它的效率与我所写的一样高效,请告诉我.

However, again, if you can think of a more elegant and/shorter solution, which is as efficient as what I wrote, please let me know.

我敢肯定,正则表达式可以在这里解决,但有时它们有时不可读,并且与精确代码相比效率很低.仍然很高兴知道这种问题如何解决.也许我也可以对此执行一个小型基准测试.

I'm pretty sure regular expressions could be a solution here, but they tend to be unreadable sometimes, and also very inefficient compared to exact code. Still could also be nice to know how this kind of question would work for it. Maybe I could perform a small benchmark on it too.

推荐答案

好的,这是我的解决方案,包括用于测试的示例:

OK so here's my solution, including a sample to test it:

TextSearchUtil.kt

object TextSearchUtil {
    /**@return where the query was found. First integer is the start. The second is the last, excluding.
     * Special cases: Pair(0,0) if query is empty or ignored, null if not found.
     * @param text the text to search within. Only allowed characters are searched for. Rest are ignored
     * @param query what to search for. Only allowed characters are searched for. Rest are ignored
     * @param allowedCharactersSet the only characters we should be allowed to check. Rest are ignored*/
    fun findOccurrenceWhileIgnoringCharacters(text: String, query: String, allowedCharactersSet: HashSet<Char>): Pair<Int, Int>? {
        //get index of first char to search for
        var searchIndexStart = -1
        for ((index, c) in query.withIndex())
            if (allowedCharactersSet.contains(c)) {
                searchIndexStart = index
                break
            }
        if (searchIndexStart == -1) {
            //query contains only ignored characters, so it's like an empty one
            return Pair(0, 0)
        }
        //got index of first character to search for
        if (text.isEmpty())
        //need to search for a character, but the text is empty, so not found
            return null
        var mainIndex = 0
        while (mainIndex < text.length) {
            var searchIndex = searchIndexStart
            var isFirstCharToSearchFor = true
            var secondaryIndex = mainIndex
            var charToSearch = query[searchIndex]
            secondaryLoop@ while (secondaryIndex < text.length) {
                //skip ignored characters on query
                if (!isFirstCharToSearchFor)
                    while (!allowedCharactersSet.contains(charToSearch)) {
                        ++searchIndex
                        if (searchIndex >= query.length) {
                            //reached end of search while all characters were fine, so found the match
                            return Pair(mainIndex, secondaryIndex)
                        }
                        charToSearch = query[searchIndex]
                    }
                //skip ignored characters on text
                var c: Char? = null
                while (secondaryIndex < text.length) {
                    c = text[secondaryIndex]
                    if (allowedCharactersSet.contains(c))
                        break
                    else {
                        if (isFirstCharToSearchFor)
                            break@secondaryLoop
                        ++secondaryIndex
                    }
                }
                //reached end of text
                if (secondaryIndex == text.length) {
                    if (isFirstCharToSearchFor)
                    //couldn't find the first character anywhere, so failed to find the query
                        return null
                    break@secondaryLoop
                }
                //time to compare
                if (c != charToSearch)
                    break@secondaryLoop
                ++searchIndex
                isFirstCharToSearchFor = false
                if (searchIndex >= query.length) {
                    //reached end of search while all characters were fine, so found the match
                    return Pair(mainIndex, secondaryIndex + 1)
                }
                charToSearch = query[searchIndex]
                ++secondaryIndex
            }
            ++mainIndex
        }
        return null
    }
}

使用示例进行测试:

MainActivity.kt

class MainActivity : AppCompatActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        //
        val text = "+972 50-123-45678"
        val allowedCharacters = "01234567890+*#"
        val allowedPhoneCharactersSet = HashSet<Char>(allowedCharacters.length)
        for (c in allowedCharacters)
            allowedPhoneCharactersSet.add(c)
        //
        val tests = hashMapOf(
                "" to Pair(0, 0),
                "9" to Pair(1, 2),
                "97" to Pair(1, 3),
                "250" to Pair(3, 7),
                "250123" to Pair(3, 11),
                "250118" to null,
                "++" to null,
                "8" to Pair(16, 17),
                "+" to Pair(0, 1),
                "+8" to null,
                "78" to Pair(15, 17),
                "5678" to Pair(13, 17),
                "788" to null,
                "+ " to Pair(0, 1),
                "  " to Pair(0, 0),
                "+ 5" to null,
                "+ 9" to Pair(0, 2)
        )
        for (test in tests) {
            val result = TextSearchUtil.findOccurrenceWhileIgnoringCharacters(text, test.key, allowedPhoneCharactersSet)
            val isResultCorrect = result == test.value
            val foundStr = if (result == null) null else text.substring(result.first, result.second)
            when {
                !isResultCorrect -> Log.e("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result found String: \"$foundStr\"")
                foundStr == null -> Log.d("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result")
                else -> Log.d("AppLog", "checking query of \"${test.key}\" inside \"$text\" . Succeeded?$isResultCorrect Result: $result found String: \"$foundStr\"")

            }
        }
        //
        Log.d("AppLog", "special cases:")
        Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("a", "c", allowedPhoneCharactersSet) == Pair(0, 0)}")
        Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("ab", "c", allowedPhoneCharactersSet) == Pair(0, 0)}")
        Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("ab", "cd", allowedPhoneCharactersSet) == Pair(0, 0)}")
        Log.d("AppLog", "${TextSearchUtil.findOccurrenceWhileIgnoringCharacters("a", "cd", allowedPhoneCharactersSet) == Pair(0, 0)}")
    }

}

如果我想突出显示结果,可以使用类似的内容:

If I want to highlight the result, I can use something like that:

    val pair = TextSearchUtil.findOccurrenceWhileIgnoringCharacters(text, "2501", allowedPhoneCharactersSet)
    if (pair == null)
        textView.text = text
    else {
        val wordToSpan = SpannableString(text)
        wordToSpan.setSpan(BackgroundColorSpan(0xFFFFFF00.toInt()), pair.first, pair.second, Spannable.SPAN_EXCLUSIVE_EXCLUSIVE)
        textView.setText(wordToSpan, TextView.BufferType.SPANNABLE)
    }

这篇关于如何在另一个字符串中查找一个字符串,而忽略某些字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆