Rust 中不区分大小写的字符串匹配 [英] Case-insensitive string matching in Rust

查看:199
本文介绍了Rust 中不区分大小写的字符串匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种简单的方法可以使用 str::matches 不区分大小写?

Is there a simple way to use str::matches case-insensitively?

推荐答案

您始终可以将两个字符串转换为相同的大小写.这适用于某些情况:

You can always convert both strings to the same casing. This will work for some cases:

let needle = "μτς";
let haystack = "ΜΤΣ";

let needle = needle.to_lowercase();
let haystack = haystack.to_lowercase();

for i in haystack.matches(&needle) {
    println!("{:?}", i);
}

另见str::to_ascii_lowercase 仅用于 ASCII 变体.

See also str::to_ascii_lowercase for ASCII-only variants.

在其他情况下,regex crate 可能会足够的大小写折叠(可能是Unicode) 给你:

In other cases, the regex crate might do enough case-folding (potentially Unicode) for you:

use regex::RegexBuilder; // 1.4.3

fn main() {
    let needle = "μτς";
    let haystack = "ΜΤΣ";

    let needle = RegexBuilder::new(needle)
        .case_insensitive(true)
        .build()
        .expect("Invalid Regex");

    for i in needle.find_iter(haystack) {
        println!("{:?}", i);
    }
}

但是,请记住 Rust 的字符串最终是 UTF-8.是的,您需要处理所有 UTF-8.这意味着选择大写或小写可能会改变您的结果.同样,更改文本大小写的唯一正确方法要求您了解语言正文;它不是字节的固有属性.是的,您可以拥有包含表情符号和其他令人兴奋的内容的字符串超出基本多语言平面.

However, remember that ultimately Rust's strings are UTF-8. Yes, you need to deal with all of UTF-8. This means that picking upper- or lower-case might change your results. Likewise, the only correct way to change text casing requires that you know the language of the text; it's not an inherent property of the bytes. Yes, you can have strings which contain emoji and other exciting things beyond the Basic Multilingual Plane.

另见:

这篇关于Rust 中不区分大小写的字符串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆