保证字符串不变时,字符串比较是否会因文化而异? [英] Could string comparisons really differ based on culture when the string is guaranteed not to change?

查看:98
本文介绍了保证字符串不变时,字符串比较是否会因文化而异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从配置文件中读取加密的凭据/连接字符串。 Resharper告诉我,此行上的 String.IndexOf(string)在这里是特定于文化的:

I'm reading encrypted credentials/connection strings from a config file. Resharper tells me, "String.IndexOf(string) is culture-specific here" on this line:

if (line.Contains("host=")) {
    _host = line.Substring(line.IndexOf(
        "host=") + "host=".Length, line.Length - "host=".Length);

...因此要将其更改为:

...and so wants to change it to:

if (line.Contains("host=")) {
    _host = line.Substring(line.IndexOf("host=", System.StringComparison.Ordinal) + "host=".Length, line.Length -   "host=".Length);

无论应用程序可能部署在哪里,我正在读取的值始终为 host = 。添加此 System.StringComparison.Ordinal位真的明智吗?

The value I'm reading will always be "host=" regardless of where the app may be deployed. Is it really sensible to add this "System.StringComparison.Ordinal" bit?

更重要的是,它会伤害任何东西(使用它)吗?

More importantly, could it hurt anything (to use it)?

推荐答案

绝对。每个MSDN( http://msdn.microsoft.com/en-us/library/ d93tkzah.aspx ),

Absolutely. Per MSDN (http://msdn.microsoft.com/en-us/library/d93tkzah.aspx),


此方法执行一个单词(区分大小写的和区分文化的 >)
使用当前区域性进行搜索。

This method performs a word (case-sensitive and culture-sensitive) search using the current culture.

因此,如果您在不同区域性下运行它,可能会得到不同的结果(通过控制面板中的区域和语言设置)。

So you may get different results if you run it under a different culture (via regional and language settings in Control Panel).

在这种情况下,您可能不会遇到问题,但会抛出 i 在搜索字符串中运行并在土耳其运行,这可能会破坏您的一天。

In this particular case, you probably won't have a problem, but throw an i in the search string and run it in Turkey and it will probably ruin your day.

请参见MSDN: http://msdn.microsoft.com/en-us/library/ms973919.aspx


这些新的建议和API的存在是为了减轻有关默认字符串API行为的误导性假设。出现
的错误的典型示例是 Turkish-I问题,该错误中的非语言字符串数据是用语言解释的

These new recommendations and APIs exist to alleviate misguided assumptions about the behavior of default string APIs. The canonical example of bugs emerging where non-linguistic string data is interpreted linguistically is the "Turkish-I" problem.

几乎所有拉丁字母,包括美国英语,字符
i(\u0069)是字符I(\u0049)的小写版本。这种
大小写规则很快成为使用
这种文化进行编程的人的默认设置。但是,在土耳其语( tr-TR)中,存在大写字母
i带点(,u0130),它是
i的大写形式。同样,在土耳其语中,有一个小写的 i无点或
(\u0131),将大写为I。在Azeri
文化( az)中也发生这种情况

For nearly all Latin alphabets, including U.S. English, the character i (\u0069) is the lowercase version of the character I (\u0049). This casing rule quickly becomes the default for someone programming in such a culture. However, in Turkish ("tr-TR"), there exists a capital "i with a dot," character (\u0130), which is the capital version of i. Similarly, in Turkish, there is a lowercase "i without a dot," or (\u0131), which capitalizes to I. This behavior occurs in the Azeri culture ("az") as well.

因此,通常关于资本化i或小写字母
的假设在所有文化中均无效。如果使用默认的
重载用于字符串比较例程,则它们将是
,这取决于区域性之间的差异。对于非语言数据,如以下示例中的
一样,这会产生不希望的结果:

Therefore, assumptions normally made about capitalizing i or lowercasing I are not valid among all cultures. If the default overloads for string comparison routines are used, they will be subject to variance between cultures. For non-linguistic data, as in the following example, this can produce undesired results:



    Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US")
Console.WriteLine("Culture = {0}",
   Thread.CurrentThread.CurrentCulture.DisplayName);
Console.WriteLine("(file == FILE) = {0}", 
   (String.Compare("file", "FILE", true) == 0));

Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
Console.WriteLine("Culture = {0}",
   Thread.CurrentThread.CurrentCulture.DisplayName);
Console.WriteLine("(file == FILE) = {0}", 
   (String.Compare("file", "FILE", true) == 0));




由于比较I的差异,$的结果当线程区域性更改时,b $ b比较也会更改。这是
的输出:

Because of the difference of the comparison of I, results of the comparisons change when the thread culture is changed. This is the output:



Culture = English (United States)
(file == FILE) = True
Culture = Turkish (Turkey)
(file == FILE) = False

这里是一个没有大小写的示例:

Here is an example without case:

var s1 = "é"; //é as one character (ALT+0233)
var s2 = "é"; //'e', plus combining acute accent U+301 (two characters)

Console.WriteLine(s1.IndexOf(s2, StringComparison.Ordinal)); //-1
Console.WriteLine(s1.IndexOf(s2, StringComparison.InvariantCulture)); //0
Console.WriteLine(s1.IndexOf(s2, StringComparison.CurrentCulture)); //0

这篇关于保证字符串不变时,字符串比较是否会因文化而异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆