C# - 不同编码的比较字符串 [英] C# - Comparing strings of different encodings

查看:176
本文介绍了C# - 不同编码的比较字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用C#,我取从页面的.ascx一个 TextBox.Text 值。当我比较值到LINQ查询里面一个普通字符串对象的平等,它始终返回false。

Using C#, I fetch a TextBox.Text value from an .ascx page. When I compare the equality of the value to a regular string object inside a LINQ-query, it always returns false.

我得出的结论是,他们是不同的编码,但迄今曾在转换或比较它们没有运气。

I have come to the conclusion that they are differently encoded, but have so far had no luck in converting or comparing them.

docname = "Testdoc 1.docx"; //regular string created in C#
fetchedVal = ((TextBox)e.Item.FindControl("txtSelectedDocs")).Text; //UTF-8

当表示为文字,但上述两个字符串相同比较字节[] 他们显然不同,由于编码

The above two strings are identical when represented as literals, but comparing the byte[] they are obviously different due to the encoding.

我已经尝试了很多不同的东西,比如:

I've tried alot of different things, such as:

System.Text.Encoding.Default.GetString(utf8.GetBytes(fetchedVal));



但将返回值TestdocÂ1.docx

如果我代之以

System.Text.Encoding.Default.GetString(System.Text.Encoding.Default.GetBytes(fetchedVal));



返回Testdoc 1.docx但一个等于() -check仍然返回

it returns "Testdoc 1.docx" but an Equals()-check still returns false.

我也曾尝试以下,这似乎是推荐的方法,但没有运气:

I have also tried the following, which seem to be the recommended approach, but with no luck:

byte[] utf8Bytes = Encoding.UTF8.GetBytes(fetchedVal);
byte[] unicodeBytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utf8Bytes);
string fetchedValConverted = Encoding.Unicode.GetString(unicodeBytes);



罪魁祸首似乎是空白的,因为检查字节序列时,它总是第七个字节的不同

The culprit appears to be the whitespace, because when examining the byte sequence it's always the seventh byte that differs.

你如何正确地从UTF-8转换为默认字符串编码在C#?

How do you properly convert from UTF-8 to default string encoding in C#?

推荐答案

字符没有编码或字节数组。编码只有当你将字符串转换成字节数组开始发挥作用;你只能做到这一点通过指定用挑字节哪种编码。

Strings don't have encodings or byte arrays. Encodings only come into play when you convert a string into a byte array; you can only do that by specifying which encoding to use to pick bytes.

这听起来像你实际上只需在你的字符串不同的字符。你可能有一个无形的性格在其中的一个,或者他们可能有一个看起来是一样的不同的字符。

It sounds like you actually simply have different characters in your strings. You might have an invisible character in one of them, or they might have different characters that look the same.

要找到答案,看看每个字符的Unicode代码点值在每一个字符串(例如,(INT)海峡[0] )。

To find out, look at the Unicode codepoint values of each character in each string (eg, (int) str[0]).

这篇关于C# - 不同编码的比较字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆