C#中的UTF8字符串变量 [英] UTF8 string variable in c#

查看:95
本文介绍了C#中的UTF8字符串变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用PostgreSQL来驱动C#桌面应用程序。当我使用 PgAdmin 查询分析器更新带有特殊字符(例如版权商标)的文本列时,它会正常工作:

I am using PostgreSQL to power a C# desktop application. When I use the PgAdmin query analyzer to update a text column with a special character (like the copyrights trademarks) it works pefectly:

update table1 set column1='value with special character ©' where column2=1

当我在C#应用程序中使用同一查询时,会引发错误:

When I use this same query from my C# application, it throws an error:


无效的字节序列进行编码

invalid byte sequence for encoding

研究了此问题之后,我了解到.NET字符串使用UTF-16 Unicode编码。

After researching this issue, I understand that .NET strings use the UTF-16 Unicode encoding.

考虑:

string sourcetext = "value with special character ©";
// Convert a string to utf-8 bytes.
byte[] utf8Bytes = System.Text.Encoding.UTF8.GetBytes(sourcetext);

// Convert utf-8 bytes to a string. 
string desttext = System.Text.Encoding.UTF8.GetString(utf8Bytes);

这里的问题是 sourcetext 目标文本被编码为UTF-16字符串。当我通过 desttext 时,我仍然遇到异常。

The problem here is both the sourcetext and desttext are encoded as UTF-16 strings. When I pass desttext, I still get the exception.

我也尝试了以下操作,但未成功:

I've also tried the following without success:

Encoder.GetString, BitConverter.GetString

编辑:我什至尝试过但无济于事:

Edit: I even tried this and doesn't help:

unsafe
{
  String utfeightstring = null;
  string sourcetext = "value with special character ©";
  Console.WriteLine(sourcetext);
  // Convert a string to utf-8 bytes. 
  sbyte[] utf8Chars = (sbyte[]) (Array) System.Text.Encoding.UTF8.GetBytes(sourcetext); 
  UTF8Encoding encoding = new UTF8Encoding(true, true);

  // Instruct the Garbage Collector not to move the memory
  fixed (sbyte* pUtf8Chars = utf8Chars)
  {
    utfeightstring = new String(pUtf8Chars, 0, utf8Chars.Length, encoding);
  }
  Console.WriteLine("The UTF8 String is " + utfeightstring); 
}

.NET中是否有支持存储UTF-8编码字符串的数据类型?是否有其他方法可以处理这种情况?

Is there a datatype in .NET that supports storing UTF-8 encoded string? Are there alternative ways to handle this situation?

推荐答案

根据Mono项目 PostgreSQL 他们建议,如果您对UTF8字符串有错误,可以在连接字符串中将编码设置为unicode(如果您使用的是Npgsql驱动程序):

As per this page from the mono project PostgreSQL they suggest that if you have errors with UTF8 strings that you can set the encoding to unicode in the connection string (if you are using the Npgsql driver):


编码:要使用的编码。可能的值:ASCII(默认)和UNICODE。如果您在使用UTF-8值时遇到问题,请使用UNICODE:Encoding = UNICODE

Encoding: Encoding to be used. Possible values: ASCII(default) and UNICODE. Use UNICODE if you are getting problems with UTF-8 values: Encoding=UNICODE

而且我一直在寻找Npgsql官方文档,没有提到。
NpgsqlConnection.ConnectionString

And I have been looking in the official Npgsql docs and it isn't mentioned. NpgsqlConnection.ConnectionString

这篇关于C#中的UTF8字符串变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆