将utf8字符串转换为unicode - VB.NET [英] convert utf8 string to unicode - VB.NET

查看:333
本文介绍了将utf8字符串转换为unicode - VB.NET的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hello good people,



如标题所示 - 我有一个非英语字符串,我从网上获取(通过URLDownloadToFile()),我正在尝试将其转换为可读,MySQL友好的Unicode。这是我在MSDN中找到的代码,但不知怎的,它无法为我工作(strLine是输入字符串)。请问有谁请告诉我有什么问题?非常感谢



Hello good people,

As the title says - I have a Non-English string I got from the web (by URLDownloadToFile())and I am trying to convert it to a readable, MySQL friendly, Unicode. This is the code I found in the MSDN but somehow it fails to do the work for me (strLine is the input string). Could anybody please tell me what is wrong??? thanks a lot

Dim utf8 As Encoding = Encoding.UTF8
Dim unicode As Encoding = Encoding.Unicode
Dim utf8Bytes As Byte() = utf8.GetBytes(strLine)
Dim unicodeBytes As Byte() = Encoding.Convert(utf8, unicode, utf8Bytes)
Dim unicodeChars(unicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length) - 1) As Char
unicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0)
Dim unicodeString As New String(unicodeChars)

推荐答案

您已下载UTF-8格式的文本文件。您必须已阅读该文件以获取该字符串。那么为什么不在阅读时设置编码并让读者进行转换呢?



You have a downloaded text file in UTF-8 format. You must you have read the file to get that string. So why not set the encoding when you read it and let reader do the conversion?

' set monospaced font
TextBox1.Font = New System.Drawing.Font("DejaVu Sans Mono", 10, _
                                        System.Drawing.FontStyle.Regular, _
                                        System.Drawing.GraphicsUnit.Point)

Dim fs As IO.FileStream = IO.File.OpenRead(Utf8_FilePath)
Dim sr As New IO.StreamReader(fs, System.Text.Encoding.UTF8)

While sr.Peek <> -1
   TextBox1.AppendText(sr.ReadLine() & vbCrLf)
End While
sr.Close()

更少的代码:

Even less code:

' set monospaced font
TextBox1.Font = New System.Drawing.Font("DejaVu Sans Mono", 10, _
                                        System.Drawing.FontStyle.Regular, _
                                        System.Drawing.GraphicsUnit.Point)


' ref: http://msdn.microsoft.com/en-us/library/system.io.file.opentext.aspx
' Opens an existing UTF-8 encoded text file for reading.
Dim sr As IO.StreamReader = IO.File.OpenText(Utf8_FilePath)

While sr.Peek <> -1
   TextBox1.AppendText(sr.ReadLine() & vbCrLf)
End While
sr.Close()



代码已根据此文件进行测试: http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt [ ^ ]


看看这个 [ ^ ] 。这家伙最初做的很差的是你正在尝试的结果不好,但他的回答似乎已经解决了他的问题。现在,他的代码是在C#中,但你应该能够轻松转换它。



这是帖子中的代码:

Take a look at this[^]. The guy was originally doing pretty much what you are trying with poor results but his answer seems to have solved his problem. Now, his code is in C#, but you should be able to convert it easily enough.

Here's the code from the post:
private byte[] GetRawBytes(string str)
{
  int charcount = str.Length;
  byte[] byttemp = new byte[charcount];
  
  for (int i = 0; i < charcount; i++)
  {
    byttemp[i] = (byte)str[i];
  }

  return byttemp;
}

private string UTF8toUnicode(string str)
{
  byte[] bytUTF8;
  byte[] bytUnicode;
  string strUnicode = String.Empty;

  bytUTF8 = GetRawBytes(str);
  bytUnicode = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, bytUTF8);
  strUnicode = Encoding.Unicode.GetString(bytUnicode);
  return strUnicode;
}


这篇关于将utf8字符串转换为unicode - VB.NET的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆