WebClient DownloadString UTF-8不显示国际字符 [英] WebClient DownloadString UTF-8 not displaying international characters
问题描述
我试图用一个字符串保存网站的html。该网站具有国际字符(è,ś,ć,...),即使我将编码设置为与网站字符集对应的UTF-8,它们也不会保存到字符串中。
I attempt to save the html of a website in a string. The website has international characters (ę, ś, ć, ...) and they are not being saved to the string even though I set the encoding to be UTF-8 which corresponds to the websites charset.
以下是我的代码:使用(WebClient客户端=新的WebClient())
$ b
Here is my code:
using (WebClient client = new WebClient())
{
client.Encoding = Encoding.UTF8;
string htmlCode = client.DownloadString(http://www.filmweb.pl/Mroczne.Widmo);
}
当我将htmlCode打印到控制台时,国际字符不会显示正确的,即使在原始的HTML中,它们正确显示。
When I print "htmlCode" to the console, the international characters are not shown correctly even though in the original HTML they are shown correctly.
任何帮助表示赞赏。
推荐答案
我有同样的问题。看来 client.DownloadString
不会使用UTF-8对字符进行编码。使用 client.DownloadData
并使用 Encoding.UTF8.GetString
编码返回的数据。
I had the same problem. It seems that client.DownloadString
doesn’t encode the characters using UTF-8. Using client.DownloadData
and encoding the returned data with Encoding.UTF8.GetString
solve the problem.
using (WebClient client = new WebClient())
{
var htmlData = client.DownloadData("http://www.filmweb.pl/Mroczne.Widmo");
var htmlCode = Encoding.UTF8.GetString(htmlData);
}
这篇关于WebClient DownloadString UTF-8不显示国际字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!