WebClient DownloadString UTF-8不显示国际字符 [英] WebClient DownloadString UTF-8 not displaying international characters

查看:152
本文介绍了WebClient DownloadString UTF-8不显示国际字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用一个字符串保存网站的html。该网站具有国际字符(è,ś,ć,...),即使我将编码设置为与网站字符集对应的UTF-8,它们也不会保存到字符串中。

I attempt to save the html of a website in a string. The website has international characters (ę, ś, ć, ...) and they are not being saved to the string even though I set the encoding to be UTF-8 which corresponds to the websites charset.

以下是我的代码:使用(WebClient客户端=新的WebClient())

$ b

Here is my code:

using (WebClient client = new WebClient())
{
    client.Encoding = Encoding.UTF8;
    string htmlCode = client.DownloadString(http://www.filmweb.pl/Mroczne.Widmo);
}

当我将htmlCode打印到控制台时,国际字符不会显示正确的,即使在原始的HTML中,它们正确显示。

When I print "htmlCode" to the console, the international characters are not shown correctly even though in the original HTML they are shown correctly.

任何帮助表示赞赏。

推荐答案

我有同样的问题。看来 client.DownloadString 不会使用UTF-8对字符进行编码。使用 client.DownloadData 并使用 Encoding.UTF8.GetString 编码返回的数据。

I had the same problem. It seems that client.DownloadString doesn’t encode the characters using UTF-8. Using client.DownloadData and encoding the returned data with Encoding.UTF8.GetString solve the problem.

using (WebClient client = new WebClient())
{
     var htmlData = client.DownloadData("http://www.filmweb.pl/Mroczne.Widmo");
     var htmlCode = Encoding.UTF8.GetString(htmlData);
}

这篇关于WebClient DownloadString UTF-8不显示国际字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆