WebClient.DownloadString导致错位的人物,由于编码问题,但浏览器就OK [英] WebClient.DownloadString results in mangled characters due to encoding issues, but the browser is OK
问题描述
下面code:
var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));
结果的变量文本
包含,在许多其他事情,串
results in a variable text
that contains, among many other things, the string
$Iº$ -Minkowski空间,标量场,而洛伦兹不变性的问题
"$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance"
然而,当我访问该网址在Firefox中,我得到
However, when I visit that URL in Firefox, I get
$κ$ -Minkowski空间,标量场,而洛伦兹不变性的问题
$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance
这实际上是正确的。我也试过
which is actually correct. I also tried
var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);
不过这给了同样的问题。
but this gave the same problem.
我不知道在哪里的故障就出在这里。是进给卧谈是UTF8-CN codeD,浏览器是足够聪明,明白这一点,但不是 Web客户端
?是进给正确UTF8-CN codeD,但 Web客户端
未能以其他方式?我能做些什么,以减轻这一点?
I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient
? Is the feed properly UTF8-encoded, but WebClient
is failing in some other way? What can I do to mitigate this?
推荐答案
这不是说谎。你应该在调用DownloadString之前先设置Web客户端的编码。
It's not lying. You should set the webclient's encoding first before calling DownloadString.
using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}
至于为什么你选择不工作,这是因为使用不正确。它应该是:
As for why your alternative isn't working, it's because the usage is incorrect. Its should be:
System.Text.Encoding.UTF8.GetString()
这篇关于WebClient.DownloadString导致错位的人物,由于编码问题,但浏览器就OK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!