URL解码混乱 [英] URL decoding confusion
问题描述
我有一个指的是以下网址DB:
I've got a DB that refers to the following url:
的 http://en.wikipedia.org/wiki/Herbert_Gr%F6nemeyer
不过,似乎这是一个坏的URL编码,造成既HttpUtility.UrlDecode问题(给我的垃圾)和Uri.UnescapeDataString(UriFormatException)。我的浏览器传递到维基百科的路径没有改变(所以我假设%F6获取浏览器的编码),如下所示:
However, it seems that this is a bad URLEncoding, causing problems with both HttpUtility.UrlDecode (gives me garbage) and Uri.UnescapeDataString (UriFormatException). My browser passes the path on to Wikipedia unaltered (so I assume the %F6 gets encoded by the browser), as follows:
GET /维基/ Herbert_Gr%F6nemeyer HTTP / 1.1
GET /wiki/Herbert_Gr%F6nemeyer HTTP/1.1
维基百科承认和301重定向到:
Wikipedia recognizes and 301 redirects to:
地点: http://en.wikipedia.org/wiki/Herbert_Gr%C3%B6nemeyer
这是怎么回事吗?维基百科是否有一个额外的专有编码
What's going on here? Does Wikipedia have an additional proprietary encoding?
编辑:我有维基百科的一个本地副本我试图越过aganst此URL引用。 赫伯特格罗内迈尔:该文章由标题,在这种情况下会被索引。任何人都可以建议如何我就从Herbert_Gr%F6nemeyer到赫伯特格罗内迈尔在代码中去。显然,下划线是不是这里的问题。
I've got a local copy of Wikipedia which I am attempting to cross reference aganst this url. The articles are indexed by title, which in this case would be: "Herbert Grönemeyer". Can anyone suggest how I would go from "Herbert_Gr%F6nemeyer" to "Herbert Grönemeyer" in code. Obviously the underscore is not the problem here.
推荐答案
%C3%B6是邻正确的UTF-8编码(O-变音)。我假设%F6是字节为字节字节值的副本,同一性质的一些本地编码(例如,从代码页1252)。
%C3%B6 is proper UTF-8 encoding for ö (o-umlaut). I would assume that %F6 is byte-for-byte copy of byte value for some local encoding of same character (e.g. from code page 1252).
这篇关于URL解码混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!