URL解码混乱 [英] URL decoding confusion

查看:136
本文介绍了URL解码混乱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个指的是以下网址DB:

I've got a DB that refers to the following url:

http://en.wikipedia.org/wiki/Herbert_Gr%F6nemeyer

不过,似乎这是一个坏的URL编码,造成既HttpUtility.UrlDecode问题(给我的垃圾)和Uri.UnescapeDataString(UriFormatException)。我的浏览器传递到维基百科的路径没有改变(所以我假设%F6获取浏览器的编码),如下所示:

However, it seems that this is a bad URLEncoding, causing problems with both HttpUtility.UrlDecode (gives me garbage) and Uri.UnescapeDataString (UriFormatException). My browser passes the path on to Wikipedia unaltered (so I assume the %F6 gets encoded by the browser), as follows:

GET /维基/ Herbert_Gr%F6nemeyer HTTP / 1.1

GET /wiki/Herbert_Gr%F6nemeyer HTTP/1.1

维基百科承认和301重定向到:

Wikipedia recognizes and 301 redirects to:

地点: http://en.wikipedia.org/wiki/Herbert_Gr%C3%B6nemeyer

这是怎么回事吗?维基百科是否有一个额外的专有编码

What's going on here? Does Wikipedia have an additional proprietary encoding?

编辑:我有维基百科的一个本地副本我试图越过aganst此URL引用。 赫伯特格罗内迈尔:该文章由标题,在这种情况下会被索引。任何人都可以建议如何我就从Herbert_Gr%F6nemeyer到赫伯特格罗内迈尔在代码中去。显然,下划线是不是这里的问题。

I've got a local copy of Wikipedia which I am attempting to cross reference aganst this url. The articles are indexed by title, which in this case would be: "Herbert Grönemeyer". Can anyone suggest how I would go from "Herbert_Gr%F6nemeyer" to "Herbert Grönemeyer" in code. Obviously the underscore is not the problem here.

推荐答案

%C3%B6是邻正确的UTF-8编码(O-变音)。我假设%F6是字节为字节字节值的副本,同一性质的一些本地编码(例如,从代码页1252)。

%C3%B6 is proper UTF-8 encoding for ö (o-umlaut). I would assume that %F6 is byte-for-byte copy of byte value for some local encoding of same character (e.g. from code page 1252).

这篇关于URL解码混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆