HttpUtility.HtmlDecode无法解码&;#39 [英] HttpUtility.HtmlDecode fails to decode &#39
本文介绍了HttpUtility.HtmlDecode无法解码&;#39的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用.Net 4.5和
HttpUtility.HtmlDecode
无法解码'
,它是单引号字符
知道为什么吗?
在Windows 8.1上使用C#.Net 4.5 WPF
此处失败的文本
Apple 13'' Z0RA30256 MacBook Pro Retina
下面是框架版本
#region Assembly System.Web.dll, v4.0.0.0
// C:Program Files (x86)Reference AssembliesMicrosoftFramework.NETFrameworkv4.5System.Web.dll
#endregion
推荐答案
无法使用Build ItHtmlDecode
方法处理此问题,您必须查找/替换它或以其他方式解决此问题。
HtmlDecode
的源代码-您可以从评论中明确看到您的场景被考虑但不受支持-HTML实体必须以;
为界限,否则它们根本不是HTML实体。浏览器会原谅错误的标记,并进行相应的补偿。
// We found a '&'. Now look for the next ';' or '&'. The idea is that
// if we find another '&' before finding a ';', then this is not an entity,
// and the next '&' might start a real entity (VSWhidbey 275184)
以下是HttpUtility
中的.NETHtmlDecode
的完整源代码,如果您要调整行为的话。
http://referencesource.microsoft.com/#System/net/System/Net/WebUtility.cs,44d08941e6aeb00d
public static void HtmlDecode(string value, TextWriter output)
{
if (value == null)
{
return;
}
if (output == null)
{
throw new ArgumentNullException("output");
}
if (value.IndexOf('&') < 0)
{
output.Write(value); // good as is
return;
}
int l = value.Length;
for (int i = 0; i < l; i++)
{
char ch = value[i];
if (ch == '&')
{
// We found a '&'. Now look for the next ';' or '&'. The idea is that
// if we find another '&' before finding a ';', then this is not an entity,
// and the next '&' might start a real entity (VSWhidbey 275184)
int index = value.IndexOfAny(_htmlEntityEndingChars, i + 1);
if (index > 0 && value[index] == ';')
{
string entity = value.Substring(i + 1, index - i - 1);
if (entity.Length > 1 && entity[0] == '#')
{
// The # syntax can be in decimal or hex, e.g.
// å --> decimal
// å --> same char in hex
// See http://www.w3.org/TR/REC-html40/charset.html#entities
ushort parsed;
if (entity[1] == 'x' || entity[1] == 'X')
{
UInt16.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out parsed);
}
else
{
UInt16.TryParse(entity.Substring(1), NumberStyles.Integer, NumberFormatInfo.InvariantInfo, out parsed);
}
if (parsed != 0)
{
ch = (char)parsed;
i = index; // already looked at everything until semicolon
}
}
else
{
i = index; // already looked at everything until semicolon
char entityChar = HtmlEntities.Lookup(entity);
if (entityChar != (char)0)
{
ch = entityChar;
}
else
{
output.Write('&');
output.Write(entity);
output.Write(';');
continue;
}
}
}
}
output.Write(ch);
}
}
这篇关于HttpUtility.HtmlDecode无法解码&;#39的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文