带有 HTML 标题、问号的 Unicode 问题?65533； [英] Unicode issue with an HTML Title, question mark? 65533;

查看：63 发布时间：2021/12/28 16:58:03 java html unicode utf-8

本文介绍了带有 HTML 标题、问号的 Unicode 问题?65533；的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我正在尝试解析以下网页中的标题:http://kid37.blogger.de/stories/1670573/

I'm trying to parse the title from the following webpage: http://kid37.blogger.de/stories/1670573/

当我在标题元素上使用 apache.commons.lang StringEscapeUtils.escapeHTML 方法时，我得到以下内容

When I use the apache.commons.lang StringEscapeUtils.escapeHTML method on the title element I get the following

Das hermetische Caf&#65533;: Rock &amp; Wrestling 2010

但是，当我使用 utf-8 编码在我的网页中显示它时，它只显示一个问号.

however when I display that in my webpage with utf-8 encoding it just shows a question mark.

使用以下代码:

String title = StringEscapeUtils.escapeHtml(myTitle);

如果我通过这个网站运行标题:http://tools.devshed.com/?option=com_mechtools&tool=27 我得到以下似乎正确的输出

If I run the title through this website: http://tools.devshed.com/?option=com_mechtools&tool=27 I get the following output which seems correct

标题:

<title>Das hermetische Café: Rock &amp; Wrestling 2010</title>

BECOMES(我期待 escapeHtml 方法能做到):

BECOMES (which I was expecting the escapeHtml method to do):

<title>Das hermetische Caf&eacute;: Rock &amp; Wrestling 2010</title>

有什么想法吗?谢谢