HTML 编码问题 - “Â"字符出现而不是“&nbsp;" [英] HTML encoding issues - "Â" character showing up instead of "&nbsp;"

查看：26 发布时间：2021/12/6 9:54:17 html vb.net encoding utf-8 iso-8859-1

本文介绍了HTML 编码问题 - “Â"字符出现而不是“&nbsp;"的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个旧版应用刚刚开始出现问题，我不确定是什么原因.它生成一堆 HTML，然后由 ActivePDF 转换为 PDF 报告.

I've got a legacy app just starting to misbehave, for whatever reason I'm not sure. It generates a bunch of HTML that gets turned into PDF reports by ActivePDF.

流程如下:

从数据库中提取一个 HTML 模板，其中包含要替换的令牌(例如~CompanyName~"、~CustomerName~"等)
用真实数据替换令牌
使用一个简单的正则表达式函数整理 HTML，该函数对 HTML 标记属性值进行属性格式化(确保引号等，因为 ActivePDF 的渲染引擎讨厌属性值周围的单引号除外)
将 HTML 发送到创建 PDF 的网络服务.

在混乱中的某个地方，HTML 模板( s)中的不间断空格编码为 ISO-8859-1，因此它们错误地显示为Â" 在浏览器 (FireFox) 中查看文档时的字符.ActivePDF 会在这些非 UTF8 字符上呕吐.

Somewhere in that mess, the non-breaking spaces from the HTML template (the  s) are encoding as ISO-8859-1 so that they show up incorrectly as an "Â" character when viewing the document in a browser (FireFox). ActivePDF pukes on these non-UTF8 characters.

我的问题:由于我不知道问题出在哪里，也没有时间进行调查，是否有一种简单的方法可以重新编码或查找并替换坏字符?我已经尝试通过我拼凑的这个小函数发送它，但它~~把它全部变成 gobbledegook~~ 并没有改变任何东西.

My question: since I don't know where the problem stems from and don't have time to investigate it, is there an easy way to re-encode or find-and-replace the bad characters? I've tried sending it through this little function I threw together, but it ~~turns it all into gobbledegook~~ doesn't change anything.

Private Shared Function ConvertToUTF8(ByVal html As String) As String
    Dim isoEncoding As Encoding = Encoding.GetEncoding("iso-8859-1")
    Dim source As Byte() = isoEncoding.GetBytes(html)
    Return Encoding.UTF8.GetString(Encoding.Convert(isoEncoding, Encoding.UTF8, source))
End Function

有什么想法吗?

我暂时接受了这个，尽管这似乎不是一个好的解决方案:

I'm getting by with this for now, though it hardly seems like a good solution:

Private Shared Function ReplaceNonASCIIChars(ByVal html As String) As String
    Return Regex.Replace(html, "[^u0000-u007F]", "&nbsp;")
End Function

HTML 编码问题 - “Â"字符出现而不是“&nbsp;" [英] HTML encoding issues - "Â" character showing up instead of "&nbsp;"

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

HTML 编码问题 - “Â"字符出现而不是“&amp;nbsp;" [英] HTML encoding issues - &quot;&#194;&quot; character showing up instead of &quot;&amp;nbsp;&quot;

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

HTML 编码问题 - “Â"字符出现而不是“ " [英] HTML encoding issues - "Â" character showing up instead of " "

登录关闭