从 HTML 转换为 Markdown 时阻止 Pandoc 转义单引号 [英] Stopping Pandoc from escaping single quotes when converting from HTML to Markdown
问题描述
如果我将单引号 '
从 HTML 转换为 Markdown,它会自动转义:
If I convert a single quote '
from HTML to Markdown, it is automatically escaped:
% echo "'" | pandoc -f html -t markdown
\'
我希望它输出时不带斜杠,因为这样会使带有收缩的文本更难阅读.
I'd like it to output without the slash, as it makes text with contractions rather much harder to read.
我认为这可能是由于all_symbols_escapable"选项造成的,但它仍然发生,即使我将其关闭:
I thought this might be due to the "all_symbols_escapable" option, but it still happens, even when I turn that off:
% echo "'" | pandoc -f html -t markdown-all_symbols_escapable
\'
这不是问题,但是,对于 markdown_strict:
It isn't a problem, however, for markdown_strict:
% echo "'" | pandoc -f html -t markdown_strict
'
有什么建议吗?我想使用调整了选项的默认 Pandoc 标记,或者如果这不是其他人所期望的,则将其报告为错误.
Any suggestions? I'd like to use the default Pandoc markdow with the options tweaked, or report this as a bug if it's not what others expect.
推荐答案
Escaping 与 pandoc 的 smart
扩展有关.此扩展在适当的时候将单引号转换为印刷正确的开始/结束单引号或撇号.当查看仅使用 ASCII 字符的 HTML 输出时,这一点变得最清楚:
Escaping is related to pandoc's smart
extensions. This extension converts single quotes to the typographically correct opening/closing single quote or apostrophe when appropriate. This becomes most clear when looking at HTML output that uses only ASCII characters:
% echo "'hello'" | pandoc -f markdown -t html --ascii
<p>‘hello’</p>
% echo "let's" | pandoc -f markdown -t html --ascii
<p>let’s</p>
可以通过转义字符来禁用这种对引号的智能处理
This smart treatment of quotes can be disabled on a per-case basis by escaping the character
% echo "let\'s" | pandoc -f markdown -t html --ascii
<p>let's</p>
或禁用降价的智能扩展:
or by disabling the smart extension for markdown:
% echo "let's" | pandoc -f markdown-smart -t html --ascii
<p>let's</p>
因此,每当 pandoc 在 HTML 中看到 '
字符时,它都会假定该字符是有意通过更正确的单引号选择的,从而确保不会以智能"方式处理它从 Markdown 回读时的方式.
So whenever pandoc sees a '
character in HTML, it assumes that this character was chosen intentionally over the more correct single quote, and thus ensures that it won't be treated in a "smart" way when read back from Markdown.
因此,解决方案是告诉 pandoc 它应该忽略这些细节,并将 Markdown 写成好像它不会受到引号的智能处理:
The solution is thus to tell pandoc that it should ignore these details and will write Markdown as if it would not be subjected to the smart treatment of quotes:
% echo "'" | pandoc -f html -t markdown-smart
'
在使用 markdown_strict
时,smart 扩展已经被禁用,这就是您在这种情况下获得所需行为的原因.
The smart extension is already disabled when using markdown_strict
, which is why you got the desired behavior in that case.
这篇关于从 HTML 转换为 Markdown 时阻止 Pandoc 转义单引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!