使用Pandoc将HTML Mathjax转换为Markdown [英] Convert html mathjax to markdown with pandoc
问题描述
我有一些html文件,包括mathjax命令. 我想使用pandoc将其转换为php额外的markdown.
I have some html files including mathjax commands. I would like to translate it into php extra markdown using pandoc.
问题是pandoc在所有数学命令之前添加"\".例如 \ begin {equation} \ $ x \ ^ 2 等等
The problem is that pandoc add "\" before all math commands. For example \begin{equation} \$ x\^2 etc
您知道如何通过pandoc避免这种情况吗? 我认为与此相关的问题是:如何转换使用pandoc将带有mathjax的HTML转换为乳胶?
Do you know how to avoid that with pandoc ? I think a related question is this one : How to convert HTML with mathjax into latex using pandoc?
推荐答案
您可以编写一个简短的Haskell程序unescape.hs:
You can write a short Haskell program unescape.hs:
-- Disable backslash escaping of special characters when writing strings to markdown.
import Text.Pandoc
main = toJsonFilter unescape
where unescape (Str xs) = RawInline "markdown" xs
unescape x = x
现在使用ghc --make unescape.hs
进行编译.并与
Now compile with ghc --make unescape.hs
. And use with
pandoc -f html -t json | ./unescape | pandoc -f json -t markdown
这将禁止在markdown输出中转义特殊字符(如$
).
This will disable escaping of special characters (like $
) in markdown output.
一种更简单的方法可能是通过sed传递pandoc的常规markdown输出:
A simpler approach might be to pipe pandoc's normal markdown output through sed:
pandoc -f html -t markdown | sed -e 's/\\\([$^_*]\)/\1/g'
这篇关于使用Pandoc将HTML Mathjax转换为Markdown的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!