如何在Pandoc中使用Python筛选器以将带有tikz的md转换为Windows 8.1上的html [英] How to use Python filter with Pandoc to convert md with tikz to html on Windows 8.1

查看:62
本文介绍了如何在Pandoc中使用Python筛选器以将带有tikz的md转换为Windows 8.1上的html的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Pandoc过滤器将带有tikz图片的markdown文件转换为html.我使用的是Win 8.1(并且具有所有依赖项-pdflatex,Python 2.7,ImageMagick和pandocfilters Python包).我正在使用John MacFarlane在 github 上提供的tikz.py脚本.

I am trying to use a Pandoc filter to convert a markdown file with a tikz picture to html. I am on Win 8.1 (and I have all the dependencies -- pdflatex, Python 2.7, ImageMagick, and the pandocfilters Python package). I am using the tikz.py script that John MacFarlane provides on github.

我在Pandoc Google网上论坛上发现了类似的问题,约翰·麦克法兰(John MacFarlane)建议将其包装Windows批处理脚本中的过滤器(过滤器必须是可执行文件).这是我的命令行输入(我将在下面提供文件内容).

I found a similar question on the Pandoc Google Group and John MacFarlane suggests wrapping the filter in a Windows batch script (the filter must be an executable). Here is my command line input (I'll provide the file contents below).

pandoc -o temp.html --filter .\tikz.bat -s temp.md

但是我一直收到以下错误.

But I keep getting the following error.

pandoc: Failed reading: satisfyElem

该脚本会生成"tikz-images"子文件夹,但它为空,输出的文件temp.html也为空.

The script generates the "tikz-images" subfolder, but it is empty, as is the resulting output file temp.html.

我该如何使用它? FWIW,更大的目标是输入文件为 R Markdown ,但是我想了解Pandoc Markdown首先进行HTML处理.

How can I get this to work? FWIW, the bigger goal is for the input files to be R Markdown, but I want to understand the Pandoc Markdown to HTML process first.

这是文件内容.

tikz.bat

python tikz.py %*

temp.md

\begin{tikzpicture}

\draw [<->](-3,0)--(3,0);
\draw (-2,-.2)--(-2,.2);
\draw (-1,-.2)--(-1,.2);
\draw(0,-.2)--(0,.2);
\draw (1,-.2)--(1,.2);
\draw (2,-.2)--(2,.2);
\node[align=left,below] at (-4.5,-0.2) {Cash flow};
\node[align=left,above] at (-4.5,0.2) {Time period};
\node[align=left,above] at (-2,0.2) {-2};
\node[align=left,above] at (-1,0.2) {-1};
\node[align=left,above] at (0,0.2) {0};
\node[align=left,above] at (1,0.2) {+1};
\node[align=left,above] at (2,0.2) {+2};
\node[align=left,below] at (1,-0.2) {\$100};
\node[align=left,below] at (2,-0.2) {\$100};

\end{tikzpicture}

Can this work?

tikz.py

#!/usr/bin/env python

"""
Pandoc filter to process raw latex tikz environments into images.
Assumes that pdflatex is in the path, and that the standalone
package is available.  Also assumes that ImageMagick's convert
is in the path. Images are put in the tikz-images directory.
"""

import hashlib
import re
import os
import sys
import shutil
from pandocfilters import toJSONFilter, Para, Image
from subprocess import Popen, PIPE, call
from tempfile import mkdtemp

imagedir = "tikz-images"


def sha1(x):
    return hashlib.sha1(x.encode(sys.getfilesystemencoding())).hexdigest()


def tikz2image(tikz, filetype, outfile):
    tmpdir = mkdtemp()
    olddir = os.getcwd()
    os.chdir(tmpdir)
    f = open('tikz.tex', 'w')
    f.write("""\\documentclass{standalone}
             \\usepackage{tikz}
             \\begin{document}
             """)
    f.write(tikz)
    f.write("\n\\end{document}\n")
    f.close()
    p = call(["pdflatex", 'tikz.tex'], stdout=sys.stderr)
    os.chdir(olddir)
    if filetype == 'pdf':
        shutil.copyfile(tmpdir + '/tikz.pdf', outfile + '.pdf')
    else:
        call(["convert", tmpdir + '/tikz.pdf', outfile + '.' + filetype])
    shutil.rmtree(tmpdir)


def tikz(key, value, format, meta):
    if key == 'RawBlock':
        [fmt, code] = value
        if fmt == "latex" and re.match("\\\\begin{tikzpicture}", code):
            outfile = imagedir + '/' + sha1(code)
            if format == "html":
                filetype = "png"
            elif format == "latex":
                filetype = "pdf"
            else:
                filetype = "png"
            src = outfile + '.' + filetype
            if not os.path.isfile(src):
                try:
                    os.mkdir(imagedir)
                    sys.stderr.write('Created directory ' + imagedir + '\n')
                except OSError:
                    pass
                tikz2image(code, filetype, outfile)
                sys.stderr.write('Created image ' + src + '\n')
            return Para([Image([], [src, ""])])

if __name__ == "__main__":
    toJSONFilter(tikz)


更新我在评论中提到caps.py过滤器也因相同的症状而失败.也许我还应该从python caps.py temp.md中添加症状,这是在pandoc之外调用过滤器的.我的理解是,这应该将caps.py文件打印到所有大写字母的屏幕上.


Update I mention in the comments that the caps.py filter also fails with the same symptoms. Maybe I should also add the symptoms from python caps.py temp.md, which is invoking the filter outside of pandoc. My understanding is that this should print the caps.py file to the screen in all caps.

但是,当我从Windows命令提示符运行python caps.py temp.md时,它挂起了.我用CTRL-C杀死了命令,然后得到了以下信息.

However, when I run python caps.py temp.md from the Windows command prompt it hangs. I kill the command with CTRL-C, then I get the following.

C:\Users\Richard\Desktop\temp>python caps.py temp.md
Traceback (most recent call last):
  File "caps.py", line 15, in <module>
    toJSONFilter(caps)

python tikz.py temp.md也会发生同样的情况.挂起,然后:

The same occurs with python tikz.py temp.md. A hang, followed by:

C:\Users\Richard\Desktop\temp>python tikz.py temp.md
Traceback (most recent call last):
  File "tikz.py", line 70, in <module>
    toJSONFilter(tikz)


更新2 我试图在命令提示符下运行Windows调试器,但是我不确定它是否有效.有时命令提示符将挂起.而且似乎调试器也挂起了.这是调试器的输出.


Update 2 I tried to run the Windows debugger on the command prompt, but I'm not sure that it worked. Sometime the command prompt would hang. And it seems like the debugger hangs, too. Here is the output from the debugger.

*** wait with pending attach
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path.           *
* Use .symfix to have the debugger choose a symbol path.                   *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is: 
ModLoad: 00007ff7`0d920000 00007ff7`0d97d000   C:\windows\system32\cmd.exe
ModLoad: 00007fff`b7c20000 00007fff`b7dcc000   C:\windows\SYSTEM32\ntdll.dll
ModLoad: 00007fff`b5c90000 00007fff`b5dce000   C:\windows\system32\KERNEL32.DLL
ModLoad: 00007fff`b4e40000 00007fff`b4f55000   C:\windows\system32\KERNELBASE.dll
ModLoad: 00007fff`b7b70000 00007fff`b7c1a000   C:\windows\system32\msvcrt.dll
ModLoad: 00007fff`b3070000 00007fff`b307e000   C:\windows\SYSTEM32\winbrand.dll
(1c7c.29a0): Break instruction exception - code 80000003 (first chance)
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\windows\SYSTEM32\ntdll.dll - 
ntdll!DbgBreakPoint:
00007fff`b7cb2cf0 cc              int     3


更新3 以下是.该文件夹具有我上面粘贴的相同文件,以及直接来自Pandoc过滤器github存储库的caps.py文件.


Update 3 Here are the files in a Dropbox folder. This folder has the same files that I pasted above, plus the caps.py file which is direct from the Pandoc filters github repo.

推荐答案

使用-t选项,后跟不是扩展名的文件格式,例如pandoc -f json -t markdown将输出该markdown, -t html将输出html etcetera,以使用重定向操作符操作> file.some_extension捕获输出.但是您的输出将发送到控制台.因此,正确的语法实际上是pandoc -f json -t markdown.

The -t option is used followed by a format not a file with the extension for example pandoc -f json -t markdown will output that markdown, -t html will output html etcetera to capture the output use a redirection operator operation > file.some_extension. But your output is going to the console. So the correct syntax is literally pandoc -f json -t markdown.

这就是它的工作原理.

This is how it works.

                 source format = input_file.html
                      ↓
                   (pandoc) = pandoc -t json input_file.html
                      ↓
              JSON-formatted AST 
                      ↓
                   (filter)    = python $HOME/Downloads/pandocfilters-1.2.4/examples/caps.py
                      ↓
              JSON-formatted AST
                      ↓
                   (pandoc)    =  pandoc -f json -t markdown
                      ↓
                target format = output_file.md

分离命令以检查输出并使用管道|重定向输出:

Separate the commands to examine output and use a pipe | to redirect output:

 pandoc -t json ~/testing/testing.html | python examples/caps.py | pandoc -f json -t markdown > output_file.md

无需安装pandocfilters即可下载tar文件,运行tar -xvf file.xyz或使用其他选择的应用程序,并参考调用python dir/to/script.py的示例,然后再次将输出管道传输到pandoc并将输出重新映射为所需的文件格式.这是一行一行:

No need to install pandocfilters download the tar file, run tar -xvf file.x.y.z or use any other application of choice and refer to the examples calling python dir/to/script.py then pipe the out put to pandoc again and redireect output to desired file format. Here is line by line:

 $pandoc -t json ~/testing/testing.html
[{"unMeta":{"viewport":{"t":"MetaInlines","c":[{"t":"Str","c":"width=device-width,"},{"t":"Space","c":[]},{"t":"Str","c":"initial-scale=1"}]},"title":{"t":"MetaInlines","c":[]},"description":{"t":"MetaInlines","c":[]}}},[{"t":"Para","c":[{"t":"Str","c":"Hello"},{"t":"Space","c":[]},{"t":"Str","c":"world!"},{"t":"Space","c":[]},{"t":"Str","c":"This"},{"t":"Space","c":[]},{"t":"Str","c":"is"},{"t":"Space","c":[]},{"t":"Str","c":"HTML5"},{"t":"Space","c":[]},{"t":"Str","c":"Boilerplate."}]},{"t":"Para","c":[{"t":"Str","c":"l"}]}]]

然后:

$pandoc -t json ~/testing/testing.html | python examples/caps.py 
[{"unMeta": {"description": {"c": [], "t": "MetaInlines"}, "viewport": {"c": [{"c": "WIDTH=DEVICE-WIDTH,", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "INITIAL-SCALE=1", "t": "Str"}], "t": "MetaInlines"}, "title": {"c": [], "t": "MetaInlines"}}}, [{"c": [{"c": "HELLO", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "WORLD!", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "THIS", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "IS", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "HTML5", "t": "Str"}, {"c": [], "t": "Space"}, {"c": "BOILERPLATE.", "t": "Str"}], "t": "Para"}, {"c": [{"c": "L", "t": "Str"}], "t": "Para"}]]

最后:

pandoc -t json ~/testing/testing.html | python examples/caps.py | pandoc -f json -t markdown
HELLO WORLD! THIS IS HTML5 BOILERPLATE.

注释:

diff -y pandoc_json.txt caps_json.txt
[{"unMeta":{"viewport":{"t":"MetaInlines","c":[{"t":"Str","c" / [{"unMeta": {"description": {"c": [], "t": "MetaInlines"}, "v

这篇关于如何在Pandoc中使用Python筛选器以将带有tikz的md转换为Windows 8.1上的html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆