从docx转换为markdown时使用紧凑列表 [英] Use compact lists when converting from docx to markdown

查看:132
本文介绍了从docx转换为markdown时使用紧凑列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Windows上使用pandoc将.docx文件转换为.md文件.

I'm using pandoc on Windows to convert from a .docx file to a .md file.

我正在使用的标志如下:

The flags I'm using are the following:

pandoc --wrap none --to markdown_github --output fms.md "FMS.docx"

当我查看输出markdown文件时,它有换行符分隔每个列表项.文档将其定义为 松散列表 ,例如一个在下面.

When I view the output markdown file, it has newlines separating each list item. The documentation defines this as a loose list such as the one below.

- one

- two

- three

我想为输出使用一个紧凑列表,如下所示.

I want to use a compact list for the output such as the one below.

- one
- two
- three

是否有一个标志使pandoc输出一个紧凑列表?

Is there a flag to make pandoc output a compact list?

如果没有,我该如何使用过滤器来获得所需的输出?

If not, how can I use a filter to achieve the desired output?

推荐答案

没有实现此目的的标志,但是有一个使用pandoc的 filter 功能的简单解决方案.在内部,列表项表示为块列表.如果所有块项目仅由Plain块组成,则列表是紧凑的.如果所有项目仅由一个段落组成,则将项目块的类型从Para(对于 paragraph )更改为Plain就足够了.

There is no flag to achieve this, but there is a simple solution using pandoc's filter functionallity. Internally, list items are represented as a list of blocks; a list is compact if all block items only consist of Plain blocks. If all items consist of only a single paragraph, then it is sufficient to change the type of the item block from Para (for paragraph) to Plain.

下面的Lua程序就是这样做的.保存并将其用作 Lua过滤器:pandoc -t markdown --lua-filter the-filter.lua your-document.docx(需要pandoc 2.1或更高版本) :

The Lua program below does just that. Save it and use it as a Lua filter: pandoc -t markdown --lua-filter the-filter.lua your-document.docx (requires pandoc 2.1 or later):

local List = require 'pandoc.List'

function compactifyItem (blocks)
  return (#blocks == 1 and blocks[1].t == 'Para')
    and {pandoc.Plain(blocks[1].content)}
    or blocks
end

function compactifyList (l)
  l.content = List.map(l.content, compactifyItem)
  return l
end

return {{
    BulletList = compactifyList,
    OrderedList = compactifyList
}}

如果与Lua相比,偏好Haskell的人,还可以将以下过滤器与pandoc -t markdown --filter the-filter.hs your-document.docx一起使用:

If one prefers Haskell over Lua, it's also possible to use the filter below with pandoc -t markdown --filter the-filter.hs your-document.docx:

import Text.Pandoc.JSON

main = toJSONFilter compactifyList

compactifyList :: Block -> Block
compactifyList blk = case blk of
  (BulletList items)         -> BulletList $ map compactifyItem items
  (OrderedList attrbs items) -> OrderedList attrbs $ map compactifyItem items
  _                          -> blk

compactifyItem :: [Block] -> [Block]
compactifyItem [Para bs] = [Plain bs]
compactifyItem item      = item

如果既不选择Lua也不选择Haskell,则使用Python过滤器也可以实现同样的效果.有关详细信息,请参见pandoc的过滤器页.

The same would also be possible using a Python filter in case neither Lua nor Haskell is an option. See pandoc's filters page for details.

这篇关于从docx转换为markdown时使用紧凑列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆