无头LibreOffice在Windows上导出到PDF的速度非常慢(比Linux慢6倍) [英] Headless LibreOffice very slow to export to PDF on Windows (6 times slow than on Linux)

查看:202
本文介绍了无头LibreOffice在Windows上导出到PDF的速度非常慢(比Linux慢6倍)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常需要使用LibreOffice将许多(> 1000).docx文档导出为PDF.这是一个示例文档: test.docx .以下代码可以运行,但是在Windows上却相当慢(每个PDF文档平均需要3.3秒):

I often need to export many (> 1000) .docx documents to PDF with LibreOffice. Here is a sample document: test.docx. The following code works but it's quite slow on Windows (3.3 seconds on average for each PDF document):

import subprocess, docx, time   # first do: pip install python-docx 
for i in range(10):
    doc = docx.Document('test.docx')
    for paragraph in doc.paragraphs:
        paragraph.text = paragraph.text.replace('{{num}}', str(i))
    doc.save('test%i.docx' % i)   # these 4 previous lines are super fast - a few ms
    t0 = time.time()
    subprocess.call(r'C:\Program Files\LibreOffice\program\soffice.exe --headless --convert-to pdf test%i.docx --outdir . --nocrashreport --nodefault --nofirststartwizard --nolockcheck --nologo --norestore"' % i)
    print('PDF generated in %.1f sec' % (time.time()-t0))

    # for linux:
    # (0.54 seconds on average, so it's 6 times better than on Windows!)
    # subprocess.call(['/usr/bin/soffice', '--headless', '--convert-to', 'pdf', '--outdir', '/home/user', 'test%i.docx' % i])  

如何在Windows上加快PDF导出速度?

我怀疑在启动LibreOffice/Writer,(执行工作),关闭LibreOffice"上浪费了很多时间. 启动LibreOffice/Writer,(执行工作),关闭LibreOffice" 启动LibreOffice/Writer,(执行此工作),关闭LibreOffice" 等.

注意:

  • As a comparison: here: https://bugs.documentfoundation.org/show_bug.cgi?id=92274 the export time is said to be either 90ms or 810ms.

soffice.exe 替换为 swriter.exe :相同的问题:平均3.3秒

soffice.exe replaced by swriter.exe: same problem: 3.3 second on average

subprocess.call(r'C:\Program Files\LibreOffice\program\swriter.exe --headless --convert-to pdf test%i.docx --outdir ."' % i)

推荐答案

实际上,所有时间都浪费在启动/退出LibreOffice中.相反,我们可以一次调用 soffice.exe 的传递许多docx文档:

Indeed, all the time is wasted in starting/quitting LibreOffice. We can instead pass many docx documents in one call of soffice.exe:

import subprocess, docx
for i in range(1000):
    doc = docx.Document('test.docx')
    for paragraph in doc.paragraphs:
        paragraph.text = paragraph.text.replace('{{num}}', str(i))
    doc.save('test%i.docx' % i)

# all PDFs in one pass:
subprocess.call(['C:\Program Files\LibreOffice\program\swriter.exe', 
    '--headless', '--convert-to', 'pdf', '--outdir', '.'] + ['test%i.docx' % i for i in range(1000)])

总共107秒,因此每个PDF平均约为107毫秒!

107 seconds total, so it's ~ 107 ms on average per PDF, far better!

注意:

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆