LibreOffice 将 .docx 并行转换为 .pdf 效果不佳 [英] LibreOffice convert .docx to .pdf in parallel not working well

查看:164
本文介绍了LibreOffice 将 .docx 并行转换为 .pdf 效果不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多docx文件要转换为pdf.一一转换它们需要很长时间.所以我写了一个python脚本来并行转换它们:

从子流程导入Popen中的

 导入时间导入操作系统os.chdir(os.path.dirname(__ file__))output_dir ='./输出'source_file_format ='./docs/example_{}.docx'po_list = [Popen(f"/Applications/LibreOffice.app/Contents/MacOS/soffice --invisible --convert-to pdf --outdir {output_dir} {source_file_format.format(i)}",壳=真)对于范围内的我(0,7,1)]而po_list:时间.睡眠(0.01)对于我,枚举(po_list)中的p:状态= p.poll()如果状态为无":继续elif status == 0:print('成功:[{}] {}-> {}'.format(p.returncode,p.stderr,p.args))po_list.remove(p)别的:print('Failed:{}:{}'.format(p.args,p.poll()))po_list.remove(p) 

但是,每次我运行此脚本时,仅一部分docx文件会成功转换.其余的转换过程甚至都不会抛出任何错误信息.

解决方案

我们在同一问题上也停留了一段时间.

LibreOffice的多个实例使用UserInstallation目录共享相同的空间,因此并行转换在这里造成了问题(间歇性过程似乎很混乱).

为每个 libre 实例使用不同的目录有助于解决这个问题.您可以通过UserInstallation env变量来实现此目的,该变量可以通过以下方式传递:"-env:UserInstallation = file:///d:/tmp/p0/"

您可以通过在目录中附加循环变量或任何唯一标识符来自动执行此操作.

参考: https://ask.libreoffice.org/en/question/42975/how-can-i-run-multiple-instances-of-sofficebin-a-a-time/

I have a lot of docx files to be converted to pdf. Converting them one by one takes long time. So I write a python scripts to convert them in parallel:

from subprocess import Popen
import time
import os

os.chdir(os.path.dirname(__file__))

output_dir = './outputs'
source_file_format = './docs/example_{}.docx'

po_list = [Popen(
    f"/Applications/LibreOffice.app/Contents/MacOS/soffice --invisible --convert-to pdf --outdir {output_dir} {source_file_format.format(i)}",
    shell=True)
    for i in range(0, 7, 1)]

while po_list:
    time.sleep(0.01)
    for i, p in enumerate(po_list):
        status = p.poll()
        if status is None:
            continue
        elif status == 0:
            print('Succeed: [{}] {} -> {}'.format(p.returncode, p.stderr, p.args))
            po_list.remove(p)
        else:
            print('Failed: {} : {}'.format(p.args, p.poll()))
            po_list.remove(p)

But each time I run this script, only a part of docx files are converted successfully. The rest conversion processes even not throw any error info.

解决方案

We were also stuck on the same issue for some time.

Multiple Instances of LibreOffice shares the same space using a UserInstallation directory and thus parallel conversion was creating a problem here (The intermittent processes seem to get mixed up).

Using a different directory for each instance of libre helped to solve this issue. You may achieve this via UserInstallation env variable which can be passed as: "-env:UserInstallation=file:///d:/tmp/p0/"

You may automate this by appending your loop variable or any unique identifier in the directory.

Reference: https://ask.libreoffice.org/en/question/42975/how-can-i-run-multiple-instances-of-sofficebin-at-a-time/

这篇关于LibreOffice 将 .docx 并行转换为 .pdf 效果不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆