使用图表从 docx 的 python 调用 libreoffice 生成 pdf 时出现的问题 [英] issue when calling libreoffice for pdf generation from python of docx with charts

查看:47
本文介绍了使用图表从 docx 的 python 调用 libreoffice 生成 pdf 时出现的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 debian 9.5、python 3.5、libreoffice 5.2、x86_64 arch.

using debian 9.5, python 3.5, libreoffice 5.2, x86_64 arch.

我有一个 22 页的 word 文件 (docx),其中包含多个图表.

I have a word file (docx) of 22 pages, which contains several charts.

当使用 bash 从终端运行时,以下命令可以正常工作,即生成一个 22 页的 pdf 文件:

When run from terminal using bash, the following command works correctly i.e. generates a pdf file of 22 pages:

/usr/bin/libreoffice --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx

输出:

使用过滤器转换/tmp/docx5/output.docx ->/tmp/docx5//output.pdf:writer_pdf_Export

convert /tmp/docx5/output.docx -> /tmp/docx5//output.pdf using filter : writer_pdf_Export

问题如下:使用 subprocess.run 从 python 执行的相同外部命令生成只有一页的 pdf 文件,而不是 22 页,没有错误消息.

没有其他 libreoffice 实例正在运行.

No other instances of libreoffice are running.

cmd = '/usr/bin/libreoffice --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx'

print(subprocess.run(cmd, shell=True, check=True))

这是这个python脚本的输出:

this is the output of this python script:

使用过滤器转换/tmp/docx5/output.docx ->/tmp/docx5//output.pdf:writer_pdf_Export

convert /tmp/docx5/output.docx -> /tmp/docx5//output.pdf using filter : writer_pdf_Export

CompletedProcess(args='/usr/bin/libreoffice --headless --convert-topdf --outdir/tmp/docx5//tmp/docx5/output.docx', returncode=0)

CompletedProcess(args='/usr/bin/libreoffice --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx', returncode=0)

显然,pdf生成成功,但只转换了docx文件的第一页.

Apparently, pdf generation was successfull, but only the first page of docx file was converted.

当从 python 启动的 libreoffice 遇到第一个图表时,pdf 的生成似乎终止了.

It seems that the generation of the pdf terminates when libreoffice, started from python, encounters the first chart.

libreoffice 是否需要 java 运行时来生成 pdf?

Does libreoffice require the java runtime for generating pdf?

libreoffice 的无头操作会不会有问题?

Could there be an issue with headless operations of libreoffice?

有什么提示吗?

更新:

添加了 'env:UserInstallation' 选项,当从 python 运行修改后的脚本时:

added the 'env:UserInstallation' option, when running from python the modified script:

cmd = '/usr/bin/libreoffice -env:UserInstallation=file:///home/marco/  --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx'

print(subprocess.run(cmd, shell=True, check=True))

输出如下,现在它包含一个关于找不到java运行时环境的警告:

The output is the following, now it contains a warning about not finding a java runtime environment:

javaldx:找不到 Java 运行时环境!

javaldx: Could not find a Java Runtime Environment!

警告:无法从 javaldx 读取路径

Warning: failed to read path from javaldx

使用过滤器转换/tmp/docx5/output.docx ->/tmp/docx5//output.pdf:writer_pdf_Export

convert /tmp/docx5/output.docx -> /tmp/docx5//output.pdf using filter : writer_pdf_Export

CompletedProcess(args='/usr/bin/libreoffice-env:UserInstallation=file:///home/marco/--headless --convert-to pdf --outdir/tmp/docx5//tmp/docx5/output.docx', returncode=0)

CompletedProcess(args='/usr/bin/libreoffice -env:UserInstallation=file:///home/marco/ --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx', returncode=0)

关于如何通过命令行参数指定 libreoffice 可以找到它需要的 Java 运行时环境的任何想法?

Any idea on how to specify thorugh command line parameters where libreoffice can find the java runtime environment it needs?

推荐答案

我找到了解决方案,虽然我不清楚技术原因:

I found a solution, although it is not clear to me the technical reason:

这个作品(使用带有图表的 docx 文件的 libreoffice 完成 pdf 生成):

this WORKS (complete pdf generation using libreoffice of docx file with charts):

PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/lib/jvm/java-10-oracle/bin:/usr/lib/jvm/java-10-oracle/db/bin /usr/bin/libreoffice -env:UserInstallation=file:///tmp/docx5/ --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx

这不起作用(使用带有图表的 docx 文件的 libreoffice 生成部分 pdf):

this DOES NOT WORK (partial pdf generation using libreoffice of docx file with charts):

PATH=/home/marco/venv/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/lib/jvm/java-10-oracle/bin:/usr/lib/jvm/java-10-oracle/db/bin /usr/bin/libreoffice -env:UserInstallation=file:///tmp/docx5/ --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx

似乎 python virtualenv 与 libreoffice 产生了某种冲突.我使用了 strace,但没有发现任何有用的东西.

It seems that python virtualenv causes some sort of conflict with libreoffice. I used strace but found nothing useful.

所以我的情况的解决方案是在从python调用libreoffice时从PATH环境变量中删除virtualenv路径,这可以通过停用virtualenv来实现:

So the solution for my case is to remove the virtualenv path from PATH environment variable when calling libreoffice from python, and this can be achieved by deactivating virtualenv:

marco@pc:~$ source venv/bin/activate
...
(venv) marco@pc:~$ deactivate && /usr/bin/libreoffice -env:UserInstallation=file:///tmp/docx5/ --headless --convert-to pdf --outdir /tmp/docx5/ /tmp/docx5/output.docx

这篇关于使用图表从 docx 的 python 调用 libreoffice 生成 pdf 时出现的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆