python将Microsoft Office文档转换为Linux上的纯文本 [英] python convert microsoft office docs to plain text on linux
问题描述
在Linux上使用python将.doc,.ppt和.xls转换为纯文本的方法有何建议?实际上,任何转换方法都是有用的.我已经考虑过使用Open Office,但是,我想要一个不需要安装Open Office的解决方案.
Any recomendations on a method to convert .doc, .ppt, and .xls to plain text on linux using python? Really any method of conversion would be useful. I have already looked at using Open Office but, I would like a solution that does not require having to install Open Office.
推荐答案
I'd go for the command line-solution (and then use the Python subprocess module to run the tools from Python).
msword( catdoc ),excel( xls2csv )和ppt( catppt )的转换器可以在以下位置找到: http://vitus.wagner.pp.ru/software/catdoc/.
Convertors for msword (catdoc), excel (xls2csv) and ppt (catppt) can be found (in source form) here: http://vitus.wagner.pp.ru/software/catdoc/.
不能真正评论catppt的有用性,但是catdoc和xls2csv可以很好地工作!
Can't really comment on the usefullness of catppt but catdoc and xls2csv work great!
但是请确保首先搜索您的发行版存储库...例如,在ubuntu上,catdoc只是一个快速的获取途径.
But be sure to first search your distributions repositories... On ubuntu for example catdoc is just one fast apt-get away.
这篇关于python将Microsoft Office文档转换为Linux上的纯文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!