在python中将pdf转换为docx格式 [英] Convert pdf to docx format in python
本文介绍了在python中将pdf转换为docx格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
请告诉我如何将pdf转换为docx。我尝试使用pdfminer转换为html来提取文本,但看起来仍然不够好。
pdf2docx
推荐答案- 安装pdf2docx包点击here
安装
克隆或下载pdf2docx
pip install pdf2docx or # download the package and install your environment python setup.py install
选项1
from pdf2docx import Converter pdf_file = r'C:UsersABCDDesktopXYZ/Document1.pdf'# source file docx_file = r'C:UsersABCDDesktopXYZ/sample.docx' # destination file # convert pdf to docx cv = Converter(pdf_file) cv.convert(docx_file, start=0, end=None) cv.close() #Output Parsing Page 53: 53/53... Creating Page 53: 53/53... -------------------------------------------------- Terminated in 6.258919400000195s.
选项2
from pdf2docx import parse pdf_file = r'C:UsersABCDDesktopXYZ/Document2.pdf' # source file docx_file = r'C:UsersABCDDesktopXYZ/sample_2.docx' # destination file # convert pdf to docx parse(pdf_file, docx_file, start=0, end=None) # output Parsing Page 53: 53/53... Creating Page 53: 53/53... -------------------------------------------------- Terminated in 5.883666100000482s.
这篇关于在python中将pdf转换为docx格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文