编辑文档 Python-docx 标题中的内容 [英] Edit content in header of document Python-docx
问题描述
我正在尝试查找和替换文档标题中的文本框中的文本.但是搜索了一段时间后,似乎无法通过 python-docx 访问标题或浮动"文本框中的内容(我阅读了 issue 这里)
I am trying to find and replace text in a Textbox in Header of document. But after searching for awhile, it seems there is no way to access the content in the Header or "float" text boxes via python-docx (I read issue here)
所以,这意味着我们必须直接在文档的xml格式上查找和替换.你知道这样做吗?
So, it means we have to find and replace directly on the xml format of document. Do you know anyway to do that?
推荐答案
我找到了解决这个问题的方法.例如,我有一个 template.docx
文件,我想更改 标题中的文本框 中的文本,如上所述.下面的流程步骤,我解决了我的问题:
I found a way to solve this problem. For instance, I have a template.docx
file, and I want to change text in a Textbox in Header as described above. Flow steps below, I solved my problems:
- 将文件
template.docx
重命名为template.zip
- 将
template.zip
解压到template
文件夹 - 在
/template/word/
文件夹中的header
文件之一中查找并替换我想要更改的文本..xml - 将
/template
文件夹中的所有文件压缩回template.zip
- 将
template.zip
重命名回template.docx
- Rename file
template.docx
totemplate.zip
- Unzip
template.zip
totemplate
folder - Find and replace text I want to change in one of the
header<number>.xml
files in/template/word/
folder. - Zip all files in
/template
folder back totemplate.zip
- Rename
template.zip
back totemplate.docx
我用Python来操作这些
I used Python to manipulate these
import os
import shutil
import zipfile
WORKING_DIR = os.getcwd()
TEMP_DOCX = os.path.join(WORKING_DIR, "template.docx")
TEMP_ZIP = os.path.join(WORKING_DIR, "template.zip")
TEMP_FOLDER = os.path.join(WORKING_DIR, "template")
# remove old zip file or folder template
if os.path.exists(TEMP_ZIP):
os.remove(TEMP_ZIP)
if os.path.exists(TEMP_FOLDER):
shutil.rmtree(TEMP_FOLDER)
# reformat template.docx's extension
os.rename(TEMP_DOCX, TEMP_ZIP)
# unzip file zip to specific folder
with zipfile.ZipFile(TEMP_ZIP, 'r') as z:
z.extractall(TEMP_FOLDER)
# change header xml file
header_xml = os.path.join(TEMP_FOLDER, "word", "header1.xml")
xmlstring = open(header_xml, 'r', encoding='utf-8').read()
xmlstring = xmlstring.replace("#TXTB1", "Hello World!")
with open(header_xml, "wb") as f:
f.write(xmlstring.encode("UTF-8"))
# zip temp folder to zip file
os.remove(TEMP_ZIP)
shutil.make_archive(TEMP_ZIP.replace(".zip", ""), 'zip', TEMP_FOLDER)
# rename zip file to docx
os.rename(TEMP_ZIP, TEMP_DOCX)
shutil.rmtree(TEMP_FOLDER)
这篇关于编辑文档 Python-docx 标题中的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!