编辑文档 Python-docx 标题中的内容 [英] Edit content in header of document Python-docx

查看:46
本文介绍了编辑文档 Python-docx 标题中的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试查找和替换文档标题中的文本框中的文本.但是搜索了一段时间后,似乎无法通过 python-docx 访问标题或浮动"文本框中的内容(我阅读了 issue 这里)

I am trying to find and replace text in a Textbox in Header of document. But after searching for awhile, it seems there is no way to access the content in the Header or "float" text boxes via python-docx (I read issue here)

所以,这意味着我们必须直接在文档的xml格式上查找和替换.你知道这样做吗?

So, it means we have to find and replace directly on the xml format of document. Do you know anyway to do that?

推荐答案

我找到了解决这个问题的方法.例如,我有一个 template.docx 文件,我想更改 标题中的文本框 中的文本,如上所述.下面的流程步骤,我解决了我的问题:

I found a way to solve this problem. For instance, I have a template.docx file, and I want to change text in a Textbox in Header as described above. Flow steps below, I solved my problems:

  1. 将文件template.docx 重命名为template.zip
  2. template.zip解压到template文件夹
  3. /template/word/ 文件夹中的 header.xml 文件之一中查找并替换我想要更改的文本.
  4. /template 文件夹中的所有文件压缩回 template.zip
  5. template.zip 重命名回 template.docx
  1. Rename file template.docx to template.zip
  2. Unzip template.zip to template folder
  3. Find and replace text I want to change in one of the header<number>.xml files in /template/word/ folder.
  4. Zip all files in /template folder back to template.zip
  5. Rename template.zip back to template.docx

我用Python来操作这些

I used Python to manipulate these

import os
import shutil
import zipfile

WORKING_DIR = os.getcwd()
TEMP_DOCX = os.path.join(WORKING_DIR, "template.docx")
TEMP_ZIP = os.path.join(WORKING_DIR, "template.zip")
TEMP_FOLDER = os.path.join(WORKING_DIR, "template")

# remove old zip file or folder template
if os.path.exists(TEMP_ZIP):
    os.remove(TEMP_ZIP)
if os.path.exists(TEMP_FOLDER):
    shutil.rmtree(TEMP_FOLDER)

# reformat template.docx's extension
os.rename(TEMP_DOCX, TEMP_ZIP)

# unzip file zip to specific folder
with zipfile.ZipFile(TEMP_ZIP, 'r') as z:
    z.extractall(TEMP_FOLDER)

# change header xml file
header_xml = os.path.join(TEMP_FOLDER, "word", "header1.xml")
xmlstring = open(header_xml, 'r', encoding='utf-8').read()
xmlstring = xmlstring.replace("#TXTB1", "Hello World!")
with open(header_xml, "wb") as f:
    f.write(xmlstring.encode("UTF-8"))

# zip temp folder to zip file
os.remove(TEMP_ZIP)
shutil.make_archive(TEMP_ZIP.replace(".zip", ""), 'zip', TEMP_FOLDER)

# rename zip file to docx
os.rename(TEMP_ZIP, TEMP_DOCX)
shutil.rmtree(TEMP_FOLDER)

这篇关于编辑文档 Python-docx 标题中的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆