在.docx文件中查找和替换文本-Python [英] Find and replace text in .docx file - Python

查看：158 发布时间：2021/5/2 20:06:08 python text replace docx zipfile

本文介绍了在.docx文件中查找和替换文本-Python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在努力寻找一种方法来查找和替换docx文件中的文本，但运气不佳.我尝试了docx模块，但无法正常工作.最终，我使用zipfile模块制定了下面描述的方法，并替换了docx存档中的document.xml文件.为此，您需要一个模板文档(docx)，其文本要替换为唯一的字符串，该字符串不能与文档中任何其他现有或将来的文本匹配(例如，与XXXMEETDATEXXX上的XXXCLIENTNAMEXXX的会议进行得非常顺利.).

I've been doing a lot of searching for a method to find and replace text in a docx file with little luck. I've tried the docx module and could not get that to work. Eventually I worked out the method described below using the zipfile module and replacing the document.xml file in the docx archive. For this to work you need a template document (docx) with the text you want to replace as unique strings that could not possibly match any other existing or future text in the document (eg. "The meeting with XXXCLIENTNAMEXXX on XXXMEETDATEXXX went very well.").

import zipfile

replaceText = {"XXXCLIENTNAMEXXX" : "Joe Bob", "XXXMEETDATEXXX" : "May 31, 2013"}
templateDocx = zipfile.ZipFile("C:/Template.docx")
newDocx = zipfile.ZipFile("C:/NewDocument.docx", "a")

with open(templateDocx.extract("word/document.xml", "C:/")) as tempXmlFile:
    tempXmlStr = tempXmlFile.read()

for key in replaceText.keys():
    tempXmlStr = tempXmlStr.replace(str(key), str(replaceText.get(key)))

with open("C:/temp.xml", "w+") as tempXmlFile:
    tempXmlFile.write(tempXmlStr)

for file in templateDocx.filelist:
    if not file.filename == "word/document.xml":
        newDocx.writestr(file.filename, templateDocx.read(file))

newDocx.write("C:/temp.xml", "word/document.xml")

templateDocx.close()
newDocx.close()

我的问题是这种方法有什么问题?我对这些东西还很陌生，所以我觉得应该已经有人解决了.这使我相信这种方法存在很大的问题.但这有效！我在这里想念什么?

My question is what's wrong with this method? I'm pretty new to this stuff, so I feel someone else should have figured this out already. Which leads me to believe there is something very wrong with this approach. But it works! What am I missing here?

以下是我所有其他尝试学习此知识的人的思考过程:

Here is a walkthrough of my thought process for everyone else trying to learn this stuff:

步骤1)准备一个Python字典，将要替换的文本字符串作为键，将新文本作为项(例如{"XXXCLIENTNAMEXXX":"Joe Bob"，"XXXMEETDATEXXX":"2013年5月31日")}).

Step 1) Prepare a Python dictionary of the text strings you want to replace as keys and the new text as items (eg. {"XXXCLIENTNAMEXXX" : "Joe Bob", "XXXMEETDATEXXX" : "May 31, 2013"}).

第2步)使用zipfile模块打开模板docx文件.

Step 2) Open the template docx file using the zipfile module.

第3步)使用附加访问模式打开一个新的新docx文件.

Step 3) Open a new new docx file with the append access mode.

第4步)从模板docx文件中提取document.xml(所有文本都保存在其中)，并将xml读取为文本字符串变量.

Step 4) Extract the document.xml (where all the text lives) from the template docx file and read the xml to a text string variable.

第5步)使用for循环将xml文本字符串中字典中定义的所有文本替换为新文本.

Step 5) Use a for loop to replace all of the text defined in your dictionary in the xml text string with your new text.

步骤6)将xml文本字符串写入新的临时xml文件.

Step 6) Write the xml text string to a new temporary xml file.

第7步)使用for循环和zipfile模块将模板docx归档文件中的所有文件复制到新的docx归档文件中(不包括word/document.xml文件).

Step 7) Use a for loop and the zipfile module to copy all of the files in the template docx archive to a new docx archive EXCEPT the word/document.xml file.

第8步)，将带有替换文本的临时xml文件作为新的word/document.xml文件写入新的docx存档中.

Step 8) Write the temporary xml file with the replaced text to the new docx archive as a new word/document.xml file.

第9步)关闭模板和新的docx存档.

Step 9) Close your template and new docx archives.

第10步)打开新的docx文档，并享受替换后的文本！

Step 10) Open your new docx document and enjoy your replaced text!

-编辑-在第7和11行上缺少右括号')'

--Edit-- Missing closing parentheses ')' on lines 7 and 11

在.docx文件中查找和替换文本-Python [英] Find and replace text in .docx file - Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在.docx文件中查找和替换文本-Python [英] Find and replace text in .docx file - Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭