如何在Word 2007 .docx文件中搜索单词? [英] How can I search a word in a Word 2007 .docx file?
问题描述
我想在Word 2007文件(.docx)中搜索文本字符串,例如可以从Word中的搜索中找到的某些特殊短语".
I'd like to search a Word 2007 file (.docx) for a text string, e.g., "some special phrase" that could/would be found from a search within Word.
Python是否有办法查看文本?我对格式没有兴趣-我只想将文档归类为具有或不具有某些特殊短语".
Is there a way from Python to see the text? I have no interest in formatting - I just want to classify documents as having or not having "some special phrase".
推荐答案
更确切地说,.docx文档是OpenXML格式的Zip存档:您必须首先将其解压缩.
我下载了一个示例(Google:一些搜索词文件类型:docx ),解压缩后找到了一些文件夹. word 文件夹包含文档本身,位于文件 document.xml 中.
More exactly, a .docx document is a Zip archive in OpenXML format: you have first to uncompress it.
I downloaded a sample (Google: some search term filetype:docx) and after unzipping I found some folders. The word folder contains the document itself, in file document.xml.
这篇关于如何在Word 2007 .docx文件中搜索单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!