在java中搜索Docx文件 [英] Searching Docx files in java
问题描述
我正在撰写一份搜索文件内容的申请
i已经编写了搜索记事本可编辑文件的代码。
I am writing an application for searching the Content of Documents i have already written the code for searching the documents which are editable by notepad.
我也希望对docx文件也这样做。经过一些研究后我得出了这两件事
I also wish to do the same for docx files. After some research i have come up with these two things
-
http://www.infoq.com/articles/cracking-office-2007-with-java
此方法要求我提取docx文件,然后搜索xml文件,但是这将涉及提取部分的额外开销,坦率地说我不知道如何处理xml文件(丢弃属性内容等)
http://www.infoq.com/articles/cracking-office-2007-with-java this method requires me to extract docx file and then search the xml files however this would involve an extra overhead on the extraction part and frankly i dont know how to process an xml file ( discarding attribute content etc)
http://www.javadocx.com/download
这个方法允许我将jar库导入我的项目,据说我可以用它创建docx文件,我不明白如何使用它打开docx文件
http://www.javadocx.com/download this method allows me to import a jar library to my project and supposedly i can create docx files with it, what i dont understand is how to open docx files using it
任何人都可以推荐我一种替代方法来执行相同的操作或帮助上述两种方法吗?
can anyone recommend me a alternate method to perform the same action or help with the above two mentioned methods?
推荐答案
尝试 http://tika.apache.org/ 或docx4j或POI。
Try http://tika.apache.org/ or docx4j or POI.
这篇关于在java中搜索Docx文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!