仅从ms word文档中提取单词 [英] extract only words from ms word document
问题描述
我正在使用c#阅读ms word文档.它读取所有字符,包括制表符,空格,空字符串,特殊字符,数字.我只想要确切的单词.谁能为我提供最佳解决方案.编码的答案会更有帮助.在此先感谢..
i am reading a ms word document using c#. it reads all characters including tab,white space,empty string,special characters, numbers. i want only and exactly the words. can anyone provide me an optimal solution. answers with coding will be more helpful. thanks in advance..
推荐答案
使用 ^ ] /library/microsoft.office.interop.word.document%28v=office.11%29.aspx>文档 [
Use Words member[^] of Document[^] object.
Words集合中的每个项目都是一个Range对象,代表一个单词.没有Word对象.
Each item in the Words collection is a Range object that represents one word. There is no Word object.
步骤:
1)创建一个MS Word实例.
2)打开现有文档.
3)浏览单词集合(使用for each
循环).
4)关闭文档.
5)关闭MS Word的实例.
仅此而已!
单词任务 [
Steps to do:
1) Create an instance of MS Word.
2) Open existing document.
3) Go through the collection of Words (use for each
loop).
4) Close the document.
5) Close an instance of MS Word.
That''s all!
Word tasks[^]
尝试:
使用DocxToText从DOCX文件中提取文本 [ ^ ]
Try:
Using DocxToText to Extract Text from DOCX Files[^]
这篇关于仅从ms word文档中提取单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!