如何从word文件中读取除表数据之外的文本? [英] How to read text from word file excluding Table data?
问题描述
HI,
在我的应用程序中,我必须通过排除其他形状(如表格,图表等)逐行读取单词(DOCX)文件中的内容。从下面的代码我能够读取内容,但它也包括表格中的文本。
private void GetParaDetail(Word.Document doc)
{
foreach(Word.Paragraph para in doc.Paragraphs)
{
string temp = para.Range.Text.Trim();
}
}
我上传了一个文件到这个位置(https://1drv.ms/w/s!Ah-Jh2Ok5SuHcCKzdzlY6etFDv8),使用上面的代码为我顺序得到以下段落
1111111111111
2222222222222
3333333333333
4444444444444
5555555555555
。
。
。
。
kkkkkkkkkkk
但我需要以下文字。我搜索了很多,但没有找到任何有用的信息。所有人都只参考上述代码。
1111111111111
2222222222222
kkkkkkkkkkk
解决方案
最简单的方法可能是删除所有形状,inlineshapes&文件中的表格。但是,您可以考虑将它们转换为文本,而不是删除表。删除/转换内容后,您可以在
一遍中阅读整个文档。在 VBA中,可以这么简单:
Sub Demo()
With ActiveDocument
Do While .InlineShapes.Count> 0
.InlineShapes(1)。删除
循环
Do While.Shapes.Count> 0
.Shapes(1)。删除
循环
Do While .Tables.Count> 0
.Tables(1)。删除$
循环
结束与
结束子
我将留给你做C#实现。
HI,
In my application I have to read the content from a word(DOCX) file line by line by excluding other shapes(like table,chart etc). From the below code I am able to read the content but it also include the text from a table.
private void GetParaDetail(Word.Document doc) { foreach(Word.Paragraph para in doc.Paragraphs) { string temp = para.Range.Text.Trim(); } }I uploaded a file to this location(https://1drv.ms/w/s!Ah-Jh2Ok5SuHcCKzdzlY6etFDv8), by using above code for the file I got the below paragraphs sequentially
1111111111111 2222222222222 3333333333333 4444444444444 5555555555555 . . . . kkkkkkkkkkkbut I need the below text. I searched a lot but didnt find any helpful information. all are referring the above code only.
1111111111111 2222222222222 kkkkkkkkkkk解决方案The simplest method might be to delete all shapes, inlineshapes & tables from the document. Instead of deleting tables, though, you might consider converting them to text. Once you've deleted/converted the content, you can read the whole document in one pass. In VBA that could be as simple as:
Sub Demo()
With ActiveDocument
Do While .InlineShapes.Count > 0
.InlineShapes(1).Delete
Loop
Do While .Shapes.Count > 0
.Shapes(1).Delete
Loop
Do While .Tables.Count > 0
.Tables(1).Delete
Loop
End With
End SubI'll leave it to you to do the C# implementation.
这篇关于如何从word文件中读取除表数据之外的文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!