如何在C#中将Word文档的页面拆分为单独的文件 [英] How to split pages of a Word document into separate files in c#
本文介绍了如何在C#中将Word文档的页面拆分为单独的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个OCR程序,可以将图像转换为Word文档. Word文档包含所有图像的文本,我想将其拆分为单独的文件.
I have an OCR program that converts images to word documents. The word document contains text of the all images, and I want to split it into separate files.
在C#中有什么方法可以做到这一点?
Is there any way to do this in c#?
谢谢
推荐答案
与其他答案相同,但带有IEnumerator以及文档的扩展方法.
Same as other answer, but with an IEnumerator and an extension method to the document.
static class PagesExtension {
public static IEnumerable<Range> Pages(this Document doc) {
int pageCount = doc.Range().Information[WdInformation.wdNumberOfPagesInDocument];
int pageStart = 0;
for (int currentPageIndex = 1; currentPageIndex <= pageCount; currentPageIndex++) {
var page = doc.Range(
pageStart
);
if (currentPageIndex < pageCount) {
//page.GoTo returns a new Range object, leaving the page object unaffected
page.End = page.GoTo(
What: WdGoToItem.wdGoToPage,
Which: WdGoToDirection.wdGoToAbsolute,
Count: currentPageIndex+1
).Start-1;
} else {
page.End = doc.Range().End;
}
pageStart = page.End + 1;
yield return page;
}
yield break;
}
}
主要代码如下:
static void Main(string[] args) {
var app = new Application();
app.Visible = true;
var doc = app.Documents.Open(@"path\to\source\document");
foreach (var page in doc.Pages()) {
page.Copy();
var doc2 = app.Documents.Add();
doc2.Range().Paste();
}
}
这篇关于如何在C#中将Word文档的页面拆分为单独的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文