我如何可以查询在ASP.NET应用程序一个Word的docx? [英] How can I query a Word docx in an ASP.NET app?

查看:135
本文介绍了我如何可以查询在ASP.NET应用程序一个Word的docx?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想一个Word 2007或更高的docx文件上传到我的Web服务器和目录转换为一个简单的XML结构。与传统的VBA在桌面上这样做似乎是它本来是很容易。综观用于创建的docx文件Wordpro​​cessingML中的XML数据是混乱的。有没有办法(没有COM)浏览文档中有一个面向对象的方式?

I would like to upload a Word 2007 or greater docx file to my web server and convert the table of contents to a simple xml structure. Doing this on the desktop with traditional VBA seems like it would have been easy. Looking at the WordprocessingML XML data used to create the docx file is confusing. Is there a way (without COM) to navigate the document in more of an object-oriented fashion?

推荐答案

我强烈建议寻找到<一个href=\"http://www.microsoft.com/downloads/details.aspx?FamilyID=c6e744e5-36e9-45f5-8d8c-331df206e0d0&DisplayLang=en\"相对=nofollow>打开XML SDK 2.0 。这是一个CTP,但我发现它在操纵xmlx文件,而不必在所有处理COM非常有用的。该文件是有点粗略,但寻找事情的关键是DocumentFormat.OpenXml.Packaging.Wordpro​​cessingDocument类。如果重命名扩展为.zip和挖掘到XML文件中有您可以选择分开的.docx文档。从这样做,它看起来像一个目录被包含在一个结构化文档标签与物像的标题是从那里的超链接。 Putzing与它周围一点,我发现,这样的事情应该工作(或至少给你一个起点)。

I highly recommend looking into the Open XML SDK 2.0. It's a CTP, but I've found it extremely useful in manipulating xmlx files without having to deal with COM at all. The documentation is a bit sketchy, but the key thing to look for is the DocumentFormat.OpenXml.Packaging.WordprocessingDocument class. You can pick apart the .docx document if you rename the extension to .zip and dig into the XML files there. From doing that, it looks like a Table of Contents is contained in a "Structured Document" tag and that things like the headings are in a hyperlink from there. Putzing around with it a bit, I found that something like this should work (or at least give you a starting point).

WordprocessingDocument wordDoc = WordprocessingDocument.Open(Filename, false);
SdtBlock contents = wordDoc.MainDocumentPart.Document.Descendants<SdtBlock>().First();
List<string> contentList = new List<string>();
foreach (Hyperlink section in contents.Descendants<Hyperlink>())
{
    contentList.Add(section.Descendants<Text>().First().Text);
}

这篇关于我如何可以查询在ASP.NET应用程序一个Word的docx?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆