HTML字符串添加到了OpenXML(* .DOCX)文件 [英] Add HTML String to OpenXML (*.docx) Document
问题描述
我想使用微软的OpenXML的2.5库中创建一个OpenXML文档。一切都很正常,直到我试图插入HTML字符串到我的文档中。我已经冲刷网页和这里是我想出了到目前为止(剪断,只是我有麻烦的部分):
I am trying to use Microsoft's OpenXML 2.5 library to create a OpenXML document. Everything works great, until I try to insert an HTML string into my document. I have scoured the web and here is what I have come up with so far (snipped to just the portion I am having trouble with):
Paragraph paragraph = new Paragraph();
Run run = new Run();
string altChunkId = "id1";
AlternativeFormatImportPart chunk =
document.MainDocumentPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Html, altChunkId);
chunk.FeedData(new MemoryStream(Encoding.UTF8.GetBytes(ioi.Text)));
AltChunk altChunk = new AltChunk { Id = altChunkId };
run.AppendChild(new Break());
paragraph.AppendChild(run);
body.AppendChild(paragraph);
显然,我并没有实际增加了altChunk在这个例子中,但我已经试过到处追加它 - 到运行,段落,身体等。在以往情况下,我无法在Word 2010中打开该文件DOCX
Obviously, I haven't actually added the altChunk in this example, but I have tried appending it everywhere - to the run, paragraph, body, etc. In ever case, I am unable to open up the docx file in Word 2010.
这是使我有点坚果因为现在看来似乎应该是直接的(我承认,我没有完全理解AltChunk东西)。 。希望得到任何帮助。
This is making me a little nutty because it seems like it should be straightforward (I will admit that I'm not fully understanding the AltChunk "thing"). Would appreciate any help.
侧面说明:有一件事我发现这很有趣,我不知道,如果它实际上是一个问题或没有,是的这种反应它说从一个MemoryStream工作时AltChunk损坏文件。任何人都可以确认这是/是不是真的?
Side Note: One thing I did find that was interesting, and I don't know if it's actually a problem or not, is this response which says AltChunk corrupts the file when working from a MemoryStream. Can anybody confirm that this is/isn't true?
推荐答案
我可以重现错误的... ...有是与内容的使用
中的一个不完整的HTML文档作为替代格式进口部分的内容的问题。
例如,如果您使用以下的HTML片段< H1> HELLO< / H1方式>
MS Word不能打开文档
I can reproduce the error "... there is a problem with the content" by using
an incomplete HTML document as the content of the alternative format import part.
For example if you use the following HTML snippet <h1>HELLO</h1>
MS Word is unable to open the document.
下面的代码显示了如何将 AlternativeFormatImportPart
添加到Word文档。
(我测试过用微软Word 2013的代码)。
The code below shows how to add an AlternativeFormatImportPart
to a word document.
(I've tested the code with MS Word 2013).
using (WordprocessingDocument doc = WordprocessingDocument.Open(@"test.docx", true))
{
string altChunkId = "myId";
MainDocumentPart mainDocPart = doc.MainDocumentPart;
var run = new Run(new Text("test"));
var p = new Paragraph(new ParagraphProperties(
new Justification() { Val = JustificationValues.Center }),
run);
var body = mainDocPart.Document.Body;
body.Append(p);
MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes("<html><head></head><body><h1>HELLO</h1></body></html>"));
// Uncomment the following line to create an invalid word document.
// MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes("<h1>HELLO</h1>"));
// Create alternative format import part.
AlternativeFormatImportPart formatImportPart =
mainDocPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.Html, altChunkId);
//ms.Seek(0, SeekOrigin.Begin);
// Feed HTML data into format import part (chunk).
formatImportPart.FeedData(ms);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainDocPart.Document.Body.Append(altChunk);
}
据为
办公室的OpenXML规范有效的父元素 W:altChunk
元素是体,评论,docPartBody,尾注,脚注,FTR,HDR和TC
。
所以,我已经添加了 W:altChunk
来body元素
According to the Office OpenXML specification valid parent elements for the
w:altChunk
element are body, comment, docPartBody, endnote, footnote, ftr, hdr and tc
.
So, I've added the w:altChunk
to the body element.
有关详细信息在 W于:altChunk
元素看到这样的 MSDN 链接。
For more information on the w:altChunk
element see this MSDN link.
这篇关于HTML字符串添加到了OpenXML(* .DOCX)文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!