将图像添加到从altchunk创建的openxml文档中 [英] adding images to openxml doc created from altchunk

查看:123
本文介绍了将图像添加到从altchunk创建的openxml文档中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个自动化的过程来从xhtml源创建docx文件. xhtml文件包含其"src"属性指向外部引用的图像(<img>元素).但是docx文件需要在没有网络连接的情况下可读,因此我需要找到一种将图像直接嵌入docx包(即,在/media文件夹中)的方法.

I need an automated process for creating docx files from xhtml source. The xhtml files contain images (<img> elements) whose "src" attributes point to an external reference. But the docx files need to be readable without a network connection, so I need to find a way to embed the images directly into the docx package (namely, in the /media folder).

到目前为止,我已经使用了altChunk方法(如埃里克·怀特)来创建.docx文件.我曾希望使用OpenXML SDK将图像部分插入包中.但是要做到这一点,我需要将段落(<p>节点)插入文档中.不幸的是,文档部分仅包含对altChunk的引用(单独存储在docx包中).当然,一旦打开,编辑和保存了docx,altChunk部分就会被删除,其内容会正确地嵌入到document.xml中.但是我不知道以编程方式执行此操作的任何方法,因此无济于事.

So far I've used the altChunk method (as described by Eric White) to create the .docx file. I had hoped to use the OpenXML SDK to insert the image parts into the package. But to do that I need to insert paragraphs (<p> nodes) into the document. Unfortunately the document part contains nothing but a reference to the altChunk (stored separately in the docx package). Of course, once the docx is opened, edited and saved, the altChunk part is removed and it’s contents are embedded properly in the document.xml. But I don’t know of any way to do that programatically, so that doesn't help.

我考虑过的其他选项:

  1. 将xhtml划分为多个段,在每个图像之间进行分隔,然后一次添加每个altChunk,并在每个图像之间添加适当的图像引用. (乏味,但似乎有可能)
  2. 将图像插入媒体文件夹,然后找到将WordProcessingML直接嵌入xhtml的方法,以便<img>引用打包的图像文件. (充其量是个问题) 谁能想到更好的方法?
  1. Partitioning the xhtml into segments, separated between each image, then adding each altChunk one at a time, with the appropriate image reference between each one. (Tedious but seems possible)
  2. Inserting the images into the media folder, and then find way to embed WordProcessingML directly into the xhtml so that the <img> references the packaged image file. (Questionable at best) Can anyone think of a better approach?

推荐答案

好吧,我解决了我自己的问题:我决定将文档转换为mHtml(可以包含直接嵌入文件中的图像),然后使用altchunk创建最终的docx文件.但是,我仍然想对文件进行一些后处理(在Word文档中插入尾注),但是如上所述,直到之后将altchunk转换为docx,这是不可能的,这是无法通过编程方式完成的.

Well, I sorta solved my own problem: I decided to convert the document to mHtml (which can contain images embedded directly in the file) and then use the altchunk to create the final docx file. However, I still wanted to do some post-processing on the file (to insert endnotes in the Word document), but as mentioned above, this is not possible until after the altchunk has been transformed into docx, which cannot be done programmatically.

所以我突然意识到,我可以完全绕开altchunk路径,而只是将mHtml用作从xHtml到docx的网关".我只是将xHtml转换为mHtml,并带有嵌入的图像尾注,然后将文件重命名为.doc扩展名.可以通过Word直接打开生成的文档(在随后的保存中将更正确地对其进行转换).到目前为止,它运行良好(尽管Mac版本的Word和Word2003中存在一些错误).

So it dawned on me that I could bypass the altchunk path altogether and simply use mHtml as the "gateway" from xHtml to docx. I just transformed the xHtml into mHtml, complete with embedded images and endnotes, then renamed the file with a .doc extension. The resulting document can be opened directly by Word (and will be converted more properly on subsequent save). So far it works great (albeit with some bugs in Mac's version of Word, as well as Word2003).

这篇关于将图像添加到从altchunk创建的openxml文档中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆