如何转换.html中的.doc,.docx文件 [英] How to convert a .doc, .docx file in .html

查看:94
本文介绍了如何转换.html中的.doc,.docx文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何转换.html中的.doc,.docx文件

How to convert a .doc, .docx file in .html

推荐答案

首先,你应该明白问题是模棱两可的,因为没有格式与其数据和渲染模型之间的一种通用一对一对应关系。也就是说,如果你开发这样的函数,它应该接受两个,而不是一个参数(文档本身),它还应该接收一些映射规则,这些规则可以不同,产生不同的结果。



要读取/解析Word文档,可以使用Microsoft Office Interop for Word。如果您安装了office,这是已经放入GAC的程序集,因此您可以使用Add Reference窗口的.NET选项卡来引用它。请参阅:

http://en.wikipedia.org/wiki/Visual_Studio_Tools_for_Office [< a href =http://en.wikipedia.org/wiki/Visual_Studio_Tools_for_Officetarget =_ blanktitle =New Window> ^ ],

http://msdn.microsoft.com/en-us/library/ff601860.aspx [ ^ ],

http://msdn.microsoft.com/en-us/library/microsoft.office.interop .word.aspx [ ^ ]。



这篇文章也很有用:

http://www.dotnetperls.com/word [ ^ ]。



如果你想在不安装Office的情况下使用Word格式,你仍然可以做到。毕竟,OpenOffice,LibreOffice和其他产品支持所有版本的格式,请参阅:

http ://en.wikipedia.org/wiki/OpenOffice.org [ ^ ],

http://en.wikipedia.org/wiki/LibreOffice [ ^ ]。



这些产品是开源的,所以你可以随时下载源代码并查看转换背后的代码。



如果你我想只支持较新的Office Open XML,格式本身可用,并在ECMA-376和ISO / IEC 29500:2008下标准化:

http://en.wikipedia.org/wiki/Office_Open_XML [ ^ ],

http:/ /en.wikipedia.org/wiki/Office_Open_XML_software [ ^ ]。



请参阅Office Open XML软件的对比图:

http://en.wikipedia.org/wiki/Comparison_of_Office_Open_XML_software [ ^ ]。



由于某些源代码可用且已打开,您可以使用它。



-SA
First and foremost, you should understand that the problem is ambiguous, as there is not one universal one-to-one correspondence between the format and their data and rendering model. That said, if you develop such function, it should accept two, not one parameter (document itself), it should also receive some set of mapping rules, which can be different, producing different results.

To read/parse Word documents, you can use Microsoft Office Interop for Word. This is the assembly already put to GAC if you install office, so you can reference it using ".NET" tab of the "Add Reference" window. Please see:
http://en.wikipedia.org/wiki/Visual_Studio_Tools_for_Office[^],
http://msdn.microsoft.com/en-us/library/ff601860.aspx[^],
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.aspx[^].

This article can also be useful:
http://www.dotnetperls.com/word[^].

If you want to work with Word formats without installation of Office, you still can do it. After all, OpenOffice, LibreOffice and other products support all versions of the format, please see:
http://en.wikipedia.org/wiki/OpenOffice.org[^],
http://en.wikipedia.org/wiki/LibreOffice[^].

These products are open-source, so you can always download the source code and see the code behind the conversion.

If you would like to support only the newer Office Open XML, the format itself is available and is standardized under ECMA-376 and ISO/IEC 29500:2008:
http://en.wikipedia.org/wiki/Office_Open_XML[^],
http://en.wikipedia.org/wiki/Office_Open_XML_software[^].

Please see the comparison chart on Office Open XML software:
http://en.wikipedia.org/wiki/Comparison_of_Office_Open_XML_software[^].

As some source code is available and open, you can use it.

—SA


http://stackoverflow.com/questions/8135901/converting-docx-to-html [ ^ ]

http://janewdaisy.wordpress.com/2012/04/06/how-to-convert-word -document-to-html-with-cvb-net / [ ^ ]


这篇关于如何转换.html中的.doc,.docx文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆