如何使用的Open XML格式与DOCX转换为HTML文件 [英] How to convert docx to html file using open xml with formatting

查看:1443
本文介绍了如何使用的Open XML格式与DOCX转换为HTML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道有很多有相同的标题问题,但我目前有他们的一些问题,我没有得到正确的路要走。

I know there are lot of question having same title but I am currently having some issue for them I didn't get the correct way to go.

我使用Open XML SDK 2.5与动力工具沿转换 .DOCX 文件到使用 HtmlConverter 类转换的.html 文件。

I am using Open xml sdk 2.5 along with Power tool to convert .docx file to .html file which uses HtmlConverter class for conversion.

我能够成功转换的 DOCX 文件到 HTML 文件,但问题是,HTML文件不保留文档文件的原始格式。例如。 字体大小,颜色,下划线,粗体的等不反映到HTML文件

I am successfully able to convert the docx file into the Html file but the problem is, html file doesn't retain the original formatting of the document file. eg. Font-size,color,underline,bold etc doesn't reflect into the html file.

下面是我的现有代码:

public void ConvertDocxToHtml(string fileName)
{
   byte[] byteArray = File.ReadAllBytes(fileName);
   using (MemoryStream memoryStream = new MemoryStream())
   {
      memoryStream.Write(byteArray, 0, byteArray.Length);
      using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStream, true))
      {
         HtmlConverterSettings settings = new HtmlConverterSettings()
         {
            PageTitle = "My Page Title"
         };
         XElement html = HtmlConverter.ConvertToHtml(doc, settings);
         File.WriteAllText(@"E:\Test.html", html.ToStringNewLineOnAttributes());
      }
    }
 }



所以,我只是想知道如果有任何方式,我可以保留转换后的HTML文件的格式。

So I just want to know if is there any way by which I can retain the formatting in converted HTML file.

我知道一些第三方的API它做同样的事情。但我宁愿如果有使用开放XML或任何其他开源做到这一点的方式。

I know about some third party APIs which does the same thing. But I would prefer if there any way using open xml or any other open source to do this.

推荐答案

PowerTools的开放XML只是发布了新的HtmlConverter模块。现在,它包含一个开源的,免费实现从DOCX到HTML转换用CSS格式的。该模块HtmlConverter.cs支持所有的段落,字符和表格样式,字体和文本格式,编号,项目符号列表,图像等。请参见 http://bit.ly/1bclyg9

PowerTools for Open XML just released a new HtmlConverter module. It now contains an open source, free implementation of a conversion from DOCX to HTML formatted with CSS. The module HtmlConverter.cs supports all paragraph, character, and table styles, fonts and text formatting, numbered and bulleted lists, images, and more. See http://bit.ly/1bclyg9

这篇关于如何使用的Open XML格式与DOCX转换为HTML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆