来自 HTML <figure>和<figcaption>到 Microsoft Word [英] from HTML <figure> and <figcaption> to Microsoft Word

查看:15
本文介绍了来自 HTML <figure>和<figcaption>到 Microsoft Word的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有figureimgfigcaption 标签的HTML,我想将它们转换为Microsoft Word 文档.

img引用的图片需要插入到Word文档中,figcaption需要转换成标题(也保留图号).

我尝试使用 Word 2013 打开 html,但 figcaption 没有转换为图形标题,而只是图像下方的一个简单文本.

是否有任何最低工作样本来完成它?我看了一下

我的 pandoc 是:

c:	emp>.pandoc.exe -vpandoc.exe 1.19.2.1使用 pandoc 类型 1.17.0.4、texmath 0.9、天窗 0.1.1.4 编译默认用户数据目录:C:UsersaleAppDataRoamingpandoc版权所有 (C) 2006-2016 约翰麦克法兰网站:http://pandoc.org这是免费软件;请参阅复制条件的来源.没有保证,即使是适销性或适用性出于特定目的.

编辑 1

也可以使用一些 C# 来完成它.也许我可以通过 C# 程序将 HTML 转换为某种 XML Word 格式.

解决方案

这可能比您希望的更迂回,但是如果您将文件另存为 pdf(我进入 adobe 并从包含图/figcaption,但你显然可以通过编程来做到这一点),然后将该pdf文件导出到word,然后你就可以创建一个word文档了.也许中间步骤太多了,但确实有效!

希望这对您有所帮助(也许 pdf 可以??)

编辑 1: 我刚刚找到了一个 jquery 插件 Mark Windsoll 将 HTML 转换为 Word.我制作了一个 codepen 以在此处包含图/figcaption.当您按下按钮时,它会打印为 Word.(我想你也可以保存它,但他的原始代码笔实际上并没有在点击说导出到文档的链接时做任何事情......叹气......)

 jQuery(document).ready(function print($) {$(".word-export").click(function(event) {$("#page-content").wordExport();});});

img{width:300px;高度:自动;}figcaption{width:350px;text-align:center;}h1{margin-top:10px;}h1, h2{margin-left:35px;}p{宽度:95%;填充顶部:20px;边距:0px 自动;}按钮{边距:15px 30px;填充:5px;}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script><script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/FileSaver.js"></脚本><script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/jquery.wordexport.js"><link href="https://www.jqueryscript.net/css/jquerysctipttop.css" rel="stylesheet"/><h1>jQuery Word 导出插件演示</h1><div id="页面内容"><h2>可爱的树</h2><图><img src="http://www.rachelgallen.com/images/autumntrees.jpg"></figure><figcaption>秋天的树木</figcaption><p>Lorem ipsum dolor sat amet, consectetur adipiscing elit.Donec vehicula bibendum lacinia.Pellentesque placerat interdum nisl non semper.Integer ornare, nunc non varius mattis, nulla neque venenatis nibh, vitae cursus risus quam ut nulla.Aliquam 时代 volutpat.Aliquam 时代 volutpat.</p><p>这里还有一些文字,但这已经足够了 lorem ipsum 垃圾!</p>

<button class="word-export" onclick="print();">导出为 .doc

编辑 2:使用 C# 将 HTML 转换为 Word,您可以使用 Gembox, 除非您购买专业版,否则免费(您可以免费使用一段时间来评估它).

C# 代码是

//将 HTML 转换为 Word (DOCX) 文档.DocumentModel.Load("Document.html").Save("Document.docx");

瑞秋

I have an HTML with thefigure, img and figcaption tags and I would like to get them converted to a Microsoft Word document.

The image referred by img should be inserted in the Word document and the figcaption should be converted to its caption (also keeping the figure number).

I have tried to open the html with Word 2013 but the figcaption is not converted as the figure caption but it is just a simple text below the image.

Is there any minimum working sample to get it done? I had a look at https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Word_XML_Format_example but it is too verbose to grab just an Hello world sample.

figure .image {
    width: 100%;
}

figure {
    text-align: center;
    display: table;
    max-width: 30%; /* demo; set some amount (px or %) if you can */
    margin: 10px auto; /* not needed unless you want centered */
}
article {
  counter-reset: figures;
}

figure {
  counter-increment: figures;
}

figcaption:before {
  content: "Fig. " counter(figures) " - "; /* For I18n support; use data-counter-string. */
}

<figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/c/ca/Matterhorn002.jpg"></p>
<figcaption>Il monte Cervino.</figcaption>
</figure>

<figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/2/26/Banner_clouds.jpg"></p>
<figcaption>La nuvola che spesso è vicino alla vetta.</figcaption>
</figure>

I tried with pandoc on Windows

pandoc -f html -t docx -o hello.docx hello.html

but with no luck, as you can see the "Fig. 1" and "Fig. 2" is missing:

My pandoc is:

c:	emp>.pandoc.exe -v
pandoc.exe 1.19.2.1
Compiled with pandoc-types 1.17.0.4, texmath 0.9, skylighting 0.1.1.4
Default user data directory: C:UsersaleAppDataRoamingpandoc
Copyright (C) 2006-2016 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

Edit 1

It is fine also to use some C# to get it done. Maybe I can transform the HTML to some XML Word format by means of a C# program.

解决方案

This may be more roundabout than you would like, but if you save the file as a pdf (I went into adobe and created a pdf from a html file containing figure/figcaption, but you could do that programatically obviously), and then export that pdf file to word, then you can create a word document. Perhaps a middle step too much but it does work!

Hope this is of some assistance (perhaps a pdf would do??)

EDIT 1: I just found a jquery plugin by Mark Windsoll which converts HTML to Word. I made a codepen to include figure /figcaption here. When you press the button it prints as Word. (I suppose you could save it either, but his original code pen didn't actually do anything on click of the link that said export to doc.. sigh..)

 jQuery(document).ready(function print($)  {   
$(".word-export").click(function(event) {
         $("#page-content").wordExport();
     });
 });

img{width:300px;
height:auto;}
figcaption{width:350px;text-align:center;}
h1{margin-top:10px;}
h1, h2{margin-left:35px;}
p{width:95%;
  padding-top:20px;
  margin:0px auto;}
button{margin: 15px 30px; 
padding:5px;}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/FileSaver.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/jquery.wordexport.js"></script>

<link href="https://www.jqueryscript.net/css/jquerysctipttop.css" rel="stylesheet"/>

<h1>jQuery Word Export Plugin Demo</h1>
<div id="page-content">
<h2>Lovely Trees</h2>
<figure>
  <img src="http://www.rachelgallen.com/images/autumntrees.jpg"></figure>
  <figcaption>Autumn Trees</figcaption>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec vehicula bibendum lacinia. Pellentesque placerat interdum nisl non semper. Integer ornare, nunc non varius mattis, nulla neque venenatis nibh, vitae cursus risus quam ut nulla. Aliquam erat volutpat. Aliquam erat volutpat. </p>
  <p>And some more text here, but that's quite enough lorem ipsum rubbish!</p>
</div>
<button class="word-export" onclick="print();"> Export as .doc </button>

EDIT 2: To convert HTML to Word using C# you can use Gembox, which is free unless you buy the professional version (you could use it free for a while to evaluate it).

The C# code is

// Convert HTML to Word (DOCX) document.
DocumentModel.Load("Document.html").Save("Document.docx");

Rachel

这篇关于来自 HTML &lt;figure&gt;和&lt;figcaption&gt;到 Microsoft Word的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆