来自HTML< figure>和< figcaption>到Microsoft Word [英] from HTML <figure> and <figcaption> to Microsoft Word

查看:94
本文介绍了来自HTML< figure>和< figcaption>到Microsoft Word的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有figureimgfigcaption标记的HTML,我希望将它们转换为Microsoft Word文档.

应将img引用的图像插入Word文档中,并将figcaption转换为其标题(还要保留图形编号).

我试图用Word 2013打开html,但figcaption并未转换为图形标题,而只是图像下方的简单文本.

是否有最低限度的工作样本可以完成?我看了 https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Word_XML_Format_example,但是仅获取 Hello world 示例就太冗长了.

 figure .image {
    width: 100%;
}

figure {
    text-align: center;
    display: table;
    max-width: 30%; /* demo; set some amount (px or %) if you can */
    margin: 10px auto; /* not needed unless you want centered */
}
article {
  counter-reset: figures;
}

figure {
  counter-increment: figures;
}

figcaption:before {
  content: "Fig. " counter(figures) " - "; /* For I18n support; use data-counter-string. */
} 

 <figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/c/ca/Matterhorn002.jpg"></p>
<figcaption>Il monte Cervino.</figcaption>
</figure>

<figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/2/26/Banner_clouds.jpg"></p>
<figcaption>La nuvola che spesso è vicino alla vetta.</figcaption>
</figure> 

我尝试在Windows上使用pandoc

pandoc -f html -t docx -o hello.docx hello.html

但是没有运气,因为您可以看到缺少图1"和图2":

我的pandoc是:

c:\temp>.\pandoc.exe -v
pandoc.exe 1.19.2.1
Compiled with pandoc-types 1.17.0.4, texmath 0.9, skylighting 0.1.1.4
Default user data directory: C:\Users\ale\AppData\Roaming\pandoc
Copyright (C) 2006-2016 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

编辑1

也可以使用一些C#完成它.也许我可以通过C#程序将HTML转换为XML Word格式.

解决方案

这可能比您想要的更环形交叉路口,但是如果您将文件另存为pdf(我进入Adobe并从包含以下内容的html文件中创建了pdf插图/插图,但显然可以通过编程来实现),然后将该pdf文件导出为word,然后可以创建word文档.也许中间的步骤太多了,但它确实起作用了!

希望这对您有所帮助(也许可以使用pdf?)

我刚刚找到了 jquery插件由Mark Windsoll撰写,可将HTML转换为Word.我制作了一个 codepen在此处添加图形/figcaption .当您按下按钮时,它将打印为Word. (我想您也可以保存它,但是他的原始代码笔实际上在单击显示为导出到文档的链接时实际上什么也没做.

  jQuery(document).ready(function print($)  {   
$(".word-export").click(function(event) {
         $("#page-content").wordExport();
     });
 }); 

 img{width:300px;
height:auto;}
figcaption{width:350px;text-align:center;}
h1{margin-top:10px;}
h1, h2{margin-left:35px;}
p{width:95%;
  padding-top:20px;
  margin:0px auto;}
button{margin: 15px 30px; 
padding:5px;} 

 <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/FileSaver.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/jquery.wordexport.js"></script>

<link href="https://www.jqueryscript.net/css/jquerysctipttop.css" rel="stylesheet"/>

<h1>jQuery Word Export Plugin Demo</h1>
<div id="page-content">
<h2>Lovely Trees</h2>
<figure>
  <img src="http://www.rachelgallen.com/images/autumntrees.jpg"></figure>
  <figcaption>Autumn Trees</figcaption>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec vehicula bibendum lacinia. Pellentesque placerat interdum nisl non semper. Integer ornare, nunc non varius mattis, nulla neque venenatis nibh, vitae cursus risus quam ut nulla. Aliquam erat volutpat. Aliquam erat volutpat. </p>
  <p>And some more text here, but that's quite enough lorem ipsum rubbish!</p>
</div>
<button class="word-export" onclick="print();"> Export as .doc </button> 

要使用C#将HTML转换为Word,您可以使用 Gembox ,它是免费的,除非您购买了专业版(您可以暂时免费使用它来评估它).

C#代码为

// Convert HTML to Word (DOCX) document.
DocumentModel.Load("Document.html").Save("Document.docx");

瑞秋

I have an HTML with thefigure, img and figcaption tags and I would like to get them converted to a Microsoft Word document.

The image referred by img should be inserted in the Word document and the figcaption should be converted to its caption (also keeping the figure number).

I have tried to open the html with Word 2013 but the figcaption is not converted as the figure caption but it is just a simple text below the image.

Is there any minimum working sample to get it done? I had a look at https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats#Word_XML_Format_example but it is too verbose to grab just an Hello world sample.

figure .image {
    width: 100%;
}

figure {
    text-align: center;
    display: table;
    max-width: 30%; /* demo; set some amount (px or %) if you can */
    margin: 10px auto; /* not needed unless you want centered */
}
article {
  counter-reset: figures;
}

figure {
  counter-increment: figures;
}

figcaption:before {
  content: "Fig. " counter(figures) " - "; /* For I18n support; use data-counter-string. */
}

<figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/c/ca/Matterhorn002.jpg"></p>
<figcaption>Il monte Cervino.</figcaption>
</figure>

<figure>
<p><img class="image" src="https://upload.wikimedia.org/wikipedia/commons/2/26/Banner_clouds.jpg"></p>
<figcaption>La nuvola che spesso è vicino alla vetta.</figcaption>
</figure>

I tried with pandoc on Windows

pandoc -f html -t docx -o hello.docx hello.html

but with no luck, as you can see the "Fig. 1" and "Fig. 2" is missing:

My pandoc is:

c:\temp>.\pandoc.exe -v
pandoc.exe 1.19.2.1
Compiled with pandoc-types 1.17.0.4, texmath 0.9, skylighting 0.1.1.4
Default user data directory: C:\Users\ale\AppData\Roaming\pandoc
Copyright (C) 2006-2016 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

Edit 1

It is fine also to use some C# to get it done. Maybe I can transform the HTML to some XML Word format by means of a C# program.

解决方案

This may be more roundabout than you would like, but if you save the file as a pdf (I went into adobe and created a pdf from a html file containing figure/figcaption, but you could do that programatically obviously), and then export that pdf file to word, then you can create a word document. Perhaps a middle step too much but it does work!

Hope this is of some assistance (perhaps a pdf would do??)

EDIT 1: I just found a jquery plugin by Mark Windsoll which converts HTML to Word. I made a codepen to include figure /figcaption here. When you press the button it prints as Word. (I suppose you could save it either, but his original code pen didn't actually do anything on click of the link that said export to doc.. sigh..)

 jQuery(document).ready(function print($)  {   
$(".word-export").click(function(event) {
         $("#page-content").wordExport();
     });
 });

img{width:300px;
height:auto;}
figcaption{width:350px;text-align:center;}
h1{margin-top:10px;}
h1, h2{margin-left:35px;}
p{width:95%;
  padding-top:20px;
  margin:0px auto;}
button{margin: 15px 30px; 
padding:5px;}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/FileSaver.js"></script>
<script src="https://www.jqueryscript.net/demo/Export-Html-To-Word-Document-With-Images-Using-jQuery-Word-Export-Plugin/jquery.wordexport.js"></script>

<link href="https://www.jqueryscript.net/css/jquerysctipttop.css" rel="stylesheet"/>

<h1>jQuery Word Export Plugin Demo</h1>
<div id="page-content">
<h2>Lovely Trees</h2>
<figure>
  <img src="http://www.rachelgallen.com/images/autumntrees.jpg"></figure>
  <figcaption>Autumn Trees</figcaption>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec vehicula bibendum lacinia. Pellentesque placerat interdum nisl non semper. Integer ornare, nunc non varius mattis, nulla neque venenatis nibh, vitae cursus risus quam ut nulla. Aliquam erat volutpat. Aliquam erat volutpat. </p>
  <p>And some more text here, but that's quite enough lorem ipsum rubbish!</p>
</div>
<button class="word-export" onclick="print();"> Export as .doc </button>

EDIT 2: To convert HTML to Word using C# you can use Gembox, which is free unless you buy the professional version (you could use it free for a while to evaluate it).

The C# code is

// Convert HTML to Word (DOCX) document.
DocumentModel.Load("Document.html").Save("Document.docx");

Rachel

这篇关于来自HTML&lt; figure&gt;和&lt; figcaption&gt;到Microsoft Word的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆