由Pandoc生成的docx在方程式中遗漏了斜体变量 [英] pandoc-generated docx misses italic variables in equations

查看:112
本文介绍了由Pandoc生成的docx在方程式中遗漏了斜体变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Markdown的以下部分中嵌入了LaTeX方程:

I have the following segment of Markdown with embedded LaTeX equations:

# Fisher's linear discriminant

\newcommand{\cov}{\mathrm{cov}}
\newcommand{\A}{\mathrm{A}}
\renewcommand{\B}{\mathrm{B}}
\renewcommand{\T}{^\top}

The first method to find an optimal linear discriminant was proposed by Fisher
(1936), using the ratio of the between-class variance to the within-class variance
of the projected data, $d(\vec x)$, as a criterion. Expressed in terms of the
sample properties, the $p$-dimensional centroids $\bar {\vec x}_\A$ and
$\bar {\vec x}_\B$ and the $p \times p$ covariance matrices
$S_A = \cov_i ( \vec x_{\A i} )$ and $S_B = \cov_i ( \vec x_{\B i} )$, the
optimal direction is given by 
$$
\vec w = \left ( \frac{ S_A + S_B }{2} \right ) ^{-1}
~ ( \bar {\vec x}_\B - \bar {\vec x}_\A ).
$$

当我使用pandoc将其转换为LaTeX并使用xelatex对其进行编译时,我得到了期望的文本,并带有很好的渲染数学.当我使用Pandoc将其转换为MS Word时使用

When I convert it with pandoc to LaTeX and compile it with xelatex, I get the expected text with nicely rendered math. When I convert it with pandoc to MS Word using

pandoc test.text -o test.docx

并在MS Office Word 2007中打开它,我得到以下信息:

and open it in MS Office Word 2007, I get the following:

只有方程式中那些符号或正文字的部分才能正确显示,而斜体字的变量名会在框中用问号代替.

Only those parts of the equations that are symbols or upright text get rendered correctly, while variable names in italics are replaced by a question mark in a box.

我该如何进行这项工作?

推荐答案

在Word 2007中,我看到的结果类似于您的结果,除了在这里,我没有看到方框中的问号"字符,只是空格.

In Word 2007, I see a result similar to yours, except that here, I don't see the "question marks in boxes" characters, just space.

如果我随后选择一个表达式,并使用线性显示和返回的技巧,则该表达式会重新出现字符.

If I then take one of the expressions, and use your trick of going to linear display and back, the characters reappear for that expression.

如果我保存并重新打开,其他表达式仍然无法正确显示,但是如果我保存并查看XML,我会注意到

If I save and re-open, the other expressions still do not display correctly, but if I save and look at the XML, I notice that

  1. Math字体已更改为Cambria Math
  2. 附加运行参数(w:rPr)XML,用于指定Cambria Math 字体已插入oMath内的许多运行(w:r)中 元素,即使在不显示的oMath表达式中 正确地.但是,在现在显示的oMath表达式中 正确地,此额外的XML已应用于每次运行.在里面 其他的,它仅适用于某些运行(我想我可以看到 模式,但我现在没有时间了...)
  3. 如果我将XML手动添加到其他运行中,然后重新打开 文档,表达式正确显示.或者至少,他们在 我尝试过的一个案例.
  1. the Math font has been changed to Cambria Math
  2. additional run parameter (w:rPr) XML specifying the Cambria Math font has been inserted in many of the runs (w:r) inside the oMath elements, even in the oMath expressions that do not display correctly. However, in the oMath expression that now displays correctly, this extra XML has been applied to every run. In the others, it has only been applied to some runs (I think I can see the pattern but I'm running out of time here right now...)
  3. If I manually add the XML to the other runs and re-open the document, the expressions appear correctly. Or at least, they do in the one case I have tried.

由于Word 2010可以正确显示结果,因此我只能假定它不依赖这些显式字体设置,而Word 2007可以.这实际上并没有帮助您,因为更改所有w:r元素将比您已经在做的事更加困难.但是有可能需要在XML层次结构中较高的位置或.zip中的其他位置(可能在fontTable.xml或styles.xml中)设置默认的样式/字体.我对Word的XML结构不太熟悉,无法猜测什么(如果可能缺少任何内容,但是明天可以看一下).

Since Word 2010 displays the resuls correctly, I can only assume that it does not rely on these explicit font settings, whereas Word 2007 does. This doesn't really help you yet, because altering all those w:r elements would be even harder than what you are already doing. But it is possible that a default style/font needs to be set, either somewhere higher in the XML hierarchy, or perhaps elsewhere in the .zip (perhaps in fontTable.xml or styles.xml). I'm not familiar enough with Word's XML structures to guess what, if anything might be missing, but may be able to have a look tomorrow.

我认为另一种可能性是,您只需要拥有所有这些额外的rPr元素,即可在Word 2007中工作,这表明pandoc可能是为Word 2010而非2007写的.(我一无所知)关于该工具).

I suppose another possibility is that you just have to have all these extra rPr elements for this to work in Word 2007, which would suggest that pandoc may have been written for Word 2010, not 2007. (I don't know anything about the tool).

例如,您拥有

<m:r>
  <m:t>(</m:t>
</m:r>

您需要的是

<m:r>
  <w:rPr>
    <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math" />
  </w:rPr>
  <m:t>(</m:t>
</m:r>

这篇关于由Pandoc生成的docx在方程式中遗漏了斜体变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆