如何在docx中接受修订/跟踪更改(ins/del)? [英] How to accept revisions / track changes (ins/del) in a docx?

查看:267
本文介绍了如何在docx中接受修订/跟踪更改(ins/del)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在MS-Word 2010中,文件"->信息"下有一个选项,用于在共享之前检查文档是否存在问题.这样就可以处理曲目更改(到最新版本)并立即从文档中删除所有注释和注释.

In MS-Word 2010 there is an Option under File -> Information to check the document for problems before sharing it. This makes it possible to handle track changes (to new newest version) and remove all comments and annotations from the document at once.

在docx4j中也可以使用这种可能性吗?还是我需要研究相应的JAXB对象并编写遍历查找器? 手动执行此操作可能会很繁琐,因为我必须将RunIns(w:ins)添加到R(w:r)并删除RunDel(w:del).我还曾经在w:ins中看到一个w:del.在这种情况下,我不知道这是否还会反过来出现或出现在更深的嵌套中.

Is this possibility available in docx4j as well or do I need to investiagte the corresponding JAXB-Objects and write a traverse finder? Doing that manually could be a lot of work since I would have to add the RunIns (w:ins) to the R (w:r) and remove the RunDel (w:del). I also saw a w:del once inside a w:ins. In this case I don't know if this also appears vice versa or in deeper nestings.

进一步的研究使XSLT得以发展: https://github.com/plutext/docx4all/blob/master/docx4all/src/main/java/org/docx4all/util/ApplyRemoteChanges.xslt 我无法在docx4j中运行此文件,而是通过手动解压缩docx并提取document.xml.在普通的document.xml上应用xslt之后,我再次将其包装在docx容器中,以使用MS-Word打开它.结果与通过MS-Word本身接受修订的结果不同.更具体:XSLT删除了已删除的标记文本(在表中),但未删除文本前面的列表点.这经常出现在我的文档中.

Further research brought this XSLT up: https://github.com/plutext/docx4all/blob/master/docx4all/src/main/java/org/docx4all/util/ApplyRemoteChanges.xslt I was not able to run this within docx4j but by manually unzipping the docx and extracting the document.xml. After applying the xslt on the plain document.xml I wrapped it in the docx container again to open it with MS-Word. The result was not the same as it would be by accepting the revision with MS-Word itself. More concrete: The XSLT removed the deleted marked text (in a Table), but not a listing dot before the text. This appears quite often in my document.

如果无法轻松解决此请求,我将更改约束.对于我来说,有一种方法可以获取ContentAccessor的所有文本,作为String. ContentAccessor可以是PTc.字符串应在其中的R内或RunIns(在其中的R)内.为此,我在下面有一个半解.插入部分从else if (child instanceof RunIns) {行开始.但是如上所述,我不确定嵌套的del/ins语句如何出现以及是否可以很好地处理它们.结果仍然与以前我要使用MS-Word准备文档的结果不同.

If this request is not posible to solve in an easy manner, I will change the constraints. It is sufficent for me to have a method for getting all text of a ContentAccessor, as a String. The ContentAccessor could be a P or Tc. The String shall be inside a R there or inside a RunIns (with R inside of that) For this I have a half solution below. The intersting part starts in the line of else if (child instanceof RunIns) {. But as mentioned above I'm not sure how nested del/ins Statements might appear and if this will handle them well. And the results are still not the same as if I would prepare the document with MS-Word before.

//Similar to:
//http://www.docx4java.org/forums/docx-java-f6/how-to-get-all-text-element-of-a-paragraph-with-docx4j-t2028.html
private String getAllTextfromParagraph(ContentAccessor ca) {
    String result = "";
    List<Object> children = ca.getContent();
    for (Object child : children) {
        child = XmlUtils.unwrap(child);
        if (child instanceof Text) {
            Text text = (Text) child;
            result += text.getValue();
        } else if (child instanceof R) {
            R run = (R) child;
            result += getTextFromRun(run);
        }
        else if (child instanceof RunIns) {
            RunIns ins = (RunIns) child;
            for (Object obj : ins.getCustomXmlOrSmartTagOrSdt()) {
                if (obj instanceof R) {
                    result += getTextFromRun((R) obj);
                }
            }
        }
    }
    return result.trim();
}

private String getTextFromRun(R run) {
    String result = "";
    for (Object o : run.getContent()) {
        o = XmlUtils.unwrap(o);
        if (o instanceof R.Tab) {
            Text text = new Text();
            text.setValue("\t");
            result += text.getValue();
        }
        if (o instanceof R.SoftHyphen) {
            Text text = new Text();
            text.setValue("\u00AD");
            result += text.getValue();
        }
        if (o instanceof Br) {
            Text text = new Text();
            text.setValue(" ");
            result += text.getValue();
        }
        if (o instanceof Text) {
            result += ((Text) o).getValue();
        }
    }
    return result;
}

推荐答案

https ://github.com/plutext/docx4j/commit/309a8e4008553452ebe675e81def30aab97542a2?w = 1 添加了一种仅转换一个零件的方法,并提供了示例代码以用于接受更改.

https://github.com/plutext/docx4j/commit/309a8e4008553452ebe675e81def30aab97542a2?w=1 adds a method for transforming just one Part, and sample code to use it to accept changes.

XSLT就是您所发现的(许可为Apache 2):

The XSLT is just what you found (relicensed as Apache 2):

    <?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
  xmlns:o="urn:schemas-microsoft-com:office:office"
  xmlns:v="urn:schemas-microsoft-com:vml"
  xmlns:WX="http://schemas.microsoft.com/office/word/2003/auxHint"
  xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
  xmlns:w10="urn:schemas-microsoft-com:office:word"
  xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage"
        xmlns:msxsl="urn:schemas-microsoft-com:xslt"
    xmlns:ext="http://www.xmllab.net/wordml2html/ext"
  xmlns:java="http://xml.apache.org/xalan/java"
  xmlns:xml="http://www.w3.org/XML/1998/namespace"
  version="1.0"
        exclude-result-prefixes="java msxsl ext o v WX aml w10">


  <xsl:output method="xml" encoding="utf-8" omit-xml-declaration="no" indent="yes" />


  <xsl:template match="/ | @*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="w:del" />

  <xsl:template match="w:ins" >
    <xsl:apply-templates select="*"/>
  </xsl:template>

</xsl:stylesheet>

您需要添加对MSDN链接中标识的其他元素的支持.如果您这样做,我很乐意收到拉取请求

You'll need to add support for the other elements identified in the MSDN link. If you do that, I'd be happy to get a pull request

这篇关于如何在docx中接受修订/跟踪更改(ins/del)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆