在git中如何区分微软Word文档? [英] In git how to diff microsoft word documents?

查看:154
本文介绍了在git中如何区分微软Word文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在关注如何区分Microsoft Word文档的此指南,但我遇到了这个错误:

 用法:/usr/bin/docx2txt.pl [infile.docx |  -  | -h ] [outfile.txt |  - ] 
/usr/bin/docx2txt.pl< infile.docx
/usr/bin/docx2txt.pl< infile.docx> outfile.txt

在第二次使用中,输出在STDOUT上转储。

使用'-h'作为获取此使用信息的第一个参数。

使用' - '作为infile名称以从STDIN读取docx文件。

使用' - '作为outfile名称来转储STDOUT上的文本。
如果省略了第二个参数,则输出保存在infile.txt中。

注意:infile.docx也可以是一个目录名称,其中包含有关.docx文件的解压内容


致命:无法读取文件以区分

解释我是如何来的到那个错误:我创建了一个.gitattributes在我想要区分的存储库中。 .gitattributes看起来像这样:

  *。docx diff = word 
* .docx difftool = word

我已经安装了docx2txt。我在Linux上。我创建了一个名为docx2txt的文件,其中包含:

 #!/ bin / bash 
docx2txt.pl $ 1 -

$ chmod a + x docx2txt我把docx2txt放在/ usr / bin /

中:

  $ git config diff.word.textconv docx2txt 

然后尝试区分两个Microsoft Word文档。那是当我得到我上面提到的错误。



我错过了什么?如何解决此错误?



PS:我不知道我的shell是否可以找到docx2txt,因为当我这样做时:

  $ docx2txt 

我的终端冻结了,处理的东西,但不输出任何东西,当我做这些命令时,会发生这种情况:

  $ man docx2txt 
没有人工输入docx2txt
$ docx2txt --help
无法读取docx文件< - help>!

进度更新:我将docx2txt更改为

 #!/ bin / bash 
docx2txt.pl$ 1 -

与pmod建议的一样,现在 git diff< commit> 可以从命令行运行!好极了!
然而,当我尝试的时候

  $ git difftool< commit> 

git启动kdiff3,我得到这个弹出错误:

 某些输入字符无法转换为有效的unicode。 
您可能会使用错误的编解码器。 (例如用于非UTF-8文件的UTF-8)。
如果不确定,不要保存结果。继续您的风险。
受影响的输入文件位于A,B中。

...以及所有字符在文件中是巨型的巨型。命令行会正确显示diff文本,但由于某些原因,kdiff3不会正确显示diff中的文本。



如何正确显示diff的文本kdiff3或另一个gui工具?我应该将kdiff3更改为另一个工具吗?



额外:由于以下命令,我的shell似乎无法找到docx2txt:

  $ doctxt 
其中:(/ usr / local / sbin中没有doctxt:/ usr / local / bin :/ usr / bin:/ usr / lib / jvm / default / bin:/ usr / bin / site_perl:/ usr / bin / vendor_perl:/ usr / bin / core_perl)

$其中docx2txt
/ usr / bin / docx2txt


解决方案

doc2txt.pl 根据用法需要恰好两个参数或零。在第一个(你)的情况下,参数是文件名或 - 。因此,当包含至少一个作为第一个参数传递的文件名空间时,您的包装脚本看起来是正确的。在这种情况下,扩展后的 $ 1 文件名部分将作为单独的参数传递,因此工具会输出使用信息,因为它读取的参数超过2个。



 #!/ bin / bash 
docx2txt.pl$ 1使用引号避免文件名拆分: -




PS:我不知道我的shell是否可以找到docx2txt


您可以使用

  $ which docx2txt 

如果您看到路径,那么工具(二进制或可运行脚本)发现(基于PATH环境变量)。


因为当我这样做时:



$ docx2txt



我的终端冻结了,处理了一些东西,但是没有输出任何东西


如果没有参数,您的脚本将执行 doc2txt.pl - ,根据工具的使用情况,它预计inpu t文件通过STDIN传递,即您输入的内容。因此,它看起来像悬挂和处理某些东西,但实际上只能捕获您的输入。

I've been following this guide here on how to diff Microsoft Word documents, but I ran into this error:

Usage:  /usr/bin/docx2txt.pl [infile.docx|-|-h] [outfile.txt|-]
        /usr/bin/docx2txt.pl < infile.docx
        /usr/bin/docx2txt.pl < infile.docx > outfile.txt

        In second usage, output is dumped on STDOUT.

        Use '-h' as the first argument to get this usage information.

        Use '-' as the infile name to read the docx file from STDIN.

        Use '-' as the outfile name to dump the text on STDOUT.
        Output is saved in infile.txt if second argument is omitted.

Note:   infile.docx can also be a directory name holding the unzipped content
        of concerned .docx file.

fatal: unable to read files to diff

To explain how I came to that error: I created a .gitattributes in the repository I want to diff from. .gitattributes looks like this:

*.docx diff=word
*.docx difftool=word

I've installed docx2txt. I'm on Linux. I've created a file called docx2txt which contains this:

#!/bin/bash
docx2txt.pl $1 -

I $ chmod a+x docx2txt and I put docx2txt in /usr/bin/

I did:

$ git config diff.word.textconv docx2txt

then tried to diff two microsoft word documents. That's when I got the error I mentioned above.

What am I missing? How do I resolve this error?

PS: I don't know if my shell can find docx2txt because when I do this:

$ docx2txt

my terminal freezes, processing something, but doesn't output anything, and when I do these commands this happens:

$ man docx2txt
No manual entry for docx2txt
$ docx2txt --help
Can't read docx file <--help>!

UPDATE on progress: I changed docx2txt to

#!/bin/bash
docx2txt.pl "$1" -

as pmod suggested, and now git diff <commit> works from the command line! Yay! However, when I try

$ git difftool <commit>

git launches kdiff3 and, I get this pop-up error:

Some input characters could not be converted to valid unicode.
You might be using the wrong codec. (e.g. UTF-8 for non UTF-8 files).
Don't save the result if unsure. Continue at your own risk.
Affected input files are in A, B.

...and all of the characters in the files are mumbo jumbo. The command line displays the diff text correctly, but kdiff3 does not display the text from the diff correctly for some reason.

How do I display the text for the diff correctly in kdiff3 or another gui tool? Should I change kdiff3 to another tool?

Extra: My shell doesn't seem to be able to find docx2txt, because of these commands:

$ which doctxt
which: no doctxt in (/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl)

$ which docx2txt
/usr/bin/docx2txt

解决方案

doc2txt.pl expects exactly two arguments or zero according to usage. In the first (your) case arguments either filenames or "-". So, your wrapper script looks correct expect for the case when there is at least one space in filename passed as first argument. In this case, after expansion of $1 filename parts will be passed as separate arguments, thus tool outputs usage info because it reads more than 2 arguments.

Try using quotes to avoid filename splitting:

#!/bin/bash
docx2txt.pl "$1" -

PS: I don't know if my shell can find docx2txt

You can check this with

$ which docx2txt

If you see the path, then tool (binary or runnable script) can be found (based on PATH environment variable).

because when I do this:

$ docx2txt

my terminal freezes, processing something, but doesn't output anything

Without arguments your script will execute doc2txt.pl - which according to tool's usage expects input file passed through STDIN, i.e. what you're typing. Thus, it looks like hanging and processing something, but actually only captures your input.

这篇关于在git中如何区分微软Word文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆