如果我将MS Office(例如.docx)或OpenOffice(例如.odt)文档放入git存储库,我该怎么办? [英] What should I do if I put MS Office (e.g. .docx) or OpenOffice( e.g. .odt) document into a git repository?

查看:115
本文介绍了如果我将MS Office(例如.docx)或OpenOffice(例如.odt)文档放入git存储库,我该怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我放了几个 .docx .txt .pdf 文件放入.git存储库中。我可以打开,编辑,保存本地的 .docx 文件;然而,当我把它推到github上,并将它下载回我的电脑时,Word抱怨它无法打开它。

为了在github上存储 .docx 文件,是否需要对git设置执行一些必要的步骤?

解决方案解决方案

制作 .gitattributes 文件放入您的工作目录中,并添加以下行:

  *。docx binary 



为什么不设置 core.autocrlf = false



这也很有用。但将 .docx 配置为二进制格式不仅解决了此问题,而且还解决了潜在的合并问题。

这个问题的由来是什么?



From http://git-scm.com/docs/gitattributes ,将文件标记为二进制文件一节。注意斜体部分。


Git通常会通过检查内容的开始来正确猜测blob是包含文本还是二进制数据。但是,有时您可能想要覆盖其决定,要么是因为blob包含文件后面的二进制数据,要么是因为技术上由文本字符组成的内容对于人类阅读器不透明

.docx 格式是包含 xml 和二进制数据,比如图片。



Git将您的 .docx 视为一个文本(而不是二进制)文件和替换的结束字符。按照Microsoft开发的格式, .docx 可能正在使用CRLF,它可能已被远程存储库中的LF替换。当你从远程直接下载该文件时,它仍然有LF。



在二进制文件中,Git永远不会替换末尾字符,所以即使远程存储库上的文件也会有正确的CRLF 。

适用格式



这适用于任何文件格式,它是一个包含文本和二进制数据的压缩包。这包括:


I put several .docx, .txt and .pdf file into a .git repository. I can open, edit, save the local .docx file; however, when I push it to github, and download it back to my computer, Word complains that it cannot open it.

In order to store .docx file on github, is there some essential steps I should do to the git settings?

解决方案

Solution

Make a .gitattributes file in your working directory and add the following line to it:

*.docx    binary

Why not just set core.autocrlf=false ?

This is useful too. But configuring .docx as a binary format solves not only this problem, but also potential merge issues.

What is the origin of this problem?

From http://git-scm.com/docs/gitattributes , section "Marking files as binary". Note the italicized section.

Git usually guesses correctly whether a blob contains text or binary data by examining the beginning of the contents. However, sometimes you may want to override its decision, either because a blob contains binary data later in the file, or because the content, while technically composed of text characters, is opaque to a human reader.

.docx format is a zip folder containting xml and binary data, such as images.

Git treated your .docx as a text (and not binary) file and replaced endline characters. As Microsoft-developed format, .docx is probably using CRLF, which might have been replaced with LF in the remote repository. When you downloaded that file directly from remote, it still had LFs.

In a binary file Git never replaces endline chars, so even the files on remote repository will have proper CRLFs.

Applicable formats

This is applicable to any file format which is a zipped package with text and binary data. This includes:

这篇关于如果我将MS Office(例如.docx)或OpenOffice(例如.odt)文档放入git存储库,我该怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆