autocrlf和eol有什么区别? [英] What is difference between autocrlf and eol

查看:133
本文介绍了autocrlf和eol有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 git文档,介绍 .gitattributes code>来解决混合行尾的问题,并发现有两个类似的设置。

AUTOCRLF:


行结束转换虽然Git通常会保留文件内容本身
可以配置为在存储库
中将行尾标准化为LF,并且可选地在签出文件时将它们转换为CRLF。



如果您只想在工作目录
中拥有CRLF行结束符,则无论您使用哪个存储库,都可以在不使用任何属性的情况下设置
配置变量core.autocrlf。

[core] autocrlf = true这不会强制对文本
文件进行标准化,但会确保您引入到
存储库的文本文件当他们的
被添加时,他们的行尾标准化为LF,并且已经在存储库
中被标准化的文件保持正常化。


和EOL:


此属性将特定的行结束样式设置为用于
工作目录。它可以在没有任何
内容检查的情况下实现行尾转换,从而有效地设置text属性。



设置为字符串值crlf此设置强制Git将
结尾的文件标准化,并在检出
文件时将其转换为CRLF。



设置为字符串值lf此设置强制Git将行
结尾的行标准化为签入时的LF,并防止在签出文件
时转换为CRLF。



向后兼容crlf属性
为了向后兼容,crlf属性的解释如下:

crlf text



-crlf -text



crlf =输入eol = lf

看起来两者都是一样的,但有一些关于兼容性的内容。这是否意味着, autocrlf 已被弃用,新的​​风格是 eol 或其他?我目前有一个存储库,里面有多个损坏的文件,我想转换成 crlf 表示。你可以看到文档混淆了我们,而不是澄清事情。



我应该在这种情况下应用什么?

解决方案

与其直接回答问题本身有关,请参阅 VonC的回答那我们来关注一下:


我目前有一个存储库,里面有多个损坏的文件,我想转换成crlf表示形式。


首先,我们注意到这些选项都不能改变任何现有的提交。这是一个基本的Git属性:一旦创建,就不会改变现有的提交。你可以做的是make new 提交。这通常不会太大,因为通常我们只是想让新的东西正确(但请参阅 git filter-branch ,它会在应用后复制提交过滤到它们的内容,并且可以用来重新复制整个存储库:新的repo不再与旧版本兼容,但是您可以通过这种方式修复历史记录)。

接下来,我认为这是了解所有这些行结束/ CRLF属性选项的关键:转换适用于移入或移出索引的文件。



请记住,Git的索引是您建立 next 提交的地方。索引的内容最初与当前提交的内容相同:例如,运行 git checkout master ,Git解析名称 master 复制到一个提交ID并将该特定提交复制到工作树 - 但副本通过索引。



<换句话说,Git首先发现文件 foo.txt 在提交中(并且需要提取)。因此,Git将该版本的 foo.txt 移动到索引。索引的版本 HEAD commmit的版本完全匹配。 Git不会将任何过滤器应用于索引版本,也不会更改任何行结尾。

更新索引版本后, Git会复制该版本文件索引到工作树<1> 在此提取过程中,现在有一些转换发生在 上。如果有污迹过滤器,Git现在就应用它。如果有行结束转换,Git会立即应用这些转换。



在此过程中,工作树文件可能与不同>索引版本。现在Git有一个问题,因为现在该文件是脏(在工作树中修改)。这是事情变得特别令人困惑的地方,尽管大多数时候,这里的细节都是不可见的。



最终,在使用工作树之后,您可以运行<$在某些文件路径名(或使用 git add -a 或任何其他文件添加许多文件)上使用c $ c> git add 。这会将文件从工作树复制到索引中。 2 在此副本期间,现在发生了更多转换:如果存在干净的过滤器,则Git立即应用它。如果有行结束转换,Git现在就应用它们。换句话说,在 git add 之后 - 这些文件中,索引版本可能与工作树版本不匹配。但是,无论如何,Git都会将索引版本标记为匹配。一个 git status 会跳过工作树版本,因为Git现在声称索引版本与工作树版本匹配。如果你再次运行 git add ,那么索引版本与将被添加匹配。 p>

实际的实现使用时间戳,通常以一秒的分辨率。 Git会继续相信索引版本与工作树版本相匹配,除非操作系统触及文件工作树版本的时间戳。 即使您更改要应用的一组过滤器和/或行结束转换,情况也是如此。 Git没有意识到您已经改变了行结束的工作方式,或者更改了清理过滤器做一些不同的事情:它只是看到索引的高速缓存方面说我匹配工作树版本时间戳T 。只要工作树版本的时间戳仍然是 T ,那么文件必须是干净的。

因此,要更新这些在更改任何文本转换设置后,您需要让Git认识到文件不干净。您可以触摸<路径> 来设置新的时间戳now,该时间戳与索引中的旧时间戳不匹配。现在 git add -a (或其他)将像平常一样进行扫描,但由于时间戳不匹配,它会在这次找到文件,并且会重新过滤它将它添加到索引中。



再次,这些转换发生在您< git add 文件。




通常,在类似Windows的系统中,您的目标是采用LF存储库格式的文件并将它们转换为Windows的CR-LF文件来处理。这种转换发生在索引 out 到工作树的路上:即,在 git checkout 期间。那么你会希望在 git add 过程中将这些CR-LF工作树文件转换为纯LF格式,这样库中的表单就是Linux和Linus Torvalds,因此Git :-))更喜欢它们。但如果你真的想要惹恼所有的Unix / Linux人员,你可以将它们以CR-LF格式存储在存储库中。如果有的话,你应用哪些步骤来完成转换: git checkout time和 git add 时间。

.gitattributes 文件指定将哪些转换应用于哪些文件。 core.autocrlf core.eol 设置不会: Git必须使其最好猜测哪些文件会在哪个步骤获得哪些转换。






1 从技术上讲,在索引中是文件的哈希ID。该文件本身作为Git blob 对象存储在存储库数据库中。就像提交对象一样,这些blob对象是不可变的。这是为什么它不能在索引中更改:它实际上只是一个哈希ID。



2 git add 过程只需写入一个 new blob,并在任何过滤之后写入新的blob。如果新的blob与某个现有的blob完全匹配,则新的blob将重新使用现有Blob的数据库条目和哈希ID,但实际上并未保存 - 现有Blob就足够了。如果不是,blob的数据将作为新文件存储,并带有新ID。这是新的哈希ID进入索引。


I'm reading git documentation about .gitattributes to fix my problems with mixed line endings and find out that there is two similar settings.

AUTOCRLF:

End-of-line conversion While Git normally leaves file contents alone, it can be configured to normalize line endings to LF in the repository and, optionally, to convert them to CRLF when files are checked out.

If you simply want to have CRLF line endings in your working directory regardless of the repository you are working with, you can set the config variable "core.autocrlf" without using any attributes.

[core] autocrlf = true This does not force normalization of text files, but does ensure that text files that you introduce to the repository have their line endings normalized to LF when they are added, and that files that are already normalized in the repository stay normalized.

And EOL:

This attribute sets a specific line-ending style to be used in the working directory. It enables end-of-line conversion without any content checks, effectively setting the text attribute.

Set to string value "crlf" This setting forces Git to normalize line endings for this file on checkin and convert them to CRLF when the file is checked out.

Set to string value "lf" This setting forces Git to normalize line endings to LF on checkin and prevents conversion to CRLF when the file is checked out.

Backwards compatibility with crlf attribute For backwards compatibility, the crlf attribute is interpreted as follows:

crlf text

-crlf -text

crlf=input eol=lf

It seems that both are doing the same, but there is something about compatibility. Does it mean, that autocrlf is deprecated and the new flavor is eol or something? I currently have a repository with multiple corrupted files which I want to convert into crlf representation. And you see that documentation confuse us instead of clarify things.

What should I apply in this situation?

解决方案

Rather than directly answering the question itself—see VonC's answer to the linked question for that—let's concentrate on this:

I currently have a repository with multiple corrupted files which I want to convert into crlf representation.

First, let's note that none of these options can change any existing commit. This is a fundamental Git property: once made, no existing commit can be altered. What you can do is make new commits. That's usually not too big a deal since usually we just want new stuff to be correct (but see git filter-branch, which copies commits after applying filters to their contents, and can be used to re-copy an entire repository: the new repo is no longer compatible with the old one, but you can "fix history" this way).

Next, I think this is the key to understanding all of these end of line / CRLF attribute options: transformations are applied to files when they move into or out of the index.

Remember that Git's index is where you build the next commit. The contents of the index are initially the same as whatever commit is current: you run git checkout master, for instance, and Git resolves the name master to a commit-ID and copies that particular commit to your work-tree—but the copy goes through the index.

In other words, Git first finds that file foo.txt is in the commit (and needs to be extracted). So Git moves that version of foo.txt to the index. The index's version exactly matches the HEAD commmit's version. Git does not apply any filters to the index version, nor change any line endings.

Once the index version is updated, Git copies that version of the file from the index to the work-tree.1 Some transformations take place now, during this extraction process. If there is a smudge filter, Git applies it now. If there are line-ending conversions to make, Git applies those now.

The work-tree file may, during this process, become different from the index version. Now Git has a problem, because now the file is "dirty" (modified in the work-tree). This is where things get particularly confusing, although most of the time, the details here are invisible.

Eventually, after working with your work-tree, you may run git add on some file path-name (or use git add -a or whatever to add many files). This copies the file from the work-tree, into the index.2 More transformations happen now, during this copy: if there is a clean filter, Git applies it now. If there are line-ending conversions to make, Git applies them now.

In other words, after git add-ing these files, the index version may not match the work-tree version. However, Git marks the index version as "matching" anyway. A git status will skip right over the work-tree version, because Git now claims that the index version matches the work-tree version. It sort of does, because the index version matches what would be added if you ran git add again.

The actual implementation uses time stamps, usually with one-second resolution. Git will continue to believe that the index version matches the work-tree version unless and until the OS touches the time-stamp on the work-tree version of the file. This is true even if you change the set of filters and/or line-ending conversions to apply. Git doesn't realize that you have changed the way the line endings should work, or changed the "clean" filter to do something different: it just sees that the index's "cache" aspect says "I match work-tree version time-stamp T". As long as the work-tree version's time-stamp is still T, the file must be "clean".

Hence, to update these things after changing any text-conversion settings, you need to make Git realize that the file is not clean. You can touch <path> to set a new time-stamp of "now", which won't match the older time stamp in the index. Now git add -a (or whatever) will scan as usual, but since the time stamps don't match, it will find the file this time, and will re-filter it to add it to the index.

Again, these transformations occur when you git add the file.


Normally, on a Windows-like system, your goal here will be to take LF-only repository-format files and turn them into CR-LF files for Windows to deal with. That transformation occurs on the way out of the index, to the work-tree: i.e., during git checkout. Then you would want to transform these CR-LF work-tree files into LF-only format during the git add process, so that the in-repository form is the way Linux (and Linus Torvalds and hence Git :-) ) prefer them. But you can store them inside the repository in CR-LF format, if you really want to annoy all the Unix/Linux folks. It's all a matter of which transforms, if any, you apply at which steps: git checkout time, and git add time.

The .gitattributes file specifies which transforms to apply to which files. The core.autocrlf and core.eol settings don't: Git must make its best guess about which files get which transformations at which step.


1Technically, all that's in the index is the hash ID of the file. The file itself is stored as a Git blob object in the repository database. Just as with commit objects, these blob objects are immutable. That's why it cannot be changed in the index: it's really just a hash ID.

2The git add process simply writes a new blob, with the new blob written after any filtering. If the new blob exactly matches some existing blob, bit-for-bit, the new blob re-uses the existing blob's database entry and hash ID, and is not actually saved—the existing blob suffices. If not, the blob's data gets stored as a new file, with a new ID. It's the new hash ID that goes into the index.

这篇关于autocrlf和eol有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆