Git 在一行内合并 [英] Git merging within a line

查看:15
本文介绍了Git 在一行内合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

序言

我使用 git 作为我实验室正在撰写的论文的版本控制系统,使用 LaTeX.有几个人在合作.

I'm using git as a version control system for a paper that my lab is writing, in LaTeX. There are several people collaborating.

我遇到了 git 对它如何合并很固执的情况.假设两个人对一行进行了单字更改,然后尝试将它们合并.尽管 git diff --word-diff 似乎能够逐字显示分支之间的差异,但 git merge 似乎无法逐字执行合并,而是需要手动合并.

I'm running into git being stubborn about how it merges. Let's say two people have made single-word changes to a line, and then attempt to merge them. Though git diff --word-diff seems capable of SHOWING the difference between the branches word-by-word, git merge seems unable to perform the merge word-by-word, and instead requires a manual merge.

对于 LaTeX 文档,这尤其令人讨厌,因为编写 LaTeX 时的常见习惯是每行写一个完整的段落,并在显示时让文本编辑器处理自动换行.我们现在正在通过为每个句子添加一个换行符来解决这个问题,这样 git 至少可以合并段落中不同句子的更改.但是它仍然会对句子中的多个更改感到困惑,这当然会使文本不再很好地换行.

With a LaTeX document this is particularly annoying, as the common habit when writing LaTeX is to write a full paragraph per line and just let your text editor handle word wrapping when displaying for you. We are working around for now by adding a newline for each sentence, so that git can at least merge changes on different sentences within a paragraph. But it will still get confused about multiple changes within a sentence, and this makes the text no longer wrap nicely of course.

问题

有没有办法逐字"而不是逐行"合并两个文件?

Is there a way to git merge two files "word by word" rather than "line by line"?

推荐答案

这里有一个与 sehe 相同的解决方案,但有一些更改,希望能解决您的意见:

Here's a solution in the same vein as sehe's, with a few changes which hopefully will address your comments:

  • 这个解决方案考虑按句子合并,而不是按单词合并,就像你以前手工做的那样,只有现在,用户会看到每个段落一行,但git会看到段落分成句子.这似乎更合乎逻辑,因为从段落中添加/删除句子可能与段落中的其他更改兼容,但是当两个提交编辑同一个句子时,手动合并可能更可取.这还具有干净"快照的好处,即仍然具有某种程度的人类可读性(并且可通过 Latex 编译!).
  • 过滤器是单行命令,应该可以更轻松地将其移植给协作者.
  • This solution considers merging by sentence rather than by word, as you had been doing by hand, only now, the user will see a single line per paragraph, but git will see paragraphs broken into sentences. This seems to be more logical because adding/removing a sentence from a paragraph may be compatible with other changes in the paragraph, but it is probably more desirable to have a manual merge when the same sentence is edited by two commits. This also has the benefit of the "clean" snapshots to still be somewhat human readable (and latex compilable!).
  • The filters are one-line commands, which should make it easier to port this to collaborators.

在 saha 的解决方案中制作一个(或附加到).gittatributes.

As in saha's solution make a (or append to) .gittatributes.

    *.tex filter=sentencebreak

现在实施清洁和污迹过滤器:

Now to implement the clean and smudge filters:

    git config filter.sentencebreak.clean "perl -pe "s/[.]*?(\?|\!|\.|'') /$&%NL%\n/g unless m/%/||m/^[\ *\\\]/""
    git config filter.sentencebreak.smudge "perl -pe "s/%NL%
//gm""

我创建了一个包含以下内容的测试文件,注意单行段落.

I've created a test file with the following contents, notice the one-line paragraph.

    chapter{Tumbling Tumbleweeds. Intro}
    A way out west there was a fella, fella I want to tell you about, fella by the name of Jeff Lebowski.  At least, that was the handle his lovin' parents gave him, but he never had much use for it himself. This Lebowski, he called himself the Dude. Now, Dude, that's a name no one would self-apply where I come from.  But then, there was a lot about the Dude that didn't make a whole lot of sense to me.  And a lot about where he lived, like- wise.  But then again, maybe that's why I found the place s'durned innarestin'.

    This line has two sentences. But it also ends with a comment. % here

将其提交到本地存储库后,我们可以看到原始内容.

After we commit it to the local repo, we can see the raw contents.

    $ git show HEAD:test.tex

    chapter{Tumbling Tumbleweeds. Intro}
    A way out west there was a fella, fella I want to tell you about, fella by the name of Jeff Lebowski. %NL%
     At least, that was the handle his lovin' parents gave him, but he never had much use for it himself. %NL%
    This Lebowski, he called himself the Dude. %NL%
    Now, Dude, that's a name no one would self-apply where I come from. %NL%
     But then, there was a lot about the Dude that didn't make a whole lot of sense to me. %NL%
     And a lot about where he lived, like- wise. %NL%
     But then again, maybe that's why I found the place s'durned innarestin'.

    This line has two sentences. But it also ends with a comment. % here

因此,干净过滤器的规则是只要找到以 .?! 结尾的文本字符串>''(这是做双引号的乳胶方式)然后一个空格,它将添加 %NL% 和一个换行符.但它会忽略以 开头的行(乳胶命令)或在任何地方包含注释(因此注释不能成为正文的一部分).

So the rules of the clean filter are whenever it finds a string of text that ends with . or ? or ! or '' (that's the latex way to do double quotes) then a space, it will add %NL% and a newline character. But it ignores lines that start with (latex commands) or contain a comment anywhere (so that comments cannot become part of the main text).

污迹过滤器移除 %NL% 和换行符.

The smudge filter removes %NL% and the newline.

区分和合并是在干净的"文件上完成的,因此对段落的更改会逐句合并.这是所需的行为.

Diffing and merging is done on the 'clean' files so changes to paragraphs are merged sentence by sentence. This is the desired behavior.

好消息是,latex 文件应该在干净或污迹状态下编译,因此合作者有一些希望不需要做任何事情.最后,您可以将 git config 命令放在作为 repo 一部分的 shell 脚本中,这样协作者只需在 repo 的根目录中运行它即可进行配置.

The nice thing is that the latex file should compile in either the clean or smudged state, so there is some hope for collaborators to not need to do anything. Finally, you could put the git config commands in a shell script that is part of the repo so a collaborator would just have to run it in the root of the repo to get configured.

    #!/bin/bash

    git config filter.sentencebreak.clean "perl -pe "s/[.]*?(\?|\!|\.|'') /$&%NL%\n/g unless m/%/||m/^[\ *\\\]/""
    git config filter.sentencebreak.smudge "perl -pe "s/%NL%
//gm""

    fileArray=($(find . -iname "*.tex"))

    for (( i=0; i<${#fileArray[@]}; i++ ));
    do
        perl -pe "s/%NL%
//gm" < ${fileArray[$i]} > temp
        mv temp ${fileArray[$i]}
    done

最后一点是一个黑客,因为当这个脚本第一次运行时,分支已经被签出(以干净的形式)并且它不会被自动弄脏.

That last little bit is a hack because when this script is first run, the branch is already checked out (in the clean form) and it doesn't get smudged automatically.

你可以把这个脚本和.gitattributes文件添加到repo中,然后新用户只需要clone,然后在repo根目录运行脚本即可.

You can add this script and the .gitattributes file to the repo, then new users just need to clone, then run the script in the root of the repo.

如果在 git bash 中完成,我认为这个脚本甚至可以在 windows git 上运行.

I think this script even runs on windows git if done in git bash.

缺点:

  • 这不会巧妙地处理带有注释的行,它只是忽略它们.
  • %NL% 有点丑
  • 过滤器可能会搞砸一些方程式(对此不确定).

这篇关于Git 在一行内合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆