确保文件在更新钩子中将CRLF转换为LF - 是否有性能问题? [英] make sure files are converted CRLF into LF in an update hook- is there a performance hit?

查看:141
本文介绍了确保文件在更新钩子中将CRLF转换为LF - 是否有性能问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于当前版本和下一版本中的core.autocrlf和core.safecrlf特性已经有很多讨论。我在这里的问题涉及开发人员从裸仓库克隆的环境。

在克隆期间,启用autocrlf设置。但是由于开发人员完全控制了他们的克隆,他们可以删除此autocrlf设置并继续。


  1. 我们可以指定其他文件比二进制的.gitattributes文件,但有没有其他方式GIT自动确定文件是文本文件还是二进制文件?

  2. 有没有一种方法一个更新挂钩(提交挂钩不可能,因为开发人员仍然可以删除它),可以确保从Windows环境将文件(包含CRLF)推送到承载裸回购的UNIX计算机,并转换为UNIX EOL格式(LF)?
  3. 是否有这样的更新钩子扫描每个文件的CRLF会影响推送操作的性能?

  4. ol>

    谢谢

    解决方案


    • 1 / Git本身有一个启发式来确定一个文件是二进制还是文本(类似于 istext ) p>


    • 2 / gergap博客最近(2010年5月)有相同的想法。

      查看他的更新钩子在这里(在这个答案的最后转载),但诀窍是:

      而不是试图转换,钩子会简单地拒绝推动,如果它检测到(假设是)非二进制文件,使用不当的eol风格

    • 在检出时转换 LF-> CRLF 在Windows上。
      如果文件已经包含 CRLF ,Git足够聪明地检测它并且不会将其扩展到 CRCRLF 会出现什么问题。它保留了 CRLF ,这意味着在签出期间文件在本地被隐式地改变了,因为当再次提交时,错误的 CRLF 将被更正为 LF 。这就是为什么GIT必须将这些文件标记为已修改。



      理解这个问题是很好的,但我们需要一种解决方案来防止错误的线路终端被推到中央repo。

      解决方案是在中央服务器上安装更新挂钩





      • 3 /会有一个小的成本,但除非你每30秒推一次,否则这不应该成为问题。

        加上没有实际的转换发生:它会导致文件不正确,推送被拒绝。

        将转换问题放在应该属于它的位置:开发者端。






       #!/ bin / sh 

      #作者:Gerhard Gappmeier, ascolab GmbH
      #此脚本基于git / contrib / hooks中的update.sample。
      #你可以随意使用这个脚本,无论你想要什么。

      #要启用此挂钩,请将此文件重命名为更新。


      #---命令行
      refname =$ 1
      oldrev =$ 2
      newrev =$ 3
      #echoCOMMANDLINE:$ *

      #---安全检查
      如果[-z$ GIT_DIR];那么
      回显不要从命令行运行此脚本。 >& 2
      echo(如果你愿意,你可以提供GIT_DIR然后运行>& 2
      echo$ 0< ref>< oldrev>< newrev>)> ;& 2
      exit 1
      fi

      if [-z$ refname-o -z$ oldrev-o -z$ newrev];那么
      回声用法:$ 0< ref>< oldrev>< newrev> >& 2
      exit 1
      fi

      BINARAY_EXT =pdb dll exe png gif jpg

      #如果给定文件名为一个二进制文件
      函数IsBinary()
      {
      result = 0
      用于$ BINARAY_EXT中的ext;如果[$ ext=$ {1#*。}];则执行
      ;然后
      结果= 1
      中断
      fi
      完成

      返回$结果
      }

      #make temp路径
      tmp = $(mktemp /tmp/git.update.XXXXXX)
      log = $(mktemp /tmp/git.update.log.XXXXXX)
      tree = $(mktemp / tmp /git.diff-tree.XXXXXX)
      ret = 0

      git diff-tree -r$ oldrev$ newrev> $ tree
      #echo
      #echo diff-tree:
      #cat $ tree

      #使用文件描述符读取$ tree
      exec 3& < ; 0
      exec 0< $ tree
      while read old_mode new_mode old_sha1 new_sha1 status name
      do
      #debug output
      #echoold_mode = $ old_mode new_mode = $ new_mode old_sha1 = $ old_sha1 new_sha1 = $ new_sha1 status = $ status name = $ name
      #跳过显示父提交的行
      test -z$ new_sha1&&继续
      #跳过删除操作
      [$ new_sha1=0000000000000000000000000000000000000000]&&继续

      #不要对二进制文件进行CRLF检查
      IsBinary $ tmp
      if [$? -eq 1];然后
      continue#跳过二进制文件
      fi

      #检查CRLF
      git cat-file blob $ new_sha1> $ tmp
      RESULT =`grep -Pl'\r\\\
      '$ tmp`
      echo $ RESULT
      if [$ RESULT=$ tmp];然后
      echo########################################### ################################################## ######
      echo#'$ name'contains CRLF!尊敬的Windows开发者,请激活GIT core.autocrlf功能,
      echo#或者将行结束符更改为LF试图推动。
      echo#使用'git config core.autocrlf true'激活CRLF转换。
      echo#或者使用'git reset HEAD〜1'撤销上次提交并修复行结束符。
      echo############################################ ################################################## #####
      ret = 1
      fi
      完成
      exec 0& 3
      #---完成
      出口$ ret


      There had been a lot of discussions about the core.autocrlf and core.safecrlf features in the current release and the next release. The question i have here relates to an environment where developers clone from a bare repository.

      During the clone the autocrlf settings are enabled. But since the developers has full control on their clone, they can remove this autocrlf setting and proceed.

      1. We can specify files other than binary in the .gitattributes file but is there any other way GIT automatically determine if a file is a text file or binary file?

      2. Is there a way like an update hook (commit hook is not possible as developers can still remove it) that can be placed to make sure, the files (with CRLF) being pushed from a windows environment to a UNIX machine hosting the bare repo, is converted to UNIX EOL format (LF)?

      3. Will having such update hooks that scans each file for CRLF affect performance of a push operation?

      Thanks

      解决方案

      • 1/ Git itself has an heuristic to determine if a file is binary or text (similar to istext)

      • 2/ gergap weblog had recently (may 2010) the same idea.
        See his update hook here (reproduced at the end of this answer), but the trick is:
        Rather than trying to convert, the hook will simply reject the push if it detects an (supposedly) non-binary file with improper eol style.

      Git converts LF->CRLF when checking out on Windows.
      If the file contains already CRLF, Git is clever enough to detect that and does not expand it to CRCRLF what would be wrong. It keeps the CRLF, which means the file was implicitly changed locally during the checkout, because when committing it again, the wrong CRLF will be corrected to LF. That’s why GIT must mark these files as modified.

      It’s good to understand the problem, but we need a solution that prevents that wrong line endi- ngs are pushed to the central repo.
      The solution is to install an update hook on the central server.

      • 3/ There will be a small cost, but unless you push every 30 seconds, this shouldn't be an issue.
        Plus there is no actual conversion taking place: it the file is not correct, the push gets rejected.
        That places the conversion issue right back where it should belong: on the developer side.

      #!/bin/sh
      #
      # Author: Gerhard Gappmeier, ascolab GmbH
      # This script is based on the update.sample in git/contrib/hooks.
      # You are free to use this script for whatever you want.
      #
      # To enable this hook, rename this file to "update".
      #
      
      # --- Command line
      refname="$1"
      oldrev="$2"
      newrev="$3"
      #echo "COMMANDLINE: $*"
      
      # --- Safety check
      if [ -z "$GIT_DIR" ]; then
          echo "Don't run this script from the command line." >&2
          echo " (if you want, you could supply GIT_DIR then run" >&2
          echo "  $0 <ref> <oldrev> <newrev>)" >&2
          exit 1
      fi
      
      if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then
          echo "Usage: $0 <ref> <oldrev> <newrev>" >&2
          exit 1
      fi
      
      BINARAY_EXT="pdb dll exe png gif jpg"
      
      # returns 1 if the given filename is a binary file
      function IsBinary() 
      {
          result=0
          for ext in $BINARAY_EXT; do
              if [ "$ext" = "${1#*.}" ]; then
                  result=1
                  break
              fi
          done
      
          return $result
      }
      
      # make temp paths
      tmp=$(mktemp /tmp/git.update.XXXXXX)
      log=$(mktemp /tmp/git.update.log.XXXXXX)    
      tree=$(mktemp /tmp/git.diff-tree.XXXXXX)
      ret=0
      
      git diff-tree -r "$oldrev" "$newrev" > $tree
      #echo
      #echo diff-tree:
      #cat $tree
      
      # read $tree using the file descriptors
      exec 3<&0
      exec 0<$tree
      while read old_mode new_mode old_sha1 new_sha1 status name
      do
          # debug output
          #echo "old_mode=$old_mode new_mode=$new_mode old_sha1=$old_sha1 new_sha1=$new_sha1 status=$status name=$name"
          # skip lines showing parent commit
          test -z "$new_sha1" && continue
          # skip deletions
          [ "$new_sha1" = "0000000000000000000000000000000000000000" ] && continue
      
          # don't do a CRLF check for binary files
          IsBinary $tmp
          if [ $? -eq 1 ]; then
              continue # skip binary files
          fi
      
          # check for CRLF
          git cat-file blob $new_sha1 > $tmp
          RESULT=`grep -Pl '\r\n' $tmp`
          echo $RESULT
          if [ "$RESULT" = "$tmp" ]; then
              echo "###################################################################################################"
              echo "# '$name' contains CRLF! Dear Windows developer, please activate the GIT core.autocrlf feature,"
              echo "# or change the line endings to LF before trying to push."
              echo "# Use 'git config core.autocrlf true' to activate CRLF conversion."
              echo "# OR use 'git reset HEAD~1' to undo your last commit and fix the line endings."
              echo "###################################################################################################"
              ret=1
          fi
      done
      exec 0<&3
      # --- Finished
      exit $ret
      

      这篇关于确保文件在更新钩子中将CRLF转换为LF - 是否有性能问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆