Git Blame提交统计 [英] Git Blame Commit Statistics
问题描述
我如何滥用"责备(或某些更合适的功能,和/或与shell命令结合使用),以统计当前存储库中来自每个提交者的行数(代码)? /p>
示例输出:
Committer 1: 8046 Lines
Committer 2: 4378 Lines
更新
git ls-tree -r -z --name-only HEAD -- */*.c | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
我在路上更新了一些东西.
为方便起见,您也可以将其放入自己的命令中:
#!/bin/bash
# save as i.e.: git-authors and set the executable flag
git ls-tree -r -z --name-only HEAD -- $1 | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
将此内容存储在您的路径中的某个位置,或修改您的路径并像使用它一样
-
git authors '*/*.c' # look for all files recursively ending in .c
-
git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
-
git authors 'Makefile' # just count lines of authors in the Makefile
原始答案
虽然接受的答案可以完成工作,但是非常慢.
$ git ls-tree --name-only -z -r HEAD|egrep -z -Z -E '\.(cc|h|cpp|hpp|c|txt)$' \
|xargs -0 -n1 git blame --line-porcelain|grep "^author "|sort|uniq -c|sort -nr
几乎是瞬时的.
要获取当前跟踪的文件列表,可以使用
git ls-tree --name-only -r HEAD
此解决方案避免调用file
来确定文件类型,并出于性能原因而使用grep匹配所需的扩展名.如果应包括所有文件,只需将其从行中删除.
grep -E '\.(cc|h|cpp|hpp|c)$' # for C/C++ files
grep -E '\.py$' # for Python files
如果文件中可以包含空格,这对shell不利,则可以使用:
git ls-tree -z --name-only -r HEAD | egrep -Z -z '\.py'|xargs -0 ... # passes newlines as '\0'
(通过管道)提供文件列表,可以使用xargs调用命令并分配参数.允许处理多个文件的命令将忽略-n1
.在这种情况下,我们调用git blame --line-porcelain
,并且每次调用都使用正好1个参数.
xargs -n1 git blame --line-porcelain
然后,我们对输出进行过滤,以查找是否存在作者",对列表进行排序并通过以下方式对重复行进行计数:
grep "^author "|sort|uniq -c|sort -nr
注意
其他答案实际上过滤掉仅包含空格的行.
grep -Pzo "author [^\n]*\n([^\n]*\n){10}[\w]*[^\w]"|grep "author "
上面的命令将打印包含至少一个非空白字符的行的作者.您还可以使用匹配\w*[^\w#]
,这还将排除第一个非空白字符不是#
(在许多脚本语言中为注释)的行.
How can I "abuse" blame (or some better suited function, and/or in conjunction with shell commands) to give me a statistic of how much lines (of code) are currently in the repository originating from each committer?
Example Output:
Committer 1: 8046 Lines
Committer 2: 4378 Lines
Update
git ls-tree -r -z --name-only HEAD -- */*.c | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
I updated some things on the way.
For convenience, you can also put this into its own command:
#!/bin/bash
# save as i.e.: git-authors and set the executable flag
git ls-tree -r -z --name-only HEAD -- $1 | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
store this somewhere in your path or modify your path and use it like
git authors '*/*.c' # look for all files recursively ending in .c
git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
git authors 'Makefile' # just count lines of authors in the Makefile
Original Answer
While the accepted answer does the job it's very slow.
$ git ls-tree --name-only -z -r HEAD|egrep -z -Z -E '\.(cc|h|cpp|hpp|c|txt)$' \
|xargs -0 -n1 git blame --line-porcelain|grep "^author "|sort|uniq -c|sort -nr
is almost instantaneous.
To get a list of files currently tracked you can use
git ls-tree --name-only -r HEAD
This solution avoids calling file
to determine the filetype and uses grep to match the wanted extension for performance reasons. If all files should be included, just remove this from the line.
grep -E '\.(cc|h|cpp|hpp|c)$' # for C/C++ files
grep -E '\.py$' # for Python files
if the files can contain spaces, which are bad for shells you can use:
git ls-tree -z --name-only -r HEAD | egrep -Z -z '\.py'|xargs -0 ... # passes newlines as '\0'
Give a list of files (through a pipe) one can use xargs to call a command and distribute the arguments. Commands that allow multiple files to be processed obmit the -n1
. In this case we call git blame --line-porcelain
and for every call we use exactly 1 argument.
xargs -n1 git blame --line-porcelain
We then filter the output for occurences of "author " sort the list and count duplicate lines by:
grep "^author "|sort|uniq -c|sort -nr
Note
Other answers actually filter out lines that contain only whitespaces.
grep -Pzo "author [^\n]*\n([^\n]*\n){10}[\w]*[^\w]"|grep "author "
The command above will print authors of lines containing at least one non-whitespace character. You can also use match \w*[^\w#]
which will also exclude lines where the first non-whitespace character isn't a #
(comment in many scripting languages).
这篇关于Git Blame提交统计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!