Git blame:统计 [英] Git blame: statistics
问题描述
我怎样才能滥用责备(或者一些更适合的功能,和/或与shell命令结合起来),让我能够统计每个提交者源代码库中目前有多少行代码? / p>
输出示例:
提交者1:8046行
提交者2:4378行
更新
git ls-tree -r -z --name-only HEAD - * / *。c | xargs -0 -n1 git blame \
--line-porcelain HEAD | grep^ author| sort | uniq -c | sort -nr
我在路上更新了一些东西。
对于懒惰,您也可以将它放入它自己的命令中:
#!/ bin / bash
#另存为:git-authors并设置可执行文件flag
git ls-tree -r -z --name-only HEAD - $ 1 | xargs -0 -n1 git blame \
--line-porcelain HEAD | grep^ author| sort | uniq -c | sort -nr
将此路径存储在您的路径中,或者修改您的路径并像使用它一样使用它
-
git authors'* / *。c'#寻找以.c结尾的所有文件
-
git作者'* / *。[ch]'#寻找递归结尾为.c或.h
的所有文件 -
git authors' Makefile'#只计算Makefile中的作者行数
原始答案
尽管接受的答案完成了这项工作,但它非常缓慢。
$ git ls-tree - name-only -z -r HEAD | egrep -z -Z -E'\。(cc | h | cpp | hpp | c | txt)$'\
| xargs -0 -n1 git blame - -line-porcelain | grep^ author| sort | uniq -c | sort -nr
现在几乎是瞬间的。
要获取当前跟踪的文件列表,您可以使用
git ls-tree --name-only -r HEAD
此解决方案避免调用 file
来确定文件类型,并出于性能原因使用grep来匹配想要的扩展名。如果应该包含所有文件,只需从行中删除。
grep -E'\。(cc | h | cpp | hpp | c)$'#for C / C ++文件
grep -E'\.py $'#for Python files
如果文件可以包含空格,这些空格对于可以使用的shell是不利的:
git ls-tree -z --name-only -r HEAD | egrep -Z -z'\.py'| xargs -0 ...#将换行符传递为'\ 0'
给出文件列表(通过管道),可以使用xargs来调用命令并分发参数。允许处理多个文件的命令会提示 -n1
。在这种情况下,我们调用 git blame --line-porcelain
,并且对于每个调用我们只使用1个参数。
xargs -n1 git blame --line-porcelain
然后我们过滤作者出现的输出对列表进行排序并通过以下方式对重复行进行排序:
grep^ author| sort | uniq -c | sort -nr
注意
其他答案实际上会过滤掉只包含空格的行。
grep -Pzoauthor [^ \\\
] * \\ \\ n([^ \\\
] * \\\
){10} [\w] * [^ \w]| grepauthor
上面的命令将打印包含至少一个非空白字符的行的作者。您还可以使用匹配 \ w * [^ \ w#]
,这也会排除第一个非空白字符不是#
(以许多脚本语言评论)。
How can I "abuse" blame (or some better suited function, and/or in conjunction with shell commands) to give me a statistic of how much lines (of code) are currently in the repository originating from each committer?
Example Output:
Committer 1: 8046 Lines
Committer 2: 4378 Lines
Update
git ls-tree -r -z --name-only HEAD -- */*.c | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
I updated some things on the way.
for the lazy you can also put this into it's own command:
#!/bin/bash
# save as i.e.: git-authors and set the executable flag
git ls-tree -r -z --name-only HEAD -- $1 | xargs -0 -n1 git blame \
--line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr
store this somewhere in your path or modify your path and use it like
git authors '*/*.c' # look for all files recursively ending in .c
git authors '*/*.[ch]' # look for all files recursively ending in .c or .h
git authors 'Makefile' # just count lines of authors in the Makefile
Original Answer
While the accepted answer does the job it's very slow.
$ git ls-tree --name-only -z -r HEAD|egrep -z -Z -E '\.(cc|h|cpp|hpp|c|txt)$' \
|xargs -0 -n1 git blame --line-porcelain|grep "^author "|sort|uniq -c|sort -nr
is almost instantaneous.
To get a list of files currently tracked you can use
git ls-tree --name-only -r HEAD
This solution avoids calling file
to determine the filetype and uses grep to match the wanted extension for performance reasons. If all files should be included, just remove this from the line.
grep -E '\.(cc|h|cpp|hpp|c)$' # for C/C++ files
grep -E '\.py$' # for Python files
if the files can contain spaces, which are bad for shells you can use:
git ls-tree -z --name-only -r HEAD | egrep -Z -z '\.py'|xargs -0 ... # passes newlines as '\0'
Give a list of files (through a pipe) one can use xargs to call a command and distribute the arguments. Commands that allow multiple files to be processed obmit the -n1
. In this case we call git blame --line-porcelain
and for every call we use exactly 1 argument.
xargs -n1 git blame --line-porcelain
We then filter the output for occurences of "author " sort the list and count duplicate lines by:
grep "^author "|sort|uniq -c|sort -nr
Note
Other answers actually filter out lines that contain only whitespaces.
grep -Pzo "author [^\n]*\n([^\n]*\n){10}[\w]*[^\w]"|grep "author "
The command above will print authors of lines containing at least one non-whitespace character. You can also use match \w*[^\w#]
which will also exclude lines where the first non-whitespace character isn't a #
(comment in many scripting languages).
这篇关于Git blame:统计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!