确定作者当前的代码分布 [英] Determine current code distribution by author
问题描述
我认为如果可以采用一个Git仓库,运行一些脚本,并让它产生代码库中的行数以及每个作者贡献的比例,那么它会很整洁。
基本上,因为我是一个有竞争力的编码器,所以我想要一个个人衡量标准来查看我写的行数(在当前的HEAD中)是否大于我的合作伙伴。如果说我写了当前代码库的%,这将是一个有趣的统计数据。
有没有人想过要这样做?我寻找了一种方法,但是我的shell脚本并不是最好的,所以我不能单独做它。 解决方案
你可以尝试解析 git-blame
。这个命令给出了编辑文件的每一行的最后一个人。
这个例子并不是你想要的,但我认为它给了你一个想法:
git blame -e the / file | awk -F'< |>''{print $ 2}'|排序| uniq -c
这将打印作者的电子邮件地址以及他们的行数例如:
47 foo@bar.com
34712 blah@baz.com
为了让它在整个资源库中运行,可以这样做:
git ls-files |同时读f;做git责备-e $ f;完成| awk -F'< |>''{print $ 2}'|排序| uniq -c
这里的想法是先用git ls-files生成文件列表,然后在每个文件上运行上面的代码片段(使用上面提到的这里)。如果你在一个大的代码库上运行它,你可能希望将中间结果存储在临时文件中,而不是使用管道。
I thought it would be neat if it were possible to take a Git repository, run some script, and have it generate the number of lines in the code base, and the proportion of each author that contributed to it.
Basically, because I am kind of a competitive coder, I would like a personal metric to see if the number of lines that I've written (in the current HEAD) are greater than my partner(s). It would be a fun statistic to say "I wrote % of the current codebase".
Has anyone ever thought to do this? I've looked for a way, but my shell scripting is not the best, so I couldn't do it alone.
You could try to parse the output of git-blame
. This command gives the last person that edited each line of a file.
This example is not exactly what you want but I think it gives you the idea:
git blame -e the/file | awk -F '<|>' '{print $2}' | sort | uniq -c
This will print the e-mail addresses of the authors together with the number of lines they modified lastly for a file, for example:
47 foo@bar.com
34712 blah@baz.com
To make it run on the whole repository, you can do something like this:
git ls-files | while read f; do git blame -e $f; done | awk -F '<|>' '{print $2}' | sort | uniq -c
The idea here is to first generate the list of files with git ls-files, and then run the above snippet on each of the files (using the snippet mentioned here). If you're running this on a large codebase, you may want to store intermediate results in temporary files rather than use pipes.
这篇关于确定作者当前的代码分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!