有效检索包含提交的版本 [英] Efficient retrieval of releases that contain a commit

查看:75
本文介绍了有效检索包含提交的版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在命令行中,如果我输入

  git tag --contains {commit} 

获取包含给定提交的发布列表,每个发布需要大约11到20秒承诺。由于目标代码库存在超过300,000次提交,所以需要大量的检索所有提交的信息。



然而, gitk 显然设法检索这些数据做得很好。从我搜索的内容来看,它使用了缓存。



我有两个问题:


  1. 如何解释缓存格式?

  2. 有没有办法从 git 命令行获取转储工具来生成相同的信息?


解决方案

您可以直接从 git rev-list



latest.awk

  BEGIN {thiscommit =; } 
$ 1 ==commit{
if(thiscommit!=)
print thiscommit,tags [thiscommit]
thiscommit = $ 2
line [$ 2] = NR
latest = 0;如果(行[$ i]>最近){
latest =行[$ i];如果(行[$ i]>最近)为$(i = 3; i <= NF; ++ i)
标签[$ 2] =标签[$ i];
}
next;
}
$ 1!=commit{tags [thiscommit] = $ 0; }
END {if(thiscommit!=)print thiscommit,tags [thiscommit]; }

示例命令:

  git rev-list --date-order --children --format =%d --all | awk -f latest.awk 

你也可以使用 - topo-order ,你可能必须清除 $ 1!=commit逻辑中的不需要的引用。



根据您想要的传递性以及清单的明确程度,累积标签可能需要字典。这里有一个明确列出所有提交的所有ref:



all.awk

  BEGIN {
thiscommit =;
}
$ 1 ==commit{
if(thiscommit!=)
print thiscommit,tags [thiscommit]
thiscommit = $ 2
行[$ 2] = NR
split(,看过);
for(i = 3; i <= NF; ++ i){
nnew = split(tags [$ i],new);
for(n = 1; n <= nnew; ++ n){
if(!seen [new [n]]){
tags [$ 2] = tags [$ 2] new [n]
[new [n]] = 1
}
}
}
next;
}
$ 1!=commit{
nnew = split($ 0,new,,);
new [1] = substr(new [1],3);
new [nnew] = substr(new [nnew],1,length(new [nnew]) - 1);
for(n = 1; n <= nnew; ++ n)
tags [thiscommit] = tags [thiscommit]new [n]

}
END {if(thiscommit!=)print thiscommit,tags [thiscommit]; }

all.awk 花了几分钟要做322K的Linux内核仓库提交,大约每秒1000次或类似的事情(大量重复的字符串和冗余处理),所以如果你真的在完整的交叉产品之后,你可能想用C ++重写。 ..但我不认为gitk显示,只有最近的邻居,对不对?

In the command line, if I type

git tag --contains {commit}

to obtain a list of releases that contain a given commit, it takes around 11 to 20 seconds for each commit. Since the target code base there exists more than 300,000 commits, it would take a lot to retrieve this information for all commits.

However, gitk apparently manages to do a good job retrieving this data. From what I searched, it uses a cache for that purpose.

I have two questions:

  1. How can I interpret that cache format?
  2. Is there a way to obtain a dump from the git command line tool to generate that same information?

解决方案

You can get this almost directly from git rev-list.

latest.awk:

BEGIN { thiscommit=""; }
$1 == "commit" {
    if ( thiscommit != "" )
        print thiscommit, tags[thiscommit]
    thiscommit=$2
    line[$2]=NR
    latest = 0;
    for ( i = 3 ; i <= NF ; ++i ) if ( line[$i] > latest ) {
        latest = line[$i];
        tags[$2] = tags[$i];
    }
    next;
}
$1 != "commit"  { tags[thiscommit] = $0; }
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }

a sample command:

git rev-list --date-order --children --format=%d --all | awk -f latest.awk

you can also use --topo-order, and you'll probably have to weed out unwanted refs in the $1!="commit" logic.

Depending on what kind of transitivity you want and how explicit the listing has to be, accumulating the tags might need a dictionary. Here's one that gets an explicit listing of all refs for all commits:

all.awk:

BEGIN {
    thiscommit="";
}
$1 == "commit" {
    if ( thiscommit != "" )
        print thiscommit, tags[thiscommit]
    thiscommit=$2
    line[$2]=NR
    split("",seen);
    for ( i = 3 ; i <= NF ; ++i ) {
        nnew=split(tags[$i],new);
        for ( n = 1 ; n <= nnew ; ++n ) {
            if ( !seen[new[n]] ) {
                tags[$2]= tags[$2]" "new[n]
                seen[new[n]] = 1
            }
        }
    }
    next;
}
$1 != "commit"  {
    nnew=split($0,new,", ");
    new[1]=substr(new[1],3);
    new[nnew]=substr(new[nnew],1,length(new[nnew])-1);
    for ( n = 1; n <= nnew ; ++n )
        tags[thiscommit] = tags[thiscommit]" "new[n]

}
END { if ( thiscommit != "" ) print thiscommit, tags[thiscommit]; }

all.awk took a few minutes to do the 322K linux kernel repo commits, about a thousand a second or something like that (lots of duplicate strings and redundant processing) so you'd probably want to rewrite that in C++ if you're really after the complete cross-product ... but I don't think gitk shows that, only the nearest neighbors, right?

这篇关于有效检索包含提交的版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆