在C ++中执行矢量交集 [英] performing vector intersection in C++

查看:96
本文介绍了在C ++中执行矢量交集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个无符号向量的向量。我需要找到所有这些无符号向量的交集,这样我写了下面的代码:

  int func()
{
vector< vector< unsigned> >吨;
vector< unsigned> intersectedValues;
bool firstIntersection = true;
for(int i = 0; i<(t).size(); i ++)
{
if(firstIntersection)
{
intersectedValues = t [0 ]。
firstIntersection = false;
} else {
vector< unsigned> tempIntersectedSubjects;
set_intersection(t [i] .begin(),
t [i] .end(),intersectedValues.begin(),
intersectedValues.end(),
std :: inserter(tempIntersectedSubjects,tempIntersectedSubjects.begin()));
intersectedValues = tempIntersectedSubjects;
}
if(intersectedValues.size()== 0)
break;


$ / code>

每个单独的矢量都有9000个元素, t中的向量。当我剖析我的代码时,我发现set_intersection占用了最多的时间,因此当有很多fun​​c()调用时,代码变慢。我可以使用:gcc(GCC)4.8.2 20140120(Red Hat 4.8.2-15)可以让我建议如何让代码更有效率。



编辑:向量t中的个别向量被排序。

解决方案

我没有一个框架来描述操作,但是我肯定会更改代码以重用随时分配的向量。另外,我会将循环中的初始交点提升。另外, std :: back_inserter()应该确保元素被添加到正确的位置而不是在开始处:

  int func()
{
vector< vector< unsigned> > t = some_initialization();
if(t.empty()){
return;
}
vector< unsigned> intersectedValues(T [0]);
vector< unsigned> tempIntersectedSubjects; (std :: vector< std :: vector< unsigned>> :: size_type i(1u);
i< t.size()&!intersectedValues.empty(); ++ i){
std :: set_intersection(t [i] .begin(),t [i] .end(),
intersectedValues.begin(),intersectedValues.end(),
std :: back_inserter(tempIntersectedSubjects);
std :: swap(intersectedValues,tempIntersectedSubjects);
tempIntersectedSubjects.clear();
}
}

我认为这个代码有一个公平的机会可以加快速度,也可以合理地交集不同的集合:而不是保持一个设置和相交,你可以创建一个新的交点对相邻集合,然后相交的第一个集合与他们的尊重相邻的:

  std :: vector< std :: vector< unsigned>>交集(
std :: vector< std :: vector< unsigned>>> ;常量和放大器; t){
std :: vector< std :: vector< unsigned>> R等
std :: vector< std :: vector< unsignned>> :: size_type i(0); $;
为(; i + 1 r.push_back(intersect(t [i],t [i + 1]));
}
if(i< t.size()){
r.push_back(t [i]);
}
return r;
}

std :: vector< unsigned> func(std :: vector< std :: vector< unsigned>> const& t){
if(t.empty()){/ *处理t为空... * /}
std :: vector< std :: vector< unsigned>> r(intersection(t))
返回r.size()== 1? r [0]:func(r);
}

当然,你不会像这样实现它:你会使用Stepanov的二进制计数器来保留中间集合。这种方法假定结果很可能是非空的。如果期望的结果是空的,这可能不是一个改进。


I have a vector of vector of unsigned. I need to find the intersection of all these vector of unsigned's for doing so I wrote the following code:

int func()
{
   vector<vector<unsigned> > t;
   vector<unsigned> intersectedValues;
   bool firstIntersection=true;
   for(int i=0;i<(t).size();i++)
   {
       if(firstIntersection)
       {
           intersectedValues=t[0];
           firstIntersection=false;
       }else{
           vector<unsigned> tempIntersectedSubjects;                                                              
           set_intersection(t[i].begin(),
                  t[i].end(), intersectedValues.begin(),
                  intersectedValues.end(),
                  std::inserter(tempIntersectedSubjects, tempIntersectedSubjects.begin()));
           intersectedValues=tempIntersectedSubjects;
       }         
       if(intersectedValues.size()==0)
           break;
   }               
}

Each individual vector has 9000 elements and there are many such vectors in "t". When I profiled my code I found that set_intersection takes the maximum amount of time and hence makes the code slow when there are many invocations of func(). Can someone please suggest as to how can I make the code more efficient.

I am using: gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)

EDIT: Individual vectors in vector "t" are sorted.

解决方案

I don't have a framework to profile the operations but I'd certainly change the code to reuse the readily allocated vector. In addition, I'd hoist the initial intersection out of the loop. Also, std::back_inserter() should make sure that elements are added in the correct location rather than in the beginning:

int func()
{
    vector<vector<unsigned> > t = some_initialization();
    if (t.empty()) {
        return;
    }
    vector<unsigned> intersectedValues(t[0]);
    vector<unsigned> tempIntersectedSubjects;
    for (std::vector<std::vector<unsigned>>::size_type i(1u);
         i < t.size() && !intersectedValues.empty(); ++i) {
        std::set_intersection(t[i].begin(), t[i].end(),
                              intersectedValues.begin(), intersectedValues.end(),
                             std::back_inserter(tempIntersectedSubjects);
        std::swap(intersectedValues, tempIntersectedSubjects);
        tempIntersectedSubjects.clear();
    }
}               

I think this code has a fair chance to be faster. It may also be reasonable to intersect the sets different: instead of keeping one set and intersecting with that you could create a new intersection for pairs of adjacent sets and then intersect the first sets with their respect adjacent ones:

std::vector<std::vector<unsigned>> intersections(
    std::vector<std::vector<unsigned>> const& t) {
    std::vector<std::vector<unsigned>> r;
    std::vector<std::vector<unsignned>>::size_type i(0);
    for (; i + 1 < t.size(); i += 2) {
        r.push_back(intersect(t[i], t[i + 1]));
    }
    if (i < t.size()) {
        r.push_back(t[i]);
    }
    return r;
}

std::vector<unsigned> func(std::vector<std::vector<unsigned>> const& t) {
    if (t.empty()) { /* deal with t being empty... */ }
    std::vector<std::vector<unsigned>> r(intersections(t))
    return r.size() == 1? r[0]: func(r);
}

Of course, you wouldn't really implement it like this: you'd use Stepanov's binary counter to keep the intermediate sets. This approach assumes that the result is most likely non-empty. If the expectation is that the result will be empty that may not be an improvement.

这篇关于在C ++中执行矢量交集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆