我如何写一个无分支std :: vector扫描? [英] How do I write a branchless std::vector scan?

查看:156
本文介绍了我如何写一个无分支std :: vector扫描?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在数组上写一个简单的扫描。我有一个 std :: vector< int> data ,我想查找元素小于9的所有数组索引,并将它们添加到结果向量。我可以写一个分支:

I want to write a simple scan over an array. I have a std::vector<int> data and I want to find all array indices at which the elements are less than 9 and add them to a result vector. I can write this using a branch:

for (int i = 0; i < data.size(); ++i)
    if (data[i] < 9)
        r.push_back(i);

这给出了正确答案,但我想将其与无分支版本进行比较。

This gives the correct answer but I would like to compare it to a branchless version.

使用原始数组 - 并假设数据是一个int数组, length 是其中的元素数量, r 是一个具有足够空间的结果数组 - 我可以写如下:

Using raw arrays - and assuming that data is an int array, length is the number of elements in it, and r is a result array with plenty of room - I can write something like:

int current_write_point = 0;
for (int i = 0; i < length; ++i){
    r[current_write_point] = i;
    current_write_point += (data[i] < 9);
}

我如何使用数据

推荐答案

让我们看看实际的编译器输出

auto scan_branch(const std::vector<int>& v)
{
  std::vector<int> res;
  int insert_index = 0;
  for(int i = 0; i < v.size(); ++i)
  {
    if (v[i] < 9)
    {
       res.push_back(i);
    } 
  }
  return res;
}

这段代码显然在反汇编。如果它大于或等于9,它只是继续下一个元素,但是在小于9的情况下,一些可怕的代码执行的push_back,我们继续。没有什么意想不到的。

This code clearly has a branch at 26th line of disassembly. If it's greater than or equal to 9, it just continues with the next element, however in the event of lesser than 9, some horrible amount of code executes for the push_back and we continue. Nothing unexpected.

auto scan_nobranch(const std::vector<int>& v)
{
  std::vector<int> res;
  res.resize(v.size());

  int insert_index = 0;
  for(int i = 0; i < v.size(); ++i)
  {
    res[insert_index] = i;
    insert_index += v[i] < 9;
  }

  res.resize(insert_index);
  return res;
}

但是,这只有一个条件移动,你可以在反汇编的第190行。看起来我们有一个赢家。因为条件移动不能导致流水线停顿,所以在这一个没有分支(除了条件检查)。

This one, however, only has a conditional move, which you can see in the 190th line of the disassembly. It looks like we have a winner. Since conditional move cannot result in pipeline stalls, there are no branches in this one (except the for condition check).

这篇关于我如何写一个无分支std :: vector扫描?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆