在C ++ 11中使用静态变量有惩罚吗 [英] Is there a penalty for using static variables in C++11

查看:71
本文介绍了在C ++ 11中使用静态变量有惩罚吗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在C ++ 11中,这是

In C++11, this:

const std::vector<int>& f() {
    static const std::vector<int> x { 1, 2, 3 };
    return x;
}

是线程安全的.但是,由于这种额外的线程安全保证,在第一次(即在初始化时)调用此函数是否会有额外的损失?我想知道该函数是否会比使用全局变量的函数慢一些,因为它必须获取一个互斥体以检查它是否在每次调用时都由另一个线程初始化,或者发生了什么.

is thread-safe. However, is there an extra penalty for calling this function after the first time (i.e. when it is initialized) due to this extra thread-safe guarantee? I am wondering if the function will be slower than one using a global variable, because it has to acquire a mutex to check whether it's being initialized by another thread every time it is called, or something.

推荐答案

有史以来最好的直觉是'我应该对此进行衡量.'因此,让我们找出答案:

#include <atomic>
#include <chrono>
#include <cstdint>
#include <iostream>
#include <numeric>
#include <vector>

namespace {
class timer {
    using hrc = std::chrono::high_resolution_clock;
    hrc::time_point start;

    static hrc::time_point now() {
      // Prevent memory operations from reordering across the
      // time measurement. This is likely overkill, needs more
      // research to determine the correct fencing.
      std::atomic_thread_fence(std::memory_order_seq_cst);
      auto t = hrc::now();
      std::atomic_thread_fence(std::memory_order_seq_cst);
      return t;
    }

public:
    timer() : start(now()) {}

    hrc::duration elapsed() const {
      return now() - start;
    }

    template <typename Duration>
    typename Duration::rep elapsed() const {
      return std::chrono::duration_cast<Duration>(elapsed()).count();
    }

    template <typename Rep, typename Period>
    Rep elapsed() const {
      return elapsed<std::chrono::duration<Rep,Period>>();
    }
};

const std::vector<int>& f() {
    static const auto x = std::vector<int>{ 1, 2, 3 };
    return x;
}

static const auto y = std::vector<int>{ 1, 2, 3 };
const std::vector<int>& g() {
    return y;
}

const unsigned long long n_iterations = 500000000;

template <typename F>
void test_one(const char* name, F f) {
  f(); // First call outside the timer.

  using value_type = typename std::decay<decltype(f()[0])>::type;
  std::cout << name << ": " << std::flush;

  auto t = timer{};
  auto sum = uint64_t{};
  for (auto i = n_iterations; i > 0; --i) {
    const auto& vec = f();
    sum += std::accumulate(begin(vec), end(vec), value_type{});
  }
  const auto elapsed = t.elapsed<std::chrono::milliseconds>();
  std::cout << elapsed << " ms (" << sum << ")\n";
}
} // anonymous namespace

int main() {
  test_one("local static", f);
  test_one("global static", g);
}

在Coliru上运行,本地版本在4618毫秒内执行5e8迭代,全局版本在4392毫秒内执行.因此,是的,本地版本的每次迭代速度降低了约0.452纳秒.尽管存在可测量的差异,但是它太小了,在大多数情况下都不会影响观察到的性能.


有趣的对位,
从clang ++切换到g ++会更改结果顺序. g ++编译的二进制文件在4418毫秒(全局)与4181毫秒(本地)中运行,因此 local 每次迭代的速度提高了474皮秒.尽管如此,它确实重申了两种方法之间的差异很小的结论.
检查生成的程序集,我决定从函数指针转换为函数对象,以更好地进行内联.通过函数指针进行间接调用的时间并不是OP中代码的真正特征.所以我用了这个程序:

Running at Coliru, the local version does 5e8 iterations in 4618 ms, the global version in 4392 ms. So yes, the local version is slower by approximately 0.452 nanoseconds per iteration. Although there's a measurable difference, it's too small to impact observed performance in most situations.


Interesting counterpoint, switching from clang++ to g++ changes the result ordering. The g++-compiled binary runs in 4418 ms (global) vs. 4181 ms (local) so local is faster by 474 picoseconds per iteration. It does nonetheless reaffirm the conclusion that the variance between the two methods is small.
EDIT 2: Examining the generated assembly, I decided to convert from function pointers to function objects for better inlining. Timing with indirect calls through function pointers isn't really characteristic of the code in the OP. So I used this program:

#include <atomic>
#include <chrono>
#include <cstdint>
#include <iostream>
#include <numeric>
#include <vector>

namespace {
class timer {
    using hrc = std::chrono::high_resolution_clock;
    hrc::time_point start;

    static hrc::time_point now() {
      // Prevent memory operations from reordering across the
      // time measurement. This is likely overkill.
      std::atomic_thread_fence(std::memory_order_seq_cst);
      auto t = hrc::now();
      std::atomic_thread_fence(std::memory_order_seq_cst);
      return t;
    }

public:
    timer() : start(now()) {}

    hrc::duration elapsed() const {
      return now() - start;
    }

    template <typename Duration>
    typename Duration::rep elapsed() const {
      return std::chrono::duration_cast<Duration>(elapsed()).count();
    }

    template <typename Rep, typename Period>
    Rep elapsed() const {
      return elapsed<std::chrono::duration<Rep,Period>>();
    }
};

class f {
public:
    const std::vector<int>& operator()() {
        static const auto x = std::vector<int>{ 1, 2, 3 };
        return x;
    }
};

class g {
    static const std::vector<int> x;
public:
    const std::vector<int>& operator()() {
        return x;
    }
};

const std::vector<int> g::x{ 1, 2, 3 };

const unsigned long long n_iterations = 500000000;

template <typename F>
void test_one(const char* name, F f) {
  f(); // First call outside the timer.

  using value_type = typename std::decay<decltype(f()[0])>::type;
  std::cout << name << ": " << std::flush;

  auto t = timer{};
  auto sum = uint64_t{};
  for (auto i = n_iterations; i > 0; --i) {
    const auto& vec = f();
    sum += std::accumulate(begin(vec), end(vec), value_type{});
  }
  const auto elapsed = t.elapsed<std::chrono::milliseconds>();
  std::cout << elapsed << " ms (" << sum << ")\n";
}
} // anonymous namespace

int main() {
  test_one("local static", f());
  test_one("global static", g());
}

毫不奇怪,在 g ++(本地3803毫秒,全局2323毫秒)

Not surprisingly, runtimes were faster under both g++ (3803ms local, 2323ms global) and clang (4183ms local, 3253ms global). The results affirm our intuition that the global technique should be faster than the local, with deltas of 2.96 nanoseconds (g++) and 1.86 nanoseconds (clang) per iteration.

这篇关于在C ++ 11中使用静态变量有惩罚吗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆