C ++ 11中线程安全的局部静态变量初始化的代价? [英] Cost of thread-safe local static variable initialization in C++11?
问题描述
我们知道局部静态变量初始化在C ++ 11中是线程安全的,现代编译器完全支持此操作. (在C ++ 11中本地静态变量初始化线程安全吗? ?)
We know that local static variable initialization is thread-safe in C++11, and modern compilers fully support this. (Is local static variable initialization thread-safe in C++11?)
使其成为线程安全的代价是什么?我知道这很可能取决于编译器的实现.
What is the cost of making it thread-safe? I understand that this could very well be compiler implementation dependent.
上下文:我有一个多线程应用程序(10个线程),通过以下函数以很高的速率访问单例对象池实例,我担心它的性能含义.
Context: I have a multi-threaded application (10 threads) accessing a singleton object pool instance via the following function at very high rates, and I'm concerned about its performance implications.
template <class T>
ObjectPool<T>* ObjectPool<T>::GetInst()
{
static ObjectPool<T> instance;
return &instance;
}
推荐答案
查看生成的汇编代码帮助.
#include <vector>
std::vector<int> &get(){
static std::vector<int> v;
return v;
}
int main(){
return get().size();
}
汇编器
std::vector<int, std::allocator<int> >::~vector():
movq (%rdi), %rdi
testq %rdi, %rdi
je .L1
jmp operator delete(void*)
.L1:
rep ret
get():
movzbl guard variable for get()::v(%rip), %eax
testb %al, %al
je .L15
movl get()::v, %eax
ret
.L15:
subq $8, %rsp
movl guard variable for get()::v, %edi
call __cxa_guard_acquire
testl %eax, %eax
je .L6
movl guard variable for get()::v, %edi
movq $0, get()::v(%rip)
movq $0, get()::v+8(%rip)
movq $0, get()::v+16(%rip)
call __cxa_guard_release
movl $__dso_handle, %edx
movl get()::v, %esi
movl std::vector<int, std::allocator<int> >::~vector(), %edi
call __cxa_atexit
.L6:
movl get()::v, %eax
addq $8, %rsp
ret
main:
subq $8, %rsp
call get()
movq 8(%rax), %rdx
subq (%rax), %rdx
addq $8, %rsp
movq %rdx, %rax
sarq $2, %rax
ret
与
#include <vector>
static std::vector<int> v;
std::vector<int> &get(){
return v;
}
int main(){
return get().size();
}
汇编器
std::vector<int, std::allocator<int> >::~vector():
movq (%rdi), %rdi
testq %rdi, %rdi
je .L1
jmp operator delete(void*)
.L1:
rep ret
get():
movl v, %eax
ret
main:
movq v+8(%rip), %rax
subq v(%rip), %rax
sarq $2, %rax
ret
movl $__dso_handle, %edx
movl v, %esi
movl std::vector<int, std::allocator<int> >::~vector(), %edi
movq $0, v(%rip)
movq $0, v+8(%rip)
movq $0, v+16(%rip)
jmp __cxa_atexit
我对汇编器不是很好,但是我可以看到在第一个版本中v
周围有一个锁,并且get
没有内联,而在第二个版本中get
本质上已经消失了. >
您可以使用各种编译器和优化标志游玩,但似乎没有编译器能够内联或优化锁,即使程序显然是单线程的.
您可以在get
中添加static
,这样在保留锁定的同时使gcc内联get
.
I'm not that great with assembler, but I can see that in the first version v
has a lock around it and get
is not inlined whereas in the second version get
is essentially gone.
You can play around with various compilers and optimization flags, but it seems no compiler is able to inline or optimize out the locks, even though the program is obviously single threaded.
You can add static
to get
which makes gcc inline get
while preserving the lock.
要了解这些锁和其他指令对编译器,标志,平台和周围代码的成本,您需要制定适当的基准.
我希望锁会带来一些开销,并且比内联代码要慢得多,当您实际使用向量时,内联代码将变得无关紧要,但是您无法确定是否需要测量.
To know how much these locks and additional instructions cost for your compiler, flags, platform and surrounding code you would need to make a proper benchmark.
I would expect the locks to have some overhead and be significantly slower than the inlined code, which becomes insignificant when you actually do work with the vector, but you can never be sure without measuring.
这篇关于C ++ 11中线程安全的局部静态变量初始化的代价?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!