C ++ 0x原子实现在c ++ 98有关__sync_synchronize() [英] C++0x atomic implementation in c++98 question about __sync_synchronize()

查看:1160
本文介绍了C ++ 0x原子实现在c ++ 98有关__sync_synchronize()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个followin原子模板,以模仿将在即将到来的c ++ 0x标准中提供的原子操作。



但是,我不确定__sync_synchronize()调用我在返回的基础值是必要的。



根据我的理解,__sync_synchronize()是一个完整的内存屏障,我不确定我需要这么昂贵的调用返回对象的值。



我很确定它会在需要的值的设置周围,但我也可以实现这个与程序集..

  __ asm__ __volatile__(rep; nop:::memory); 

有没有人知道,我肯定需要在返回对象的synchronize()。



M。

 类型名T> 
struct atomic
{
private:
volatile T obj;

public:
atomic(const T& t):
obj(t)
{
}

运算符T()
{
__sync_synchronize(); //不确定这是overkill
return obj;
}

inline atomic< T> & operator =(T val)
{
__sync_synchronize(); //不知道这是否过度
obj = val;
return * this;
}

inline T operator ++()
{
return __sync_add_and_fetch(& obj,(T)1);
}

inline T operator ++(int)
{
return __sync_fetch_and_add(& obj,(T)1);
}

inline T operator + =(T val)
{
return __sync_add_and_fetch(& obj,val);
}

inline T operator - ()
{
return __sync_sub_and_fetch(& obj,(T)1);
}

inline T operator - (int)
{
return __sync_fetch_and_sub(& obj,(T)1);
}

inline T operator - =(T)
{
return __sync_sub_and_fetch(& obj,val);
}

//执行原子CAS操作
//返回操作前的值
inline T exchange(T oldVal,T newVal)
{
return __sync_val_compare_and_swap(& obj,oldval,newval);
}

};

更新:我想确保操作在读/

解决方案

首先,一些小的评论:



  volatile T obj; 

volatile在这里没用,更多的是你自己做所有的障碍。

  inline T运算符++(int)

内联是不需要的,因为它是在类中定义方法时隐含的。



Getters和setters:


$ b b

 内联运算符T()
{
__sync_synchronize(); //(I)
T tmp = obj;
__sync_synchronize(); //(II)
return tmp;
}

inline atomic< T> & operator =(T val)
{
__sync_synchronize(); //(III)
obj = val;
__sync_synchronize(); //(IV)
return * this;
}

为了确保读取和写入时的存储器访问的总排序,每个访问的障碍(像这样)。我会很高兴只有障碍(II)和(III),因为它们足以满足一些用途我想出了(例如指针/布尔说数据是有,spinlock),但除非另有规定,我不会省略其他,因为有人可能需要他们(如果有人表示你可以省略一些障碍,但不限制可能的用途,但我不认为这是可能的)。



当然,这将是不必要的复杂和缓慢。



也就是说,我只是倾倒障碍,甚至在任何地方使用障碍的想法类似模板。请注意:




  • 该接口的排序语义全部由您定义;如果你决定界面在这里或那里有障碍,他们必须在这里或那里,期间。如果你不定义它,你可以想出更有效的设计,因为不是所有的障碍,或者甚至不是完全的障碍,可能需要一个特定的问题。

  • 通常,你使用原子,如果你有一个无锁的算法,可以给你一个性能优势;这意味着过早地隐藏访问的接口可能不可用作它的构建块,因为它会妨碍性能本身。

  • 无锁算法通常包含无法封装的通信通过一个原子数据类型,所以你需要知道在算法中发生了什么,把障碍准确放置在他们所属的地方(例如,当实现锁时,你需要一个屏障之后但之前您发布它们,这两者都是写作,至少在原则上)

  • 如果您不想遇到问题,障碍显然在算法中,只是使用基于锁的算法。没有什么不好的。



BTW,c ++ 0x接口允许你指定精确的内存排序约束。

I have written the followin atomic template with a view to mimicing the atomic operations which will be available in the upcoming c++0x standard.

However, I am not sure that the __sync_synchronize() call I have around the returning of the underlying value are necessary.

From my understanding, __sync_synchronize() is a full memory barrier and I'm not sure I need such a costly call when returning the object value.

I'm pretty sure it'll be needed around the setting of the value but I could also implement this with the assembly ..

__asm__ __volatile__ ( "rep;nop": : :"memory" );

Does anyone know wether I definitely need the synchronize() on return of the object.

M.

template < typename T >
struct atomic
{
private:
    volatile T obj;

public:
    atomic( const T & t ) :
        obj( t )
    {
    }

    inline operator T()
    {
        __sync_synchronize();   // Not sure this is overkill
        return obj;
    }

    inline atomic< T > & operator=( T val )
    {
        __sync_synchronize();   // Not sure if this is overkill
        obj = val;
        return *this;
    }

    inline T operator++()
    {
        return __sync_add_and_fetch( &obj, (T)1 );
    }

    inline T operator++( int )
    {
        return __sync_fetch_and_add( &obj, (T)1 );
    }

    inline T operator+=( T val )
    {
        return __sync_add_and_fetch( &obj, val );
    }

    inline T operator--()
    {
        return __sync_sub_and_fetch( &obj, (T)1 );
    }

    inline T operator--( int )
    {
        return __sync_fetch_and_sub( &obj, (T)1 );
    }

    inline T operator-=( T )
    {
        return __sync_sub_and_fetch( &obj, val );
    }

    // Perform an atomic CAS operation
    // returning the value before the operation
    inline T exchange( T oldVal, T newVal )
    {
        return __sync_val_compare_and_swap( &obj, oldval, newval );
    }

};

Update: I want to make sure that the operations are consistent in the face of read/write re-ordering due to compiler optimisations.

解决方案

First, some petty remarks:

volatile T obj;

volatile is useless here, even more that you make all the barriers yourself.

inline T operator++( int )

inline is unneeded, since it is implied when the method is defined inside the class.

Getters and setters:

inline operator T()
{
    __sync_synchronize();   // (I)
    T tmp=obj;
    __sync_synchronize();   // (II)
    return tmp;
}

inline atomic< T > & operator=( T val )
{
    __sync_synchronize();   // (III)
    obj = val;
    __sync_synchronize();   // (IV)
    return *this;
}

To assure total ordering of the memory accesses on read and write, you need two barriers on each access (like this). I would be happy with only barriers (II) and (III) as they suffice for some uses I came up with (eg. pointer/boolean saying data is there, spinlock), but, unless specified otherwise, I would not omit the others, because someone might need them (it would be nice if someone showed you can omit some of the barriers without restricting possible uses, but I don't think it's possible).

Of course, this would be unnecessarily complicated and slow.

That said, I would just dump the barriers, and even the idea of using the barriers in any place of a similar template. Note that:

  • the ordering semantics of that interface is all defined by you; and if you decide the interface has the barriers here or there, they must be here or there, period. If you don't define it, you can come up with more efficient design, because not all barriers, or even not full barriers, might be needed for a particular problem.
  • usually, you use atomics if you have a lock-free algorithm that could give you a performance advantage; this means an interface that prematurely pessimizes the accesses will probably be unusable as a building block of it, as it will hamper the performance itself.
  • lock-free algorithms typically contain communication that cannot be encapsulated by one atomic data type, so you need to know what's happening in the algorithm to place the barriers precisely where they belong (eg. when implementing a lock, you need a barrier after you've acquired it, but before you release it, which are both writes, at least in principle)
  • if you don't wanna have problems, and are not sure about placing the barriers explicitly in the algorithm, just use lock-based algorithms. There's nothing bad about it.

BTW, the c++0x interface allows you to specify precise memory ordering constraints.

这篇关于C ++ 0x原子实现在c ++ 98有关__sync_synchronize()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆