CMPXCHG16B是否正确? [英] CMPXCHG16B correct?

查看:333
本文介绍了CMPXCHG16B是否正确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这并不完全正确,虽然我不确定为什么。
建议是伟大的,因为CMPXCHG16B的文档是非常少的(我没有任何英特尔手册...)

This doesn't exactly seem to be right although I am unsure why. Advice would be great as the documentation for CMPXCHG16B is pretty minimal (I don't own any intel manuals...)

template<>
inline bool cas(volatile types::uint128_t *src, types::uint128_t cmp, types::uint128_t with)
{
    /*
    Description:
     The CMPXCHG16B instruction compares the 128-bit value in the RDX:RAX and RCX:RBX registers 
     with a 128-bit memory location. If the values are equal, the zero flag (ZF) is set, 
     and the RCX:RBX value is copied to the memory location. 
     Otherwise, the ZF flag is cleared, and the memory value is copied to RDX:RAX.
     */
    uint64_t * cmpP = (uint64_t*)&cmp;
    uint64_t * withP = (uint64_t*)&with;
    unsigned char result = 0;
    __asm__ __volatile__ (
    "LOCK; CMPXCHG16B %1\n\t"
    "SETZ %b0\n\t"
    : "=q"(result)  /* output */ 
    : "m"(*src), /* input */
      //what to compare against
      "rax"( ((uint64_t) (cmpP[1])) ), //lower bits
      "rdx"( ((uint64_t) (cmpP[0])) ),//upper bits
      //what to replace it with if it was equal
      "rbx"( ((uint64_t) (withP[1])) ), //lower bits
      "rcx"( ((uint64_t) (withP[0]) ) )//upper bits
    : "memory", "cc", "rax", "rdx", "rbx","rcx" /* clobbered items */
    );
    return result;
}

当运行一个例子时,我应该得到0。

When running with an example I am getting 0 when it should be 1. Any ideas?

(1)

推荐答案

主要的问题是约束,rax不做它的样子,而是第一个字符r让gcc使用任何寄存器。

(1) The main problem is the constraints, "rax" doesn't do what it looks like, rather the first character "r" lets gcc use any register.

不知道你的存储类型:: uint128_t,但假设x86平台的标准小端,那么高和低双字也会交换。

(2) Not sure how your storing types::uint128_t, but assuming the standard little endian for x86 platforms, then the high and low dwords are also swapped around.

(3)地址的东西,并将其转换为别的可以打破别名规则。取决于你的类型:: uint128_t是如何定义的,这是一个问题(如果它是一个两个uint64_t的结构)。 GCC与-O2将优化假设别名规则不违反。

(3) Taking the address of something and casting it to something else can break aliasing rules. Depends on how your types::uint128_t is defined as to wether or not this is an issue (fine if it is a struct of two uint64_t's). GCC with -O2 will optimize assuming aliasing rules are not violated.

(4)* src应该真正标记为输出,而不是指定内存clobber。但这真的更多的是性能而不是正确性问题。类似地,rbx和rcx不需要指定为破坏。

(4) *src should really be marked as an output, rather than specifying memory clobber. but this is really more of a performance rather than correctness issue. similarly rbx and rcx do not need to specified as clobbered.

这是一个工作的版本,

#include <stdint.h>

namespace types
{
    struct uint128_t
    {
        uint64_t lo;
        uint64_t hi;
    }
    __attribute__ (( __aligned__( 16 ) ));
}

template< class T > inline bool cas( volatile T * src, T cmp, T with );

template<> inline bool cas( volatile types::uint128_t * src, types::uint128_t cmp, types::uint128_t with )
{
    bool result;
    __asm__ __volatile__
    (
        "lock cmpxchg16b oword ptr %1\n\t"
        "setz %0"
        : "=q" ( result )
        , "+m" ( *src )
        , "+d" ( cmp.hi )
        , "+a" ( cmp.lo )
        : "c" ( with.hi )
        , "b" ( with.lo )
        : "cc"
    );
    return result;
}

int main()
{
    using namespace types;
    uint128_t test = { 0xdecafbad, 0xfeedbeef };
    uint128_t cmp = test;
    uint128_t with = { 0x55555555, 0xaaaaaaaa };
    return ! cas( & test, cmp, with );
}




  • Luke

  • 这篇关于CMPXCHG16B是否正确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆