在扩展组装cmpxchg16b不可能约束 [英] Impossible constraint with cmpxchg16b in extended assembly

查看:397
本文介绍了在扩展组装cmpxchg16b不可能约束的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想写内联汇编与我的 C 的code进行比较和交换操作。我的code是:

  typedef结构节点{
    int数据;
    结构节点*接下来的;
    结构节点*反向链接;
    INT标志;
    INT标志;
} node_lf;typedef结构searchfrom {
    node_lf *电流;
    node_lf *接下来的;
} return_sf;typedef结构csArg {
    node_lf *节点;
    INT标志;
    INT标志;
} cs_arg;typedef结构return_tryFlag {
    node_lf *节点;
    INT结果;
} return_tf;静态内嵌node_lf CS(node_lf *地址,cs_arg * old_val,cs_arg * new_val)
{
    node_lf值= *地址;
    __asm​​__ __volatile __(锁; cmpxchg16b%0; setz%1;
                         := M(*(挥发性node_lf *)地址)
                          = Q(值)
                         :M(*(挥发性node_lf *)地址),
                          A(old_val->标记),D(old_val->旗)
                          B(new_val->标记),C(new_val->旗)
                         :记忆);
    返回值;
}

GCC给人当code编译这个错误:


  

linkedlist.c:在函数'CS':linkedlist.c:45:3:错误:ASM __ asm__ __volatile __ 不可能约束(锁; cmpxchg16b% 0; setz%1;:= M(*(挥发性node_lf


什么是错误的,我code?我怎样才能解决这个问题?

我想实现这个code等价的:

  node_lf CS(node_lf *地址,cs_arg * old_val,cs_arg * new_val){
    node_lf值= *地址;
    如果(value.next == old_val->节点和放大器;&安培; value.mark == old_val->马克和放大器;&安培;
        value.flag == old_val->旗){
        寻址>接下来= new_val->节点;
        寻址>标记= new_val->标志;
        寻址>标志= new_val->标志;
    }
    返回值;
}


解决方案

那么,让我们来破解这个。

的几点在我们开始之前:


  1. 使用内联汇编是一个坏主意。这是很难写,很难正确书写,这是难以维持,这是无法移植到其它编译器或平台等。除非这是一个任务要求,不这样做。

  2. 当执行CMPXCHG操作,字段要比较/交换必须是连续的。所以,如果你想在接下来操作标记标记在一个单一的操作中,它们必须是在结构上彼此相邻。

  3. 当执行CMPXCHG操作,字段必须是一个适当大小的边界上对齐。因为如果你正计划在16字节操作例子中,数据必须在16byte边界上对齐。 GCC提供了多种方法来从要这样做aligned属性,以_mm_malloc。

  4. 当使用__sync_bool_compare_and_swap(比内联汇编更好的选择),您必须转换数据类型合适大小的整数。

  5. 我假设你的平台是64位。

2及3需要做一些改变你的结构的字段顺序。请注意,我并​​没有试图改变 searchfrom return_tryFlag ,因为我不知道他们是用什么。

所以,在考虑这些事情,这就是我想出了:

 的#include<&stdio.h中GT;
#包括LT&;&memory.h GT;typedef结构节点{
    结构节点*接下来的;
    INT标志;
    INT标志;    结构节点*反向链接;
    int数据;
} node_lf;typedef结构csArg {
    node_lf *节点;
    INT标志;
    INT标志;
} cs_arg;布尔CS3(node_lf *地址,cs_arg * old_val,cs_arg * new_val){    返回__sync_bool_compare_and_swap((无符号__int128 *)的地址,
                                        *(无符号__int128 *)old_val,
                                        *(无符号__int128 *)new_val);
}无效ShowIt(无效* V)
{
   无符号长长* ULL =(无符号长*长)诉;
   的printf(%号码:%P,* ULL,*(ULL + 1));
}诠释的main()
{
   cs_arg OLDVAL,的newval;
   节点n;   memset的(安培; OLDVAL,0,sizeof的(OLDVAL));
   memset的(安培;的newval,0,sizeof的(的newval));
   memset的(安培; N,0,sizeof的(节点));   n.mark = 3;
   newval.mark = 4;   布尔B:   做{
      的printf(如果); ShowIt(安培; N);输出(是); ShowIt(安培; OLDVAL);的printf(变化); ShowIt(安培;的newval);
      B = CS3(安培; N,&放大器; OLDVAL,&放大器;的newval);
      的printf(结果%d个\\ N,B);      如果(B)的
         打破;
      的memcpy(安培; OLDVAL,&安培; N,的sizeof(cs_arg));
   }而(1);
}

当您退出循环,OLDVAL会是什么在那里之前(必须或CAS会失败,我们会再次循环)和的newval将实际上得到书面的东西。需要注意的是,如果这真的是多线程的,没有保证的newval将是相同为n的当前内容,因为另一个线程可能已经加入进来,又变了。

有关输出我们得到:

 如果0000000000000000:0000000000000003 0000000000000000是:0000000000000000更改为0000000000000000:0000000000000000。结果0
如果0000000000000000:0000000000000003 0000000000000000是:0000000000000003更改为0000000000000000:0000000000000000。结果1

注意,CAS(正确!)在第一次尝试失败,因为'老'值不匹配的当前值。

在使用汇编程序也许能为您节省一个或两个指令,在可读性,可维护性,可移植性等方面的胜利几乎是肯定值得的成本。

如果由于某种原因,你的必须的使用内联汇编,你仍然需要重新整理你的结构,以及关于调整问题依然存在。您还可以看看 http://stackoverflow.com/a/37825052/2189500 。它仅使用8个字节,但概念是相同的。

I am trying to write inline assembly with my C code to perform compare and swap operation. My code is:

typedef struct node {
    int data;
    struct node * next;
    struct node * backlink;
    int flag;
    int mark;
} node_lf;

typedef struct searchfrom {
    node_lf * current;
    node_lf * next;
} return_sf;

typedef struct csArg {
    node_lf * node;
    int mark;
    int flag;
} cs_arg;

typedef struct return_tryFlag {
    node_lf * node;
    int result;
} return_tf;

static inline node_lf cs(node_lf * address, cs_arg *old_val, cs_arg *new_val)
{
    node_lf value = *address;
    __asm__ __volatile__("lock; cmpxchg16b %0; setz %1;"
                         :"=m"(*(volatile node_lf *)address),
                          "=q"(value)
                         :"m"(*(volatile node_lf *)address),
                          "a"(old_val->mark), "d"(old_val->flag),
                          "b"(new_val->mark), "c"(new_val->flag)
                         :"memory");
    return value;
}

GCC gives this error when the code is compiled:

linkedlist.c: In function 'cs': linkedlist.c:45:3: error: impossible constraint in 'asm' __asm__ __volatile__("lock; cmpxchg16b %0; setz %1;":"=m"(*(volatile node_lf

What is wrong in my code? And how can I fix this?

I am trying to implement the equivalent of this code:

node_lf cs (node_lf * address, cs_arg *old_val, cs_arg *new_val ) { 
    node_lf value = *address; 
    if (value.next == old_val->node && value.mark == old_val->mark && 
        value.flag == old_val->flag) { 
        address->next = new_val->node; 
        address->mark = new_val->mark; 
        address->flag = new_val->flag; 
    } 
    return value; 
}

解决方案

So, let's take a crack at this.

A few points before we get started:

  1. Using inline asm is a bad idea. It is hard to write, it is hard to write correctly, it is hard to maintain, it isn't portable to other compilers or platforms, etc. Unless this is an assignment requirement, don't do it.
  2. When performing cmpxchg operations, the fields to be compared/exchanged must be contiguous. So if you want to operate on next, flag and mark in a single operation, they must be next to each other in the structure.
  3. When performing cmpxchg operations, the fields must be aligned on an appropriately sized boundary. For example if you are planning to operate on 16bytes, the data must be aligned on a 16byte boundary. gcc provides a variety of ways to do this from the aligned attribute, to _mm_malloc.
  4. When using __sync_bool_compare_and_swap (a better choice than inline asm), you must cast the data types to an appropriately-sized integer.
  5. I'm assuming your platform is x64.

2 & 3 required some changes to the field order of your structures. Note that I did not try to change searchfrom or return_tryFlag, since I'm not sure what they are used for.

So, with those things in mind, here's what I came up with:

#include <stdio.h>
#include <memory.h>

typedef struct node {
    struct node * next;
    int mark;
    int flag;

    struct node * backlink;
    int data;
} node_lf;

typedef struct csArg {
    node_lf * node;
    int mark;
    int flag;
} cs_arg;

bool cs3(node_lf * address, cs_arg *old_val, cs_arg *new_val) { 

    return __sync_bool_compare_and_swap((unsigned __int128 *)address,
                                        *(unsigned __int128 *)old_val,
                                        *(unsigned __int128 *)new_val);
}

void ShowIt(void *v)
{
   unsigned long long *ull = (unsigned long long *)v;
   printf("%p:%p", *ull, *(ull + 1));
}

int main()
{
   cs_arg oldval, newval;
   node n;

   memset(&oldval, 0, sizeof(oldval));
   memset(&newval, 0, sizeof(newval));
   memset(&n, 0, sizeof(node));

   n.mark = 3;
   newval.mark = 4;

   bool b;

   do {
      printf("If "); ShowIt(&n); printf(" is "); ShowIt(&oldval); printf(" change to "); ShowIt(&newval);
      b = cs3(&n, &oldval, &newval);
      printf(". Result %d\n", b);

      if (b)
         break;
      memcpy(&oldval, &n, sizeof(cs_arg));
   } while (1);  
}

When you exit the loop, oldval will be what was there before (has to be or the cas would have failed and we would have looped again) and newval will be what actually got written. Note that if this truly were multi-threaded, there is no guarantee that newval would be the same as the current contents of n, since another thread could already have come along and changed it again.

For output we get:

If 0000000000000000:0000000000000003 is 0000000000000000:0000000000000000 change to 0000000000000000:0000000000000000. Result 0
If 0000000000000000:0000000000000003 is 0000000000000000:0000000000000003 change to 0000000000000000:0000000000000000. Result 1

Note that the cas (correctly!) fails on the first attempt, since the 'old' value doesn't match the 'current' value.

While using assembler may be able to save you an instruction or two, the win in terms of readability, maintainability, portability, etc is almost certainly worth the cost.

If for some reason you must use inline asm, you will still need to re-order your structs, and the point about alignment still stands. You can also look at http://stackoverflow.com/a/37825052/2189500. It only uses 8 bytes, but the concepts are the same.

这篇关于在扩展组装cmpxchg16b不可能约束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆