OpenMP锁定与关键 [英] OpenMP lock vs. critical

查看:176
本文介绍了OpenMP锁定与关键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用锁和关键部分来确保循环线程的安全.这是代码:

I'm playing around with locks and critical sections for making a loop thread safe. Here is the code:

#pragma omp parallel for num_threads(4) private(k, f_part_k, len, len_3, mg, fact)
for (k = part+1; k < n; k++) {
  /* Compute force on part due to k */
  f_part_k[X] = curr[part].s[X] - curr[k].s[X];
  f_part_k[Y] = curr[part].s[Y] - curr[k].s[Y];
  len = sqrt(f_part_k[X]*f_part_k[X] + f_part_k[Y]*f_part_k[Y]);
  len_3 = len*len*len;
  mg = -G*curr[part].m*curr[k].m;
  fact = mg/len_3;
  f_part_k[X] *= fact;
  f_part_k[Y] *= fact;

  /* Add force in to total forces */
  omp_set_lock(&(locks[k]));
  //#pragma omp critical
  {
      forces[part][X] += f_part_k[X];
      forces[part][Y] += f_part_k[Y];
      forces[k][X] -= f_part_k[X];
      forces[k][Y] -= f_part_k[Y];
  }
  omp_unset_lock(&(locks[k]));
}

for (i = 0; i < n; i++)
    omp_destroy_lock(&(locks[i]));
} 

当我仅使用注释掉的关键指令时,结果很好,即与顺序版本的结果匹配.但是,如果我使用代码中所示的锁,则结果将很遥远.我想我误解了锁的概念,因为在我理解的使用这种锁方法的情况下,对forces数组的写访问应该是安全的.你能指出我正确的方向吗?

When I use only the critical directive which is commented out the results are fine, i.e. match that of the sequential version. However, if I use locks as shown in the code the results are way off. I guess I misunderstood the concept of locks because in my understanding using this lock approach the write access to the forces array should be safe. Could you point me in the right direction?

推荐答案

我认为您的代码存在以下竞争条件:

I think the problem with your code is a race condition on:

omp_set_lock(&(locks[k]));
{
    forces[part][X] += f_part_k[X]; // Race condition for different k
    forces[part][Y] += f_part_k[Y]; // Race condition for different k
    forces[k][X] -= f_part_k[X]; 
    forces[k][Y] -= f_part_k[Y]; 
}
omp_unset_lock(&(locks[k]));

实际上,对于k的不同值,多个线程尝试写入forces[part][X]forces[part][Y].此外,我认为不需要显式同步对forces[k][X]forces[k][Y]的访问,因为每个线程都会更新自己的k.

In fact, for different values of k, multiple threads try to write to forces[part][X] and forces[part][Y]. Further I think that there is no need to explicitely synchronize the access to forces[k][X] and forces[k][Y], as each thread will update its own k.

如果您要尝试使用具有正确语义的其他同步结构,则可以尝试:

If you want to experiment with different synchronization constructs that gives the right semantic, you could try:

原子级同步

#pragma omp atomic
forces[part][X] += f_part_k[X];
#pragma omp atomic
forces[part][Y] += f_part_k[Y];

forces[k][X] -= f_part_k[X]; 
forces[k][Y] -= f_part_k[Y]; 

显式锁定

omp_set_lock(&lock);
{
  forces[part][X] += f_part_k[X];
  forces[part][Y] += f_part_k[Y];
}
omp_unset_lock(&lock);

forces[k][X] -= f_part_k[X]; 
forces[k][Y] -= f_part_k[Y];

命名关键部分

#pragma omp critical(PART)
{
  forces[part][X] += f_part_k[X];
  forces[part][Y] += f_part_k[Y];
}
forces[k][X] -= f_part_k[X]; 
forces[k][Y] -= f_part_k[Y]; 

我建议您阅读criticalatomic构造的定义此处(第2.8.2和2.8.5节),并查看示例 A.19.1c A.22.* A.45.1c

I suggest you to read the definition of the critical and atomic constructs here (section 2.8.2 and 2.8.5), and have a look at the examples A.19.1c, A.22.* and A.45.1c

也就是说,在您提出的情况下,我将尝试以下操作:

That said, in the case you presented I would try the following:

float fredx = 0.0f;
float fredy = 0.0f;
#pragma omp parallel for private(k, f_part_k, len, len_3, mg, fact) reduction(+:fredx,fredy)
for (k = part+1; k < n; k++) {
  /* Compute force on part due to k */
  f_part_k[X] = curr[part].s[X] - curr[k].s[X];
  f_part_k[Y] = curr[part].s[Y] - curr[k].s[Y];
  len = sqrt(f_part_k[X]*f_part_k[X] + f_part_k[Y]*f_part_k[Y]);
  len_3 = len*len*len;
  mg = -G*curr[part].m*curr[k].m;
  fact = mg/len_3;
  f_part_k[X] *= fact;
  f_part_k[Y] *= fact;

  /* Add force in to total forces */
  fredx += f_part_k[X];
  fredy += f_part_k[Y];

  forces[k][X] -= f_part_k[X];
  forces[k][Y] -= f_part_k[Y];            
}

forces[part][X] += fredx;
forces[part][Y] += fredy;

避免任何显式同步.

这篇关于OpenMP锁定与关键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆