哪些 seqlock 实现是正确的? [英] Which of these implementations of seqlock are correct?

查看:80
本文介绍了哪些 seqlock 实现是正确的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 Seqlock 的实施.然而,我发现的所有来源都以不同的方式实现它们.

I am studying the implementation of Seqlock. However all sources I found implement them differently.

Linux 内核是这样实现的:

static inline unsigned __read_seqcount_begin(const seqcount_t *s)
{
    unsigned ret;

repeat:
    ret = READ_ONCE(s->sequence);
    if (unlikely(ret & 1)) {
        cpu_relax();
        goto repeat;
    }
    return ret;
}

static inline unsigned raw_read_seqcount_begin(const seqcount_t *s)
{
    unsigned ret = __read_seqcount_begin(s);
    smp_rmb();
    return ret;
}

基本上,它使用易失性读取加上读取屏障,在读取器端具有获取语义.

Basically, it uses a volatile read plus a read barrier with acquire semantics on the reader side.

使用时,后续读取不受保护:

When used, subsequent reads are unprotected:

struct Data {
    u64 a, b;
};

// ...
read_seqcount_begin(&seq);
int v1 = d.a, v2 = d.b;
// ...

rigtorp/Seqlock

RIGTORP_SEQLOCK_NOINLINE T load() const noexcept {
  T copy;
  std::size_t seq0, seq1;
  do {
    seq0 = seq_.load(std::memory_order_acquire);
    std::atomic_signal_fence(std::memory_order_acq_rel);
    copy = value_;
    std::atomic_signal_fence(std::memory_order_acq_rel);
    seq1 = seq_.load(std::memory_order_acquire);
  } while (seq0 != seq1 || seq0 & 1);
  return copy;
}

数据加载仍然在没有原子操作或保护的情况下执行.但是,与内核中具有获取语义的 rmb 相比,在读取之前添加了具​​有获取-释放语义的 atomic_signal_fence.

The load of data is still performed without an atomic operation or protection. However, an atomic_signal_fence with acquire-release semantics is added prior to the read, in contrast to the rmb with acquire semantics in Kernel.

pub fn read(&self) -> T {
    loop {
        // Load the first sequence number. The acquire ordering ensures that
        // this is done before reading the data.
        let seq1 = self.seq.load(Ordering::Acquire);

        // If the sequence number is odd then it means a writer is currently
        // modifying the value.
        if seq1 & 1 != 0 {
            // Yield to give the writer a chance to finish. Writing is
            // expected to be relatively rare anyways so this isn't too
            // performance critical.
            thread::yield_now();
            continue;
        }

        // We need to use a volatile read here because the data may be
        // concurrently modified by a writer.
        let result = unsafe { ptr::read_volatile(self.data.get()) };

        // Make sure the seq2 read occurs after reading the data. What we
        // ideally want is a load(Release), but the Release ordering is not
        // available on loads.
        fence(Ordering::Acquire);

        // If the sequence number is the same then the data wasn't modified
        // while we were reading it, and can be returned.
        let seq2 = self.seq.load(Ordering::Relaxed);
        if seq1 == seq2 {
            return result;
        }
    }
}

加载seqdata 之间没有内存障碍,而是在这里使用了易失性读取.

No memory barrier between loading seq and data, but instead a volatile read is used here.

T reader() {
  int r1, r2;
  unsigned seq0, seq1;
  do {
    seq0 = seq.load(m_o_acquire);
    r1 = data1.load(m_o_relaxed);
    r2 = data2.load(m_o_relaxed);
    atomic_thread_fence(m_o_acquire);
    seq1 = seq.load(m_o_relaxed);
  } while (seq0 != seq1 || seq0 & 1);
  // do something with r1 and r2;
}

类似于 Rust 实现,但对数据使用原子操作而不是 volatile_read.

Similar to the Rust implementation, but atomic operations instead of volatile_read are used on data.

本文声称:

在一般情况下,有很好的语义理由要求此类 seqlock临界区"内的所有数据访问必须是原子的.如果我们读取指针 p 作为读取数据的一部分,然后也读取 *p,如果 p 的读取碰巧看到更新了一半的指针值,那么临界区中的代码可能从错误地址读取.在这种情况下,可能无法避免使用传统的原子负载读取指针,而这正是我们想要的.

In the general case, there are good semantic reasons to require that all data accesses inside such a seqlock "critical section" must be atomic. If we read a pointer p as part of reading the data, and then read *p as well, the code inside the critical section may read from a bad address if the read of p happened to see a half-updated pointer value. In such cases, there is probably no way to avoid reading the pointer with a conventional atomic load, and that's exactly what's desired.

然而,在很多情况下,特别是在多进程的情况下,seqlock 数据由一个简单的可复制对象组成,而 seqlock 的临界区"由一个简单的复制操作组成.在正常情况下,这可以使用 memcpy 编写.但这在这里是不可接受的,因为 memcpy 不会生成原子访问,并且(无论如何根据我们的规范)容易受到数据竞争的影响.

However, in many cases, particularly in the multiple process case, seqlock data consists of a single trivially copyable object, and the seqlock "critical section" consists of a simple copy operation. Under normal circumstances, this could have been written using memcpy. But that's unacceptable here, since memcpy does not generate atomic accesses, and is (according to our specification anyway) susceptable to data races.

目前要正确编写这样的代码,我们基本上需要将这些数据分解成许多小的无锁原子子对象,并一次复制一份.将数据视为单个大原子对象将违背 seqlock 的目的,因为原子复制操作将获取常规锁.我们的提议本质上增加了一个方便的库工具来自动分解成小对象.

Currently to write such code correctly, we need to basically decompose such data into many small lock-free atomic subobjects, and copy them a piece at a time. Treating the data as a single large atomic object would defeat the purpose of the seqlock, since the atomic copy operation would acquire a conventional lock. Our proposal essentially adds a convenient library facility to automate this decomposition into small objects.

我的问题

  1. 以上哪些实现是正确的?哪些是正确但效率低下的?
  2. volatile_read 能否在 seqlock 的获取读取之前重新排序?
  1. Which of the above implementations are correct? Which are correct but inefficient?
  2. Can the volatile_read be reordered before the acquire-read of seqlock?

推荐答案

你的 Linux 引用似乎有误.

Your qoutes from Linux seems wrong.

根据 https://www.kernel.org/doc/html/latest/locking/seqlock.html 读取过程为:

Read path:

do {
        seq = read_seqcount_begin(&foo_seqcount);

        /* ... [[read-side critical section]] ... */

} while (read_seqcount_retry(&foo_seqcount, seq));

如果您查看问题中发布的 github 链接,您会发现包含几乎相同过程的评论.

If you look at the github link posted in the question, you'll find a comment including nearly the same process.

您似乎只研究了读取过程的一部分.链接文件实现了实现读取器和写入器所需的内容,但不实现它们自身的读取器/写入器.

It seems that you are only looking into one part of the read process. The linked file implements what you need to implement readers and writers but not the reader/writer them self.

另请注意文件顶部的此注释:

Also notice this comment from the top of the file:

* The seqlock seqcount_t interface does not prescribe a precise sequence of
* read begin/retry/end. For readers, typically there is a call to
* read_seqcount_begin() and read_seqcount_retry(), however, there are more
* esoteric cases which do not follow this pattern.

这篇关于哪些 seqlock 实现是正确的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆