MongoDB findAndModify.它真的是原子的吗?帮助编写封闭式更新解决方案 [英] MongoDB findAndModify. Is it really atomic? Help writing a closed update solution

查看:77
本文介绍了MongoDB findAndModify.它真的是原子的吗?帮助编写封闭式更新解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 Event 文档,由嵌入的 Snapshots 组成.

I have Event documents, consisting of embedded Snapshots.

如果:

  • 该事件在快照 A 的 5 分钟内开始
  • 事件的最新快照不早于快照 A 前一分钟.

否则……创建一个新的Event.

Otherwise.... create a new Event.

这是我的 findAndUpdate 查询,它可能更有意义:

Here is my findAndUpdate query that might make more sense:

Event.findAndModify(
  query: { 
    start_timestamp: { $gte: newSnapshot.timestamp - 5min },
    last_snapshot_timestamp: { $gte: newSnapshot.timestamp - 1min }
  },
  update: { 
    snapshots[newSnapshot.timestamp]: newSnapshot,
    $max: { last_snapshot_timestamp: newSnapshot.timestamp },
    $min: { start_timestamp: newSnapshot.timestamp }
  },
  upsert: true,
  $setOnInsert: { ALL OUR NEW EVENT FIELDS } }
)

不幸的是,我无法在 start_timestamp 上创建唯一索引.快照带有不同的时间戳,我想将它们分组到一个事件中.即快照 A 在 12:00:00 进入,快照 B 在 12:00:59 进入.它们应该在同一个事件中,但它们可以在不同的时间写入数据库,因为编写它们的工作人员是并发执行的.假设另一个快照在 12:00:30 进入,它应该写入与上面两个相同的事件.最后,应将 12:02:00 的快照写入新事件.

Unfortunately, I cannot create a unique index on start_timestamp. Snapshots come in with different timestamps, and I want to group them into an event. I.e Snapshot A comes in at 12:00:00, and Snapshot B comes in at 12:00:59. They should be in the same event, but they could be written to the DB at different times, because the workers writing them are acting concurrently. Say another snapshot comes in, at 12:00:30, it should be written to the same event as the two above. Finally a snapshot at 12:02:00 should be written to a new event.

我的问题是......这在并发环境中是否能正常工作.findAndUpdate 是原子的吗?我是否有可能创建两个事件,而我应该创建一个,并将快照添加到其中?

My question is.... will this work correctly in a concurrent environment. Is the findAndUpdate atomic? Is it possible I might create two events, when I should have created one, and added the snapshot to it?

因此,正如@chainh 亲切地指出的那样,上述方法不能保证不会创建两个事件.

所以我尝试了一种新的基于锁定的方法 - 您认为这可行吗?

var acquireLock = function() {
  var query = { "locked": false}
  var update = { $set: { "locked": true } }
  return Lock.findAndModify({
    query: query, 
    update: update,
    upsert: true
  })
};

var releaseLock = function() {
  var query = { "locked": true }
  var update = { $set: { "locked": false } }
  return Lock.findAndModify({
    query: query, 
    update: update
  })
};

var insertSnapshot = function(newSnapshot, upsert) {
  Event.findAndModify(
    query: { 
      start_timestamp: { $gte: newSnapshot.timestamp - 5min },
      last_snapshot_timestamp: { $gte: newSnapshot.timestamp - 1min }
    },
    update: { 
      snapshots[newSnapshot.timestamp]: newSnapshot,
      $max: { last_snapshot_timestamp: newSnapshot.timestamp },
      $min: { start_timestamp: newSnapshot.timestamp }
    },
    upsert: upsert,
    $setOnInsert: { ALL OUR NEW EVENT FIELDS } }
  )
};

var safelyInsertEvent = function(snapshot) {
  return insertSnapshot(snapshot, false)
  .then(function(modifyRes) {
    if (!modifyRes.succeeded) {
      return acquireLock()
    }
  })
  .then(function(lockRes) {
    if (lockRes.succeeded) {
      return insertSnapshot(snapshot, true)
    } else {
      throw new AcquiringLockError("Didn't acquire lock. Try again")
    }
  })
  .then(function() {
    return releaseLock()
  })
  .catch(AcquiringLockError, function(err) {
    return safelyInsertEvent(snapshot)
  })
};

锁定文档将只包含一个字段(已锁定).基本上上面的代码试图找到一个现有的事件并更新它.如果它有效,很好,我们可以纾困.如果我们没有更新,我们知道我们没有一个现有的事件来粘贴快照.所以我们然后原子地获取一个锁,如果成功,我们可以安全地更新一个新事件.如果获取锁失败,我们只需再次尝试整个过程,希望到那时我们有一个现有的事件可以将其插入.

The lock document would simply contain a single field (locked). Basically the above code tries to find an existing event and update it. If it works, great, we can bail out. If we didn't update, we know we don't have an existing event to stick the snapshot in. So we then acquire a lock atomically, and if that succeeds, we can safely upsert a new event. If acquiring that lock fails, we simply try the whole process again, and hopefully by that time we have an existing event to stick it in.

推荐答案

根据你的代码:

Event.findAndModify(
  query: { 
    start_timestamp: { $gte: newSnapshot.timestamp - 5min },
    last_snapshot_timestamp: { $gte: newSnapshot.timestamp - 1min }
  },
  update: { 
    snapshots[newSnapshot.timestamp]: newSnapshot,
    $max: { last_snapshot_timestamp: newSnapshot.timestamp },
    $min: { start_timestamp: newSnapshot.timestamp }
  },
  upsert: true,
  $setOnInsert: { ALL OUR NEW EVENT FIELDS } }
)

当第一个 Event 文档插入数据库成功时,这个 Event 文档的字段有如下关系:
start_timestamp == last_snapshot_timestamp

When succeed to insert the first Event document into database, fields of this Event document has following relationship:
start_timestamp == last_snapshot_timestamp

经过后续更新,关系变为:
开始时间戳
开始时间戳

After subsequent updates, the relationship turns to:
start_timestamp < last_snapshot_timestamp < last_snapshot_timestamp + 1min < start_timestamp + 5min
OR
start_timestamp < last_snapshot_timestamp < start_timestamp + 5min < last_snapshot_timestamp + 1min

所以,如果新的快照要连续插入到这个Event文件中,必须符合:
newSnapshot.timestamp <Math.min(last_snapshot_timestamp + 1, start_timestamp + 5)

So, if new snapshot wants to insert into this Event document continuously, it must conform:
newSnapshot.timestamp < Math.min(last_snapshot_timestamp + 1, start_timestamp + 5)

假设随着时间的推移,数据库中有两个事件文档:
事件 1 (start_timestamp1, last_snapshot_timestamp1),
事件 2 (start_timestamp2, last_snapshot_timestamp2)
一般来说,start_timestamp2 > last_snapshot_timestamp1

Suppose there are two Event documents in database over time:
Event1 (start_timestamp1, last_snapshot_timestamp1),
Event2 (start_timestamp2, last_snapshot_timestamp2)
Generally, start_timestamp2 > last_snapshot_timestamp1

现在,如果有新的快照到来,并且它的时间戳小于 start_timestamp1(假设有可能通过延迟或伪造),然后可以插入此快照进入任一事件文档.所以,我怀疑你是否需要添加另一个条件查询部分以确保 last_snapshot_timestamp 和 start_timestamp 之间的距离始终小于某个值(例如 5min)?例如,我将查询更改为

Now, if there is a new snapshot comes, and its timestamp is less than start_timestamp1 (just suppose it's possible by latency or forging), then this snapshot can be inserted into either Event document. So, I doubt whether you need another condition added into the query part to make sure the distance between last_snapshot_timestamp and start_timestamp is always less than a certain value (e.g. 5min)? For example, I change the query to

  query: { 
        start_timestamp: { $gte: newSnapshot.timestamp - 5min },
        last_snapshot_timestamp: { $gte: newSnapshot.timestamp - 1min , $lte : newSnapshot.timestamp + 5}
      }

好的,让我们继续...
如果我尝试解决这个问题,我仍然会尝试在字段 start_timestamp 上建立唯一索引.根据MongoDB手册,使用findAndModifyupdate即可完成工作原子地.但令人头疼的是,当出现重复值时我应该如何处理,因为newSnapshot.timestamp 失控,它可能会修改 start_timestamp运算符 $min.

Ok, let's continue...
If I try to solve this question, I still try to build a unique index on field start_timestamp. According to the manual of MongoDB, use findAndModify or update is able to finish the work atomically. But the headache is how should I handle when duplicate value occurs because newSnapshot.timestamp is out of control and it will possibly modify start_timestamp by operator $min.

方法是:

  1. 几个线程创建(upsert)一个新的事件文档,因为没有文档可以满足查询条件;
  2. 一个线程成功创建了具有特定 newSnapshot.timestamp 值的新事件文档,其他人因字段 start_timestamp 上唯一索引的约束而失败;
  3. 其他线程重试(现在是更新而不是 upsert)并且会成功更新(使用现有的事件文档);
  4. 如果更新(不是 upsert)导致 $min 操作符修改 start_timestamp 并且巧合的是 newSnapshot.tiemstamp等于现有事件文档中start_timestamp的值,更新将因unique约束而失败指数.但是我们可以得到消息,我们知道一个事件文档已经存在,它的start_timestamp值正好等于newSnapshot.timestamp.现在,我们可以简单地将 newSnapshot 插入到这个 Event 文档中,因为它肯定符合条件.
  1. several threads create(upsert) a new Event document because no documents can satisfy the query condition;
  2. one thread succeed to create the new Event document with certain newSnapshot.timestamp value, others fail by the constraints of unique index on field start_timestamp;
  3. other threads retry (now is update instead of upsert) and will succeed to update (use the existed Event document);
  4. If an update(not upsert) causes to modify start_timestamp by $min operator and coincidentally newSnapshot.tiemstamp is equal to the value of start_timestamp in an existed Event document, the update will fail by constraints of unique index. But we can get the message, and we know an Event document has existed, whose start_timestamp value is just equal to newSnapshot.timestamp. Now, we can simply insert the newSnapshot into this Event document because it definitely conform the condition.

因为它不需要返回事件文档,所以我使用 update 而不是 findAndModify 因为两者都是原子操作并且 update 在这种情况下具有更简单的编写.
我使用简单的 JavaScript(在 mongo shell 上运行)来表达步骤(我不熟悉您使用的代码语法.:D),我认为您可以轻松理解.

As it doesn't need to return Event document, I use update instead of findAndModify since both are atomic operation and update has simpler writing in this case.
I use simple JavaScript (run on mongo shell) to express steps (I'm not familiar with that code syntax you used. :D ), and I think you can understand easily.

var gap5 = 5 * 60 * 1000;   // just suppose, you should change accordingly if the value is not true. 
var gap1 = 1 * 60 * 1000;
var initialFields = {};     // ALL OUR NEW EVENT FIELDS

function insertSnapshotIfStartTimeStampNotExisted() {
    var query = { 
            start_timestamp: { $gte: newSnapshot.timestamp - gap5 },
            last_snapshot_timestamp: { $gte: newSnapshot.timestamp - gap1 }
    };
    var update = { 
            $push : {snapshots: newSnapshot}, // suppose snapshots is an array 
            $max: { last_snapshot_timestamp: newSnapshot.timestamp },
            $min: { start_timestamp: newSnapshot.timestamp },
            $setOnInsert : initialFields
    },

    var result = db.Event.update(query, update, {upsert : true});
    if (result.nUpserted == 0 && result.nModified == 0) {
        insertSnapshotIfStartTimeStampExisted();            // Event document existed with that start_timestamp
    }
}

function insertSnapshotIfStartTimeStampExisted() {
    var query = { 
            start_timestamp: newSnapshot.timestamp,
    };
    var update = { 
            $push : {snapshots: newSnapshot}
    },

    var result = db.Event.update(query, update, {upsert : false});
    if (result.nModified == 0) {
        insertSnapshotIfStartTimeStampNotExisted();         // If start_timestamp just gets modified; it's possible.
    }
}

// entry
db.Event.ensureIndex({start_timestamp:1},{unique:true});
insertSnapshotIfStartTimeStampNotExisted();

这篇关于MongoDB findAndModify.它真的是原子的吗?帮助编写封闭式更新解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆