在高复制数据存储中复制条目 [英] Duplicate entries in High Replication Datastore

查看:87
本文介绍了在高复制数据存储中复制条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当调用此POST方法时,我们仍然有重复条目的情况。
我以前在堆栈溢出时询问了一些建议,并给出了解决方案,即利用父/子方法来保留强一致的查询。



我已将所有数据迁移到该表单中,并让它再运行3个月。
然而这个问题从来没有解决过。

如果recordsdb.count()== 1,那么这个问题就在这里:
为了更新条目,应该是正确的,但是HRD可能并不总是找到最新的条目,而是创建一个新条目。



正如你所看到的,我们正在通过家长/孩子的方法来记录/阅读记录:

  new_record = FeelTrackerRecord(parent = user.key,...)

仍然并不总是获取最新的条目:

  recordsdb = FeelTrackerRecord.query(ancestor = user.key).filter(FeelTrackerRecord .record_date == ...)

所以我们很坚持这一点,不知道如何解决它。

  @requires_auth 
def post(self,ios_sync_timestamp):
user = User。查询(User.email == request.authorization.username).f蚀刻(1)[0]
如果用户:
json_records = request.json ['records']
用于json_records中的json_record:
recordsdb = FeelTrackerRecord.query(ancestor = user .key).filter(FeelTrackerRecord.record_date == date_parser.parse(json_record ['record_date']))
如果recordsdb.count()== 1:
rec = recordsdb.fetch(1)[ 0]
如果json_record中的'timestamp':
如果rec.timestamp< json_record ['timestamp']:
rec.rating = json_record ['rating']
rec.notes = json_record ['notes']
rec.timestamp = json_record ['timestamp']
rec.is_deleted = json_record ['is_deleted']
rec.put()
elif recordsdb.count()== 0:
new_record = FeelTrackerRecord(parent = user.key,
user = user.key,
record_date = date_parser.parse(json_record ['record_date']),
rating = json_record ['rating'],
notes = json_record [''注释'],
timestamp = json_record ['timestamp'])
new_record.put()
else:
引发异常('在同一记录日期中得到两个以上的记录 - REST后')
user.last_sync_timestamp = create_timestamp(datetime.datetime.today())
user.put()
$ return $',201
else:
return'',401

可能的解决方案:



我必须解决的最后一个想法是,远离父/子策略并使用 user.key PLUS date-string 作为键的一部分。



Saving:

  new_record = FeelTrackerRecord(id = str(user.key)+ json_record [ record_date'],...)
new_record.put()

  key = ndb.Key(FeelTrackerRecord,str(user.key)+ json_record ['record_date'] )
record = key.get();

现在我可以检查记录是否为无,我将创建一个新条目,否则我会更新它。希望HRD没有理由不再找到记录。
您认为什么,这是一个有保证的解决方案?

解决方案

可能的解决方案似乎与原始代码具有相同的问题。想象一下,如果两台服务器几乎同时执行相同的指令,竞争条件。由于Google的过度配置,这种情况一定会偶尔发生。



更强大的解决方案应该使用 Transactions 以及并发导致一致性违规的回滚。用户实体应该是其自己的实体组的母公司。在事务内的用户实体中增加记录计数器字段。只有在事务成功完成时才创建新的FeelTrackerRecord。因此,FeelTrackerRecord实体必须有一个用户作为父项。



编辑:对于您的代码,以下行将在user = User .query(...:

$ p $ Transaction txn = datastore.beginTransaction();
try {

以下行将在user.put()之后显示:

  txn.commit(); 
} finally {
if(txn.isActive()){
txn.rollback();




$ b $ p
$ b

这可能会忽略一些流控制嵌套细节,这个答案试图描述。



在一个活动事务中,如果多个进程(例如多个服务器由于过度配置而同时执行相同的POST)第二个过程将抛出记录的ConcurrentModificationException。



编辑2 :增加计数器的事务(并可能抛出异常)也必须创建新记录。这样,如果引发异常,则不会创建新记录。


We still have a rare case of duplicate entries when this POST method is called. I had asked for advice previously on Stack overflow and was given a solution, that is utilising the parent/child methodology to retain strongly consistent queries.

I have migrated all data into that form and let it run for another 3 months. However the problem was never solved.

The problem is right here with this conditional if recordsdb.count() == 1: It should be true in order to update the entry, but instead HRD might not always find the latest entry and creates a new entry instead.

As you can see, we are writing/reading from the Record via Parent/Child methodology as recommended:

new_record = FeelTrackerRecord(parent=user.key,...)

And yet still upon retrieval, the HRD still doesn't always fetch the latest entry:

recordsdb = FeelTrackerRecord.query(ancestor = user.key).filter(FeelTrackerRecord.record_date == ... )

So we are quite stuck on this and don't know how to solve it.

@requires_auth
    def post(self, ios_sync_timestamp):
        user = User.query(User.email == request.authorization.username).fetch(1)[0]
        if user:
            json_records = request.json['records']
            for json_record in json_records:
                recordsdb = FeelTrackerRecord.query(ancestor = user.key).filter(FeelTrackerRecord.record_date == date_parser.parse(json_record['record_date']))
                if recordsdb.count() == 1:
                    rec = recordsdb.fetch(1)[0]
                    if 'timestamp' in json_record:
                        if rec.timestamp < json_record['timestamp']:
                            rec.rating = json_record['rating']
                            rec.notes = json_record['notes']
                            rec.timestamp = json_record['timestamp']
                            rec.is_deleted = json_record['is_deleted']
                            rec.put()
                elif recordsdb.count() == 0:
                    new_record = FeelTrackerRecord(parent=user.key,
                                        user=user.key, 
                                        record_date = date_parser.parse(json_record['record_date']), 
                                        rating = json_record['rating'], 
                                        notes = json_record['notes'], 
                                        timestamp = json_record['timestamp'])
                    new_record.put()
                else:
                    raise Exception('Got more than two records for the same record date - among REST post')
            user.last_sync_timestamp = create_timestamp(datetime.datetime.today())
            user.put()
            return '', 201
        else:
            return '', 401

Possible Solution:

The very last idea I have to solve this would be, stepping away from Parent/Child strategy and using the user.key PLUS date-string as part of the key.

Saving:

new_record = FeelTrackerRecord(id=str(user.key) + json_record['record_date'], ...)
new_record.put()

Loading:

key = ndb.Key(FeelTrackerRecord, str(user.key) +  json_record['record_date'])
record = key.get();

Now I could check if record is None, I shall create a new entry, otherwise I shall update it. And hopefully HRD has no reason not finding the record anymore. What do you think, is this a guaranteed solution?

解决方案

The Possible Solution appears to have the same problem as the original code. Imagine the race condition if two servers execute the same instructions practically simultaneously. With Google's overprovisioning, that is sure to happen once in a while.

A more robust solution should use Transactions and a rollback for when concurrency causes a consistency violation. The User entity should be the parent of its own Entity Group. Increment a records counter field in the User entity within a transaction. Create the new FeelTrackerRecord only if the Transaction completes successfully. Therefore the FeelTrackerRecord entities must have a User as parent.

Edit: In the case of your code the following lines would go before user = User.query(... :

Transaction txn = datastore.beginTransaction();
try {

and the following lines would go after user.put() :

    txn.commit();
} finally {
    if (txn.isActive()) {
        txn.rollback();
    }
}

That may overlook some flow control nesting detail, it is the concept that this answer is trying to describe.

With an active transaction, if multiple processes (for example on multiple servers executing the same POST concurrently because of overprovisioning) the first process will succeed with its put and commit, while the second process will throw the documented ConcurrentModificationException.

Edit 2: The transaction that increments the counter (and may throw an exception) must also create the new record. That way if the exception is thrown, the new record is not created.

这篇关于在高复制数据存储中复制条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆