如何在DynamoDB中有条件地执行SET操作 [英] How to conditionally execute a SET operation in DynamoDB

查看:178
本文介绍了如何在DynamoDB中有条件地执行SET操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在DynamoDb中有一个聚合表,其中包含以下列:idsumcountmaxminhash.我总是要更新sumcount,但是仅当我的值大于/小于数据库中已有的值时,我才想更新minmax.另外,我只希望此操作在存储的hash与我发送的内容不同时才能成功,以防止重新处理相同的数据.

I have an aggregations table in DynamoDb with the following columns: id, sum, count, max, min, and hash. I will ALWAYS want to update sum and count but will want to update min and max only when I have values greater than/lesser than the values already in the database. Also, I only want this operation to succeed when the stored hash is different from what I am sending, to prevent reprocessing the same data.

我目前有这些:

UpdateExpression:ADD sum :sum ADD count :count SET hash :hash
UpdateCondition:attribute_not_exists(hash) OR hash <> :hash

UpdateExpression: ADD sum :sum ADD count :count SET hash :hash
UpdateCondition: attribute_not_exists(hash) OR hash <> :hash

问题是对于minmax,我需要这样的东西: SET min :min IF :min < min以及类似的最大值当然,这目前不起作用.我找不到合适的更新功能来在DynamoDb中执行此比较.实现此目的的正确方法是什么?

The thing is that I need something like this for min and max: SET min :min IF :min < min and something alike for max. Of course, this doesn't currently work. I could not find a suitable update function that would perform this comparision in DynamoDb. What is the proper way to achieve this.

PS .:已经建议我对dynamodb进行多次请求,并将max/min设置为UpdateCondition s,但是出于数据一致性的原因,我想避免使用这些多次请求方法.

PS.: I already was suggested doing multiple requests to dynamodb and place the max/min as UpdateConditions, but I want to avoid these multiple requests approach for data consistency reasons.

PS2:另一种以JavaScript-sh方式表达我想要的内容的方式是SET :min < min ? :min : min

PS2.: Another way to express what I want in a JavaScript-sh way would be something like SET :min < min ? :min : min

推荐答案

我意识到自己无法实现自己想要的解决方案.整个更新必须只有一个条件,并且因为没有诸如SET min = minimum(:min, min)这样的事情,所以我不得不接受命运并向DynamoDB发出多个UpdateItem请求.

I got to a solution to this problem by realizing that what I wanted was just not possible. There must be just one condition to the entire update and since there is no such thing as SET min = minimum(:min, min) I had to accept my fate and make more than one UpdateItem request to DynamoDB.

令人高兴的是,这些更新的执行顺序无关紧要.这里的难点是确保每个更新仅执行一次.因为我们要发出很多请求(并最终达到峰值),所以很有可能由于ProvisionedThroughputExceededException或AWS的某些速率限制而导致某些更新失败.

The nice thing is that the order of execution of these updates doesn't matter. The hard thing here is to make sure that each update is executed exactly once. Because we are firing a lot of requests (and having peaks eventually) there is a real chance of some failing updates due to ProvisionedThroughputExceededException or maybe just some rate limiting from AWS.

这是我的最终解决方案;

So here is my final solution;

  1. Lambda函数接收具有数百个数据点的有效负载.
  2. Lambda函数将这些数据点聚集在内存中,并产生形式为{id, sum, count, min, max}的中间聚集对象.
  3. Lambda函数为每个聚合对象生成3个更新对象,其形式如下(这些更新引用同一记录):
  1. Lambda function receives payload with hundreds of data points.
  2. Lambda function aggregates this data points in memory and produces an intermediary aggregation object of the form {id, sum, count, min, max}.
  3. Lambda function generates 3 update objects per aggregation object, of the forms (these updates are referring to the same record):
  1. {UpdateExpression: 'ADD #SUM :sum, #COUNT :count'}
  2. {ConditionExpression: '#MAX < :max OR attribute_not_exists(#MAX)', UpdateExpression: 'SET #MAX = :max'}
  3. {ConditionExpression: '#MIN > :min OR attribute_not_exists(#MIN)', UpdateExpression: 'SET #MIN = :min'}
  1. {UpdateExpression: 'ADD #SUM :sum, #COUNT :count'}
  2. {ConditionExpression: '#MAX < :max OR attribute_not_exists(#MAX)', UpdateExpression: 'SET #MAX = :max'}
  3. {ConditionExpression: '#MIN > :min OR attribute_not_exists(#MIN)', UpdateExpression: 'SET #MIN = :min'}

  • 因为我们需要100%确保将始终成功处理这些更新,所以lambda函数会将其发送到 FIFO SQS队列(作为3条单独的消息).我不在这里使用FIFO队列,因为我希望保留订单,但是因为我希望

  • Because we need to be 100% sure that these updates will always be processed with success, then the lambda function sends them to a FIFO SQS queue (as 3 separate messages). I am not using a FIFO queue here because I want the order to be preserved but because I want the guarantee of exactly once delivery.
  • A consumer keeps pooling the queue and whenever there are messages it just shoots them to DynamoDB as the parameter of .updateItem.
  • 在此过程结束时,我能够对数千条记录进行实时汇总:)

    At the end of this process, I was able to do real-time aggregations for thousands of records :)

    PS .:摆脱了哈希列

    这篇关于如何在DynamoDB中有条件地执行SET操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆