Celery SQS +任务重复+ SQS可见性超时 [英] Celery SQS + Duplication of tasks + SQS visibility timeout

查看:88
本文介绍了Celery SQS +任务重复+ SQS可见性超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的大多数Celery任务的ETA都比Amazon SQS定义的最大可见性超时长.

Most of my Celery tasks have ETA longer then maximal visibility timeout defined by Amazon SQS.

Celery 文档说:

这会导致ETA/倒数/重试任务的时间到执行超过可见性超时;事实上,如果发生这种情况将再次执行,并再次循环执行.

This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop.

因此,您必须增加可见性超时以匹配以下时间您打算使用的最长ETA.

So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use.

同时它还说:

截至本文撰写时,AWS支持的最大可见性超时为12小时(43200秒):

The maximum visibility timeout supported by AWS as of this writing is 12 hours (43200 seconds):

如果我使用SQS,应该怎么做才能避免在工作人员中多次执行任务?

What should I do to avoid multiple execution of tasks in my workers if I am using SQS?

推荐答案

通常,拥有很长的ETA的任务不是一个好主意.

Generally its not a good idea to have tasks with very long ETAs.

首先,存在"visibility_timeout"问题.而且您可能不希望看到的超时时间过长,因为如果工作进程在任务即将运行前1分钟崩溃,那么队列将在将任务发送给其他工作线程之前仍然等待visible_timeout完成,我想您不希望这样做再过一个月.

First of all, there is the "visibility_timeout" issue. And you probably dont want a very big visibility timeout because if the worker crashes 1 min before the task is about to run, then the Queue will still wait for the visibility_timeout to finish before sending the task to another worker and, I guess you dont want this to be another 1 month.

来自celery文档:

From celery docs:

请注意,Celery将在工作人员关闭时重新传递消息,因此可见性超时时间长只会延迟重新投放丢失"停电或强制终止时执行的任务工人.

Note that Celery will redeliver messages at worker shutdown, so having a long visibility timeout will only delay the redelivery of ‘lost’ tasks in the event of a power failure or forcefully terminated workers.

而且,SQS仅允许要确认的列表中有这么多任务.

And also, SQS allows only so many tasks to be in the list to be ack'ed.

SQS将这些任务称为机上消息".来自 http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html :

SQS calls these tasks as "Inflight Messages". From http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html:

从使用者排队,但尚未从队列中删除.

A message is considered to be in flight after it's received from a queue by a consumer, but not yet deleted from the queue.

对于标准队列,最多可进行120,000个飞行每个队列中的邮件数.如果达到此限制,Amazon SQS将返回OverLimit错误消息.为了避免达到极限,您应该处理完消息后,从队列中删除消息.你也可以增加用于处理消息的队列数.

For standard queues, there can be a maximum of 120,000 inflight messages per queue. If you reach this limit, Amazon SQS returns the OverLimit error message. To avoid reaching the limit, you should delete messages from the queue after they're processed. You can also increase the number of queues you use to process your messages.

对于FIFO队列,最多可以有20,000个飞行中消息每个队列.如果达到此限制,则Amazon SQS不返回错误消息.

For FIFO queues, there can be a maximum of 20,000 inflight messages per queue. If you reach this limit, Amazon SQS returns no error messages.

我看到了两种可能的解决方案,您可以改用RabbitMQ,它不依赖可见性超时(如果您不想管理自己的服务,则可以使用"RabbitMQ即服务"服务),也可以更改代码以拥有很小的ETA(最佳做法)

I see two possible solutions, you can either use RabbitMQ instead, which doesnt rely on visibility timeouts (there are "RabbitMQ as a service" services if you dont want to manage your own) or change your code to have really small ETAs (best practice)

这是我的2美分,也许@asksol可以提供一些额外的见解.

These are my 2 cents, maybe @asksol can provide some extra insights.

这篇关于Celery SQS +任务重复+ SQS可见性超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆