分布式任务队列(Ex。Celery)与crontab脚本 [英] Distributed task queues (Ex. Celery) vs crontab scripts

查看:156
本文介绍了分布式任务队列(Ex。Celery)与crontab脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法理解分布式任务队列的目的。例如,python的芹菜库



我知道在芹菜中,python框架,你可以设置定时窗口以使函数得到执行。然而,也可以在针对python脚本的linux crontab中轻松完成。



据我所知,从我自己的django-celery webapps中可以看出,芹菜消耗的RAM内存要远远多于设置一个原始的crontab。对于相对较小的应用程序,几百MB的差异。



有人可以帮我这个区别吗?也许高级别的解释如何工作队列/ crontabs一般工作也是很好的。



谢谢。

解决方案

如果你需要分发它们,以及如何管理它们,这取决于你想要完成的任务。



crontab能够每N个间隔执行一个脚本。它运行,然后返回。基本上你每个间隔都可以执行一次执行。你可以直接指示一个crontab来执行一个django管理命令并访问整个django环境,所以芹菜并不会真正帮助你。



芹菜带来什么表中,借助消息队列,是分布式的任务。许多服务器可以加入工作人员池,每个接收工作项目,而不用担心双重处理。一旦准备就绪,也可以执行任务。使用cron,您将被限制在至少一分钟。



例如,假设您刚刚启动了一个新的Web应用程序,并且您收到了数百个注册需要向每个用户发送电子邮件。发送电子邮件可能需要很长时间(比较),因此您决定通过任务处理激活电子邮件。



如果您使用cron,则需要确保每一分钟的cron能够处理所有需要发送的电子邮件。如果您有几台服务器,您现在需要确保您没有向同一用户发送多个激活电子邮件 - 您需要某种同步。



使用芹菜,您将任务添加到队列中。每个服务器可能有几个工作人员,所以你已经在一个cronjob之前。您可能还有几台服务器可以让您进一步扩展。同步处理作为队列的一部分。



您可以将芹菜用作cron替代品,但这不是真正的主要用途。它用于在分布式集群中种植异步任务。



当然,芹菜有一个大功能列表,cron没有。


I'm having trouble understanding the purpose of 'distributed task queues'. For example, python's celery library.

I know that in celery, the python framework, you can set timed windows for functions to get executed. However, that can also be easily done in a linux crontab directed at a python script.

And as far as I know, and shown from my own django-celery webapps, celery consumes much more RAM memory than just setting up a raw crontab. Few hundred MB difference for a relatively small app.

Can someone please help me with this distinction? Perhaps a high level explanation of how task queues / crontabs work in general would be nice also.

Thank you.

解决方案

It depends what you want your tasks to be doing, if you need to distribute them, and how you want to manage them.

A crontab is capable of executing a script every N intervals. It runs, and then returns. Essentially you get a single execution each interval. You could just direct a crontab to execute a django management command and get access to the entire django environment, so celery doesn't really help you there.

What celery brings to the table, with the help of a message queue, is distributed tasks. Many servers can join the pool of workers and each receive a work item without fear of double handling. It's also possible to execute a task as soon as it is ready. With cron, you're limited to a minimum of one minute.

As an example, imagine you've just launched a new web application and you're receiving hundreds of sign ups that require an email to be sent to each user. Sending an email may take a long time (comparatively) so you decide that you'll handle activation emails via tasks.

If you were using cron, you'd need to ensure that every minute cron is able to process all of the emails that need to be sent. If you have several servers you now need to make sure that you aren't sending multiple activation emails to the same user - you need some kind of synchronization.

With celery, you add a task to the queue. You may have several workers per server so you've already scaled ahead of a cronjob. You may also have several servers allowing you to scale even more. Synchronization is handled as part of the 'queue'.

You can use celery as a cron replacement but that's not really its primary use. It is used for farming out asynchronous tasks across a distributed cluster.

And of course, celery has a big list of features that cron does not.

这篇关于分布式任务队列(Ex。Celery)与crontab脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆