多个服务器上的django-celery基础架构,代理为redis [英] django-celery infrastructure over multiple servers, broker is redis

查看:106
本文介绍了多个服务器上的django-celery基础架构,代理为redis的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,我们在单个云服务器上进行了所有设置,其中包括:

Currently we have everything setup on single cloud server, that includes:

  • 数据库服务器
  • Apache
  • 芹菜
  • redis担任芹菜和其他一些任务的经纪人

现在我们正在考虑将主要组件分解为单独的服务器,例如单独的数据库服务器,单独的媒体文件存储,负载均衡器后面的Web服务器.原因是不必为一台笨重的服务器付费,而不必使用负载平衡器按需创建服务器,从而降低成本并提高整体速度.

Now we are thinking to break apart the main components to separate servers e.g. separate database server, separate storage for media files, web servers behind load balancers. The reason is to not to pay for one heavy server and use load balancers to create servers on demand to reduce cost and improve overall speed.

我真的只是对芹菜感到困惑,有人在负载均衡器后面的多个生产服务器上使用过芹菜吗?任何指导将不胜感激.

I am really confused about celery only, have anyone ever used celery on multiple production servers behind load balancers? Any guidance would be appreciated.

考虑一个小用例,该用例当前是在单台服务器上完成的(困惑是当我们使用多台服务器时该如何完成):

Consider one small use case which is currently how it is been done on single server (confusion is that how that can be done when we use multiple servers):

  • 用户上传 abc.pptx 文件->引用存储在数据库中->存储在服务器磁盘上
  • 创建任务(将文档转换为pdf)并将其放入redis(经纪人)队列
  • 在同一服务器上运行的
  • celery从队列中选择任务
    • 读取文件,然后使用称为 docsplit
    • 的软件将其转换为pdf
    • 在服务器磁盘上创建一个文件夹(以后将用作静态内容),将pdf文件及其缩略图和纯文本以及原始文件放入
    • User uploads a abc.pptx file->reference is stored in database->stored on server disk
    • A task (convert document to pdf) is created and goes in redis (broker) queue
    • celery which is running on same server picks the task from queue
      • Read the file, convert it to pdf using software called docsplit
      • create a folder on server disk (which will be used as static content later on) puts pdf file and its thumbnail and plain text and the original file

      考虑到上述用例,如何设置多个可以执行相同功能的Web服务器?

      Considering the above use case, how can you setup up multiple web servers which can perform the same functionality?

      推荐答案

      将大大简化您的处理过程的是一些共享存储,可从所有协作服务器访问这些存储.通过这种设计,您可以将工作分配到更多服务器上,而不必担心下一步处理将在哪个服务器上完成.

      What will strongly simplify your processing is some shared storage, accessible from all cooperating servers. With such design, you may distribute the work among more servers without worrying on which server will be next processing step done.

      如果您可以使用某些云存储(例如AWS S3),请使用它.

      If you can use some cloud storage, like AWS S3, use that.

      如果您的服务器也运行在AWS上,则无需为同一区域内的流量付费,并且传输速度非常快.

      In case you have your servers running at AWS too, you do not pay for traffic within the same region, and transfers are quite fast.

      主要优点是,您的数据可在同一存储桶/密钥名称下的所有服务器上使用,因此您不必费心谁在处理哪个文件,因为所有文件都在S3上共享存储.

      Main advantage is, your data are available from all the servers under the same bucket/key name, so you do not have to bother about who is processing which file, as all have shared storage on S3.

      注意:如果您需要清除旧文件,甚至可以在Give Bucket上设置一些策略文件,例如删除1天或1周以上的文件.

      note: If you need to get rid of old files, you may even set up some policy file on give bucket, e.g. to delete files older than 1 day or 1 week.

      还有更多选择

      • 桑巴
      • 中央文件服务器
      • FTP
      • Google存储(非常类似于AWS S3)
      • Swift(来自OpenStack)

      对于小文件,您甚至可以使用Redis,但是这样的解决方案是有充分理由的.

      For small files you could even use Redis, but such solutions are for good reasons rather rare.

      这篇关于多个服务器上的django-celery基础架构,代理为redis的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆