使用Celery实时，使用Gevent进行同步外部API查询 [英] Using Celery for Realtime, Synchronous External API Querying with Gevent

查看：126 发布时间：2017/5/29 5:04:52 python django redis celery gevent

本文介绍了使用Celery实时，使用Gevent进行同步外部API查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用一个Web应用程序，该应用程序将接收到一个用户的请求，并且必须使用一些外部API来撰写该请求的答案。这可以直接从主要网络线程使用类似gevent的方式来完成请求。

I'm working on a web application that will receive a request from a user and have to hit a number of external APIs to compose the answer to that request. This could be done directly from the main web thread using something like gevent to fan out the request.

或者，我在想，我可以将传入的请求放入队列中，使用工人分配负载。这个想法将是尝试保持实时，同时分离几个工作人员的请求。每个这些工作人员只会查询许多外部API中的一个。他们接收到的响应将经过一系列变换，被保存到一个数据库中，被转换成一个通用的模式并保存在一个通用的数据库中，最终被组合成一个通过Web请求返回的一个大的响应。网络请求最有可能在这段时间内被阻止，用户等待，所以保持
尽可能快的排队和出队很重要。

Alternatively, I was thinking, I could put incoming requests into a queue and use workers to distribute the load. The idea would be to try to keep it real time, while splitting up the requests amongst several workers. Each of these workers would be querying only one of the many external APIs. The response they receive would then go through a series transformations, be saved into a DB, be transformed to a common schema and saved in a common DB to finally be composed into one big response that would be returned through the web request. The web request is most likely going to be blocking all this time, with a user waiting, so keeping the queueing and dequeueing as fast as possible is important.

外部API调用可以轻松转化为单独的任务。我认为从一个api任务到一个DB保存任务的转换链接
可以使用一个链等来完成，最终结果将使用和弦返回到web线程的所有结果组合起来。

The external API calls can easily be turned into individual tasks. I think the linking from one api task to a transformation to a DB saving task could be done using a chain, etc, and the final result combining all results returned to the web thread using a chord.

有些问题：

可以使用芹菜？

我正在使用django。我应该尝试在纯芹菜上使用django-芹菜吗？

这些任务中的每一个都可能产生其他任务，例如记录刚发生的
或其他类型的分支。这可能是可能的吗？

任务可以返回他们得到的数据 - 即通过芹菜（可能是在这种情况下为底层）的潜在Kb数据，或者应该写入数据库，只是传递指向该数据的指针？

每个任务大多是I / O绑定，最初只是使用web线程中的gevent来抛出请求并跳过整个排队设计，但事实证明它将被重用于不同的组件。试图通过Qs实时保持整个往返行程可能需要许多工作人员确保排队大多是空的。还是吗运行几何工作池可以帮助这个吗？

我必须编写地理特定任务，还是自动使用地理池处理网络IO？

可以为某些任务分配优先权吗？

如何保持顺序？

我应该跳过芹菜，只是使用kombu吗？

似乎芹菜更适合于可以推迟的任务，而且
不是时间敏感的。我真的想要保持这个实时吗？

我还应该看看其他什么技术？

Can this (and should this) be done using celery?
I'm using django. Should I try to use django-celery over plain celery?
Each one of those tasks might spawn off other tasks - such as logging what just happened or other types of branching off. Is this possible?
Could tasks be returning the data they get - i.e. potentially Kb of data through celery (redis as underlying in this case) or should they write to the DB, and just pass pointers to that data around?
Each task is mostly I/O bound, and was initially just going to use gevent from the web thread to fan out the requests and skip the whole queuing design, but it turns out that it would be reused for a different component. Trying to keep the whole round trip through the Qs real time will probably require many workers making sure the queueus are mostly empty. Or is it? Would running the gevent worker pool help with this?
Do I have to write gevent specific tasks or will using the gevent pool deal with network IO automagically?
Is it possible to assign priority to certain tasks?
What about keeping them in order?
Should I skip celery and just use kombu?
It seems like celery is geared more towards "tasks" that can be deferred and are not time sensitive. Am I nuts for trying to keep this real time?
What other technologies should I look at?

更新：尝试将此更多的哈希。我在Kombu做了一些阅读，似乎能够做我正在想的，虽然在比芹菜低得多的水平上。这是我所想到的一个图表。

Update: Trying to hash this out a bit more. I did some reading on Kombu and it seems to be able to do what I'm thinking of, although at a much lower level than celery. Here is a diagram of what I had in mind.

与Kombu可访问的原始队列似乎是可能的，是许多工作人员订阅广播消息的能力。如果使用队列，则发布者不需要知道类型和编号。可以使用芹菜实现类似的效果吗？看起来，如果你想要和弦，你需要知道在运行时什么任务将涉及和弦，而在这种情况下，你可以简单地添加听众到广播，只是确保他们宣布他们在运行添加对最终队列的响应。

What seems to be possible with raw queues as accessible with Kombu is the ability for a number of workers to subscribe to a broadcast message. The type and number does not need to be known by the publisher if using a queue. Can something similar be achieved using Celery? It seems like if you want to make a chord, you need to know at runtime what tasks are going to be involved in the chord, whereas in this scenario you can simply add listeners to the broadcast, and simply make sure they announce they are in the running to add responses to the final queue.

更新2：我看到有一个广播能力您可以将它与和弦结合在一起吗？一般来说，你可以将芹菜和原料混合在一起吗？这开始听起来像是一个关于冰沙的问题。

Update 2: I see there is the ability to broadcast Can you combine this with a chord? In general, can you combine celery with raw kombu? This is starting to sound like a question about smoothies.

使用Celery实时，使用Gevent进行同步外部API查询 [英] Using Celery for Realtime, Synchronous External API Querying with Gevent

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Celery实时，使用Gevent进行同步外部API查询 [英] Using Celery for Realtime, Synchronous External API Querying with Gevent

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭