在python调用API同时 [英] calling an api concurrently in python

查看：189 发布时间：2016/5/23 22:03:42 python api concurrency threadpool gevent

本文介绍了在python调用API同时的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要跟一个API来获取有关信息的球队。每个团队都有一个唯一的ID。我打电话与该ID的API，我也得到玩家对各队列表（类型的字典列表）。一个用于播放器的按键是另一个ID，我可以用它来获取有关该玩家的详细信息。我可以捆绑这些player_ids并向API的调用来获取所有在一个API调用每一个玩家的更多信息。

I need to talk to an api to get information about teams. Each team has a unique id. I call the api with that id, and I get a list of players on each team (list of dicts). One of the keys for a player is another id that I can use to get more information about that player. I can bundle all these player_ids and make a call to the api to get all the additional information for each player in one api call.

我的问题是这样的：
我希望球队成长的数量，也可能是相当大的。此外，玩家每队人数也可能会变大。

My question is this: I expect the number of teams to grow, it could be quite large. Also, the number of players for each team could also grow large.

什么是使这些API调用同时对API的最好方法？我可以使用从multiprocessing.dummy线程池，我也看到genvent用于这样的事情。

What is the best way to make these api calls concurrently to the api? I can use the ThreadPool from multiprocessing.dummy, I have also seen genvent used for something like this.

对API的调用需要一段时间才能得到一个返回值（1-2秒每个散装API调用）。

The calls to the api take some time to get a return value (1-2 seconds for each bulk api call).

现在，我做的是这样的：

Right now, what I do is this:

for each team:
    get the list of players
    store the player_ids in a list
    get the player information for all the players (passing the list of player_ids)
assemble and process the information

如果我使用线程池，我可以做到以下几点：

If I use ThreadPool, I can do the following:

create a ThreadPool of size x
result = pool.map(function_to_get_team_info, list of teams)
pool.close()
pool.join()
#process results

def function_to_get_team_info(team_id):
    players = api.call(team_id)
    player_info = get_players_information(players)
    return player_info

def get_players_information(players):
    player_ids = []
    for player in players:
        player_ids.append(player['id'])
    return get_all_player_stats(player_ids)

def get_all_player_stats(players_id):
    return api.call(players_id)

这同时处理各队，并在线程池结果返回装配的所有信息。

This processes each team concurrently, and assembles all the information back in the ThreadPool results.

为了使这个完全并发，我想我需要让我的线程池队伍数量的大小。但我不认为这很好地进行扩展。所以，我在想，如果我用GEVENT处理此信息，如果这将是一个更好的办法。

In order to make this completely concurrent, I think I would need to make my ThreadPool the size of the number of teams. But I don't think this scales well. So, I was wondering if I used gevent to process this information if that would be a better approach.

任何建议将是非常欢迎的。

Any suggestions would be very welcome

推荐答案

一个解决办法是：

prepare要执行的任务列表，在团队中的ID您的案件清单进行处理，

创建n主题工人固定池，

每个工作线程从列表弹出一个任务和处理任务（下载数据队），完成后，它会弹出一个任务，

当任务列表为空，工作线程停止。

该解决方案可以安全的从你的情况下，当一个特定的团队处理需要例如100个时间单位，当其他球队在1时间单位进行处理（在平均）。

This solution could safe you from the case when processing of a particular team takes e.g. 100 time units, when other teams are processed in 1 time unit (on an average).

您可以根据团队人数，团队的平均处理时间，线程的工人调多个CPU内核的数量等。

You can tune number of thread workers depending on number of teams, average team processing time, number of CPU cores etc.

的扩展答案的

这可以使用Python multiprocessing.Pool ：


This can be achieved with the Python multiprocessing.Pool:
from multiprocessing import Pool

def api_call(id):
    pass # call API for given id

if __name__ == '__main__':
    p = Pool(5)
    p.map(api_call, [1, 2, 3])


                        这篇关于在python调用API同时的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在python调用API同时 [英] calling an api concurrently in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在python调用API同时 [英] calling an api concurrently in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭