管理 Tweepy API 搜索 [英] Managing Tweepy API Search

查看:71
本文介绍了管理 Tweepy API 搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果这是之前在别处回答的问题的粗暴重复,请原谅我,但我不知道如何使用 tweepy API 搜索功能.是否有关于如何使用 api.search() 函数搜索推文的文档?

Please forgive me if this is a gross repeat of a question previously answered elsewhere, but I am lost on how to use the tweepy API search function. Is there any documentation available on how to search for tweets using the api.search() function?

有什么方法可以控制返回的推文数量、结果类型等功能吗?

Is there any way I can control features such as number of tweets returned, results type etc.?

由于某种原因,结果似乎最大为 100.

The results seem to max out at 100 for some reason.

我使用的代码片段如下

searched_tweets = self.api.search(q=query,rpp=100,count=1000)

推荐答案

我最初基于 Yuva Raj<制定了一个解决方案/a> 的建议GET search/tweets - max_id 参数与 <在循环的每次迭代中返回的最后一条推文的 code>id 也检查 TweepError 的出现.

I originally worked out a solution based on Yuva Raj's suggestion to use additional parameters in GET search/tweets - the max_id parameter in conjunction with the id of the last tweet returned in each iteration of a loop that also checks for the occurrence of a TweepError.

然而,我发现使用 tweepy.Cursor 有一种更简单的方法来解决这个问题(参见 tweepy Cursor 教程 了解更多关于使用 Cursor 的信息.

However, I discovered there is a far simpler way to solve the problem using a tweepy.Cursor (see tweepy Cursor tutorial for more on using Cursor).

以下代码获取最近 1000 次提及 'python'.

The following code fetches the most recent 1000 mentions of 'python'.

import tweepy
# assuming twitter_authentication.py contains each of the 4 oauth elements (1 per line)
from twitter_authentication import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET

auth = tweepy.OAuthHandler(API_KEY, API_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

api = tweepy.API(auth)

query = 'python'
max_tweets = 1000
searched_tweets = [status for status in tweepy.Cursor(api.search, q=query).items(max_tweets)]

更新:响应 Andre Petre 关于 tweepy 的潜在内存消耗问题的评论.光标,我将包括我的原始解决方案,用以下内容替换上面用于计算 searched_tweets 的单语句列表推导式:

Update: in response to Andre Petre's comment about potential memory consumption issues with tweepy.Cursor, I'll include my original solution, replacing the single statement list comprehension used above to compute searched_tweets with the following:

searched_tweets = []
last_id = -1
while len(searched_tweets) < max_tweets:
    count = max_tweets - len(searched_tweets)
    try:
        new_tweets = api.search(q=query, count=count, max_id=str(last_id - 1))
        if not new_tweets:
            break
        searched_tweets.extend(new_tweets)
        last_id = new_tweets[-1].id
    except tweepy.TweepError as e:
        # depending on TweepError.code, one may want to retry or wait
        # to keep things simple, we will give up on an error
        break

这篇关于管理 Tweepy API 搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆