管理 Tweepy API 搜索 [英] Managing Tweepy API Search
问题描述
如果这是之前在别处回答的问题的粗暴重复,请原谅我,但我不知道如何使用 tweepy API 搜索功能.是否有关于如何使用 api.search()
函数搜索推文的文档?
Please forgive me if this is a gross repeat of a question previously answered elsewhere, but I am lost on how to use the tweepy API search function. Is there any documentation available on how to search for tweets using the api.search()
function?
有什么方法可以控制返回的推文数量、结果类型等功能吗?
Is there any way I can control features such as number of tweets returned, results type etc.?
由于某种原因,结果似乎最大为 100.
The results seem to max out at 100 for some reason.
我使用的代码片段如下
searched_tweets = self.api.search(q=query,rpp=100,count=1000)
推荐答案
我最初基于 Yuva Raj<制定了一个解决方案/a> 的建议在GET search/tweets - max_id
参数与 <在循环的每次迭代中返回的最后一条推文的 code>id 也检查 TweepError
的出现.
I originally worked out a solution based on Yuva Raj's suggestion to use additional parameters in GET search/tweets - the max_id
parameter in conjunction with the id
of the last tweet returned in each iteration of a loop that also checks for the occurrence of a TweepError
.
然而,我发现使用 tweepy.Cursor
有一种更简单的方法来解决这个问题(参见 tweepy Cursor 教程 了解更多关于使用 Cursor
的信息.
However, I discovered there is a far simpler way to solve the problem using a tweepy.Cursor
(see tweepy Cursor tutorial for more on using Cursor
).
以下代码获取最近 1000 次提及 'python'
.
The following code fetches the most recent 1000 mentions of 'python'
.
import tweepy
# assuming twitter_authentication.py contains each of the 4 oauth elements (1 per line)
from twitter_authentication import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
auth = tweepy.OAuthHandler(API_KEY, API_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
query = 'python'
max_tweets = 1000
searched_tweets = [status for status in tweepy.Cursor(api.search, q=query).items(max_tweets)]
更新:响应 Andre Petre 关于 tweepy 的潜在内存消耗问题的评论.光标
,我将包括我的原始解决方案,用以下内容替换上面用于计算 searched_tweets
的单语句列表推导式:
Update: in response to Andre Petre's comment about potential memory consumption issues with tweepy.Cursor
, I'll include my original solution, replacing the single statement list comprehension used above to compute searched_tweets
with the following:
searched_tweets = []
last_id = -1
while len(searched_tweets) < max_tweets:
count = max_tweets - len(searched_tweets)
try:
new_tweets = api.search(q=query, count=count, max_id=str(last_id - 1))
if not new_tweets:
break
searched_tweets.extend(new_tweets)
last_id = new_tweets[-1].id
except tweepy.TweepError as e:
# depending on TweepError.code, one may want to retry or wait
# to keep things simple, we will give up on an error
break
这篇关于管理 Tweepy API 搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!