Tweepy 跟踪条款和关注用户 [英] Tweepy tracking terms and following users

查看:25
本文介绍了Tweepy 跟踪条款和关注用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个应用程序,以使用流式 twitter API 跟踪来自特定用户的某些术语.

I'm trying to build an app to track some terms from specifics users using the streaming twitter API.

我基于这个 教程.但是,只有当我按条款或用户 ID 跟踪推文时才有效,但现在两者都可用.当我尝试使用它们进行搜索时,api 会返回来自任何用户的推文.我的代码在这里:

I made a working python script using tweepy for the streaming api based on this tutorial. But, it's only working if I track tweets by terms or by user ids, but now by both. When I try to search using both of them, the api returns me tweets from any user. My code is here:

#Acessando a API do twitter com as chaves
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token_key, access_token_secret)

#Chamando o Listener com o tweepy
api = tweepy.API(auth)

#Chama o stream e passa o que buscar no twitter.
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
list_users = ['11111','22222']   #Some ids
list_terms = ['term1','term2']   #Some terms
sapi.filter(follow=list_users, track=list_terms)

这两个变量(list_users, list_terms) 分别是用户ID 列表和术语列表.

These two variables(list_users, list_terms) are lists of user ids and list of terms respectively.

如何按用户和术语过滤推文流?有没有办法用 tweepy 过滤器做到这一点?还是应该在检索到推文后进行验证?

How can I filter tweets stream by users AND by terms? Is there any way to do it with the tweepy filter? Or should I do a verification after retrieving the tweet?

推荐答案

Twitter 流 API 使用 OR 逻辑评估不同的条件,即返回推文与术语和来自用户的联合.因此,您必须实现自定义 on_data 函数才能使用 AND 进行过滤.

Twitter streaming API evaluates different conditions with OR logic, that is returns union of tweets with terms and from users. So you have to implement custom on_data function in order to filter with AND.

请注意,您最多只能满足 5000 个用户和 400 个术语,并且由于速率限制可能是一个问题,因此您需要为 api 提供产生较低推文流的条件,并在后期处理中使用所有其余条件过滤传入数据.

Note that you're limited to condition on up to 5000 users and 400 terms, and as rate limit may be an issue, so you'd supply api with a condition that yields lower tweet stream, and filter incoming data with all the rest conditions in post processing.

您最多可以跟踪 5,000 个用户和 400 个关键字——速率限制确实在 Firehose 的 1% 时生效,因此如果在任何时候来自您的关键字和用户联合的推文量上升到所有的 1% 以上Firehose 上实时"发生的推文,您将获得最多 1% 的推文以及速率限制通知,告知您错过了多少推文.

You can track up to 5,000 users and 400 keywords -- the rate limiting indeed takes effect at 1% of the Firehose, so if at any moment the tweet volume from the union of your keywords and users rises above 1% of all tweets happening in "real time" on the Firehose, you'll get up to 1% of the tweets along with a rate limit notice informing you of how many tweets you missed.

这篇关于Tweepy 跟踪条款和关注用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆