Tweepy Streaming - 停止收集 x 数量的推文 [英] Tweepy Streaming - Stop collecting tweets at x amount

查看:16
本文介绍了Tweepy Streaming - 停止收集 x 数量的推文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 MongoDB 中存储 x # 条推文后,我希望 Tweepy Streaming API 停止接收推文.

I'm looking to have the Tweepy Streaming API stop pulling in tweets after I have stored x # of tweets in MongoDB.

我在类中尝试了 IF 和 WHILE 语句,用计数器定义,但无法让它在某个 X 数量处停止.这对我来说是一个真正的打击.我在这里找到了这个链接:https://groups.google.com/forum/#!topic/tweepy/5IGlu2Qiug4 但我试图复制它的努力失败了.它总是告诉我 init 需要一个额外的参数.我相信我们的 Tweepy 身份验证设置不同,所以不是苹果对苹果.

I have tried IF and WHILE statements inside the class, defintion with counters, but cannot get it to stop at a certain X amount. This is a real head-banger for me. I found this link here: https://groups.google.com/forum/#!topic/tweepy/5IGlu2Qiug4 but my efforts to replicate this have failed. It always tells me that init needs an additional argument. I believe we have our Tweepy auth set different, so it is not apples to apples.

有什么想法吗?

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import json, time, sys

import tweepy
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_TOKEN_SECRET)

class StdOutListener(StreamListener):

    def on_status(self, status):
        text = status.text
        created = status.created_at
        record = {'Text': text, 'Created At': created}
        print record #See Tweepy documentation to learn how to access other fields
        collection.insert(record)  


    def on_error(self, status):
        print 'Error on status', status

    def on_limit(self, status):
        print 'Limit threshold exceeded', status

    def on_timeout(self, status):
        print 'Stream disconnected; continuing...'


stream = Stream(auth, StdOutListener())
stream.filter(track=['tv'])

推荐答案

你需要在 __init__ 的 class 里面添加一个 counter,然后在 on_status 里面加一个 counter>.然后当计数器低于 20 时,它将向集合中插入一条记录.这可以如下所示完成:

You need to add a counter inside of your class in __init__, and then increment it inside of on_status. Then when the counter is below 20 it will insert a record into the collection. This could be done as show below:

def __init__(self, api=None):
    super(StdOutListener, self).__init__()
    self.num_tweets = 0

def on_status(self, status):
    record = {'Text': status.text, 'Created At': status.created_at}
    print record #See Tweepy documentation to learn how to access other fields
    self.num_tweets += 1
    if self.num_tweets < 20:
        collection.insert(record)
        return True
    else:
        return False

这篇关于Tweepy Streaming - 停止收集 x 数量的推文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆