我如何使用来自 Twitter 的流媒体 API 的推文并将它们存储在 mongodb 中 [英] How can I consume tweets from Twitter's streaming api and store them in mongodb

查看:48
本文介绍了我如何使用来自 Twitter 的流媒体 API 的推文并将它们存储在 mongodb 中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要开发一个应用程序,让我可以跟踪推文并将它们保存在 mongodb 中以用于研究项目(正如你可能猜到的,我是一个菜鸟,所以请耐心等待).我发现这段代码通过我的终端窗口发送推文:

I need to develop an app that lets me track tweets and save them in a mongodb for a research project (as you might gather, I am a noob, so please bear with me). I have found this piece of code that sends tweets streaming through my terminal window:

import sys
import tweepy

consumer_key=""
consumer_secret=""
access_key = ""
access_secret = "" 


auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

class CustomStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        print status.text

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream

sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=['Gandolfini'])

有什么办法可以修改这段代码,而不是让推文流过我的屏幕,而是发送到我的 mongodb 数据库?

Is there a way I can modify this piece of code so that instead of having tweets streaming over my screen, they are sent to my mongodb database?

谢谢

推荐答案

举个例子:

import json
import pymongo
import tweepy

consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)


class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, api):
        self.api = api
        super(tweepy.StreamListener, self).__init__()

        self.db = pymongo.MongoClient().test

    def on_data(self, tweet):
        self.db.tweets.insert(json.loads(tweet))

    def on_error(self, status_code):
        return True # Don't kill the stream

    def on_timeout(self):
        return True # Don't kill the stream


sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])

这会将推文写入 mongodb test 数据库,tweets 集合.

This will write tweets to the mongodb test database, tweets collection.

希望有所帮助.

这篇关于我如何使用来自 Twitter 的流媒体 API 的推文并将它们存储在 mongodb 中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆