从 Twitter 获取历史数据 [英] Getting historical data from Twitter

查看:63
本文介绍了从 Twitter 获取历史数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于一个研究项目,我想获得过去 3 个月的 Twitter 消息.撇开技术挑战不谈,这可能吗?通过使用某种慢速轮询机制来阻止速率限制器?

For a research project I would like to get the last 3 months worth of Twitter messages. Technical challenges aside, is this possible? by using some sort of slow polling mechanism to keep the rate limiter at bay?

Twitter API 声明客户可以通过时间轴 REST API 的页面和计数参数请求多达 3,200 个状态"这些是每小时吗?每天?或者……曾经?

The Twitter API states "Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API" Are these per hour? Per day? or...ever?

有什么建议吗?它甚至在理论上可能吗?以前有人做过类似的事情吗?

Any suggestions? Would it even be theoretically possible? Did some one do something similar before?

谢谢!马可

推荐答案

众所周知,Twitter 不会提供超过三周的可用"推文.在某些情况下,您只能获得一个星期.您最好在接下来的三个月内存储推文.许多人正确地怀疑他们是否还坚持使用 Twitter.

Twitter notoriously does not make "available" tweets older than three weeks. In some cases you can only get one week. You're better off storing tweets for the next three months. Many rightly doubt if they're even persisted by Twitter.

您是否正在寻找任何推文?如果是这样,请查看流 API 的 status/sample 方法.流 API 使用持久性 HTTP 套接字,这对编程来说可能很麻烦,但是当你让它工作时它非常优雅.我建议设置一个小脚本将推文从状态/样本转储到数据库中.几天后,您应该会拥有大量数据.

Are you looking for just any tweets? If so, check out the Streaming API's status/sample method. The streaming API uses persistent HTTP sockets that can be a pain to program, but it's quite graceful when you get it working. I'd recommend setting up a little script to dump tweets from status/sample into a DB. You should have a TON of data after just a few days.

这篇关于从 Twitter 获取历史数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆