如何在推特上搜索关键字 [英] How to search twitter for keywords

查看:984
本文介绍了如何在推特上搜索关键字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一项服务,在持续监控的基础上为 Twitter 中的多个用户执行关键字搜索.似乎有 5 种不同的方法可以实现这一点——所有这些方法都有各自的缺点.我已经浏览了 Twitter 和 twitter4j 文档,但找不到任何其他方法.

I am trying to build a service that performs keyword searches for multiple users in Twitter on a constant monitoring basis. There seems to be 5 different ways to accomplish this--all with their own drawbacks. I have gone through the Twitter and twitter4j documents and cannot find any other approaches.

  1. 使用 Twitter REST API 执行搜索 (https://dev.twitter.com/docs/api/1/get/search).此 API 受结果限制:要求太多,您将受到限制.我必须跟踪上次阅读的推文,以免重复结果.需要一个计时器来轮询流.如果有多个搜索词,则可以轻松拨打多个电话.

  1. Use the Twitter REST API to perform searches (https://dev.twitter.com/docs/api/1/get/search). This API is result-limited: ask for too much and you will be limited. I do have to keep track of the last tweet read so I don't duplicate results. A timer is needed to poll the stream. If there are multiple search terms it is simple to make multiple calls.

搜索公共流方法 (https://dev.twitter.com/docs/streaming-apis/streams/public).虽然这对于持续搜索非常有用,但 Twitter 每个帐户只允许一个连接,并且可以传递到 Twitter 的术语数量是有限制的.对于我的用例绝对不可能

Search the public stream approach (https://dev.twitter.com/docs/streaming-apis/streams/public). While this is great for constant searching, Twitter only allows one connection per account and there are limits on how many terms can passed into Twitter. Definitely impossible for my use case

尝试使用用户流进行过滤.我这样做了,但发现很难快速确定推文是来自搜索还是用户流.此外,Twitter 表示他们将限制每个 IP 地址的用户流数量,因此这种方法无法扩展.(Twitter 一直在谈论一种叫做 SiteStreams 的东西,但它是一个非常有限的测试版,没有任何文档,所以我无法考虑).

Try to use User Streams for filtering. I did this but found that it was difficult to quickly determine if a tweet was from search or the user stream. Also, Twitter states that they will limit the number of user streams per IP address so this approach does not scale. (Twitter has been talking up something called SiteStreams, but it is a very limited beta without any documentation so it is not something I can consider).

转到从 Twitter 购买整个 Firehose 的第三方(例如 Datasift)并在那里搜索 Twitter 流.这会变得昂贵——基本计划每月 3000 美元.搜索一个词 24/7 每月花费约 45 美元)

Go to a third party who is purchasing the entire firehose from Twitter (e.g. Datasift) and search the twitter stream there. This gets expensive--$3K/month for the base plan. Searching for a single word 24/7 costs ~$45/month)

我对社区的问题是我是否已经用尽了所有可能性"?如果是,那么在我看来#1--使用带有计时器的 REST API 并跟踪上次找到的方法是正确的方法.有人不同意吗?如果是这样,您能否指出可以帮助我解决此问题的文档(或库).

My question for the community is "have I exhausted all possibilities"? If yes, then it appears to me that #1--using the REST API with a timer and tracking last found is the right approach. Does anyone disagree? If so, can you point me to the documentation ( or library) that would help me resolve this issue.

谢谢大家

推荐答案

来自 Twitter 的回应是使用 #4--从供应商(例如 Datasift)购买访问权限.

Response from Twitter was to use #4--purchase access from vendor such as Datasift.

这篇关于如何在推特上搜索关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆