使用 Tweepy 避免 Twitter API 限制 [英] Avoid Twitter API limitation with Tweepy
问题描述
我在 Stack Exchange 上的一些问题中看到,限制可以是每 15 分钟请求数的函数,并且还取决于算法的复杂性,只是这不是一个复杂的算法.
I saw in some question on Stack Exchange that the limitation can be a function of the number of requests per 15 minutes and depends also on the complexity of the algorithm, except that this is not a complex one.
所以我使用这个代码:
import tweepy
import sqlite3
import time
db = sqlite3.connect('data/MyDB.db')
# Get a cursor object
cursor = db.cursor()
cursor.execute('''CREATE TABLE IF NOT EXISTS MyTable(id INTEGER PRIMARY KEY, name TEXT, geo TEXT, image TEXT, source TEXT, timestamp TEXT, text TEXT, rt INTEGER)''')
db.commit()
consumer_key = ""
consumer_secret = ""
key = ""
secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(key, secret)
api = tweepy.API(auth)
search = "#MyHashtag"
for tweet in tweepy.Cursor(api.search,
q=search,
include_entities=True).items():
while True:
try:
cursor.execute('''INSERT INTO MyTable(name, geo, image, source, timestamp, text, rt) VALUES(?,?,?,?,?,?,?)''',(tweet.user.screen_name, str(tweet.geo), tweet.user.profile_image_url, tweet.source, tweet.created_at, tweet.text, tweet.retweet_count))
except tweepy.TweepError:
time.sleep(60 * 15)
continue
break
db.commit()
db.close()
我总是收到 Twitter 限制错误:
I always get the Twitter limitation error:
Traceback (most recent call last):
File "stream.py", line 25, in <module>
include_entities=True).items():
File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 153, in next
self.current_page = self.page_iterator.next()
File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 98, in next
data = self.method(max_id = max_id, *self.args, **self.kargs)
File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 200, in _call
return method.execute()
File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 176, in execute
raise TweepError(error_msg, resp)
tweepy.error.TweepError: [{'message': 'Rate limit exceeded', 'code': 88}]
推荐答案
问题在于您的 try: except:
块位于错误的位置.将数据插入数据库永远不会引发 TweepError
- 它会迭代 Cursor.items()
.我建议重构您的代码以在无限循环中调用 Cursor.items()
的 next
方法.该调用应该放在 try:except:
块中,因为它会引发错误.
The problem is that your try: except:
block is in the wrong place. Inserting data into the database will never raise a TweepError
- it's iterating over Cursor.items()
that will. I would suggest refactoring your code to call the next
method of Cursor.items()
in an infinite loop. That call should be placed in the try: except:
block, as it can raise an error.
代码如下(大致):
# above omitted for brevity
c = tweepy.Cursor(api.search,
q=search,
include_entities=True).items()
while True:
try:
tweet = c.next()
# Insert into db
except tweepy.TweepError:
time.sleep(60 * 15)
continue
except StopIteration:
break
这是有效的,因为当 Tweepy 引发 TweepError
时,它没有更新任何游标数据.下次它发出请求时,它将使用与触发速率限制的请求相同的参数,有效地重复它,直到它通过.
This works because when Tweepy raises a TweepError
, it hasn't updated any of the cursor data. The next time it makes the request, it will use the same parameters as the request which triggered the rate limit, effectively repeating it until it goes though.
这篇关于使用 Tweepy 避免 Twitter API 限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!