如何停止,杀死,停止或关闭使用Twitter Stream给出的流示例上的PycURL请求 [英] How to halt, kill, stop or close a PycURL request on a stream example given using Twitter Stream
问题描述
目前正在使用twitter API流(http://stream.twitter.com/1/statuses/sample.json),因此我会不断接收数据。我希望停止cURLing流一旦我已经从它检索X数量的对象(在示例中,我给10作为一个任意数字)。
你可以看到我如何已尝试在下面的代码中关闭连接。下面的代码从来没有执行过,因为它是一个连续的数据流。所以我试图关闭在body_callback的流,但是因为perform()当前运行我不能调用close()。
任何帮助将不胜感激。 p>
代码:
#Imports
import pycurl#用于做cURL请求
import base64#用于编码用户名和API密钥
import json#用于分解json对象
#访问流和API的设置
userName ='twitter_username'#我的用户名
password ='twitter_password'#我的API密钥
apiURL ='http://stream.twitter.com/1/statuses/sample.json'#the twitter api
tweets = []#一个Tweets数组
#使用tweets数组的方法
def how_many_tweets():
print'Collected:',len tweets)
return len(tweets)
class Tweet:
def __init __(self):
self.raw =''
self.id = ''
self.content =''
def decode_json(self):
return True
def set_id(self):
return true
def set_content(self):
return True
def set_raw(self,data):
self.raw = data
#用于打印流的API,它来自API
class Stream:
def __init __(self):
self.tweetBeingRead =''
def body_callback(self,buf):
#这将获取整个Tweets,并将它们添加到一个名为tweets
if(buf.startswith('{in_reply_to_status_id_str'))的数组:#This是tweet的开始
#添加Tweet to Global Array Tweets
print'Added:'#将输出发送到控制台
print self.tweetBeingRead#将输出打印到控制台
theTweetBeingProcessed = ()创建一个新的Tweet对象
theTweetBeingProcessed.set_raw(self.tweetBeingRead)#将其原始值设置为tweetBeingRead
tweets.append(theTweetBeingProcessed)#将它添加到全局tweets数组
#开始处理新的tweet
self.tweet = buf#从头开始一个新的tweet
else:
self.tweetBeingRead = self.tweetBeingRead + buf
if(how_many_tweets )> 10):
try:
curling.close()#这就是问题所在。我想关闭流
,除了异常作为CurlError:
print'Tried closing stream:',CurlError
#用于启动数据迁移流的cURLing
datastream = Stream()
curling = pycurl.Curl()
curling.setopt(curling.URL,apiURL)
curling.setopt(curling.HTTPHEADER,['Authorization:'+ base64 .b64encode(userName +:+ password)])
curling.setopt(curling.WRITEFUNCTION,datastream.body_callback)
curling.perform()#这是cURLing启动
print'到这里。'
curling.close()#这从来没有被调用。 (
(默认情况下,返回None与返回与传递给它的数字相同)。
当您中止时,整个传输将被视为完成,并且您的perform()调用会正确返回。
传输将返回错误中止。
Im currently cURLing the twitter API stream (http://stream.twitter.com/1/statuses/sample.json), so am constantly receiving data. I wish to stop cURLing the stream once i have retrieved X number of objects from it (in the example I give 10 as an arbitrary number).
You can see how I have attempted to close the connection in the code below. The code below curling.perform() never executes, due to the fact that it is a continuous stream of data. So I attempted to close the stream in the body_callback, however because perform() is currently running i can not invoke close().
Any help would be appreciated.
Code:
# Imports
import pycurl # Used for doing cURL request
import base64 # Used to encode username and API Key
import json # Used to break down the json objects
# Settings to access stream and API
userName = 'twitter_username' # My username
password = 'twitter_password' # My API Key
apiURL = 'http://stream.twitter.com/1/statuses/sample.json' # the twitter api
tweets = [] # An array of Tweets
# Methods to do with the tweets array
def how_many_tweets():
print 'Collected: ',len(tweets)
return len(tweets)
class Tweet:
def __init__(self):
self.raw = ''
self.id = ''
self.content = ''
def decode_json(self):
return True
def set_id(self):
return True
def set_content(self):
return True
def set_raw(self, data):
self.raw = data
# Class to print out the stream as it comes from the API
class Stream:
def __init__(self):
self.tweetBeingRead =''
def body_callback(self, buf):
# This gets whole Tweets, and adds them to an array called tweets
if(buf.startswith('{"in_reply_to_status_id_str"')): # This is the start of a tweet
# Added Tweet to Global Array Tweets
print 'Added:' # Priniting output to console
print self.tweetBeingRead # Printing output to console
theTweetBeingProcessed = Tweet() # Create a new Tweet Object
theTweetBeingProcessed.set_raw(self.tweetBeingRead) # Set its raw value to tweetBeingRead
tweets.append(theTweetBeingProcessed) # Add it to the global array of tweets
# Start processing a new tweet
self.tweet = buf # Start a new tweet from scratch
else:
self.tweetBeingRead = self.tweetBeingRead+buf
if(how_many_tweets()>10):
try:
curling.close() # This is where the problem lays. I want to close the stream
except Exception as CurlError:
print ' Tried closing stream: ',CurlError
# Used to initiate the cURLing of the Data Sift streams
datastream = Stream()
curling = pycurl.Curl()
curling.setopt(curling.URL, apiURL)
curling.setopt(curling.HTTPHEADER, ['Authorization: '+base64.b64encode(userName+":"+password)])
curling.setopt(curling.WRITEFUNCTION, datastream.body_callback)
curling.perform() # This is cURLing starts
print 'I cant reach here.'
curling.close() # This never gets called. :(
You can abort the write callback by returning a number that isn't the same amount as was passed in to it. (By default it treats returning 'None' the same as returning the same number as was passed in to it)
When you abort it, the entire transfer will be considered done and your perform() call returns properly.
That transfer will then return an error as the transfer was aborted.
这篇关于如何停止,杀死,停止或关闭使用Twitter Stream给出的流示例上的PycURL请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!