如何停止,杀死,停止或关闭使用Twitter Stream给出的流示例上的PycURL请求 [英] How to halt, kill, stop or close a PycURL request on a stream example given using Twitter Stream

查看:201
本文介绍了如何停止,杀死,停止或关闭使用Twitter Stream给出的流示例上的PycURL请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前正在使用twitter API流(http://stream.twitter.com/1/statuses/sample.json),因此我会不断接收数据。我希望停止cURLing流一旦我已经从它检索X数量的对象(在示例中,我给10作为一个任意数字)。



你可以看到我如何已尝试在下面的代码中关闭连接。下面的代码从来没有执行过,因为它是一个连续的数据流。所以我试图关闭在body_callback的流,但是因为perform()当前运行我不能调用close()。



任何帮助将不胜感激。 p>

代码:

 #Imports 
import pycurl#用于做cURL请求
import base64#用于编码用户名和API密钥
import json#用于分解json对象

#访问流和API的设置
userName ='twitter_username'#我的用户名
password ='twitter_password'#我的API密钥
apiURL ='http://stream.twitter.com/1/statuses/sample.json'#the twitter api
tweets = []#一个Tweets数组

#使用tweets数组的方法
def how_many_tweets():
print'Collected:',len tweets)
return len(tweets)

class Tweet:
def __init __(self):
self.raw =''
self.id = ''
self.content =''

def decode_json(self):
return True

def set_id(self):
return true

def set_content(self):
return True

def set_raw(self,data):
self.raw = data

#用于打印流的API,它来自API
class Stream:
def __init __(self):
self.tweetBeingRead =''

def body_callback(self,buf):
#这将获取整个Tweets,并将它们添加到一个名为tweets
if(buf.startswith('{in_reply_to_status_id_str'))的数组:#This是tweet的开始
#添加Tweet to Global Array Tweets
print'Added:'#将输出发送到控制台
print self.tweetBeingRead#将输出打印到控制台
theTweetBeingProcessed = ()创建一个新的Tweet对象
theTweetBeingProcessed.set_raw(self.tweetBeingRead)#将其原始值设置为tweetBeingRead
tweets.append(theTweetBeingProcessed)#将它添加到全局tweets数组
#开始处理新的tweet
self.tweet = buf#从头开始一个新的tweet
else:
self.tweetBeingRead = self.tweetBeingRead + buf
if(how_many_tweets )> 10):
try:
curling.close()#这就是问题所在。我想关闭流
,除了异常作为CurlError:
print'Tried closing stream:',CurlError

#用于启动数据迁移流的cURLing
datastream = Stream()
curling = pycurl.Curl()
curling.setopt(curling.URL,apiURL)
curling.setopt(curling.HTTPHEADER,['Authorization:'+ base64 .b64encode(userName +:+ password)])
curling.setopt(curling.WRITEFUNCTION,datastream.body_callback)
curling.perform()#这是cURLing启动
print'到这里。'
curling.close()#这从来没有被调用。 (


解决方案

(默认情况下,返回None与返回与传递给它的数字相同)。



当您中止时,整个传输将被视为完成,并且您的perform()调用会正确返回。



传输将返回错误中止。


Im currently cURLing the twitter API stream (http://stream.twitter.com/1/statuses/sample.json), so am constantly receiving data. I wish to stop cURLing the stream once i have retrieved X number of objects from it (in the example I give 10 as an arbitrary number).

You can see how I have attempted to close the connection in the code below. The code below curling.perform() never executes, due to the fact that it is a continuous stream of data. So I attempted to close the stream in the body_callback, however because perform() is currently running i can not invoke close().

Any help would be appreciated.

Code:

# Imports
import pycurl # Used for doing cURL request
import base64 # Used to encode username and API Key
import json # Used to break down the json objects

# Settings to access stream and API
userName = 'twitter_username' # My username
password = 'twitter_password' # My API Key
apiURL = 'http://stream.twitter.com/1/statuses/sample.json' # the twitter api
tweets = [] # An array of Tweets

# Methods to do with the tweets array
def how_many_tweets():
    print 'Collected: ',len(tweets)
    return len(tweets)

class Tweet:
    def __init__(self):
        self.raw = ''
        self.id = ''
        self.content = ''

    def decode_json(self):
        return True

    def set_id(self):
        return True

    def set_content(self):
        return True

    def set_raw(self, data):
        self.raw = data

# Class to print out the stream as it comes from the API
class Stream:
    def __init__(self):
        self.tweetBeingRead =''

    def body_callback(self, buf):
        # This gets whole Tweets, and adds them to an array called tweets
        if(buf.startswith('{"in_reply_to_status_id_str"')): # This is the start of a tweet
            # Added Tweet to Global Array Tweets
            print 'Added:' # Priniting output to console
            print self.tweetBeingRead # Printing output to console
            theTweetBeingProcessed = Tweet() # Create a new Tweet Object
            theTweetBeingProcessed.set_raw(self.tweetBeingRead) # Set its raw value to tweetBeingRead
            tweets.append(theTweetBeingProcessed) # Add it to the global array of tweets
            # Start processing a new tweet
            self.tweet = buf # Start a new tweet from scratch
        else:
            self.tweetBeingRead = self.tweetBeingRead+buf
        if(how_many_tweets()>10):
            try:
                curling.close() # This is where the problem lays. I want to close the stream
            except Exception as CurlError:
                print ' Tried closing stream: ',CurlError

# Used to initiate the cURLing of the Data Sift streams
datastream = Stream()
curling = pycurl.Curl()
curling.setopt(curling.URL, apiURL)
curling.setopt(curling.HTTPHEADER, ['Authorization: '+base64.b64encode(userName+":"+password)])
curling.setopt(curling.WRITEFUNCTION, datastream.body_callback)
curling.perform() # This is cURLing starts
print 'I cant reach here.'
curling.close() # This never gets called. :(

解决方案

You can abort the write callback by returning a number that isn't the same amount as was passed in to it. (By default it treats returning 'None' the same as returning the same number as was passed in to it)

When you abort it, the entire transfer will be considered done and your perform() call returns properly.

That transfer will then return an error as the transfer was aborted.

这篇关于如何停止,杀死,停止或关闭使用Twitter Stream给出的流示例上的PycURL请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆