Twitter Streaming API - urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead [英] Twitter Streaming API - urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead

查看：79 发布时间：2021/9/10 20:21:16 python twitter tweepy twitter-streaming-api

本文介绍了Twitter Streaming API - urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用 tweepy 运行 python 脚本，该脚本在英语推文的随机样本中流式传输(使用 twitter 流 API)一分钟，然后交替搜索(使用 twitter 搜索 API)一分钟，然后返回.我发现的问题是，大约 40 多秒后，流媒体崩溃并出现以下错误:

Running a python script using tweepy which streams (using the twitter streaming API) in a random sample of english tweets, for a minute and then alternates to searching (using the twitter searching API) for a minute and then returns. Issue I've found is that after about 40+ seconds the streaming crashes and gives the following error:

完全错误:

urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

读取的字节数从 0 到 1000 不等.

The amount of bytes read can vary from 0 to well in the 1000's.

第一次看到流过早中断，搜索功能提前启动，搜索功能完成后，它再次返回流，第二次再次出现此错误时，代码崩溃.

The first time this is seen the streaming cuts out prematurely and the search function starts early, after the search function is done it comes back to the stream once again and on the second recurrence of this error the code crashes.

我正在运行的代码是:

# Handles date time calculation
def calculateTweetDateTime(tweet):
    tweetDateTime = str(tweet.created_at)

    tweetDateTime = ciso8601.parse_datetime(tweetDateTime)
    time.mktime(tweetDateTime.timetuple())
    return tweetDateTime

# Checks to see whether that permitted time has past.
def hasTimeThresholdPast():
    global startTime
    if time.clock() - startTime > 60:
        return True
    else:
        return False

#override tweepy.StreamListener to add logic to on_status
class StreamListener(StreamListener):

    def on_status(self, tweet):
        if hasTimeThresholdPast():
            return False

        if hasattr(tweet, 'lang'):
            if tweet.lang == 'en':

                try:
                    tweetText = tweet.extended_tweet["full_text"]
                except AttributeError:
                    tweetText = tweet.text

                tweetDateTime = calculateTweetDateTime(tweet)

                entityList = DataProcessing.identifyEntities(True, tweetText)
                DataStorage.storeHotTerm(entityList, tweetDateTime)
                DataStorage.storeTweet(tweet)


    def on_error(self, status_code):
        def on_error(self, status_code):
            if status_code == 420:
                # returning False in on_data disconnects the stream
                return False


def startTwitterStream():

    searchTerms = []

    myStreamListener = StreamListener()
    twitterStream = Stream(auth=api.auth, listener=StreamListener())
    global geoGatheringTag
    if geoGatheringTag == False:
        twitterStream.filter(track=['the', 'this', 'is', 'their', 'though', 'a', 'an'], async=True, stall_warnings=True)

    if geoGatheringTag == True:
        twitterStream.filter(track=['the', 'this', 'is', 'their', 'though', 'a', 'an', 'they\'re'],
                             async=False, locations=[-4.5091, 55.7562, -3.9814, 55.9563], stall_warnings=True)



# ----------------------- Twitter API Functions ------------------------
# XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
# --------------------------- Main Function ----------------------------

startTime = 0


def main():
    global startTime
    userInput = ""
    userInput.lower()
    while userInput != "-1":
        userInput = input("Type ACTiVATE to activate the Crawler, or DATABASE to access data analytic option (-1 to exit): \n")
        if userInput.lower() == 'activate':
            while(True):
                startTime = time.clock()

                startTwitterStream()

                startTime = time.clock()
                startTwitterSearchAPI()

if __name__ == '__main__':
    main()

我删除了搜索功能和数据库处理方面，因为它们是独立的，并且避免了代码混乱.

I've trimmed out the search function, and database handling aspects given they're seperate and to avoid cluttering up the code.

如果有人知道为什么会发生这种情况以及我如何解决它，请告诉我，我对任何见解都很好奇.

If anyone has any ideas why this is happening and how I might solve it please let me know, I'd be curious on any insight.

我尝试过的解决方案:
带有 http.client.IncompleteRead:
的 Try/Except 块根据 Error-while-fetching-tweets-with-tweepy

将 Stall_Warning = 设置为 True:
根据 Incompleteread-error-when-retrieving-twitter-data-using-蟒蛇

Setting Stall_Warning = to True:
As per Incompleteread-error-when-retrieving-twitter-data-using-python

删除英语语言过滤器.

Twitter Streaming API - urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead [英] Twitter Streaming API - urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Twitter Streaming API - urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead [英] Twitter Streaming API - urllib3.exceptions.ProtocolError: (&#39;Connection broken: IncompleteRead

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Twitter Streaming API - urllib3.exceptions.ProtocolError: ('连接中断:IncompleteRead [英] Twitter Streaming API - urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead

登录关闭