如何从Twitter API解码Twitter? [英] How do I decode the Twitter from the Twitter API?

查看:154
本文介绍了如何从Twitter API解码Twitter?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行以下代码,它给了我包含单词cat的推文,但是在某些时候,我收到错误



代码是:



I am running the following code, which gives me the tweets that contain the word cat, however at some points, I get an error

The code is:

import tweepy
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import sentmod as s

#consumer key, consumer secret, access token, access secret.
ckey= "xxxxx"
csecret="xxxx"
atoken="xxxxx"
asecret="xxxxx"

class listener(StreamListener):
    
    def on_data(self, data):
        all_data = json.loads(data)
        tweet = all_data["text"]       
        sentiment_value, confidence = s.sentiment(tweet)
        tweet.encode('utf-8', 'ignore')
        if "RT" in tweet:
            pass
        else:
            tweets=open("tweets.txt","a",encoding="utf-8")
            tweets.write(tweet)
            tweets.write('\n')
            tweets.write(str(sentiment_value))
            tweets.write('\n')
            tweets.write(str(confidence))
            tweets.write('\n\n\n')
            tweets.close()
            print(tweet, sentiment_value, confidence)
            if confidence*100 >= 60:
                output = open("twitter-out.txt","a")
                output.write(sentiment_value)
                output.write('\n')
                output.close()
                return True


    def on_error(self, status):
        print(status)

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=['Cat'],languages=['en'])  #locations=[]











几条推文后我收到以下错误:







print(tweet,sentiment_value,confidence)

UnicodeEncodeError:'UCS-2'编解码器无法编码位置44-44:Tk不支持非BMP字符



我尝试过:



使用utf-8和utf-16进行解码和编码,但无效






I get the following error after a few tweets:



print(tweet, sentiment_value, confidence)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 44-44: Non-BMP character not supported in Tk

What I have tried:

decoding and encoding with utf-8 and utf-16, but did not work

推荐答案

我建​​议谷歌搜索是你最好的路线去。鉴于我不打算运行你的代码来查看它创建的推文转储,很可能的问题是你得到的数据是表情符号(我在谷歌搜索的前5分钟读到)并且在UCS中不支持 - 2编解码器。



python - 'UCS-2'编解码器无法编码1050-1050位置的字符 - 堆栈溢出 [ ^ ]



非BMP字符Unicode错误·问题#16·geduldig / TwitterAPI·GitHub [< a href =https://github.com/geduldig/TwitterAPI/issues/16target =_ blanktitle =新窗口> ^ ]



无法编码非BMP字符·问题#624·tweepy / tweepy·GitHub [ ^ ]



UnicodeEncodeError:'UCS -2'编解码器无法对位置44-44中的字符进行编码:Tk不支持非BMP字符 - Google搜索 [ ^ ]
I suggest a google search would be your best route to go. Given that I am not about to run your code to see the tweet dump it creates, the highly likely issue is the data you are getting are emoji's (which i read in the first 5 minutes of googling) and are not supported in the UCS-2 codec.

python - 'UCS-2' codec can't encode characters in position 1050-1050 - Stack Overflow[^]

"Non-BMP character" Unicode error · Issue #16 · geduldig/TwitterAPI · GitHub[^]

Can't encode Non-BMP character · Issue #624 · tweepy/tweepy · GitHub[^]

UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 44-44: Non-BMP character not supported in Tk - Google Search[^]


我的问题是我使用utf编码推文8,但是,正确的编码是utf-16be,我从这个链接中获取:
My problem was that I was encoding the tweets using utf-8, however, the correct encoding is utf-16be which I took from this link:


这篇关于如何从Twitter API解码Twitter?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆