如何修复tweepy中的编码? [英] How can I fix Encoding in tweepy?

查看:90
本文介绍了如何修复tweepy中的编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在tweety模块上工作,我尝试使用此代码通过书面土耳其语获取推文(土耳其语,某些字符不支持ASCII,例如ğ,ş,ö,ç,İ,ı),但我需要显然是整个数据。
导入时间

ckey = ****
csecret = ****
令牌= ****
asecret = ****


类监听器(StreamListener):

def on_data(self,data):
tweet =数据。 split(', text:')[1] .split(', source')[0]
print tweet
saveThis = str(time.time())+': :'+鸣叫
saveData = open( archive.csv, a)
saveData.write(saveThis)
saveData.write( \n)
saveData.close()
返回True

def on_error(self,status):
打印状态

auth = OAuthHandler(ckey,csecret)
auth.set_acces s_token(atoken,asecret)

twitterStream = Stream(auth,listener())
twitterStream.filter(track = [ galatasaray])

它给了我

  1445282560.38 :: RT @EndlesGALA:T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\ / \ / t.co\ / 0K6jLC0CHd 
1445282563.02 ::: RT @ Gkhanutkan1907:Galatasaray'\u0131她的Sene Yenmek Bizim \u0130\u00e7in Ba\u015far\u0131 De\u011fil Genlerimizde Olan Bir GELENEKT\u0130R!
1445282563.26 :: 22:22 GALATASARAY
1445282564.84 :: RT @mthnzncrkrn:Karanl\u0131k Elbet Kavu\u015fur Ayd\u0131nl\u0131\u011fa。阿拉院\u0131mc\u0131n olsun Kadir BABA .\nGALATASARAY taraftar\u0131麻哨兵! #KadirAkta\u015fSu\u00e7suzd\u2026
1445282569.29 :: RT @EndlesGALA:T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\ \ / \ / t.co\ / 0K6jLC0CHd
1445282570.29 :: Fenerbah\u00e7e-Galatasaray derbisinin biletlerine yo\u011fun ilgi:https:\ / \ / t.co\ / VZ2whsiZNo
1445282571.2 :: RT @EndlesGALA:T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\ / \ / t.co\ / 0K6jLC0CHd
1445282571.95 :: Kom\u015fularla s\u0131f\u0131r sorundan s\u0131f\u0131r kom\u015fuya !! #galatasaray #NTV #yakinda#Bug\u00fcnTV#Ak\u015fam #Takvim#D\u00fcnyaKahveG\u00fcn\u00fc #menzil https:\ / \ / t.co\ / bcHNoG4UMN

我该如何解决?

解决方案

tweepy像世界上几乎所有的Web服务一样为您提供JSON编码的数据。

  tweet =数据。 split(', text:')[1] .split(', source')[0] 

那不是处理json的正确方法,而是:

  import json(在脚本顶部) )

然后:

  class listener(StreamListener):
def on_data(self,data):
tweet = json.loads(data)
打印鸣叫

tweet将是经典的python对象,这里是字典。


I'm working on tweety module and I tried this code for taking tweet by written Turkish(in turkish language,some charecter doesn't support ascii such as ğ,ş,ö,ç,İ,ı) but I need clearly whole data.

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time

ckey    = "****"
csecret  = "****"
atoken  = "****"
asecret = "****"


class listener(StreamListener):

    def on_data(self,data):
        tweet    = data.split(',"text":"')[1].split('","source')[0]
        print tweet
        saveThis = str(time.time()) + '::' + tweet
        saveData = open("archive.csv","a")
        saveData.write(saveThis)
        saveData.write("\n")
        saveData.close()
        return True

    def on_error(self,status):
        print status

auth = OAuthHandler(ckey,csecret)
auth.set_access_token(atoken,asecret)

twitterStream = Stream(auth,listener() )
twitterStream.filter(track = ["galatasaray"]) 

it gives me

1445282560.38::RT @EndlesGALA: T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\/\/t.co\/0K6jLC0CHd
1445282563.02::RT @Gkhanutkan1907: Galatasaray'\u0131 Her Sene Yenmek Bizim \u0130\u00e7in Ba\u015far\u0131 De\u011fil Genlerimizde Olan Bir GELENEKT\u0130R!
1445282563.26::22:22 GALATASARAY
1445282564.84::RT @mthnzncrkrn: Karanl\u0131k Elbet Kavu\u015fur Ayd\u0131nl\u0131\u011fa. Allah yard\u0131mc\u0131n olsun Kadir BABA .\nGALATASARAY taraftar\u0131 hep seninle! #KadirAkta\u015fSu\u00e7suzd\u2026
1445282569.29::RT @EndlesGALA: T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\/\/t.co\/0K6jLC0CHd
1445282570.29::Fenerbah\u00e7e - Galatasaray derbisinin biletlerine yo\u011fun ilgi:  https:\/\/t.co\/VZ2whsiZNo
1445282571.2::RT @EndlesGALA: T\u00fcrkiyedir #Galatasaray \nKim daha b\u00fcy\u00fck tart\u0131smayal\u0131m isterseniz http:\/\/t.co\/0K6jLC0CHd
1445282571.95::Kom\u015fularla s\u0131f\u0131r sorundan s\u0131f\u0131r kom\u015fuya!! #galatasaray #NTV #yakinda #Bug\u00fcnTV #Ak\u015fam #Takvim #D\u00fcnyaKahveG\u00fcn\u00fc #menzil https:\/\/t.co\/bcHNoG4UMN

How can I fix?

解决方案

tweepy gives you JSON-encoded data like almost every web-service in the world.

tweet    = data.split(',"text":"')[1].split('","source')[0]

that's not the right way to deal with json. do instead:

import json (at top of script)

then:

class listener(StreamListener):
    def on_data(self,data):
        tweet = json.loads(data)
        print tweet

tweet will be a classical python object. here its a dict.

这篇关于如何修复tweepy中的编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆