Tweepy Streaming API 返回“无"用于启用地理功能的推文的坐标 [英] Tweepy Streaming API returning "None" for coordinates on geo-enabled tweets

查看:21
本文介绍了Tweepy Streaming API 返回“无"用于启用地理功能的推文的坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Tweepy 访问流 API.我可以使用下面的代码获得结果,但是对于 Geo Enabled 值为True"的推文,我得到的坐标返回值为False".怎么会这样?我是否需要解码为 status.coordinates 返回的 JSON 对象?

I am using Tweepy to access the streaming API. I am able to get results with the code below but for tweets where the Geo Enabled value is "True" I am getting a Coordinates returned value of "False". How can this be? Do I need to decode the JSON object being returned for status.coordinates?

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import random
import time
import MySQLdb
import json

consumer_key="XXX"
consumer_secret="XXX"

access_token="XXX"
access_token_secret="XXX"

db=MySQLdb.connect(host='localhost', user='XXX', passwd='XXX', db='twitter')
db.set_character_set('utf8')

Coords = dict()
Place = dict()
PlaceCoords = dict()
XY = []
curr=db.cursor()

class StdOutListener(StreamListener):
    """ A listener handles tweets that are the received from the stream.
    This is a basic listener that inserts tweets into MySQLdb.
    """
    def on_status(self, status):

        print "Tweet Text: ",status.text

        text = status.text

        print "Time Stamp: ",status.created_at

        print "Time Stamp: ",status.created_at

        print "Source: ",status.source

        source = status.source

        print "Author: ",status.user.screen_name

        author = status.user.screen_name

        print "Name: ",status.user.name

        name = status.user.name

        print "Time Zone: ",status.user.time_zone

        time_zone = status.user.time_zone

        print "User Language: ",status.user.lang

        user_language = status.user.lang

        print "Followers: ",status.user.followers_count

        followers = status.user.followers_count

        print "User Description: ",status.user.description

        user_description = status.user.description

        print "Geo Enabled: ",status.user.geo_enabled

        geo_enabled = status.user.geo_enabled

        print "Friends: ",status.user.friends_count

        friends = status.user.friends_count

        print "Retweets: ",status.retweet_count

        retweets = status.retweet_count

        print "Location: ",status.user.location

        location = status.user.location

        print "ID: ",status.user.id_str

        user_id = status.user.id_str

        print "Coordinates: ",status.coordinates

        coordinates = status.coordinates

        print "Place: ",status.place

        place = status.place

这是一个示例结果输出:

Here is a sample result output:

推文文字:@aranone aran tu eres el mejor soy tu fanatico 1 me gustatu musica.hey pana sique asi q vay bn te deseo lo mejor bro)

Tweet Text: @aranone aran tu eres el mejor soy tu fanatico 1 me gusta tu musica.hey pana sique asi q vay bn te deseo lo mejor bro)

时间戳:2013-05-30 23:36:38

Time Stamp: 2013-05-30 23:36:38

时间戳:2013-05-30 23:36:38

Time Stamp: 2013-05-30 23:36:38

来源:网络

作者:juandvd_96

Author: juandvd_96

姓名:胡安·大卫·罗梅罗

Name: juan David Romero

时区:大西洋时间(加拿大)

Time Zone: Atlantic Time (Canada)

用户语言:es

关注者:365

用户描述:hola soy juan david... soy una Chico muyenamorado... y soy muy fekiz...

User Description: hola soy juan david... soy una chico muy enamorado... y soy muy fekiz...

启用地理位置:真

朋友:1857

转推:0

地点:veezuela maracaibo

Location: veezuela maracaibo

编号:481513551

ID: 481513551

坐标:无

地点:无

干杯,BD

谢谢澄清.我刚才正在检查侦听器并注意到一条推文,其中填充了坐标但作为 json 对象.我正在将推文写入 mysql 数据库,因为它们是流式传输的,似乎带有坐标信息的推文没有插入到数据库中.不确定 SQL 语句周围的错误是针对第一条还是第二条推文,发生错误的两列都设置为varchar"值.这是流媒体结果:

Thanks for clarifying. I was checking out the listener just now and noticed a tweet where coordinates were populated but as a json object. I am writing tweets to a mysql db as they are streamed and it seems like the one with the coordinates info was not inserted into the database. Not sure if the errors around the SQL statement are for the first or second tweet, both columns where the error occurred are set to 'varchar' values. Here is the streaming result:

推文文字:Vi 10 minutos y no pude ver mas.大豆超级 cagona,dios.不一样.

Tweet Text: Vi 10 minutos y no pude ver mas. Soy super cagona, dios. Vay a ver otra.

时间戳:2013-06-04 01:08:57

Time Stamp: 2013-06-04 01:08:57

时间戳:2013-06-04 01:08:57

Time Stamp: 2013-06-04 01:08:57

来源:网络

作者:艾伦瓦利

姓名:Λili

时区:圣地亚哥

用户语言:es

关注者:384

用户描述:创造你的现实,否则就会为你创造

User Description: Create your reality or it will be created for you

http://instagram.com/ailenvalli

启用地理位置:真

朋友:338

转推:0

位置:704 East Broadway ▲ 1966

Location: 704 East Broadway ▲ 1966

编号:200264965

ID: 200264965

坐标:无

地点:无

firehose_geo.py:87: 警告:字符串值不正确:第 1 行名称"列的\xCE\x9Bili"

firehose_geo.py:87: Warning: Incorrect string value: '\xCE\x9Bili' for column 'Name' at row 1

(text,status.created_at,status.created_at,source,author,name,time_zone,user_language,followers,user_description,geo_enabled,friends,retweets,location,user_id,coordinates,geo))firehose_geo.py:87:警告:字符串值不正确:'\xE2\x96\xB2 19...' 列'位置'在第 1 行

(text,status.created_at,status.created_at,source,author,name,time_zone,user_language,followers,user_description,geo_enabled,friends,retweets,location,user_id,coordinates,geo)) firehose_geo.py:87: Warning: Incorrect string value: '\xE2\x96\xB2 19...' for column 'Location' at row 1

(text,status.created_at,status.created_at,source,author,name,time_zone,user_language,followers,user_description,geo_enabled,friends,retweets,location,user_id,coordinates,geo))

(text,status.created_at,status.created_at,source,author,name,time_zone,user_language,followers,user_description,geo_enabled,friends,retweets,location,user_id,coordinates,geo))

推文文字:我有一种感觉,沃尔玛正准备从我的钱包里拿出一大块.健康食品太贵了.

Tweet Text: I have a feeling WalMart is fixing to take a chunk out of my wallet. Healthy food is so expensive.

时间戳:2013-06-04 01:42:00

Time Stamp: 2013-06-04 01:42:00

时间戳:2013-06-04 01:42:00

Time Stamp: 2013-06-04 01:42:00

来源:Android 版 Twitter

Source: Twitter for Android

作者:KaylaRenae21

Author: KaylaRenae21

姓名:†Kayla Renae'

Name: †Kayla Renae'

时区:中部时间(美国和加拿大)

Time Zone: Central Time (US & Canada)

用户语言:en

关注者:300

用户描述:我喜欢做的事情在城市里是找不到的.给我一根鱼竿 &我会离开一整天.

User Description: The things I like to do cannot be found in the city. Hand me a fishing pole & I'll be gone all day.

启用地理位置:真

朋友:437

转推:0

地点:俄克拉荷马州

编号:282414509

ID: 282414509

坐标: {'type': 'Point', 'coordinates': [-96.6623549, 34.7918959]}

Coordinates: {'type': 'Point', 'coordinates': [-96.6623549, 34.7918959]}

地点:{'type':'点','坐标':[34.7918959,-96.6623549]}

Place: {'type': 'Point', 'coordinates': [34.7918959, -96.6623549]}

推荐答案

问题与 tweepy 本身无关.

例如,请参阅此推文(https:///api.twitter.com/1/statuses/show.json?id=341458303064354817&include_entities=true) - geo_enabled 设置为 true 而 geo, coordinatesplace 等于 null.

For example, see this tweet (https://api.twitter.com/1/statuses/show.json?id=341458303064354817&include_entities=true) - it has geo_enabled set to true while geo, coordinates and place equal to null.

根据 twitter 文档:

geo_enabled:当为真时,表示用户已启用对他们的推文进行地理标记的可能性.

geo_enabled: When true, indicates that the user has enabled the possibility of geotagging their Tweets.

因此,如果 geo_enabled 为真,推文数据中将包含位置信息并不是一个严格的规则.只需检查侦听器中的 status.geostatus.coordinates 是否 not None .

So, it's not a strict rule that there will be location info in the tweet data if geo_enabled is true. Just check if status.geo or status.coordinates are not None in your listener.

希望有所帮助.

这篇关于Tweepy Streaming API 返回“无"用于启用地理功能的推文的坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆