Tweepy - 排除转推 [英] Tweepy - Exclude Retweets

查看:31
本文介绍了Tweepy - 排除转推的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最终目标是使用 tweepy api 搜索来关注主题(即 docker)并排除转发.我查看了其他提到不包括转推的主题,但它们完全适用.我试图将我学到的东西合并到下面的代码中,但我相信如果不是"那段代码是在错误的地方.任何帮助是极大的赞赏.

Ultimate goal is to use the tweepy api search to focus on topics (i.e docker) and to EXCLUDE retweets. I have looked at other threads that mention excluding retweets but they were completely applicable. I have tried to incorporate what I've learned into the code below but I believe the "if not" piece of code is in the wrong place. Any help is greatly appreciated.

#!/usr/bin/python
import tweepy
import csv #Import csv
import os

# Consumer keys and access tokens, used for OAuth
consumer_key = 'MINE'
consumer_secret = 'MINE'
access_token = 'MINE'
access_token_secret = 'MINE'

# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)


api = tweepy.API(auth)
# Open/Create a file to append data
csvFile = open('docker1.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)


ids = set()
for tweet in tweepy.Cursor(api.search, 
                    q="docker", 
                    Since="2016-08-09", 
                    #until="2014-02-15", 
                    lang="en").items(5000000):
if not tweet['retweeted'] and 'RT @' not in tweet['text']:
    #Write a row to the csv file/ I use encode utf-8
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'), tweet.favorite_count, tweet.retweet_count, tweet.id, tweet.user.screen_name])
    #print "...%s tweets downloaded so far" % (len(tweet.id))
    ids.add(tweet.id) # add new id
    print ("number of unique ids seen so far: {}",format(len(ids)))
csvFile.close()

推荐答案

API 级别的过滤:

q='your_search -filter:retweets'

此处阅读更多相关信息.

愚蠢的方法是在代码中过滤

所以 tweet 是一个对象,而不是 JSON 或 dict,你不应该像 tweet['retweeted']tweet['text'] 那样访问它

So tweet is an object not a JSON or dict, you should not access it like tweet['retweeted'] and tweet['text']

改为使用这一行:

if not tweet.retweeted:

或者对于您的用例:

if (not tweet.retweeted) and ('RT @' not in tweet.text):

这篇关于Tweepy - 排除转推的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆