将Tweepy数据从文本文件转换为数据帧 [英] Transforming Tweepy data from text file into dataframe
问题描述
我正在尝试根据提取到文本文件中的扭曲数据创建数据框。
I am trying to create a dataframe from tweepy data I pulled into a text file.
但是,当我尝试使用所需的列创建数据框时,不会生成任何内容。代码会运行,但没有输出。
however, when I try to create the dataframe with the columns I want, nothing is generated. The code runs, but there is just no output.
下面是代码:
#writing text file
,其中open( jsontweet3.txt, a)作为txtfile:
txtfile.write('tweet_id retweet_count favorite_count \n')
#writing text file with open("jsontweet3.txt", "a") as txtfile: txtfile.write('tweet_id retweet_count favorite_count \n')
#pulling tweet info
for tweet_id in fdf.tweet_id:
try:
twitinfo = tweetapi.get_status(str(tweet_id), tweet_mode='extended')
except:
# Not able to get tweet --> add to failed_tweets list
failed_tweets.append(tweet_id)
else:
# only gets executed if the try clause did not fail
retweets = twitinfo.retweet_count
favorites = twitinfo.favorite_count
txtfile.write(str(twitinfo)+' '+str(retweets)+' '+str(favorites)+'\n')
tdf = pd.DataFrame(columns=['tweet_id','retweet_count','favorite_count'])
with open('jsontweet3.txt','r') as file:
for line in file:
twitinfo,retweets,favorites= line[:-1].split(' ')
tdf = tdf.append({'tweet_id':twitinfo,'retweet_count':retweets,'favorite_count':favorites},ignore_index=True)
tdf
非常感谢所有帮助!
推荐答案
除了我对 for
循环缩进的注释和 .readlines()
,我建议:
Aside from my comments on the indentation of the for
loop and the .readlines()
, I'd suggest either:
1)将tweepy数据作为csv(用逗号分隔,而不是空格),然后 pd.read_csv()
将生成csv
1) writing the tweepy data as a csv (separate with commas, instead of spaces) and then pd.read_csv()
will generate the csv
2)创建创建文本文件的同时显示数据框。只需在第一个 for
之前生成tdf,然后在执行<时将 tdf.append()
行code> txtfile.write()
2) creating the dataframe at the same time you are creating the text file. Simply generate the tdf before the first for
, and then have your tdf.append()
line as you doing the txtfile.write()
这篇关于将Tweepy数据从文本文件转换为数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!