使用df.to_csv()编码错误 [英] Encoding error using df.to_csv()
本文介绍了使用df.to_csv()编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我试图将信息从Twits(屏幕名称,created_at和文本)保存到熊猫DataFrame中,然后将DataFrame另存为csv文件.
I am trying to save information from Twits (screen_name, created_at and text) into a pandas DataFrame and then save DataFrame as a csv file.
我遇到编码错误
import tweepy
from tweepy import OAuthHandler
consumer_key = 'bla'
consumer_secret = 'bla'
access_token = 'bla'
access_secret = 'bla'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
import pandas as pd
import numpy as np
import datetime
import sys
encoding = sys.stdout.encoding or 'utf-8'
columns = ['Screen_Name', 'Time_Stamp', 'Tweet']
todays_date = datetime.datetime.now().date()
tweetDF = pd.DataFrame(columns=columns)
for tweet in tweepy.Cursor(api.search, q="manhattan", lang="en").items(10):
lenDF = len(tweetDF)
tweetDF.loc[lenDF] = [tweet.user.screen_name, tweet.created_at, tweet.text]
tweetDF.to_csv("C:/tweetDF")
这里是错误:
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-11-c0aa5e7ee620> in <module>()
---> 34 tweetDF.to_csv("C:/tweetDF")
C:\Anaconda\lib\site-packages\pandas\core\frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal, **kwds)
1187 escapechar=escapechar,
1188 decimal=decimal)
-> 1189 formatter.save()
1190
1191 if path_or_buf is None:
C:\Anaconda\lib\site-packages\pandas\core\format.pyc in save(self)
1465
1466 else:
-> 1467 self._save()
1468
1469 finally:
C:\Anaconda\lib\site-packages\pandas\core\format.pyc in _save(self)
1565 break
1566
-> 1567 self._save_chunk(start_i, end_i)
1568
1569 def _save_chunk(self, start_i, end_i):
C:\Anaconda\lib\site-packages\pandas\core\format.pyc in _save_chunk(self, start_i, end_i)
1592 quoting=self.quoting)
1593
-> 1594 lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)
1595
1596 # from collections import namedtuple
pandas\lib.pyx in pandas.lib.write_csv_rows (pandas\lib.c:17992)()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2666' in position 7: ordinal not in range(128)
我尝试了各种编码增强功能,但未成功
I tried various encoding enhancements but was not successfull
推荐答案
我找到了解决问题的方法,并希望与他人分享:
I found the way to fix it and would like to share:
tweetDF.to_csv("C:/tweetDF", sep='\t', encoding = 'utf-8')
这篇关于使用df.to_csv()编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文