UnicodeDecodeError: ('utf-8' codec) 读取 csv 文件时 [英] UnicodeDecodeError: ('utf-8' codec) while reading a csv file
问题描述
我正在尝试读取 csv 以创建数据帧---在列中进行更改---再次将更改的值更新/反映到相同的 csv(to_csv) 中-再次尝试读取该 csv 以创建另一个数据帧...我收到一个错误
what i am trying is reading a csv to make a dataframe---making changes in a column---again updating/reflecting changed value into same csv(to_csv)- again trying to read that csv to make another dataframe...there i am getting an error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 7: invalid continuation byte
我的代码是
import pandas as pd
df = pd.read_csv("D:ss.csv")
df.columns #o/p is Index(['CUSTOMER_MAILID', 'False', 'True'], dtype='object')
df['True'] = df['True'] + 2 #making changes to one column of type float
df.to_csv("D:ss.csv") #updating that .csv
df1 = pd.read_csv("D:ss.csv") #again trying to read that csv
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 7: invalid continuation byte
所以请建议我如何避免错误并能够再次读取该 csv 到数据帧.
So please suggest how can i avoid the error and be able to read that csv again to a dataframe.
我知道我在读取和写入 csv 时遗漏了编码 = 某种编解码器类型"或解码 = 某种类型".
I know somewhere i am missing "encode = some codec type" or "decode = some type" while reading and writing to csv.
但我不知道究竟应该改变什么.所以需要帮助.
But i don't know what exactly should be changed.so need help.
推荐答案
已知编码
如果您知道要读入的文件的编码,你可以使用
Known encoding
If you know the encoding of the file you want to read in, you can use
pd.read_csv('filename.txt', encoding='encoding')
这些是可能的编码:https://docs.python.org/3/library/codecs.html#standard-encodings
如果你不知道编码,你可以尝试使用chardet,但是这不能保证工作.这更像是一种猜测.
If you do not know the encoding, you can try to use chardet, however this is not guaranteed to work. It is more a guess work.
import chardet
import pandas as pd
with open('filename.csv', 'rb') as f:
result = chardet.detect(f.read()) # or readline if the file is large
pd.read_csv('filename.csv', encoding=result['encoding'])
这篇关于UnicodeDecodeError: ('utf-8' codec) 读取 csv 文件时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!