导入CSV文件时,Python 3中的UnicodeDecodeError [英] UnicodeDecodeError in Python 3 when importing a CSV file

查看:479
本文介绍了导入CSV文件时,Python 3中的UnicodeDecodeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要使用以下代码导入CSV:

  import csv 
import sys

def load_csv(filename):
#打开读取文件
file = open(filename,'r')

#读入文件
return csv.reader(file,delimiter =',',quotechar ='\\\
')

def main(argv):
csv_file = load_csv(myfile.csv)

for csv_file:
print(item)

如果__name__ ==__main__:
main(sys.argv [1:] )

这里是我的csv文件的示例:

  foo,bar,test,1,2 
this,wont,work,because,α

错误:

 跟踪b $ b文件test.py,第22行,在< module> 
main(sys.argv [1:])
文件test.py,第18行,在主
中csv_file中的项目:
文件/ usr / lib / [0]
UnicodeDecodeError:'ascii'编解码器无法解码0xce的字节0xc在位置40:顺序不在范围内(128)

显然, CSV和抛出的错误,但我的损失,如何解决这个问题。任何帮助?



这是:

  Python 3.2.3 default,Apr 23 2012,23:35:30)
[gcc 4.7.0 20120414(prerelease)] on linux2


解决方案

似乎你的问题归结为:

  (α)

您可以通过指定 PYTHONIOENCODING

  $ PYTHONIOENCODING = utf-8 python3 test.py> output.txt 

注意:



<$ p $如果您的终端配置正确,p> $ python3 test.py



支持它, test.py

  import csv 

with open('myfile.csv',newline ='',encoding ='utf-8')as file:
for row in csv.reader(file):
print row)

如果 open() 编码参数,你会得到 UnicodeDecodeError LC_ALL = C



也可以通过 LC_ALL = C 得到 UnicodeEncodeError 即使没有重定向,在这种情况下, PYTHONIOENCODING 是必要的。


I'm trying to import a CSV, using this code:

    import csv
    import sys

    def load_csv(filename):
        # Open file for reading
        file = open(filename, 'r')

        # Read in file
        return csv.reader(file, delimiter=',', quotechar='\n')

    def main(argv):
        csv_file = load_csv("myfile.csv")

        for item in csv_file:
            print(item)

    if __name__ == "__main__":
        main(sys.argv[1:])

Here's a sample of my csv file:

    foo,bar,test,1,2
    this,wont,work,because,α

And the error:

    Traceback (most recent call last):
      File "test.py", line 22, in <module>
        main(sys.argv[1:])
      File "test.py", line 18, in main
        for item in csv_file:
      File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128)

Obviously, It's hitting the character at the end of the CSV and throwing that error, but I'm at a loss as to how to fix this. Any help?

This is:

    Python 3.2.3 (default, Apr 23 2012, 23:35:30)
    [GCC 4.7.0 20120414 (prerelease)] on linux2

解决方案

It seems your problem boils down to:

print("α")

You could fix it by specifying PYTHONIOENCODING:

$ PYTHONIOENCODING=utf-8 python3 test.py > output.txt

Note:

$ python3 test.py 

should work as is if your terminal configuration supports it, where test.py:

import csv

with open('myfile.csv', newline='', encoding='utf-8') as file:
    for row in csv.reader(file):
        print(row)

If open() has no encoding parameter above then you'll get UnicodeDecodeError with LC_ALL=C.

Also with LC_ALL=C you'll get UnicodeEncodeError even if there is no redirection i.e., PYTHONIOENCODING is necessary in this case.

这篇关于导入CSV文件时,Python 3中的UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆