pandas:文件格式和文件名中带有重音/特殊字符的oserror [英] pandas: oserror with accent/special character in file path and file name

查看:142
本文介绍了pandas:文件格式和文件名中带有重音/特殊字符的oserror的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用pandas.read_csv从某些.csv文件中获取数据.只要文件名或文件路径中没有重音(例如ä,é,ü),此方法就可以正常工作.一旦使用诸如düm1.csv之类的文件名,就会出现以下错误:OSError: Initializing from file failed.我的代码是:

I am trying to use pandas.read_csv to get data from some .csv files. This works fine as long as there is no accent (e.g. ä,é,ü) in the file name or file path. As soon as I use a file name such as düm1.csvI get the following error: OSError: Initializing from file failed. My code is:

dum1 = pd.read_csv(r"C:\Users\MyName\Desktop\dumm12\düm1.csv", sep = ";", decimal = ",", encoding = "utf-8")

我正在使用pandas 0.20.1和python 3.6.0.我发现这在以前的版本中是一个问题,但我认为它已解决.有想法该怎么解决这个吗?我也发现了这一点: https://github.com/pandas-dev/pandas/issues/15086

I am using pandas 0.20.1 and python 3.6.0. I have found that this has been an issue in previous versions but I thought it had been resolved. Any ideas on how to fix this? I also found this: https://github.com/pandas-dev/pandas/issues/15086

pd.show_versions()的输出:

output of pd.show_versions():

已安装的版本 提交:无 的Python:3.6.0.final.0 python位:64 操作系统:Windows 操作系统版本:10 机器:AMD64 处理器:Intel64 Family 6 Model 78 Stepping 3,正版英特尔 字节序:小 LC_ALL:无 朗:en 地点:无.无

INSTALLED VERSIONS commit: None python: 3.6.0.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None

熊猫:0.20.1 pytest的:3.0.5 点:9.0.1 setuptools:27.2.0 Cython:无 numpy的:1.11.3 scipy:0.18.1 xarray:无 IPython:5.2.2 狮身人面像:1.5.1 麻痹:0.4.1 dateutil的:2.6.0 pytz:2016.10 blosc:无 瓶颈:1.2.0 表格:3.2.2 numexpr的:2.6.2 羽毛:无 matplotlib:2.0.0 openpyxl:2.4.1 xlrd:1.0.0 xlwt:1.2.0 xlsxwriter:0.9.6 lxml:3.7.3 bs4:4.5.3 html5lib:0.999 sqlalchemy:1.1.5 pymysql:无 psycopg2:无 jinja2:2.9.5 s3fs:无 pandas_gbq:无 pandas_datareader:无

pandas: 0.20.1 pytest: 3.0.5 pip: 9.0.1 setuptools: 27.2.0 Cython: None numpy: 1.11.3 scipy: 0.18.1 xarray: None IPython: 5.2.2 sphinx: 1.5.1 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: 3.2.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.0 openpyxl: 2.4.1 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.5.3 html5lib: 0.999 sqlalchemy: 1.1.5 pymysql: None psycopg2: None jinja2: 2.9.5 s3fs: None pandas_gbq: None pandas_datareader: None

推荐答案

我遇到了类似的问题.在Windows系统中,使用Python 3.6的pandas.read_csv似乎出现了问题.

I had a similar problem. It's look like the problem occurs with pandas.read_csv with Python 3.6 in a Windows system.

Python 3.6将Windows文件系统编码从"mbcs"更改为"UTF-8".参见 Python PEP 529 .您可以使用命令sys.getfilesystemencoding()获取当前文件系统编码

Python 3.6 change Windows filesystem encoding from "mbcs" to "UTF-8". See Python PEP 529. You can use the command sys.getfilesystemencoding() to get the current file system encoding

我有两种解决方法:

1.--使用此代码将所有应用更改为可与以前的Python< = 3.5编码("mbcs")一起使用

1.- Use this code to change all the app to works with the prior Python <= 3.5 encoding ("mbcs")

import sys
sys._enablelegacywindowsfsencoding()

2.-将文件指针传递给pandas.read_csv

2.- Pass a file pointer to the pandas.read_csv

with open("C:\Users\MyName\Desktop\dumm12\düm1.csv", 'r') as fp:
        dum1 = pd.read_csv(fp, sep = ";", decimal = ",", encoding = "utf-8")

您可以看到此帖子: pandas .read_csv无法导入路径中带有重音符号的文件

这篇关于pandas:文件格式和文件名中带有重音/特殊字符的oserror的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆