python pandas read_excel在describe()上返回UnicodeDecodeError [英] python pandas read_excel returns UnicodeDecodeError on describe()

查看:676
本文介绍了python pandas read_excel在describe()上返回UnicodeDecodeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我喜欢熊猫,但是我遇到了Unicode错误的真正问题。 read_excel()返回可怕的Unicode错误:

  import pandas as pd 
df = pd.read_excel('tmp。 xlsx',encoding ='utf-8')
df.describe()

---------------------- -------------------------------------------------- ---
UnicodeDecodeError Traceback(最近的最后一次调用)
...
UnicodeDecodeError:'ascii'编解码器无法解码位置259中的0xc2字节:ordinal不在范围(128)

我发现原来的Excel有(不间断的空格)在许多单元格的末尾,可能是为了避免将长字符串转换为float。



其中一个方法是剥离单元格,但是必须有更好的东西。

  for df.columns:
df [col] = df [col] .str.strip()

我使用的是anaconda2.2.0 win64,与pandas 0.16

解决方案

希望这有助于某人..



我有这个错误...

  UnicodeDecodeError:'ascii'编解码器无法解码字节.... 
/ pre>

在阅读Excel文件 df = pd.read_excel ... 并尝试分配一个新列到数据框,这样 df ['new_col'] ='foo bar'



经过仔细检查,我发现问题是...由于缺少列标题,数据帧中有一些'nan'列。在使用以下代码..其他一切都可以。

  df = df.dropna(axis = 1,how ='all')


I love pandas, but I am having real problems with Unicode errors. read_excel() returns the dreaded Unicode error:

import pandas as pd
df=pd.read_excel('tmp.xlsx',encoding='utf-8')
df.describe()

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 259: ordinal not in range(128)

I figured out that the original Excel had   (non-breaking space) at the end of many cells, probably to avoid conversion of long digit strings to float.

One way around this is to strip the cells, but there must be something better.

for col in df.columns:
    df[col]=df[col].str.strip()

I am using anaconda2.2.0 win64, with pandas 0.16

解决方案

Hope this helps someone..

I had this error...

UnicodeDecodeError: 'ascii' codec can't decode byte ....

after reading an Excel File df = pd.read_excel... and trying to assign a new column to the dataframe like this df['new_col'] = 'foo bar'

After closer inspection, I found the problem to be... there were some 'nan' columns in the dataframe due to missing column headers.. after dropping the 'nan' columns using the following code.. everything else was ok.

df = df.dropna(axis=1,how='all')

这篇关于python pandas read_excel在describe()上返回UnicodeDecodeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆