替换pandas数据框中的值时出现str错误 [英] str error when replacing values in pandas dataframe

查看:61
本文介绍了替换pandas数据框中的值时出现str错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的代码从网站上抓取信息,并将其放入数据框.但是我不确定为什么代码的顺序会引起错误:AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

My code scrapes information from the website and puts it into a dataframe. But i'm not certain why the order of the code will give rise to the error: AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

基本上,抓取的数据超过20行10列.

Basically, the data scraped has over 20 rows and 10 columns.

  • 某些值在方括号ie: (2,333)中,我想将其更改为:-2333.
  • 某些值包含单词n.a,我想将其更改为numpy.nan
  • 一些值是-,我也想将它们更改为numpy.nan.
  • Some values are within brackets ie: (2,333) and I want to change it to: -2333.
  • Some values have words n.a and I want to change it to numpy.nan
  • some values are - and I want to change them to numpy.nan too.

不起作用

for final_df, engine_name in zip((df_foo, df_bar, df_far), (['engine_foo', 'engine_bar', 'engine_far'])):

# Replacing necessary items for final clean up

    final_df.replace('-', numpy.nan, inplace=True)
    final_df.replace('n.a.', numpy.nan, inplace=True)

    for i in final_df.columns:
        final_df[i] = final_df[i].str.replace(')', '')
        final_df[i] = final_df[i].str.replace(',', '')
        final_df[i] = final_df[i].str.replace('(', '-')

    # Appending Code to dataframe
    final_df = final_df.T
    final_df.insert(loc=0, column='Code', value=some_code)

# This produces the error - AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

工程

for final_df, engine_name in zip((df_foo, df_bar, df_far), (['engine_foo', 'engine_bar', 'engine_far'])):

# Replacing necessary items for final clean up

    for i in final_df.columns:
        final_df[i] = final_df[i].str.replace(')', '')
        final_df[i] = final_df[i].str.replace(',', '')
        final_df[i] = final_df[i].str.replace('(', '-')

    final_df.replace('-', numpy.nan, inplace=True)
    final_df.replace('n.a.', numpy.nan, inplace=True)

    # Appending Code to dataframe
    final_df = final_df.T
    final_df.insert(loc=0, column='Code', value=some_code)

# This doesn't give me any errors and returns me what I want. 

对为什么会发生这种情况有任何想法吗?

Any thoughts on why this happens?

推荐答案

对我来说,是双重

For me works double replace - first with regex=True for replace substrings and second for all values:

np.random.seed(23)
df = pd.DataFrame(np.random.choice(['(2,333)','n.a.','-',2.34], size=(3,3)), 
                  columns=list('ABC'))
print (df)
      A     B        C
0  2.34     -  (2,333)
1  n.a.     -  (2,333)
2  2.34  n.a.  (2,333)

df1 = df.replace(['\(','\)','\,'], ['-','',''], regex=True).replace(['-','n.a.'], np.nan)
print(df1)
      A   B      C
0  2.34 NaN  -2333
1   NaN NaN  -2333
2  2.34 NaN  -2333

df1 = df.replace(['-','n.a.'], np.nan).replace(['\(','\)','\,'], ['-','',''], regex=True)
print(df1)  
      A   B      C
0  2.34 NaN  -2333
1   NaN NaN  -2333
2  2.34 NaN  -2333

您的错误意味着您要用

Your error means you want replace some non string column (e.g. all columns are NaNs in column B) by str.replace:

df1 = df.apply(lambda x: x.str.replace('\(','-').str.replace('\)','')
                           .str.replace(',','')).replace(['-','n.a.'], np.nan)
print(df1)
      A   B      C
0  2.34 NaN  -2333
1   NaN NaN  -2333
2  2.34 NaN  -2333 


df1 = df.replace(['-','n.a.'], np.nan)
       .apply(lambda x: x.str.replace('\(','-')
                         .str.replace('\)','')
                         .str.replace(',',''))
print(df1)

AttributeError :(只能对带字符串值的.str访问器使用,在熊猫中使用np.object_ dtype",发生在索引B")

AttributeError: ('Can only use .str accessor with string values, which use np.object_ dtype in pandas', 'occurred at index B')

dtype Bfloat64:

df1 = df.replace(['-','n.a.'], np.nan)
print(df1)
      A   B        C
0  2.34 NaN  (2,333)
1   NaN NaN  (2,333)
2  2.34 NaN  (2,333)

print (df1.dtypes)
A     object
B    float64
C     object
dtype: object

这篇关于替换pandas数据框中的值时出现str错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆