pandas 带功能也删除数值 [英] Pandas strip function removes numeric values as well

查看:66
本文介绍了 pandas 带功能也删除数值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,可以从下面的代码中生成

I have a dataframe which can be generated from the code below

data_file= pd.DataFrame({'studyid':[1,2,3],'age_interview': [' 56','57 ','55'],'ethnicity': ['Chinese','Indian','European'],'Marital_status': ['Single','Married','Widowed'],'Smoke_status':['Yes','No','No']}) 

创建以上数据框后,我将其融化并应用strip函数

Once I create the above dataframe, I melt it and apply the strip function

obs = data_file.melt('studyid', value_name='valuestring').sort_values('studyid')
obs['valuestring'].str.strip()

尽管在示例数据中效果很好,但在实际数据中,它也会删除数值.我遵循与上面相同的代码,但是数据不同.

Though it works fine in the sample data, in real data it removes the numeric value as well. I follow the same code as above but just the data is different.

请找到剥离功能之前和之后的屏幕截图

Please find the screenshots of before and after strip function

在"obs ['valuestring'].str.strip()"之前输出

"obs ['valuestring'].str.strip()"之后的输出

如何防止删除数值?

推荐答案

看起来您的列包含混合的整数和字符串.这是一个可重现的示例:

It looks like your column has mixed integers and strings. Here's a reproducible example:

s = pd.Series([1, np.nan, 'abc ', 2.0, '  def '])
s.str.strip()

0    NaN
1    NaN
2    abc
3    NaN
4    def
dtype: object

如果该值不是字符串,则将其隐式处理为NaN.

If the value is not string, it is implicitly handled as NaN.

解决方案是在调用strip之前将列及其所有值转换为字符串.

The solution is to convert the column and all its values to string before calling strip.

s.astype(str).str.strip()

0      1
1    nan
2    abc
3    2.0
4    def
dtype: object

您的情况应该是

obs['valuestring'] = obs['valuestring'].astype(str).str.strip()


请注意,如果要保留NaN,请在末尾使用mask.

s.astype(str).str.strip().mask(s.isna())

0      1
1    NaN
2    abc
3    2.0
4    def
dtype: object

这篇关于 pandas 带功能也删除数值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆