Pandas 数据框:从列中的字符串中提取浮点值 [英] Pandas dataframe: Extracting float values from string in a column

查看:71
本文介绍了Pandas 数据框:从列中的字符串中提取浮点值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从特定列的字符串中提取浮点值.

I'm trying to extract a floating value from a string for a particular column.

原始输出

DATE        strCondition
4/3/2018    2.9
4/3/2018    3.1, text
4/3/2018    2.6 text
4/3/2018    text, 2.7 

和其他变体.我也试过正则表达式,但我的知识有限,我想出了:

and other variations. I've also tried regex but my knowledge here is limited, I've come up with:

clean = df['strCondition'].str.contains('\d+km')
df['strCondition'] = df['strCondition'].str.extract('(\d+)', expand = False).astype(float)

输出最终看起来像这样,它显示显示的主要整数...

where the output ends up looking like this where it displays the main integer shown...

DATE        strCondition
4/3/2018    2.0
4/3/2018    3.0
4/3/2018    2.0
4/3/2018    2.0 

我想要的输出是这样的:

My desired output would be along the lines of:

DATE        strCondition
4/3/2018    2.9
4/3/2018    3.1
4/3/2018    2.6
4/3/2018    2.7 

感谢您的时间和投入!

我忘了提到在我的原始数据框中有类似于

I forgot to mention that in my original dataframe there are strCondition entries similar to

2.9(1.0) #where I would like both numbers to get returned
11/11/2018 #where this date as a string object can be discarded 

抱歉给您带来不便!

推荐答案

尝试:

df['float'] = df['strCondition'].str.extract(r'(\d+.\d+)').astype('float')

输出:

       DATE strCondition  float
0  4/3/2018          2.9    2.9
1  4/3/2018    3.1, text    3.1
2  4/3/2018     2.6 text    2.6
3  4/3/2018    text, 2.7    2.7

这篇关于Pandas 数据框:从列中的字符串中提取浮点值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆