Pandas 数据框:有条件地替换字符 [英] Pandas Dataframe: Replace charactere conditionally
本文介绍了Pandas 数据框:有条件地替换字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,其中有一列名为Size".此列有一些包含 android 应用程序列表大小的值.
I have a dataframe with a column named "Size". This column have some values containing the size of an android applications list.
Size
8.7M
68M
2M
我需要将这些值替换为:
I need to replace these values to:
Size:
8700000
68000000
...
我想到了一个函数来验证字符串."处是否有一个点.如果存在,请将 M 值更改为五个零 (00000).如果不是,请将 M 值更改为六个零 (000000).你们能帮我解决吗?
I thought about a function that verifies if there is a dot at the string '.'. If it exists, change the M value to five zero's (00000). If not, change the M value to six zero's (000000). Could you guys help me with it?
推荐答案
多单元替换的通用解决方案:
General solution for replace by multiple units:
#dict for replace
_prefix = {'k': 1e3, # kilo
'M': 1e6, # mega
'B': 1e9, # giga
}
#all keys of dict separated by | (or)
k = '|'.join(_prefix.keys())
#extract values to new df
df1 = df['Size'].str.extract('(?P<a>[0-9.]*)(?P<b>' + k +')*', expand=True)
#convert numeric column to float
df1.a = df1.a.astype(float)
#map values by dictionary, replace NaN (no prefix) to 1
df1.b = df1.b.map(_prefix).fillna(1)
#multiple columns together
df['Size'] = df1.a.mul(df1.b).astype(int)
print (df)
Size
0 8700000
1 68000000
2 2000000
如果只想替换M
解决方案应该简化:
If want only replace M
solution should be simplified:
df['Size'] = df['Size'].str.replace('M', '').astype(float).mul(1e6).astype(int)
print (df)
Size
0 8700000
1 68000000
2 2000000
这篇关于Pandas 数据框:有条件地替换字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文