使用字典中的值替换列中的值 [英] use values in dictionary to replace values in column
问题描述
import pandas as pd
df= pd.DataFrame({'Data':['Hey this is 123456 Jonny B Good',
'This is Jonny B Good at 511-233-1137',
'Wow that is Alice N Wonderland A999b',
'Yes hi: Mick E Mouse 1A25629Q88 or ',
'Bye Mick E Mouse A13B ok was seen on '],
'E_ID': ['E11','E11', 'E22', 'E33', 'E33'],
'N_ID' : ['111', '112', '211', '311', '312'],
'Name' : ['JONNY B GOOD', 'JONNY B GOOD',
'ALICE N WONDERLAND',
'MICK E MOUSE', 'MICK E MOUSE'],
})
df
Data E_ID N_ID Name
0 Hey this is 123456 Jonny B Good E11 111 JONNY B GOOD
1 This is Jonny B Good at 511-233-1137 E11 112 JONNY B GOOD
2 Wow that is Alice N Wonderland A999b E22 211 ALICE N WONDERLAND
3 Yes hi: Mick E Mouse 1A25629Q88 or E33 311 MICK E MOUSE
4 Bye Mick E Mouse A13B ok was seen on E33 312 MICK E MOUSE
我有一个样本df
,如上所述.我也有示例字典d
,如下所示
I have a sample df
as seen above. I also have sample dictionary d
as seen below
d = {'E11': ['Jonny',
'B',
'Good',
'Jonny',
'B',
'Good',
'123456',
'511-233-1137'],
'E22': ['Alice',
'N',
'Wonderland',
'A999b'],
'E33': ['Mick',
'E' ,
'Mouse',
'Mick',
'E' ,
'Mouse',
'1A25629Q88',
'A13B',]}
我想使用d
中的值,例如Jonny
更改Data
中的相应值.所以0
行中的Jonny
将变为@@@
.
I would like use the values from d
e.g. Jonny
to change the corresponding value in Data
. So e.g. Jonny
in row 0
will become @@@
.
为此,我查看了 dict 和如何替换列值用熊猫中的字典键,但它们没有太多帮助.我想我需要使用类似的
To do so, I have looked Remap values in pandas column with a dict and how to replace column values with dictionary keys in pandas but they arent much help. I think I need to use something like this
df['New'] = df['Data'].str.replace(d[value], '@@@')
我希望我的输出看起来像这样
I would like my output to look like this
Data E_ID N_ID Name New
0 Hey this is @@@ @@@ @@@ @@@
1 This is @@@ @@@ @@@ at @@@
2 Wow that is @@@ @@@ @@@ @@@
3 Yes hi: @@@ @@@ @@@ @@@ or
4 Bye @@@ @@@ @@@ @@@ ok was seen on
我需要怎么做才能获得此输出?
What do I need to do to get this output?
推荐答案
您可以生成和使用正则表达式,如下所示:
You could generate and use regular expressions, like this:
df['New']= df['Data']
for key, value in d.items():
regex='({alternatives})'.format(alternatives='|'.join(value))
df.loc[df['E_ID']==key, 'New']= df.loc[df['E_ID']==key, 'New'].str.replace(regex, '@@@')
结果如下:
Out[115]:
Data E_ID N_ID Name New
0 Hey this is 123456 Jonny B Good E11 111 JONNY B GOOD Hey this is @@@ @@@ @@@ @@@
1 This is Jonny B Good at 511-233-1137 E11 112 JONNY B GOOD This is @@@ @@@ @@@ at @@@
2 Wow that is Alice N Wonderland A999b E22 211 ALICE N WONDERLAND Wow that is @@@ @@@ @@@ @@@
3 Yes hi: Mick E Mouse 1A25629Q88 or E33 311 MICK E MOUSE Yes hi: @@@ @@@ @@@ @@@ or
4 Bye Mick E Mouse A13B ok was seen on E33 312 MICK E MOUSE Bye @@@ @@@ @@@ @@@ ok was seen on
这篇关于使用字典中的值替换列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!