对于pandas数据框的ifelse基于字符串行 [英] Ifelse on pandas data frame based on strings row wise
问题描述
这很简单。任务是检查一列中的字符串是否包含存储在另一个字符串中的所有单词。基于此做点什么。这是一个简单的例子
This is an easy one. The task is to check if a string in one column contains all words stored in another string. Based on this do something. Here is a simple example
import pandas as pd
df = pd.DataFrame({'Strings':["The brown","fox smoked 6", "cigarettes per day", "in his cave"],
'Set': ["Alpha", "Beta", "Gamma", "Delta"]})
... >>> df
Set Strings
0 Alpha The brown
1 Beta fox smoked 6
2 Gamma cigarettes per day
3 Delta in his cave
>>>
现在我想检查df [Strings]的每一行,如果它包含抽烟和数字6(这里的第3行也是如此)。如果是这样,我需要新列df [Result]等于df [Set],但添加了health damaging字样。如果不只是复制df [Set]中包含的内容。输出应如下所示:
Now i want to check in each row of df["Strings"] if it contains the word "smoked" and the number "6" (which is true for row 3 here). If so, I need the new column df["Result"] to be equal to df["Set"] but with the words "health damaging" added to it. If not just copy what's contained in df["Set"]. Output should look like this:
... >>> df_final
Set Strings Result
0 Alpha The brown Alpha
1 Beta fox smoked 6 Beta health damaging
2 Gamma cigarettes per day Gamma
3 Delta in his cave Delta
>>>
推荐答案
你可以构建你的2个条件的掩码并通过这是 np.where
:
You can construct a mask of your 2 conditions and pass this to np.where
:
In [20]:
mask = (df['Strings'].str.contains('6')) & (df['Strings'].str.contains('smoked'))
In [23]:
et
df['Result'] = np.where(mask, df['Set'] + ' health damaging', df['Set'])
df
Out[23]:
Set Strings Result
0 Alpha The brown Alpha
1 Beta fox smoked 6 Beta health damaging
2 Gamma cigarettes per day Gamma
3 Delta in his cave Delta
这里的掩码使用 .str.contains
我们和条件一起制作掩码。
Here the mask tests for the presence of your strings using .str.contains
and we and the conditions together to make the mask.
这篇关于对于pandas数据框的ifelse基于字符串行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!