对于pandas数据框的ifelse基于字符串行 [英] Ifelse on pandas data frame based on strings row wise

查看:158
本文介绍了对于pandas数据框的ifelse基于字符串行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这很简单。任务是检查一列中的字符串是否包含存储在另一个字符串中的所有单词。基于此做点什么。这是一个简单的例子

This is an easy one. The task is to check if a string in one column contains all words stored in another string. Based on this do something. Here is a simple example

import pandas as pd

df = pd.DataFrame({'Strings':["The brown","fox smoked 6", "cigarettes per day", "in his cave"], 
'Set': ["Alpha", "Beta", "Gamma", "Delta"]})

... >>> df
     Set             Strings
0  Alpha           The brown
1   Beta        fox smoked 6
2  Gamma  cigarettes per day
3  Delta         in his cave
>>> 

现在我想检查df [Strings]的每一行,如果它包含抽烟和数字6(这里的第3行也是如此)。如果是这样,我需要新列df [Result]等于df [Set],但添加了health damaging字样。如果不只是复制df [Set]中包含的内容。输出应如下所示:

Now i want to check in each row of df["Strings"] if it contains the word "smoked" and the number "6" (which is true for row 3 here). If so, I need the new column df["Result"] to be equal to df["Set"] but with the words "health damaging" added to it. If not just copy what's contained in df["Set"]. Output should look like this:

... >>> df_final
     Set             Strings   Result
0  Alpha           The brown   Alpha
1   Beta        fox smoked 6   Beta health damaging
2  Gamma  cigarettes per day   Gamma
3  Delta         in his cave   Delta
>>>  


推荐答案

你可以构建你的2个条件的掩码并通过这是 np.where

You can construct a mask of your 2 conditions and pass this to np.where:

In [20]:

mask = (df['Strings'].str.contains('6')) & (df['Strings'].str.contains('smoked'))
In [23]:

et
df['Result'] = np.where(mask, df['Set'] + ' health damaging', df['Set'])
df
Out[23]:
     Set             Strings                Result
0  Alpha           The brown                 Alpha
1   Beta        fox smoked 6  Beta health damaging
2  Gamma  cigarettes per day                 Gamma
3  Delta         in his cave                 Delta

这里的掩码使用 .str.contains 我们和条件一起制作掩码。

Here the mask tests for the presence of your strings using .str.contains and we and the conditions together to make the mask.

这篇关于对于pandas数据框的ifelse基于字符串行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆