比较相同系列(行)但不同列的字符串 [英] Comparing strings in same series (row) but different columns
问题描述
我在比较两列之间的字符串时遇到了这个问题.我想要做的是:对于每一行,检查字符串是否为 A 列包含在 B 列中,如果是,则在 C 列中打印一个新字符串是".
I ran into this problem with comparing strings between two columns. What I want to do is to: For each row, check whether the string is column A is included in column B and if so, print a new string 'Yes' in column C.
A 列包含 NaN 值(我导入的 csv 中的空白单元格).
Column A contains NaN values (blank cells in the csv I imported).
我试过了:
df['C']=df['B'].str.contains(df.loc['A'])
df.loc[df['A'].isin(df['B']), 'C']='Yes'
它们都不起作用,因为我找不到比较字符串的正确方法.
They both didn't work as I couldn't find the right way to compare strings.
推荐答案
这使用了列表推导式,所以它可能不是最快的解决方案,但有效且简洁.
This uses list comprehension, so it may not be the fastest solution, but works and is concise.
df['C'] = pd.Series(['Yes' if a in b else 'No' for a,b in zip(df['A'],df['B'])])
如果您不想将值保留在 C 中而不是用否"覆盖它们,您可以这样做:
If you don't want to keep the values in C instead of overwriting them with 'No', you can do it like this:
df['C'] = pd.Series(['Yes' if a in b else c for a,b,c in zip(df['A'],df['B'], df['C'])])
这篇关于比较相同系列(行)但不同列的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!