比较相同系列(行)但不同列的字符串 [英] Comparing strings in same series (row) but different columns

查看:48
本文介绍了比较相同系列(行)但不同列的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在比较两列之间的字符串时遇到了这个问题.我想要做的是:对于每一行,检查字符串是否为 A 列包含在 B 列中,如果是,则在 C 列中打印一个新字符串是".

I ran into this problem with comparing strings between two columns. What I want to do is to: For each row, check whether the string is column A is included in column B and if so, print a new string 'Yes' in column C.

A 列包含 NaN 值(我导入的 csv 中的空白单元格).

Column A contains NaN values (blank cells in the csv I imported).

我试过了:

df['C']=df['B'].str.contains(df.loc['A'])
df.loc[df['A'].isin(df['B']), 'C']='Yes'

它们都不起作用,因为我找不到比较字符串的正确方法.

They both didn't work as I couldn't find the right way to compare strings.

推荐答案

这使用了列表推导式,所以它可能不是最快的解决方案,但有效且简洁.

This uses list comprehension, so it may not be the fastest solution, but works and is concise.

df['C'] = pd.Series(['Yes' if a in b else 'No' for a,b in zip(df['A'],df['B'])])

如果您不想将值保留在 C 中而不是用否"覆盖它们,您可以这样做:

If you don't want to keep the values in C instead of overwriting them with 'No', you can do it like this:

df['C'] = pd.Series(['Yes' if a in b else c for a,b,c in zip(df['A'],df['B'], df['C'])])

这篇关于比较相同系列(行)但不同列的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆