当一列不为null时,python pandas匹配两列之间的值 [英] python pandas match values between two columns when one column is not null
问题描述
在这里找不到类似的问题.
Didn't find the similar question here.
请在下面找到表格:
A B C D
0 pen nan dfds 1238
1 Apple pen fsd 324
2 Peach nan kd 878
3 grape peach jil 9kj
4 laptop nan lks 873p
5 light grape kje 7623d
6 nan grape 3r43 kj23
7 nan grape 3fdf 8734d
- 如果列B不为空,则将B中的值与A中的值进行比较,并尝试查找匹配的值.例如笔"在列A的第一行中="pen"在B列的第二行.
- 如果识别出匹配的值,则需要在A列中找到索引.笔"是一个匹配值,"pen"的索引是A列中的值为0.
我的预期输出是:
A B C D
0 pen nan dfds 1238
2 Peach nan kd 878
3 grape peach jil 9kj
并保留原始索引号,如输出示例所示
And keep original index number as in the output example
我知道如何在A和B之间进行匹配.我的代码是
I know how to do the matching job between A and B. My code is
df2=df[df[['A','B']].nunique(axis=1)==1]
但是当B列不为null时,我不知道如何添加条件.而且我不想进行循环迭代,因为数据集非常大.
But i don't know how to add the condition when column B is not null. And i don't want to do loop iterations since the dataset is super large.
非常感谢!
推荐答案
我认为,在您问题的第1点中,您的意思是" C "列不为空吗?但是无论如何,我都会在列" B "中进行演示.
I think, in point 1 of your question, do you mean column "C" is not null?? But whatever, I'll demonstrate it by taking column "B".
为此,您必须创建一个仅包含非空值的新数据框.
For this, you have to create a new dataframe containing not null values only.
df_not_null = df.dropna(subset=['B'])
然后,您可以比较任何您想比较的东西.
Then you can compare whatever you want to compare.
df2 = df_not_null[df_not_null[['A','B']].nunique(axis=1)==1]
这篇关于当一列不为null时,python pandas匹配两列之间的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!