反联接 pandas [英] Anti-Join Pandas

查看：115 发布时间：2020/5/9 0:22:06 python pandas dataframe merge anti-join

本文介绍了反联接 pandas 的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个表，我想追加它们，以便仅保留表A中的所有数据，并且仅在其键是唯一的情况下才添加表B中的数据(但是键值在表A和B中是唯一的).在某些情况下，表A和表B中都会出现一个键).

I have two tables and I would like to append them so that only all the data in table A is retained and data from table B is only added if its key is unique (Key values are unique in table A and B however in some cases a Key will occur in both table A and B).

我认为，这样做的方法将涉及某种过滤联接(反联接)，以获取表B中未在表A中出现的值，然后追加两个表.

I think the way to do this will involve some sort of filtering join (anti-join) to get values in table B that do not occur in table A then append the two tables.

我熟悉R，这是我将在R中执行此操作的代码.

I am familiar with R and this is the code I would use to do this in R.

library("dplyr")

## Filtering join to remove values already in "TableA" from "TableB"
FilteredTableB <- anti_join(TableB,TableA, by = "Key")

## Append "FilteredTableB" to "TableA"
CombinedTable <- bind_rows(TableA,FilteredTableB)

我如何在python中实现呢?

How would I achieve this in python?

推荐答案

请考虑以下数据框

TableA = pd.DataFrame(np.random.rand(4, 3),
                      pd.Index(list('abcd'), name='Key'),
                      ['A', 'B', 'C']).reset_index()
TableB = pd.DataFrame(np.random.rand(4, 3),
                      pd.Index(list('aecf'), name='Key'),
                      ['A', 'B', 'C']).reset_index()

TableA

TableB

这是做自己想要的事情的一种方式

This is one way to do what you want

# Identify what values are in TableB and not in TableA
key_diff = set(TableB.Key).difference(TableA.Key)
where_diff = TableB.Key.isin(key_diff)

# Slice TableB accordingly and append to TableA
TableA.append(TableB[where_diff], ignore_index=True)

rows = []
for i, row in TableB.iterrows():
    if row.Key not in TableA.Key.values:
        rows.append(row)

pd.concat([TableA.T] + rows, axis=1).T

计时

4行，其中2行重叠

Timing

4 rows with 2 overlap

方法1更快

10,000行5,000重叠

循环不好

这篇关于反联接 pandas 的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

反联接 pandas [英] Anti-Join Pandas

问题描述

推荐答案

计时

Timing

相关文章

Python最新文章

热门教程

热门工具

登录关闭

反联接 pandas [英] Anti-Join Pandas

问题描述

推荐答案

计时

Timing

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭