如何在 Python 中删除数据框的子集? [英] How to remove a subset of a data frame in Python?

查看:74
本文介绍了如何在 Python 中删除数据框的子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据帧 df 是 3020x4.我想从原始文件中删除一个子集 df1 20x4.换句话说,我只想得到形状为 3000x4 的差异.我尝试了以下但没有奏效.它准确地返回了 df.你能帮忙吗?谢谢.

My dataframe df is 3020x4. I'd like to remove a subset df1 20x4 out of the original. In other words, I just want to get the difference whose shape is 3000x4. I tried the below but it did not work. It returned exactly df. Would you please help? Thanks.

new_df = df.drop(df1)

推荐答案

由于您似乎无法发布具有代表性的示例,我将演示使用 merge 和参数 indicator=True 的一种方法:

As you seem to be unable to post a representative example I will demonstrate one approach using merge with param indicator=True:

因此生成一些数据:

In [116]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df

Out[116]:
          a         b         c
0 -0.134933 -0.664799 -1.611790
1  1.457741  0.652709 -1.154430
2  0.534560 -0.781352  1.978084
3  0.844243 -0.234208 -2.415347
4 -0.118761 -0.287092  1.179237

取一个子集:

In [118]:
df_subset=df.iloc[2:3]
df_subset

Out[118]:
         a         b         c
2  0.53456 -0.781352  1.978084

现在使用参数 indicator=True 执行左 merge 这将添加 _merge 列,指示该行是否为 left_onlybothright_only(后者不会出现在本例中),我们过滤合并的 df 以仅显示 left_only:

now perform a left merge with param indicator=True this will add _merge column which indicates whether the row is left_only, both or right_only (the latter won't appear in this example) and we filter the merged df to show only left_only:

In [121]:
df_new = df.merge(df_subset, how='left', indicator=True)
df_new = df_new[df_new['_merge'] == 'left_only']
df_new

Out[121]:
          a         b         c     _merge
0 -0.134933 -0.664799 -1.611790  left_only
1  1.457741  0.652709 -1.154430  left_only
3  0.844243 -0.234208 -2.415347  left_only
4 -0.118761 -0.287092  1.179237  left_only

这是原始合并的df:

In [122]:
df.merge(df_subset, how='left', indicator=True)

Out[122]:
          a         b         c     _merge
0 -0.134933 -0.664799 -1.611790  left_only
1  1.457741  0.652709 -1.154430  left_only
2  0.534560 -0.781352  1.978084       both
3  0.844243 -0.234208 -2.415347  left_only
4 -0.118761 -0.287092  1.179237  left_only

这篇关于如何在 Python 中删除数据框的子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆