过滤仅显示重复项的数据框 [英] Filtering a dataframe showing only duplicates

查看：20 发布时间：2021/12/23 15:51:23 r filter duplicates

本文介绍了过滤仅显示重复项的数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我需要一些帮助来过滤数据框.

I need some help to filter a dataframe.

df 有几列，我想把它分成两个数据帧:

The df has several columns and I want to split it into two dataframes:

1- 仅包含第一列重复的行(包括所有副本).

1- One including only the rows in which the first column is a duplicate (including all of the replicas).

2- 其余行，不重复.

2- The rest of the rows, which are not duplicates.

这是一个例子:这将是原始的.

Here is an example: This would be the original.

          V1  V2 
    [1,] "A" "1"
    [2,] "B" "1"
    [3,] "A" "1"
    [4,] "C" "2"
    [5,] "D" "3"
    [6,] "D" "4"

我想变成这样:

         V1  V2 
   [1,] "A" "1"
   [2,] "A" "1"
   [3,] "D" "3"
   [4,] "D" "4"

还有这个:

        V1  V2 
  [1,] "B" "1"
  [2,] "C" "2"

有没有办法做到这一点?我曾尝试导出到 Excel，但数据集太大而无法实现.

Is there a way to do that? I have tried exporting to Excel, but the dataset was too large to make that viable.

谢谢

考虑到 df 作为您的输入，您可以使用 dplyr 并尝试:

Considering df as your input, you can use dplyr and try:

df %>% group_by(V1) %>% filter(n() > 1)

对于重复项

和

df %>% group_by(V1) %>% filter(n() == 1)

用于唯一条目.

这篇关于过滤仅显示重复项的数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文