R - 在数据帧的子集中找到所有唯一的值 [英] R - find all unique values among subsets of a data frame

查看：163 发布时间：2017/7/21 0:03:58 r duplicates unique

本文介绍了R - 在数据帧的子集中找到所有唯一的值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含两列的数据框。第一列定义了数据的子集。我想在第二列中找到只显示在第一列的一个子集中的所有值。

例如，从：

  df = data.frame（
 data_subsets = rep（LETTERS [1：2]，each = 5），
 data_values = c ，2,3,4,5,2,3,4,6,7））
 
 data_subsets data_values 
 A 1 
 A 2 
 A 3 
 A 4 
 A 5 
 B 2 
 B 3 
 B 4 
 B 6 
 B 7

我想提取以下数据框。

  data_subsets data_values 
 A 1 
 A 5 
 B 6 
 B 7

我一直在玩重复的，但我似乎不能让它工作。任何帮助是赞赏。有一些问题处理类似的问题，我希望我没有忽视我的搜索中的答案！

编辑

我修改了@Matthew Lundberg的方法来计算元素数量并从数据框中提取。由于某种原因，他的方法并不适用于我所拥有的数据框架，所以我想出了这一点，这不是很优雅，而是完成了工作：

  counting = rowSums（do.call（rbind，tapply（df $ data_subsets，df $ data_values，FUN = table）））
 extract =名称（计数）[计数== 1] 
 df [match（extract，df $ data_values）]]

解决方案

首先，找到df $ data_values中每个元素的计数：

  x<  -  sapply（df $ data_values，function（x）sum（as.numeric（df $ data_values == x）））
 
> x 
 [1] 1 2 2 2 1 2 2 2 1 1

行：

 > df [x == 1，] 
 data_subsets data_values 
 1 A 1 
 5 A 5 
 9 B 6 
 10 B 7

请注意，您错过了上面的A 5。没有B 5。

I have a data frame with two columns. The first column defines subsets of the data. I want to find all values in the second column that only appear in one subset in the first column.

For example, from:

df=data.frame(
  data_subsets=rep(LETTERS[1:2],each=5),
  data_values=c(1,2,3,4,5,2,3,4,6,7))

data_subsets data_values
      A           1
      A           2
      A           3
      A           4
      A           5
      B           2
      B           3
      B           4
      B           6
      B           7

I would want to extract the following data frame.

data_subsets   data_values
    A              1
    A              5
    B              6
    B              7

I have been playing around with duplicated but I just can't seem to make it work. Any help is appreciated. There are a number of topics tackling similar problems, I hope I didn't overlook the answer in my searches!

EDIT

I modified the approach from @Matthew Lundberg of counting the number of elements and extracting from the data frame. For some reason his approach was not working with the data frame I had, so I came up with this, which is less elegant but gets the job done:

counts=rowSums(do.call("rbind",tapply(df$data_subsets,df$data_values,FUN=table)))
extract=names(counts)[counts==1]
df[match(extract,df$data_values),]

解决方案

First, find the count of each element in df$data_values:

 x <- sapply(df$data_values, function(x) sum(as.numeric(df$data_values == x)))

> x
 [1] 1 2 2 2 1 2 2 2 1 1

Now extract the rows:

> df[x==1,]
   data_subsets data_values
1             A           1
5             A           5
9             B           6
10            B           7

Note that you missed "A 5" above. There is no "B 5".

这篇关于R - 在数据帧的子集中找到所有唯一的值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R - 在数据帧的子集中找到所有唯一的值 [英] R - find all unique values among subsets of a data frame

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R - 在数据帧的子集中找到所有唯一的值 [英] R - find all unique values among subsets of a data frame

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭