在R中的data.frame中查找仅在一个群集中出现的变量 [英] Find variables that occur only in one cluster in data.frame in R

查看：68 发布时间：2020/4/27 5:12:48 r list function dataframe lapply

本文介绍了在R中的data.frame中查找仅在一个群集中出现的变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用BASE R，我想知道如何回答以下问题:

Using BASE R, I wonder how to answer the following question:

X或Y上是否有任何值仅出现在m中的一个元素中(作为簇)，而没有出现在其他元素上?如果是，请在下面生成我的所需输出.

Are there any value on X or Y (i.e., variables of interest names) that occurs only in one element in m (as a cluster) but not others? If yes, produce my desired output below.

例如: 在这里，我们看到X == 3仅出现在元素m[[3]]中，而没有出现在m[[1]]和m[[2]]中. 在这里，我们还看到Y == 99仅出现在m[[1]]中，而没有出现在其他地方.

For example: Here we see X == 3 only occurs in element m[[3]] but not m[[1]] and m[[2]]. Here we also see Y == 99 only occur in m[[1]] but not others.

注意:以下是一个玩具示例，感谢您提供实用的答案. AND X& Y可以是数字，也可以不是数字(例如，是字符串).

Note: the following is a toy example, a functional answer is appreciated. AND X & Y may or may not be numeric (e.g., be string).

f <- data.frame(id = c(rep("AA",4), rep("BB",2), rep("CC",2)), X = c(1,1,1,1,1,1,3,3), 
            Y = c(99,99,99,99,6,6,6,6))

m <- split(f, f$id) # Here is `m`

mods <- names(f)[-1] # variables of interest names

所需的输出:

list(AA = c(Y = 99), CC = c(X = 3))

# $AA
# Y 
# 99 

# $CC
# X 
# 3

推荐答案

tmp = do.call(rbind, lapply(names(f)[-1], function(x){
    d = unique(f[c("id", x)])
    names(d) = c("id", "val")
    transform(d, nm = x)
}))

tmp = tmp[ave(as.numeric(as.factor(tmp$val)), tmp$val, FUN = length) == 1,]

lapply(split(tmp, tmp$id), function(a){
    setNames(a$val, a$nm)
})
#$AA
# Y 
#99 

#$BB
#named numeric(0)

#$CC
#X 
#3

这篇关于在R中的data.frame中查找仅在一个群集中出现的变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在R中的data.frame中查找仅在一个群集中出现的变量 [英] Find variables that occur only in one cluster in data.frame in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在R中的data.frame中查找仅在一个群集中出现的变量 [英] Find variables that occur only in one cluster in data.frame in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭