在R data.frame中查找常量列的名称 [英] Find the names of constant columns in an R data.frame
问题描述
This is a follow-up on this question. In data.frame DATA
, I have some columns that are constant numbers across the unique rows of the first column called study.name
. For example, columns setting
, prof
and random
are constant for all rows of Shin.Ellis
and constant for all rows of Trus.Hsu
and so on. Including Shin.Ellis
and Trus.Hsu
, there are 10 unique study.name
rows.
我想知道如何找到这样的常量列的名称?
I wonder how to find the names of such constant columns?
下面提供了一个解决方案(请参阅NAMES
),但我想知道为什么从NAMES
输出了始终不恒定的"error"
吗?
A solution was provided below (see NAMES
) but I wonder why "error"
which is not constant throughout is outputted from NAMES
?
DATA <- read.csv("https://raw.githubusercontent.com/izeh/m/master/cc.csv")
DATA <- setNames(DATA, sub("\\.\\d+$", "", names(DATA)))
is_constant <- function(x) length(unique(x)) == 1L
(NAMES <- names(Filter(all, aggregate(.~study.name, DATA, is_constant)[-1])) )
# > [1] "setting" "prof" "error" "random" ## "error" is NOT a constant variable
## BUT why it is outputted here!
# Desired output:
# [1] "setting" "prof" "random"
推荐答案
我们需要传递na.action
来处理NA
元素,否则,它将完全删除整行
We need to pass na.action
to take care of the NA
elements, otherwise, it would completely remove the whole row
names(Filter(all, aggregate(.~study.name, DATA, is_constant,
na.action = na.pass)[-1]))
#[1] "setting" "prof" "random"
这篇关于在R data.frame中查找常量列的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!