在R data.frame中查找常量列的名称 [英] Find the names of constant columns in an R data.frame

查看:110
本文介绍了在R data.frame中查找常量列的名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是对此问题的后续操作.在data.frame DATA中,有一些列在称为study.name的第一列的唯一行中是恒定数.例如,对于Shin.Ellis的所有行,列settingprofrandom都是恒定,对于Trus.Hsu的所有行来说恒定,依此类推.包括Shin.EllisTrus.Hsu,有 10 个唯一的study.name行.

This is a follow-up on this question. In data.frame DATA, I have some columns that are constant numbers across the unique rows of the first column called study.name. For example, columns setting, prof and random are constant for all rows of Shin.Ellis and constant for all rows of Trus.Hsu and so on. Including Shin.Ellis and Trus.Hsu, there are 10 unique study.name rows.

我想知道如何找到这样的常量列的名称?

I wonder how to find the names of such constant columns?

下面提供了一个解决方案(请参阅NAMES),但我想知道为什么从NAMES输出了始终不恒定的"error"吗?

A solution was provided below (see NAMES) but I wonder why "error" which is not constant throughout is outputted from NAMES?

DATA <- read.csv("https://raw.githubusercontent.com/izeh/m/master/cc.csv")
DATA <- setNames(DATA, sub("\\.\\d+$", "", names(DATA)))

is_constant <- function(x) length(unique(x)) == 1L 

(NAMES <- names(Filter(all, aggregate(.~study.name, DATA, is_constant)[-1])) )

# > [1] "setting" "prof"   "error"   "random"   ## "error" is NOT a constant variable 
                                                ## BUT why it is outputted here!

# Desired output: 
# [1] "setting" "prof" "random"

推荐答案

我们需要传递na.action来处理NA元素,否则,它将完全删除整行

We need to pass na.action to take care of the NA elements, otherwise, it would completely remove the whole row

names(Filter(all, aggregate(.~study.name, DATA, is_constant, 
            na.action = na.pass)[-1]))
#[1] "setting" "prof"    "random" 

这篇关于在R data.frame中查找常量列的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆