在数据框中添加额外的因素 [英] Add extra level to factors in dataframe

查看:45
本文介绍了在数据框中添加额外的因素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含数字和有序因子列的数据框。我有很多NA值,因此没有分配任何级别。我将NA更改为 No Answer,但因子列的级别不包含该级别,因此这是我的开始方式,但我不知道如何以一种优雅的方式完成它:

I have a data frame with numeric and ordered factor columns. I have lot of NA values, so no level is assigned to them. I changed NA to "No Answer", but levels of the factor columns don't contain that level, so here is how I started, but I don't know how to finish it in an elegant way:

addNoAnswer = function(df) {
   factorOrNot = sapply(df, is.factor)
   levelsList = lapply(df[, factorOrNot], levels)
   levelsList = lapply(levelsList, function(x) c(x, "No Answer"))
   ...

是否可以直接将新级别应用于因子列,例如:

Is there a way to directly apply new levels to factor columns, for example, something like this:

df[, factorOrNot] = lapply(df[, factorOrNot], factor, levelsList)

当然,这不能正常工作。

Of course, this doesn't work correctly.

我希望保留级别的顺序并将无答案级别添加到最后一位。

I want the order of levels preserved and "No Answer" level added to last place.

推荐答案

您可以定义一个将水平添加到因子上的函数,但只返回其他值:

You could define a function that adds the levels to a factor, but just returns anything else:

addNoAnswer <- function(x){
  if(is.factor(x)) return(factor(x, levels=c(levels(x), "No Answer")))
  return(x)
}

然后,您只需 lapply 此函数添加到您的列中

Then you just lapply this function to your columns

df <- as.data.frame(lapply(df, addNoAnswer))

那应该返回您想要的东西。

That should return what you want.

这篇关于在数据框中添加额外的因素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆