替换 R 数据框中因子列的内容 [英] Replace contents of factor column in R dataframe

查看:41
本文介绍了替换 R 数据框中因子列的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要替换数据框中因子列的级别.以 iris 数据集为例,我如何将 Species 中包含 virginica 的任何细胞替换为 setosa列?

I need to replace the levels of a factor column in a dataframe. Using the iris dataset as an example, how would I replace any cells which contain virginica with setosa in the Species column?

我希望以下内容可以工作,但它会生成一条警告消息并简单地插入 NA:

I expected the following to work, but it generates a warning message and simply inserts NAs:

iris$Species[iris$Species == 'virginica'] <- 'setosa'

推荐答案

我敢打赌,问题是当您尝试用新值替换值时,该值当前不属于现有因子水平的一部分:

I bet the problem is when you are trying to replace values with a new one, one that is not currently part of the existing factor's levels:

levels(iris$Species)
# [1] "setosa"     "versicolor" "virginica" 

你的例子很糟糕,这很有效:

Your example was bad, this works:

iris$Species[iris$Species == 'virginica'] <- 'setosa'

这更有可能造成您在使用自己的数据时看到的问题:

This is what more likely creates the problem you were seeing with your own data:

iris$Species[iris$Species == 'virginica'] <- 'new.species'
# Warning message:
# In `[<-.factor`(`*tmp*`, iris$Species == "virginica", value = c(1L,  :
#   invalid factor level, NAs generated

如果您首先提高因子水平,它将起作用:

It will work if you first increase your factor levels:

levels(iris$Species) <- c(levels(iris$Species), "new.species")
iris$Species[iris$Species == 'virginica'] <- 'new.species'


如果要替换物种 A"与物种B"你会更好


If you want to replace "species A" with "species B" you'd be better off with

levels(iris$Species)[match("oldspecies",levels(iris$Species))] <- "newspecies"

这篇关于替换 R 数据框中因子列的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆