如何更改 data.table 中因子列的级别 [英] How does one change the levels of a factor column in a data.table
问题描述
在 data.table
中更改 factor
列的级别的正确方法是什么(注意:不是数据框)
What is the correct way to change the levels of a factor
column in a data.table
(note: not data frame)
library(data.table)
mydt <- data.table(id=1:6, value=as.factor(c("A", "A", "B", "B", "B", "C")), key="id")
mydt[, levels(value)]
[1] "A" "B" "C"
我正在寻找类似的东西:
I am looking for something like:
mydt[, levels(value) <- c("X", "Y", "Z")]
当然,上面的行是行不通的.
But of course, the above line does not work.
# Actual # Expected result
> mydt > mydt
id value id value
1: 1 A 1: 1 X
2: 2 A 2: 2 X
3: 3 B 3: 3 Y
4: 4 B 4: 4 Y
5: 5 B 5: 5 Y
6: 6 C 6: 6 Z
推荐答案
你仍然可以用传统方式设置它们:
You can still set them the traditional way:
levels(mydt$value) <- c(...)
这应该很快,除非 mydt
非常大,因为传统语法会复制整个对象.你也可以玩拆解和重构的游戏……但反正没人喜欢这种游戏.
This should be plenty fast unless mydt
is very large since that traditional syntax copies the entire object. You could also play the un-factoring and refactoring game... but no one likes that game anyway.
在没有 mydt
副本的情况下通过引用更改级别:
To change the levels by reference with no copy of mydt
:
setattr(mydt$value,"levels",c(...))
但一定要分配一个有效的级别向量(输入足够长度的 character
)否则你最终会得到一个无效的因素(levels<-
会做一些检查以及复制).
but be sure to assign a valid levels vector (type character
of sufficient length) otherwise you'll end up with an invalid factor (levels<-
does some checking as well as copying).
这篇关于如何更改 data.table 中因子列的级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!