如何更改data.table中因子列的级别 [英] How does one change the levels of a factor column in a data.table
问题描述
更改 data.table
中因子
列的级别的正确方法(注意:不是数据框)
What is the correct way to change the levels of a factor
column in a data.table
(note: not data frame)
library(data.table)
mydt <- data.table(id=1:6, value=as.factor(c("A", "A", "B", "B", "B", "C")), key="id")
mydt[, levels(value)]
[1] "A" "B" "C"
我在寻找像:
mydt[, levels(value) <- c("X", "Y", "Z")]
当然,上面的行不起作用。
But of course, the above line does not work.
# Actual # Expected result
> mydt > mydt
id value id value
1: 1 A 1: 1 X
2: 2 A 2: 2 X
3: 3 B 3: 3 Y
4: 4 B 4: 4 Y
5: 5 B 5: 5 Y
6: 6 C 6: 6 Z
推荐答案
您仍然可以按照传统方式设置它们:
You can still set them the traditional way:
levels(mydt$value) <- c(...)
很快,除非 mydt
非常大,因为传统的语法复制整个对象。你也可以玩非因果化和重构游戏,但没有人喜欢这个游戏。
This should be plenty fast unless mydt
is very large since that traditional syntax copies the entire object. You could also play the un-factoring and refactoring game... but no one likes that game anyway.
要改变参考的水平,没有<$ c的副本$ c> mydt :
To change the levels by reference with no copy of mydt
:
setattr(mydt$value,"levels",c(...))
但一定要分配一个有效的向量(类型
有足够的长度),否则你最终会得到一个无效的因子( levels< -
做一些检查和复制)。
but be sure to assign a valid levels vector (type character
of sufficient length) otherwise you'll end up with an invalid factor (levels<-
does some checking as well as copying).
这篇关于如何更改data.table中因子列的级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!