如何更改data.table中因子列的级别 [英] How does one change the levels of a factor column in a data.table

查看:205
本文介绍了如何更改data.table中因子列的级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更改 data.table 因子列的级别的正确方法(注意:不是数据框)

What is the correct way to change the levels of a factor column in a data.table (note: not data frame)

  library(data.table)
  mydt <- data.table(id=1:6, value=as.factor(c("A", "A", "B", "B", "B", "C")), key="id")

  mydt[, levels(value)]
  [1] "A" "B" "C"

我在寻找像:

mydt[, levels(value) <- c("X", "Y", "Z")]

当然,上面的行不起作用。

But of course, the above line does not work.

    # Actual               # Expected result
    > mydt                  > mydt
       id value                id value
    1:  1     A             1:  1     X
    2:  2     A             2:  2     X
    3:  3     B             3:  3     Y
    4:  4     B             4:  4     Y
    5:  5     B             5:  5     Y
    6:  6     C             6:  6     Z


推荐答案

您仍然可以按照传统方式设置它们:

You can still set them the traditional way:

levels(mydt$value) <- c(...)

很快,除非 mydt 非常大,因为传统的语法复制整个对象。你也可以玩非因果化和重构游戏,但没有人喜欢这个游戏。

This should be plenty fast unless mydt is very large since that traditional syntax copies the entire object. You could also play the un-factoring and refactoring game... but no one likes that game anyway.

要改变参考的水平,没有<$ c的副本$ c> mydt :

To change the levels by reference with no copy of mydt :

setattr(mydt$value,"levels",c(...))

但一定要分配一个有效的向量(类型 有足够的长度),否则你最终会得到一个无效的因子( levels< - 做一些检查和复制)。

but be sure to assign a valid levels vector (type character of sufficient length) otherwise you'll end up with an invalid factor (levels<- does some checking as well as copying).

这篇关于如何更改data.table中因子列的级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆