使用setattr更改列的因子级别对于该列的创建方式非常敏感 [英] Changing factor levels on a column with setattr is sensitive for how the column was created
问题描述
我想使用 setattr
更改列的因子级别。但是,当列选择标准 data.table
方式( dt [,col]
)时,级别
未更新。另一方面,当在 data.table
设置中以非正统的方式选择列时 - 即使用 $
I want to change factor levels of a column using setattr
. However, when the column is selected the standard data.table
way (dt[ , col]
), the levels
are not updated. On the other hand, when selecting the column in an unorthodox way in a data.table
setting—namely using $
—it works.
library(data.table)
# Some data
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# We want to change levels of 'x' using setattr
# New desired levels
lev <- c("a_new", "b_new")
# Select column in the standard data.table way
setattr(x = d[ , x], name = "levels", value = lev)
# Levels are not updated
d
# x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4
# Select column in a non-standard data.table way using $
setattr(x = d$x, name = "levels", value = lev)
# Levels are updated
d
# x y
# 1: b_new 1
# 2: a_new 2
# 3: a_new 3
# 4: b_new 4
# Just check if d[ , x] really is the same as d$x
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
identical(d[ , x], d$x)
# [1] TRUE
# Yes, it seems so
感觉像我缺少一些 data.table
( R
? )基础。
It feels like I'm missing some data.table
(R
?) basics here. Can anyone explain what's going on?
我在 setattr $找到了另外两篇文章c $ c>和
级别
:
setattr
保留不需要的重复项(R data.table)
setattr
on levels
preserving unwanted duplicates (R data.table)
两者都使用 $
选择列。他们都没有提到 [,col]
方式。
Both of them used $
to select the column. Neither of them mentioned the [ , col]
way.
推荐答案
可能有助于了解是否从两个表达式查看地址:
It might help to understand if you look at the address from both expressions:
address(d$x)
# [1] "0x10e4ac4d8"
address(d$x)
# [1] "0x10e4ac4d8"
address(d[,x])
# [1] "0x105e0b520"
address(d[,x])
# [1] "0x105e0a600"
请注意,当您多次调用第一个表达式时,第一个表达式的地址不会改变,而第二个表达式则表示它正在创建列的副本,因为地址,因此 setattr
对原始data.table没有影响。
Note that the address from the first expression doesn't change when you call it multiple times, while the second expression does which indicates it is making a copy of the column due to the dynamic nature of the address, so setattr
on it will have no effect on the original data.table.
这篇关于使用setattr更改列的因子级别对于该列的创建方式非常敏感的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!