使用setattr更改列的因子级别对于该列的创建方式非常敏感 [英] Changing factor levels on a column with setattr is sensitive for how the column was created

查看:144
本文介绍了使用setattr更改列的因子级别对于该列的创建方式非常敏感的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 setattr 更改列的因子级别。但是,当列选择标准 data.table 方式( dt [,col] )时,级别未更新。另一方面,当在 data.table 设置中以非正统的方式选择列时 - 即使用 $

I want to change factor levels of a column using setattr. However, when the column is selected the standard data.table way (dt[ , col]), the levels are not updated. On the other hand, when selecting the column in an unorthodox way in a data.table setting—namely using $—it works.

library(data.table)

# Some data 
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
d
#    x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4

# We want to change levels of 'x' using setattr
# New desired levels
lev <- c("a_new", "b_new")

# Select column in the standard data.table way 
setattr(x = d[ , x], name = "levels", value = lev)

# Levels are not updated
d
#    x y
# 1: b 1
# 2: a 2
# 3: a 3
# 4: b 4

# Select column in a non-standard data.table way using $
setattr(x = d$x, name = "levels", value = lev)

# Levels are updated
d
#        x y
# 1: b_new 1
# 2: a_new 2
# 3: a_new 3
# 4: b_new 4

# Just check if d[ , x] really is the same as d$x
d <- data.table(x = factor(c("b", "a", "a", "b")), y = 1:4)
identical(d[ , x], d$x)
# [1] TRUE
# Yes, it seems so

感觉像我缺少一些 data.table R ? )基础。

It feels like I'm missing some data.table (R?) basics here. Can anyone explain what's going on?

我在 setattr 级别

setattr 保留不需要的重复项(R data.table)

setattr on levels preserving unwanted duplicates (R data.table)

如何更改data.table中的因子列的级别

两者都使用 $ 选择列。他们都没有提到 [,col] 方式。

Both of them used $ to select the column. Neither of them mentioned the [ , col] way.

推荐答案

可能有助于了解是否从两个表达式查看地址:

It might help to understand if you look at the address from both expressions:

address(d$x)
# [1] "0x10e4ac4d8"
address(d$x)
# [1] "0x10e4ac4d8"


address(d[,x])
# [1] "0x105e0b520"
address(d[,x])
# [1] "0x105e0a600"

请注意,当您多次调用第一个表达式时,第一个表达式的地址不会改变,而第二个表达式则表示它正在创建列的副本,因为地址,因此 setattr 对原始data.table没有影响。

Note that the address from the first expression doesn't change when you call it multiple times, while the second expression does which indicates it is making a copy of the column due to the dynamic nature of the address, so setattr on it will have no effect on the original data.table.

这篇关于使用setattr更改列的因子级别对于该列的创建方式非常敏感的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆