如何在data.table中引用以数字开头的列名 [英] How to reference column names that start with a number, in data.table

查看:161
本文介绍了如何在data.table中引用以数字开头的列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果 data.table 中的列名采用数字+字符的形式,例如: 4PCS 5Y 等,如何将其称为 j x [i,j] 中,以便将其解释为未引用的列名。



我认为这可以解决我原来的问题。我想在 data.table中添加几列,其形式为数字+字符

  M<-data.table('4PCS'= 1:4,'5Y'= 4:1,X5Y = 2:5)
> M [,4PCS + 5Y]
错误: M [,4PCS中的意外符号

新列的总和应为 4PSC 5Y



有没有办法在 data.table 中以无引号的形式引用它们?如果这些列在 data.table 中用 data.frame 引用的逻辑进行引用:

 > M [,'5Y',with = FALSE] 
5Y
[1,] 4
[2,] 3
[3,] 2
[4, ] 1

然后,这种引用的功能将受到限制。添加项不起作用,因为它在 data.frame 中不起作用:

 > M [,'4PCS'+'5Y',with = FALSE] 
4PCS + 5Y中的错误:二进制运算符


data.table 功能将允许对列进行操作。我想在新的 data.table 逻辑中找到一个解决方案,因此我可以使用它的功能通过列名引用来转换列。



问题是:

如何引用以数字开头的列名,以便data.table逻辑会理解这是一个列名。

解决方案

我认为,这是您要查找的,不确定。 data.table data.frame 不同。请查看 快速简介 ,然后 常见问题解答 (如有必要,还提供参考手册)。

 需要(data.table)
dt<-data.table( 4PCS = 1:3,y = 3:1)
#4PCS y
#1:1 3
#2:2 2
#3:3 1

#访问列4PCS
dt [, 4PCS]

#返回数据.table
#4PCS
#1:1
#2:2
#3:3

#通过名称$ b $访问多个列b dt [,c( 4PCS, y)]

或者,如果需要访问该列,而 not 生成 data.table 而不是向量,则可以使用 $ 表示法:

  dt $`4 PCS`#注意`,因为变量以数字
#[1] 1 2 3

##替代,如注释中提到的mnel:
dt [,`4PCS `]
#[1] 1 2 3

或者如果您知道列号,可以使用 [[。]] 进行访问,如下所示:

  dt [ [1]#4PCS是此处的第一列
#[1] 1 2 3






编辑:



感谢@joran。我认为您正在寻找这样的东西:

  dt [,`4PCS` + y] 
#[1] 4 4 4

根本上,问题在于 4CPS 在R中不是有效的变量名(尝试 4CPS<-1 ,您将收到相同的意外符号错误)。因此要引用它,我们必须使用反引号(compare `4CPS`<-1


If the column names in data.table are in the form of number + character, for example: 4PCS, 5Y etc, how could this be referenced as j in x[i,j] so that it is interpreted as an unquoted column name.

I assume this would solve mine original problem. I wanted to add several column in 'data.table' which were in the form number + character.

M <- data.table('4PCS'=1:4,'5Y'=4:1,X5Y=2:5)
> M[,4PCS+5Y]
Error: unexpected symbol in "M[,4PCS"

The new column should be a sum of 4PSC and 5Y.

Is there a way how to refer to them in data.table in no quoted form? If these columns are referred in data.table with the quoted "logic" of data.frame :

> M[,'5Y',with=FALSE]
     5Y
[1,]  4
[2,]  3
[3,]  2
[4,]  1

then there will be a limitation in functionality of such reference. The addition would not work as it does not work in data.frame:

> M[,'4PCS'+'5Y',with=FALSE]  
Error in "4PCS" + "5Y" : non-numeric argument to binary operator

The data.table functionality would allow to operate over the columns. I would like to find a solution in the new data.table logic hence I can use its ability to transform the columns by column name referencing.

The question is:
How to quote the column name which start with number so that the data.table logic would understand that it is a column name.

解决方案

I think, this is what you're looking for, not sure. data.table is different from data.frame. Please have a look at the quick introduction, and then the FAQ (and also the reference manual if necessary).

require(data.table)
dt <- data.table("4PCS" = 1:3, y=3:1)
#    4PCS y
# 1:    1 3
# 2:    2 2
# 3:    3 1

# access column 4PCS
dt[, "4PCS"]

# returns a data.table
#    4PCS
# 1:    1
# 2:    2
# 3:    3

# to access multiple columns by name
dt[, c("4PCS", "y")]

Alternatively, if you need to access the column and not result in a data.table, rather a vector, then you can access using the $ notation:

dt$`4PCS` # notice the ` because the variable begins with a number
# [1] 1 2 3

# alternatively, as mnel mentioned under comments:
dt[, `4PCS`] 
# [1] 1 2 3

Or if you know the column number you can access using [[.]] as follows:

dt[[1]] # 4PCS is the first column here
# [1] 1 2 3


Edit:

Thanks @joran. I think you're looking for this:

dt[, `4PCS` + y]
# [1] 4 4 4

Fundamentally the issue is that 4CPS is not a valid variable name in R (try 4CPS <- 1, you'll get the same "Unexpected symbol" error). So to refer to it, we have to use backticks (compare`4CPS` <- 1)

这篇关于如何在data.table中引用以数字开头的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆