对不等大数据表的操作(+, - ,/,*) [英] operations (+, -, /, *) on unequal-sized data.table
问题描述
1)可以使用 data.table
在不等大的data.tables之间进行操作(乘法,除法,加法,减法),否则它必须使用 data.frame
?
执行以下示例是我原始发布的简化版本。在我的实际数据集中,它将是A1:A12,B1:B12,C1:C12,E1:E12,F1:F12等。我在列J和K中添加了接近我的原始数据集,显示我不能在矩阵中执行以下操作。
#示例数据
library(data.table)
input1a< - data.table(ID = c(37,45,900),
A1 = c(1,2,3),
A2 = c(43,320,390 ),
B1 = c(-0.94,2.2,-1.223),
B2 = c(2.32,4.54,7.21),
C1 = c(1,2,3) b $ b C2 = c(-0.94,2.2,-1.223),
D = c(43,320,390),
J = paste0(measurement_1,1:3) $ b K = paste0(type_1,1:3))
setkey(input1a,ID)
input1a
#ID A1 A2 B1 B2 C1 C2 DJK
# 37 1 43 -0.940 2.32 1 -0.940 43 measurement_11 type_11
#2:45 2 320 2.200 4.54 2 2.200 320 measurement_12 type_12
#3:900 3 390 -1.223 7.21 3 -1.223 390 measurement_13 type_13
input2a < - data.table(ID = c(37,45,900),
E1 = c(23,-0.2,12),
E2 = c 0.33,-0.012,-1.342))
setkey(input2a,ID)
input2a
#ID E1 E2
#1:37 -0.6135756 -0.330
#2 :45 -0.0124872 -0.012
#3:900 -0.4165049 -1.342
outputa <-0.00066 * input1a [,c(4:5),with = FALSE] *
input1a [,8,with = FALSE ] * input2a [,c(2:3),with = FALSE]#no keys,but be
#like to keep the keys
#outputa <-0.00066 * B1:B2 * D * A1 :A2返回列名
setnames(outputa,2:3,c(F1,F2))
使用 outputa
结果
outputa#使用现有代码并给出没有键的结果
#F1 F2
#1:-0.6135756 -0.02172773
#2:-0.0929280 -0.01150618
#3:-3.7776024 -2.49055607
在下面的代码中,我使用 outputa ,但没有保留这些键,并将 outputa 重写为 outputause 。我想回答以下问题,以便我可以对数据集执行所需的操作,同时保持密钥不变。
2)如何以下代码使用为每组列定义的 x 重写?此问题源自按组的变量的加权总和与data.table ,我无法使用我的数据集复制任何答案。
每组栏位定义如下:
- A1:A2 code> input1a [,2:3] ),
- B1:B2(
input1a [,4:5] / code>)和
- D
input1a [,8]
>
在 outputause ,如果
input1a [,c(4:5),with = FALSE]
是 input1a 中唯一的组,然后是 x 。
当您从单个
data.table
中有多个组时,如下所示? p>
outputause< - input1a [,lapply(.SD,function(x){
0.00066 * input1a [,c 4:5),with = FALSE] * input1a [,8,with = FALSE] *
input2a [,c(2,3),with = FALSE]
} )]#保持键完整
setnames(outputause,2:3,c(F1,F2))
$ b b
使用outputause的结果
outputause#使用修改后的代码和结果键
#ID F1 F2
#1:37 -0.6135756 -0.02172773
#2:45 -0.0929280 -0.01150618
#3:900 -3.7776024 -2.49055607
UPDATE
$ binput2at < - data.table(t(input2a))
输入< - data.table(input1a,input2at)
我已移调
input2a
,并将其与input1a
在data.tableinputs
中。在这个简单的例子中,我有3行,但在我的实际数据集,我会有接近1300行。这是我问问题2的原因。
谢谢。
解决方案我根据在在单个数据表中具有多个组的R数据表操作。带lapply的表函数和外部函数。
outputa< - data.table(input1a,input2a)
setnames(outputa,8,D1)
outputa [,D2:= D1]
fun< - function(B,D,E)0.00066 * B * D * E
outputa [ lapply(1:2,function(i)fun(get(paste0('B',i)),
get(paste0('D',i)),
get(paste0 ',i)))),
by = ID]
1) Is it possible to do operations (multiplication, division, addition, subtraction) between unequal-sized data.tables using
data.table
or will it have to be done withdata.frame
?The following example is a simplified version of my original posting. In my actual data set, it would be A1:A12, B1:B12, C1:C12, E1:E12, F1:F12, etc. I've added in columns J and K to get close to my original data set and to show that I can not do the following in a matrix.
# Sample Data library(data.table) input1a <- data.table(ID = c(37, 45, 900), A1 = c(1, 2, 3), A2 = c(43, 320, 390), B1 = c(-0.94, 2.2, -1.223), B2 = c(2.32, 4.54, 7.21), C1 = c(1, 2, 3), C2 = c(-0.94, 2.2, -1.223), D = c(43, 320, 390), J = paste0("measurement_1", 1:3), K = paste0("type_1", 1:3)) setkey(input1a, ID) input1a # ID A1 A2 B1 B2 C1 C2 D J K # 1: 37 1 43 -0.940 2.32 1 -0.940 43 measurement_11 type_11 # 2: 45 2 320 2.200 4.54 2 2.200 320 measurement_12 type_12 # 3: 900 3 390 -1.223 7.21 3 -1.223 390 measurement_13 type_13 input2a <- data.table(ID = c(37, 45, 900), E1 = c(23, -0.2, 12), E2 = c(-0.33, -0.012, -1.342)) setkey(input2a, ID) input2a # ID E1 E2 # 1: 37 -0.6135756 -0.330 # 2: 45 -0.0124872 -0.012 # 3: 900 -0.4165049 -1.342
outputa <- 0.00066 * input1a[, c(4:5), with = FALSE] * input1a[, 8, with = FALSE] * input2a[, c(2:3), with = FALSE] # no keys, but would # like to keep the keys # outputa <- 0.00066 * B1:B2 * D * A1:A2 / referring back to the column names setnames(outputa, 2:3, c("F1", "F2"))
Result using
outputa
outputa # using existing code and gives a result with no keys # F1 F2 # 1: -0.6135756 -0.02172773 # 2: -0.0929280 -0.01150618 # 3: -3.7776024 -2.49055607
In the following code I took outputa, which did not keep the keys, and rewrote outputa as outputause. I would like to have the following question answered so that I can perform the needed operations on the data set while keeping the keys intact.
2) How can the following code be rewritten with x defined for each group of columns? This question stems from Weighted sum of variables by groups with data.table and my trouble trying to replicate any of the answers with my data set.
Each group of columns is defined below:
- A1:A2 (
input1a[, 2:3]
), - B1:B2 (
input1a[, 4:5]
), and - D
input1a[, 8]
In outputause, if
input1a[, c(4:5), with = FALSE]
was the only group from input1a, then alone it would be x.What about when you have more than one group from a single
data.table
as is shown below?outputause <- input1a[, lapply(.SD, function(x) { 0.00066 * input1a[, c(4:5), with = FALSE] * input1a[, 8, with = FALSE] * input2a[, c(2, 3), with = FALSE] }), by = key(input1a)] # keeping keys intact setnames(outputause, 2:3, c("F1", "F2"))
Result using outputause
outputause # using revised code and result includes the keys # ID F1 F2 # 1: 37 -0.6135756 -0.02172773 # 2: 45 -0.0929280 -0.01150618 # 3: 900 -3.7776024 -2.49055607
UPDATE
input2at <- data.table(t(input2a)) inputs <- data.table(input1a, input2at)
I have transposed
input2a
and combined it withinput1a
in the data.tableinputs
. In this simple example I had 3 rows, but in my actual data set I'll have close to 1300 rows. This is the reason why I've asked question 2).Thank you.
解决方案I am answering my own question based on an answer provided to me in R data.table operations with multiple groups in single data.table and outside function with lapply.
outputa <- data.table(input1a, input2a) setnames(outputa, 8, "D1") outputa[, D2 := D1] fun <- function(B, D, E) 0.00066 * B * D * E outputa[, lapply(1:2, function(i) fun(get(paste0('B', i)), get(paste0('D', i)), get(paste0('E', i)))), by = ID]
这篇关于对不等大数据表的操作(+, - ,/,*)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- A1:A2 (