矩阵运算和使用data.table的分量相加 [英] matrix operations and component-wise addition using data.table

查看:188
本文介绍了矩阵运算和使用data.table的分量相加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是组件方式的最佳方法 matrix 添加如果要求和的矩阵的数量不是预先知道的?更一般来说,是否有一个好方法在 data.table ?我使用 data.table 在通过几个固定变量或类别对数据进行排序和分组时的效率,每个变量包括不同数量的观察值。

What is the best way to do component-wise matrix addition if the number of matrices to be summed is not known in advance? More generally, is there a good way to perform matrix (or multi-dimensional array) operations in the context of data.table? I use data.table for its efficiency at sorting and grouping data by several fixed variables, or categories, each comprising a different number of observations.

例如:


  1. 查找每个观察(行)数据,返回每行的矩阵。

  2. 将每个数据类别分组的所有行按成分方式求和。

这里用2×2矩阵和仅一个类别示出:

Here illustrated with 2x2 matrices and only one category:

library(data.table)

# example data, number of rows differs by category t
N <- 5
dt <- data.table(t = rep(c("a", "b"), each = 3, len = N), 
                 x1 = rep(1:2, len = N), x2 = rep(3:5, len = N),
                 y1 = rep(1:3, len = N), y2 = rep(2:5, len = N))
setkey(dt, t)
> dt
   t x1 x2 y1 y2
1: a  1  3  1  2
2: a  2  4  2  3
3: a  1  5  3  4
4: b  2  3  1  5
5: b  1  4  2  2

我尝试了一个函数来计算矩阵

I attempted a function to compute matrix sum on outer product, %o%

mat_sum <- function(x1, x2, y1, y2){
  x <- c(x1, x2) # x vector
  y <- c(y1, y2) # y vector
  xy <- x %o% y # outer product (i.e. 2x2 matrix)
  sum(xy)  # <<< THIS RETURNS A SINGLE VALUE, NOT WHAT I WANT.
  }

这当然不起作用,因为 sum

which, of course, does not work because sum adds up all the elements across the arrays.

我看到 Reduce('+',.list)但是似乎需要已经有一个 list 所有要添加的矩阵。我还没有想出在 data.table 中如何做,所以我有一个麻烦的解决方法:

I saw this answer using Reduce('+', .list) but that seems to require already having a list of all the matrices to be added. I haven't figured out how to do that within data.table, so instead I've got a cumbersome work-around:

# extract each outer product component first...
mat_comps <- function(x1, x2, y1, y2){
  x <- c(x1, x2) # x vector
  y <- c(y1, y2) # y vector
  xy <- x %o% y # outer product (i.e. 2x2 matrix)
  xy11 <- xy[1,1]
  xy21 <- xy[2,1]
  xy12 <- xy[1,2]
  xy22 <- xy[2,2]
  return(c(xy11, xy21, xy12, xy22))
}

# ...then running this function on dt, 
# taking extra step (making column 'n') to apply it row-by-row...
dt[, n := 1:nrow(dt)]
dt[, c("xy11", "xy21", "xy12", "xy22") := as.list(mat_comps(x1, x2, y1, y2)), 
   by = n]

# ...then sum them individually, now grouping by t
s <- dt[, list(s11 = sum(xy11),
               s21 = sum(xy21),
               s12 = sum(xy12),
               s22 = sum(xy22)),
        by = key(dt)]
> s
   t s11 s21 s12 s22
1: a   8  26  12  38
2: b   4  11  12  23

$

推荐答案

>一般来说, data.table 旨在使用列。将你的问题转换为col-wise操作的次数越多,你可以越多地使用 data.table

In general, data.table is designed to work with columns. The more you transform your problem to col-wise operations, the more you can get out of data.table.

这里尝试以col方式完成此操作。也许有更好的方法。这是作为一个模板,提供一个想法接近的问题(即使我明白,它可能不是在所有情况下)。

Here's an attempt at accomplishing this operation col-wise. Probably there are better ways. This is intended more as a template, to provide an idea on approaching the problem (even though I understand it may not be possible in all cases).

xcols <- grep("^x", names(dt))
ycols <- grep("^y", names(dt))
combs <- CJ(ycols, xcols)
len <- seq_len(nrow(combs))
cols = paste("V", len, sep="")
for (i in len) {
    c1 = combs$V2[i]
    c2 = combs$V1[i]
    set(dt, i=NULL, j=cols[i], value = dt[[c1]] * dt[[c2]])
}

#    t x1 x2 y1 y2 V1 V2 V3 V4
# 1: a  1  3  1  2  1  3  2  6
# 2: a  2  4  2  3  4  8  6 12
# 3: a  1  5  3  4  3 15  4 20
# 4: b  2  3  1  5  2  3 10 15
# 5: b  1  4  2  2  2  8  2  8

这基本上适用于外部产品。

This basically applies the outer product col-wise. Now it's just a matter of aggregating it.

dt[, lapply(.SD, sum), by=t, .SDcols=cols]

#    t V1 V2 V3 V4
# 1: a  8 26 12 38
# 2: b  4 11 12 23

HTH

cols,c1,c2 一点,以获得 V2 V3

这篇关于矩阵运算和使用data.table的分量相加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆