积在许多列的SQL表 [英] Dot product in an SQL table with many columns

查看:156
本文介绍了积在许多列的SQL表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过单一的指定行乘以每一行,再总结为每行该产物(点积)。我SQL表是名称和相关的高维向量的列表。该表有1串+ 1000的数值列。通常有几百万行。在矢量/阵列的每个浮子处于一个新列:

I would like to multiply each row by a single specified row, and then sum that product for each row (a dot product.) My SQL table is a list of names and associated high-dimensional vectors. The table has 1 string + 1000 numerical columns. There are usually a few million rows. Each float in the vectors/arrays is in a new column:

+--------+------+-------+------+---+--------+
|  NAME  | COL0 | COL1  | COL2 | … | COL999 |
+--------+------+-------+------+---+--------+
| TOPAZ  | 0.73 | 0.77  | 0.15 |   | 0.55   |
| GARDEN | 0.41 | 0.57  | 0.61 |   | 0.00   |
| HAVE   | 0.40 | 0.32  | 0.23 |   | 0.52   |
+--------+------+-------+------+---+--------+

在阵列数学,这将仅仅是总和(表*行,轴= 1)。这是一个大阵,并需要几秒钟时,它用C写的,生活在纯净的内存来执行,所以速度是非常重要的。但它变得太大,是在本地内存,现在,数据需要生活在一个SQL服务器。如何计算表* GARDEN一种有效的方式,这样我可以得到每一行的点积带花园(下面的例子输出)?

In array math, this would simply be sum(table * row, axis=1). It's a large array, and takes a few seconds to execute when it's written in C and lives in pure memory, so speed is very important. But it's gotten too big to be in local memory now, and the data needs to live in a SQL server. How do I calculate table * GARDEN in an efficient way such that I can get the dot product of every row with GARDEN (example output below)?

+--------+------+-------+------+---+--------+------+
|  NAME  | COL0 | COL1  | COL2 | … | COL999 | DOT  |
+--------+------+-------+------+---+--------+------+
| TOPAZ  | 0.57 | 0.24  | 0.34 |   | 0.21   | 0.46 |
| GARDEN | 0.42 | 0.43  | 0.12 |   | 0.66   | 0.48 |
| HAVE   | 0.31 | 0.37  | 0.30 |   | 0.47   | 0.40 |
+--------+------+-------+------+---+--------+------+

非常感谢!

推荐答案

的结构是:

select t.*,
       (t.col0 * garden.col0 +
        t.col1 * garden.col1 + . . .
        t.col999 * garden.col999
       ) as DOT
from t cross join
     (select t.*
      from t
      where name = 'GARDEN'
     ) garden;

这仍然不会特别快。在C#几秒钟可能是许多分钟,除非你有并行的硬件和可以利用它一个很好的SQL数据库。

This still won't be particularly fast. "A few seconds" in C# might be many minutes, unless you have parallel hardware and a good SQL database that can take advantage of it.

这篇关于积在许多列的SQL表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆