从BigQuery中的加权边列表构建邻接矩阵 [英] Build adjacency matrix from list of weighted edges in BigQuery
问题描述
相关问题:
如何在Google BigQuery中为数千个类别创建虚拟变量列表
我有一张加权边缘列表,它是用户项目评分,它看起来像这样:
I have a table of list of weighted edges which is a list of user-item rating, it looks like this:
| userId | itemId | rating
| 001 | 001 | 5.0
| 001 | 002 | 4.0
| 002 | 001 | 4.5
| 002 | 002 | 3.0
我想将此加权边列表转换为邻接矩阵:
I want to convert this weighted edge list into a adjacency matrix:
| userId | item001 | item002
| 001 | 5.0 | 4.0
| 002 | 4.5 | 3.0
According to this post, we can do it in two steps, the first step is to extract the matrix entry's value to generate a query, and second step is to run the query which is generated from 1st step.
但是我的问题是如何提取评分值并使用评分值在 IF()
语句中?我的直觉是在 IF()
语句中嵌入一个嵌套查询,例如:
But my question is how to extract the rating value and use the rating value in the IF()
statement? My intuition is to put a nested query inside the IF()
statement such like:
IF(itemId = blah,
(select rating
from mytable
where
userId = blahblah
and itemId = blah),
0)
但是这个查询看起来太贵了,有人可以给我一个例子吗?
But this query looks too expensive, can someone give me an example?
谢谢
Thanks
推荐答案
除非我错过了一些东西 - 它与您引用的帖子非常相似
Unless I am missing something - it is quite similar to the post you referenced
第1步 - 生成查询
Step 1 - generate query
SELECT 'SELECT userID, ' +
GROUP_CONCAT_UNQUOTED(
'SUM(IF(itemId = "' + STRING(itemId) + '", rating, 0)) AS item' + STRING(itemId)
)
+ ' FROM YourTable GROUP BY userId'
FROM (
SELECT itemId
FROM YourTable
GROUP BY itemId
)
步骤2 - 运行生成的查询
Step 2 - run generated query
SELECT
userID,
SUM(IF(itemId = "001", rating, 0)) AS item001,
SUM(IF(itemId = "002", rating, 0)) AS item002
FROM YourTable
GROUP BY userId
预期结果
userID item001 item002
001 5.0 4.0
002 4.5 3.0
这篇关于从BigQuery中的加权边列表构建邻接矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!