在具有多个输入的trigrams上构造BigQuery [英] Structuring BigQuery on trigrams with multiple inputs

查看：135 发布时间：2018/5/7 17:47:43 google-bigquery

本文介绍了在具有多个输入的trigrams上构造BigQuery的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

现在，感谢回答这位问题的帮助，我能够成功查询单词，并获得最受欢迎的后续单词列表。例如，使用单词great，我可以按照以下格式获得最多10个单词的列表：

  SELECT second，SUM（cell.page_count）total 
 FROM [publicdata：samples.trigrams] 
 WHERE first =great
 group by 1 
 order by 2 desc 
限制10

输出：

 第二个总额
 ------------------ 
交易3048832 
和1689911 
 ，1576341 
a 1019511 
编号984993 
许多875974 
重要805215 
部分739409 
。 700694 
 as 628978

我目前无法弄清楚如何做到这一点自动查询多个单词（而不是每次在单独的单词上调用查询），以便我可能有如下输出：

 greattotalnew_word_1new_total_1 ...new_word_Nnew_total_N 
 -------------------- -------------------------------------------------- ------------------- 
 deal 3048832new_follow_on_word1123456 ...follow_on_N1234567 
和1689911new_follow_on_word212345 ... follow_on_N2123456

基本上我可以调用 N 单个查询中的单词数量（例如， new_word_1 ）是一个完全不同的词，如棒球，与没有关系），并获得与不同colu上每个词相关的总计数MN。

另外，在了解了BigQuery的定价，我也很难找出如何尽可能限制查询的总数据。我可以考虑只使用最新的数据（比如2010年以后）和每个单词2个字母数字输出，但可能会丢失更明显的限制条件

对此有任何帮助非常感谢 - 谢谢！

解决方案
您可以在同一个查询中放置多个第一个单词，但它需要计算前10个分开单词，然后将结果汇总在一起。这里是棒极了和棒球的例子

SELECT word1，total1，word2，total2 FROM SELECT ROW_NUMBER（）OVER（）rowid1，word1，total1 FROM（ SELECT second as word1，SUM（cell.page_count）total1 FROM [publicdata：samples.trigrams] WHERE first = （）rowid2，word2，total2 FROM（（））$（$） $ 1 $ SELECT second as as2，SUM（cell.page_count）total2 FROM [publicdata：samples.trigrams] WHERE first =baseball group by 1 order by 2 desc limit 10））a2 ON a1.rowid1 = a2.rowid2

Presently, thanks to help from the answerer of this question, I am able to successfully query a word, and get a list of the most popular follow-on words. For example, using the word "great", I am able to get a list of up 10 words in the following format:
SELECT second, SUM(cell.page_count) total FROM [publicdata:samples.trigrams] WHERE first = "great" group by 1 order by 2 desc limit 10
With the output:
second total ------------------ deal 3048832 and 1689911 , 1576341 a 1019511 number 984993 many 875974 importance 805215 part 739409 . 700694 as 628978
What I am currently having trouble figuring out how is how to do this query for multiple words automatically (as opposed to calling a query on a separate word each time) so that I could possibly have a output such as:
"great" total "new_word_1" new_total_1 ... "new_word_N" new_total_N ----------------------------------------------------------------------------------------- deal 3048832 "new_follow_on_word1" 123456 ... "follow_on_N1" 234567 and 1689911 "new_follow_on_word2" 12345 ... "follow_on_N2" 123456
Where essentially I could call N number of words in a single query (for example, new_word_1 is a totally different word like "baseball", with no relation to "great"), and getting the total counts related to each word on a different column.

Additionally, after learning about the BigQuery's pricing, I am also having trouble figuring out how to limit the total data queried as much possible. I can think of using only the latest data (say, such as 2010 onwards) and 2 alphanumeric outputs per word, but may be missing more obvious limiters

Any help on this is much appreciated - thanks!
解决方案
You can put multiple first words in the same query, but it will need to compute top 10 following words separately, and then join together the results. Here is an example for "great" and "baseball"
SELECT word1, total1, word2, total2 FROM (SELECT ROW_NUMBER() OVER() rowid1, word1, total1 FROM ( SELECT second as word1, SUM(cell.page_count) total1 FROM [publicdata:samples.trigrams] WHERE first = "great" group by 1 order by 2 desc limit 10)) a1 JOIN (SELECT ROW_NUMBER() OVER() rowid2, word2, total2 FROM ( SELECT second as word2, SUM(cell.page_count) total2 FROM [publicdata:samples.trigrams] WHERE first = "baseball" group by 1 order by 2 desc limit 10)) a2 ON a1.rowid1 = a2.rowid2

这篇关于在具有多个输入的trigrams上构造BigQuery的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在具有多个输入的trigrams上构造BigQuery [英] Structuring BigQuery on trigrams with multiple inputs

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在具有多个输入的trigrams上构造BigQuery [英] Structuring BigQuery on trigrams with multiple inputs

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭