BigQuery:何时刷新GHTorrent以及如何获取最新信息? [英] BigQuery: When is GHTorrent refreshed and how to get up to date information?

查看:207
本文介绍了BigQuery:何时刷新GHTorrent以及如何获取最新信息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ghtorrent-bq 数据非常适合GitHub的快照,但是,它不清楚它何时更新以及如何获得更新的数据

解决方案

(与 https:/ Gtorrent仅在BigQuery上提供其数据的定期快照,而GitHub Archive每天更新(甚至更新)每小时一次 - 让我检查一下)。



有更频繁的GHTorrent快照会很棒(也许 https://twitter.com/gousiosg 可以提供帮助),但同时您可以合并两个数据集(查找GHTorrent快照数据,然后添加最新的星星GitHub Archive):

  #standardSQL 
SELECT COUNT(DISTINCT登录)c
FROM(
SELECT登录
FROM(
SELECT login
FROM`ghtorrent-bq。 ght_2017_01_19.watchers` a
JOIN`ghtorrent-bq.ght_2017_01_19.projects` b
ON a.repo_id = b.id
JOIN`ghtorrent-bq.ght_2017_01_19.users` c
ON a.user_id = c.id
WHERE url ='https://api.github.com/repos/angular/angular'

UNION ALL(
SELECT actor.login
FROM`githubarchive.month.2017 *`
WHERE repo.name ='angular / angular'
AND type =WatchEvent



The ghtorrent-bq data is great to have snapshot of GitHub, however, it is not clear when it is updated and how I could get more up to date data

解决方案

(related to https://stackoverflow.com/a/42930963/132438)

GHTorrent only provides a periodical snapshot of their data on BigQuery, while GitHub Archive updates daily (or even hourly - let me check that).

It would be great to have a more frequent snapshot of GHTorrent (maybe https://twitter.com/gousiosg can help), but in the meantime you can merge both datasets (look for the GHTorrent snapshot data, and then add the latest stars from GitHub Archive):

#standardSQL
SELECT COUNT(DISTINCT login) c
FROM (
  SELECT login
  FROM (
    SELECT login
    FROM `ghtorrent-bq.ght_2017_01_19.watchers` a
    JOIN `ghtorrent-bq.ght_2017_01_19.projects` b
    ON a.repo_id=b.id
    JOIN `ghtorrent-bq.ght_2017_01_19.users` c
    ON a.user_id=c.id
    WHERE url = 'https://api.github.com/repos/angular/angular'
  )
  UNION ALL (
    SELECT actor.login
    FROM `githubarchive.month.2017*` 
    WHERE repo.name='angular/angular'
    AND type = "WatchEvent"
  )
)

这篇关于BigQuery:何时刷新GHTorrent以及如何获取最新信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆