BigQuery GitHub数据:如何处理回购股票名称更改? [英] BigQuery GitHub data: How to handle repo name changes?

查看:160
本文介绍了BigQuery GitHub数据:如何处理回购股票名称更改?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是跟踪我的回购的总数。但是,其repo.name随时间而改变。如何通过 githubarchive 数据集实现这一目标?

解决方案

https://stackoverflow.com/a/42930963/132438



GitHub项目名称会经历更改,因此不用通过名称查询,而是通过id查询更安全。您可以在单独的查询中查找项目ID,或者在查询中完成这项工作:

  SELECT 
COUNT(*)naive_count,
COUNT(DISTINCT actor.id)unique_by_actor_id,
COUNT(DISTINCT actor.login)unique_by_actor_login
FROM`githubarchive.month。*`
WHERE repo .id =(
SELECT repo.id
FROM`githubarchive.month.201702`
WHERE repo.name ='bazelbuild / bazel'
LIMIT 1)
AND type =WatchEvent


My goal is to track the total number of stars of my repo. However, its repo.name changed over time. How to achieve this with the githubarchive dataset?

解决方案

(related to https://stackoverflow.com/a/42930963/132438)

GitHub project names go through changes, so instead of querying by name it's safer to query by id. You could look for a project id in a separate query, or do it altogether in a query like this:

SELECT 
  COUNT(*) naive_count,
  COUNT(DISTINCT actor.id) unique_by_actor_id, 
  COUNT(DISTINCT actor.login) unique_by_actor_login 
FROM `githubarchive.month.*` 
WHERE repo.id = (
  SELECT repo.id 
  FROM `githubarchive.month.201702` 
  WHERE repo.name='bazelbuild/bazel' 
  LIMIT 1)
AND type = "WatchEvent"

这篇关于BigQuery GitHub数据:如何处理回购股票名称更改?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆