使用BigQuery选择所有相关行(从GAE读取日志) [英] Selecting all related rows with BigQuery (reading logs from GAE)

查看:67
本文介绍了使用BigQuery选择所有相关行(从GAE读取日志)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Google App Engine日志正通过标准流媒体导出工具导出到BigQuery 。我想查询显示所有日志行中的任何日志行包含字符串的请求。

My Google App Engine logs are being exported to BigQuery via the standard streaming export tool. I'd like to query "show me all log lines for requests in which any log line contains a string".

这个查询给了我感兴趣的请求id:

This query gives me the request ids I'm interested in:

SELECT protoPayload.requestId AS reqId
  FROM TABLE_QUERY(logs, 'true') 
  WHERE protoPayload.line.logMessage contains 'INTERNAL_SERVICE_ERROR'

...这可以让我查询相关的行:

...and this lets me query for the related lines:

SELECT
  metadata.timestamp AS Time,
  protoPayload.host AS Host,
  protoPayload.status AS Status,
  protoPayload.resource AS Path,
  protoPayload.line.logMessage
FROM
  TABLE_QUERY(logs, 'true')
WHERE
  protoPayload.requestId in ("requestid1", "requestid2", "etc")
ORDER BY time

然而,我无法将两者合并为一个查询。 BQ似乎不允许在WHERE子句中使用子选择,当我尝试使用命名表执行传统的自连接时,我会混淆错误消息。什么是秘密?

However, I'm having trouble combining the two into a single query. BQ doesn't seem to allow subselects in the WHERE clause and I get confusing error messages when I try to do a traditional self-join with named tables. What's the secret?

推荐答案

要选择至少有一个logMessage包含给定字符串的行,可以使用OMIT IF构造



To select lines where at least one of logMessage contains given string, you can use OMIT IF construct

SELECT
  metadata.timestamp AS Time,
  protoPayload.host AS Host,
  protoPayload.status AS Status,
  protoPayload.resource AS Path,
  protoPayload.line.logMessage
FROM
  TABLE_QUERY(logs, 'true')
OMIT RECORD IF
  EVERY(NOT (protoPayload.line.logMessage contains 'INTERNAL_SERVICE_ERROR'))
ORDER BY time

这篇关于使用BigQuery选择所有相关行(从GAE读取日志)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆