Google BigQuery:慢速流插入性能 [英] Google BigQuery: Slow streaming inserts performance

查看:21
本文介绍了Google BigQuery:慢速流插入性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用 BigQuery 作为事件记录平台.

We are using BigQuery as event logging platform.

我们面临的问题是 insertAll 发布请求非常缓慢(https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll).它们在哪里被触发都没有关系——从服务器端还是客户端.

The problem we faced was very slow insertAll post requests (https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll). It does not matter where they are fired - from server or client side.

最小值为 900ms,平均值为 1500s,其中近 1000ms 是连接时间.即使每秒有 1 个请求(所以这里没有限制).

Minimum is 900ms, average is 1500s, where nearly 1000ms is connection time. Even if there is 1 request per second (so no throttling here).

我们使用 Google Analytics 测量协议,同一台机器的时间为 50-150 毫秒.

We use Google Analytics measurement protocol and timings from the same machines are 50-150ms.

BigQuery Streaming 'insertAll' performance with PHP 中描述的解决方案 suugested使用队列,但这似乎有点矫枉过正,因为我们每秒发送的请求不超过 10 个.

The solution described in BigQuery streaming 'insertAll' performance with PHP suugested to use queues, but it seems to be overkill because we send no more than 10 requests per second.

问题是 1500 毫秒对于流式插入是否正常,如果不是,如何使它们更快.

The question is if 1500ms is normal for streaming inserts and if not, how to make them faster.

附加信息:如果我们发送格式错误的 JSON,响应会在 50-100 毫秒内到达.

Addtional information: If we send malformed JSON, response arrives in 50-100ms.

推荐答案

根据我的经验任何对 bigquery 的请求都需要很长时间.我们曾尝试将其用作性能数据的数据库,但最终由于响应时间缓慢而退出.就我所见.BQ 是为在 1 - 10 秒响应时间内处理大请求而构建的.这些是 BQ 归类为交互式的请求.BQ 不会因为少做而变得更快.我们将相当多的记录流式传输到 BQ,但始终确保我们将它们批处理(每表).并异步运行所有请求(或者如果您必须在另一个剧院中运行).

To my experience any request to bigquery will take long. We've tried using it as a database for performance data but eventually are moving out due to slow response times. As far as I can see. BQ is built for handling big requests within a 1 - 10 second response time. These are the requests BQ categorizes as interactive. BQ doesn't get faster by doing less. We stream quite some records to BQ but always make sure we batch them up (per table). And run all requests asynchronously (or if you have to in another theat).

附注.我可以确认 Pentium10 所说的 BQ 中的故障.确保您重试失败的内容,如果再次失败,请将其记录到文件中以备下次重试.

PS. I can confirm what Pentium10 sais about faillures in BQ. Make sure you retry the stuff that fails and if it fails again log it to file for retrying it another time.

这篇关于Google BigQuery:慢速流插入性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆