Google BigQuery:速度较慢的数据流会插入效果 [英] Google BigQuery: Slow streaming inserts performance

查看:94
本文介绍了Google BigQuery:速度较慢的数据流会插入效果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用BigQuery作为事件记录平台。

我们遇到的问题是非常缓慢的insertAll发布请求( https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll )。
从服务器或客户端解雇它们并不重要。

最小值为900毫秒,平均值为1500秒,其中连接时间将近1000毫秒。
即使每秒有1个请求(因此这里不需要调节)。

我们使用Google Analytics测量协议,来自同一台机器的计时为50-150ms 。



中描述的解决方案BigQuery在PHP中使用'insertAll'性能会使用队列,但它似乎过度杀毒,因为我们每秒发送的请求数不超过10次。 问题是如果1500ms是正常的流媒体插入,如果不是,如何使它们更快。



添加信息:
如果我们发送格式错误的JSON,响应到达50-100ms。

解决方案

根据我的经验,任何对bigquery的请求都需要很长时间。我们已经尝试将它用作性能数据的数据库,但由于响应时间较慢,最终将迁移出去。据我所见。 BQ是为了在1 - 10秒的响应时间内处理大量请求而构建的。这些是BQ分类为交互式的请求。 BQ不会因做得少而变得更快。我们给BQ提供了很多记录,但总是确保我们对它们进行批处理(每张表格)。并且异步地运行所有的请求(或者如果你必须在另一个地方)。

PS。我可以确认Pentium10在BQ中的失败。确保你重试失败的东西,如果再次失败,将它记录到文件中再次重试。


We are using BigQuery as event logging platform.

The problem we faced was very slow insertAll post requests (https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll). It does not matter where they are fired - from server or client side.

Minimum is 900ms, average is 1500s, where nearly 1000ms is connection time. Even if there is 1 request per second (so no throttling here).

We use Google Analytics measurement protocol and timings from the same machines are 50-150ms.

The solution described in BigQuery streaming 'insertAll' performance with PHP suugested to use queues, but it seems to be overkill because we send no more than 10 requests per second.

The question is if 1500ms is normal for streaming inserts and if not, how to make them faster.

Addtional information: If we send malformed JSON, response arrives in 50-100ms.

解决方案

To my experience any request to bigquery will take long. We've tried using it as a database for performance data but eventually are moving out due to slow response times. As far as I can see. BQ is built for handling big requests within a 1 - 10 second response time. These are the requests BQ categorizes as interactive. BQ doesn't get faster by doing less. We stream quite some records to BQ but always make sure we batch them up (per table). And run all requests asynchronously (or if you have to in another theat).

PS. I can confirm what Pentium10 sais about faillures in BQ. Make sure you retry the stuff that fails and if it fails again log it to file for retrying it another time.

这篇关于Google BigQuery:速度较慢的数据流会插入效果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆