使用Google Big Query进行弹性搜索 [英] Elastic search with Google Big Query

查看：219 发布时间：2020/10/28 2:00:22 elasticsearch google-bigquery

本文介绍了使用Google Big Query进行弹性搜索的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在弹性搜索引擎中加载了事件日志，并使用Kibana对其进行了可视化处理。我的事件日志实际上存储在Google Big Query表中。目前，我正在将json文件转储到Google存储桶中，并将其下载到本地驱动器。然后使用logstash，将json文件从本地驱动器移至弹性搜索引擎。

I have the event logs loaded in elasticsearch engine and I visualise it using Kibana. My event logs are actually stored in the Google Big Query table. Currently I am dumping the json files to a Google bucket and download it to a local drive. Then using logstash, I move the json files from the local drive to the elastic search engine.

现在，我正在尝试通过在Google大查询和弹性搜索之间建立联系来实现流程自动化。根据我的阅读，我了解到有一个输出连接器，可将数据从弹性搜索发送到Google大查询，但反之则不然。只是想知道我是否应该将json文件上传到kubernete集群，然后在集群和Elastic搜索引擎之间建立连接。

Now, I am trying to automate the process by establishing the connection between google big query and elastic search. From what I have read, I understand that there is a output connector which sends the data from elastic search to Google big query but not vice versa. Just wondering whether I should upload the json file to a kubernete cluster and then establish the connection between the cluster and Elastic search engine.

在此方面的任何帮助将不胜感激。

Any help with this regard would be appreciated.

推荐答案

Apache Beam具有用于BigQuery和Elastic Search的连接器，我将使用DataFlow明确地执行此操作，因此您无需实现复杂的ETL和临时存储。您可以使用 BigQueryIO.Read.from 从BigQuery读取数据（如果性能很重要，请查看此内容 BigQueryIO读取与fromQuery ），然后使用 ElasticsearchIO.write（）

Apache Beam has connectors for BigQuery and Elastic Search, I would definitly perform this using DataFlow so you don´t need to implement a complex ETL and staging storage. You can read the data from BigQuery using BigQueryIO.Read.from (take a look to this if performance is important BigQueryIO Read vs fromQuery) and load it into ElasticSearch using ElasticsearchIO.write()

请参阅此如何从BigQuery Dataflow中读取数据

Refer this how read data from BigQuery Dataflow

https://github.com/GoogleCloudPlatform /professional-services/blob/master/examples/dataflow-bigquery-transpose/src/main/java/com/google/cloud/pso/pipeline/Pivot.java

弹性搜索索引

https：//github.c om / GoogleCloudPlatform / professional-services / tree / master / examples / dataflow-elasticsearch-indexer

已更新2019-06-24

UPDATED 2019-06-24

今年最近发布了BigQuery Storage API，该API改进了从BigQuery提取数据的并行性，并由DataFlow原生支持。请参阅 https://beam.apache.org / documentation / io / built-in / google-bigquery /＃storage-api 了解更多详情。

Recently this year was release BigQuery Storage API which improve the parallelism to extract data from BigQuery and is natively supported by DataFlow. Refer to https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api for more details.

从文档中

BigQuery Storage API允许您直接访问表在BigQuery存储中。结果，您的管道可以比以前更快的速度从BigQuery存储中读取数据。

The BigQuery Storage API allows you to directly access tables in BigQuery storage. As a result, your pipeline can read from BigQuery storage faster than previously possible.

这篇关于使用Google Big Query进行弹性搜索的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Google Big Query进行弹性搜索 [英] Elastic search with Google Big Query

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Google Big Query进行弹性搜索 [英] Elastic search with Google Big Query

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭