将云存储中的文本文件(.txt)加载到大查询表中 [英] loading a text files (.txt) in cloud storage into big query table

查看:97
本文介绍了将云存储中的文本文件(.txt)加载到大查询表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组文本文件,每5分钟上传一次到Google云存储中.我想每5分钟将它们放入BigQuery中一次(因为每5分钟将文本文件上传到Cloud Storage中).我知道文本文件无法上传到BigQuery.最好的方法是什么?

I have a set of text files that are uploaded every 5 minutes into the google cloud storage. I want to put them into BigQuery in every 5 minutes (because text files uploaded into Cloud Storage in every 5 min). I know text files cant to be uploaded into BigQuery. What is the best approach for this?

文本文件的样本

谢谢.

推荐答案

他是一种替代方法,该方法将使用基于事件的Cloud Function将数据加载到BigQuery中.使用"Trigger Type"作为云存储创建云功能.将文件/文件加载到云存储分区后,它将立即调用/触发云功能事件,并将来自云存储的数据加载到BigQuery中.

He is an alternative approach, which will use an event-based Cloud Function to load data into BigQuery. Create a cloud function with "Trigger Type" as cloud storage. As soon as file/files loaded into cloud storage bucket, it will invoke/trigger cloud function event and data from cloud storage will be loaded into BigQuery.

import pandas as pd
from google.cloud import bigquery

def bqDataLoad(event, context):
    bucketName = event['bucket']
    blobName = event['name']
    fileName = "gs://" + bucketName + "/" + blobName
    
    bigqueryClient = bigquery.Client()
    tableRef = bigqueryClient.dataset("bq-dataset-name").table("bq-table-name")

    dataFrame = pd.read_csv(fileName)

    bigqueryJob = bigqueryClient.load_table_from_dataframe(dataFrame, tableRef)
    bigqueryJob.result()

这篇关于将云存储中的文本文件(.txt)加载到大查询表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆