如何在 BIGQUERY 加载 API 中跳过 csv 文件的行 [英] How to skip rows of csv file in BIGQUERY load API

查看:18
本文介绍了如何在 BIGQUERY 加载 API 中跳过 csv 文件的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 BigQuery API 将 CSV 数据从云存储桶加载到 BigQuery 表我的代码是:

I am trying to load CSV data from cloud storage bucket to BigQuery table using BigQuery API My Code is :

def load_data_from_gcs(dataset_name, table_name, source):
    bigquery_client = bigquery.Client()
    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(table_name)
    job_name = str(uuid.uuid4())

    job = bigquery_client.load_table_from_storage(
        job_name, table, source)
    job.sourceFormat = 'CSV'
    job.fieldDelimiter = ','
    job.skipLeadingRows = 2

    job.begin()
    job.result()  # Wait for job to complete

    print('Loaded {} rows into {}:{}.'.format(
        job.output_rows, dataset_name, table_name))

    wait_for_job(job)

它给了我错误:

400 CSV table encountered too many errors, giving up. Rows: 1; errors: 1.

这个错误是因为,我的 csv 文件包含前两行作为标题信息,并且不应该被加载.我给了job.skipLeadingRows = 2 但它没有跳过前 2 行.有没有其他语法可以设置跳过行?

this error is because,my csv file contains first two rows as header information and that is not supposed to be loaded. I have given job.skipLeadingRows = 2 but it is not skipping the first 2 rows. Is there any other syntax to set skip rows ?

请帮忙解决这个问题.

推荐答案

您拼写错误(使用驼峰式字母代替下划线).它是 skip_leading_rows,而不是 skipLeadingRows.field_delimitersource_format 相同.

You're spelling it wrong (using camelcase instead of underscores). It's skip_leading_rows, not skipLeadingRows. Same for field_delimiter and source_format.

查看 Python 源代码 此处.

Check out the Python sources here.

这篇关于如何在 BIGQUERY 加载 API 中跳过 csv 文件的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆