如何在BIGQUERY加载API中跳过csv文件的行 [英] How to skip rows of csv file in BIGQUERY load API
问题描述
我正在尝试使用BigQuery API将来自云存储分区的CSV数据加载到BigQuery表中。$ My $是:
I am trying to load CSV data from cloud storage bucket to BigQuery table using BigQuery API My Code is :
def load_data_from_gcs(dataset_name, table_name, source):
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset(dataset_name)
table = dataset.table(table_name)
job_name = str(uuid.uuid4())
job = bigquery_client.load_table_from_storage(
job_name, table, source)
job.sourceFormat = 'CSV'
job.fieldDelimiter = ','
job.skipLeadingRows = 2
job.begin()
job.result() # Wait for job to complete
print('Loaded {} rows into {}:{}.'.format(
job.output_rows, dataset_name, table_name))
wait_for_job(job)
它给我错误:
It is giving me error:
400 CSV table encountered too many errors, giving up. Rows: 1; errors: 1.
这个错误是因为我的csv文件包含头两行作为头信息,不应该被加载。我给了
job.skipLeadingRows = 2,但它不会跳过前两行。
是否有任何其他语法可以设置跳行?
this error is because,my csv file contains first two rows as header information and that is not supposed to be loaded. I have given job.skipLeadingRows = 2 but it is not skipping the first 2 rows. Is there any other syntax to set skip rows ?
请帮忙。
Please help on this.
推荐答案
你拼错了(使用camelcase而不是下划线)。它是 skip_leading_rows
,而不是 skipLeadingRows
。同样适用于 field_delimiter
和 source_format
。
You're spelling it wrong (using camelcase instead of underscores). It's skip_leading_rows
, not skipLeadingRows
. Same for field_delimiter
and source_format
.
检查Python来源此处。
Check out the Python sources here.
这篇关于如何在BIGQUERY加载API中跳过csv文件的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!