读取数据时出错,错误消息:CSV表引用列位置15,但从位置0开始的行仅包含1列 [英] Error while reading data, error message: CSV table references column position 15, but line starting at position:0 contains only 1 columns
问题描述
我是bigquery的新手,在这里我试图将数据加载到我手动创建的GCP BigQuery表中,我有一个bash文件,其中包含bq load命令-
I am new in bigquery, Here I am trying to load the Data in GCP BigQuery table which I have created manually, I have one bash file which contains bq load command -
bq load --source_format=CSV --field_delimiter=$(printf '\u0001') dataset_name.table_name gs://bucket-name/sample_file.csv
我的CSV文件包含具有16列的多个ROWS-示例行为
My CSV file contains multiple ROWS with 16 column - sample Row is
100563^3b9888^Buckname^https://www.settttt.ff/setlllll/buckkkkk-73d58581.html^Buckcherry^null^null^2019-12-14^23d74444^Reverb^Reading^Pennsylvania^United States^US^40.3356483^-75.9268747
表架构-
当我从云外壳执行bash脚本文件时,出现以下错误-
When I am executing bash script file from cloud shell, I am getting following Error -
Waiting on bqjob_r10e3855fc60c6e88_0000016f42380943_1 ... (0s) Current status: DONE
BigQuery error in load operation: Error processing job 'project-name-
staging:bqjob_r10e3855fc60c6e88_0000ug00004521': Error while reading data, error message: CSV
table
encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection
for more details.
Failure details:
- gs://bucket-name/sample_file.csv: Error while
reading data, error message: CSV table references column position
15, but line starting at position:0 contains only 1 columns.
有什么解决方案,请先谢谢
What would be the solution, Thanks in advance
推荐答案
您正尝试根据您提供的架构将错误的值插入到表中
You are trying to insert wrong values to your table per the schema you provided
基于表模式和您的数据示例,我运行以下命令:
Based on table schema and your data example I run this command:
./bq load --source_format=CSV --field_delimiter=$(printf '^') mydataset.testLoad /Users/tamirklein/data2.csv
第一个错误
故障详细信息: -读取数据时出错,错误消息:无法解析'39b888' 作为从位置0开始的字段Field2(位置1)的int值
Failure details: - Error while reading data, error message: Could not parse '39b888' as int for field Field2 (position 1) starting at location 0
这时,我从39b888中手动删除了b,现在我得到了
At this point, I manually removed the b from 39b888 and now I get this
第二个错误
故障详细信息: -读取数据时出错,错误消息:无法解析 字段8(位置7)的日期为'14/12/2019',起始于 位置0
Failure details: - Error while reading data, error message: Could not parse '14/12/2019' as date for field Field8 (position 7) starting at location 0
在这一点上,我将14/12/2019更改为2019-12-14,这是BQ日期格式,现在一切正常.
At this point, I changed 14/12/2019 to 2019-12-14 which is BQ date format and now everything is ok
上传完成. 正在等待bqjob_r9cb3e4ef5ad596e_0000016f42abd4f6_1 ...(0s)当前状态:完成
Upload complete. Waiting on bqjob_r9cb3e4ef5ad596e_0000016f42abd4f6_1 ... (0s) Current status: DONE
您需要先清理数据,然后再上传或使用带有--max_bad_records
标志的更多行的数据示例(某些行可以,而某些行则不取决于您的数据质量)
You will need to clean your data before upload or use a data sample with more lines with --max_bad_records
flag (Some of the lines will be ok and some not based on your data quality)
注意:很遗憾,在上传期间无法控制日期格式,请参见此答案作为参考
Note: unfortunately there is no way to control date format during the upload see this answer as a reference
这篇关于读取数据时出错,错误消息:CSV表引用列位置15,但从位置0开始的行仅包含1列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!