如何让Google BigQuery正确检测标头名称? [英] How to have Google BigQuery properly detect header names?

查看:46
本文介绍了如何让Google BigQuery正确检测标头名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用上传到Google Cloud Platform的存储的数据成功创建了一个新表,但是问题是当我使用自动检测"设置并将要跳过的标题行"设置为1时,标题字段名总是错误的...我只是获得了通用名称,例如"string_field_0".

I successfully created a new table using the data I uploaded onto Google Cloud Platform's Storage, but the problem is the header field names are always wrong when I use the Automatically Detect setting, and set "Header rows to skip" to be 1...I just got generic names such as "string_field_0".

我知道我可以在Schema下手动添加字段名称,但是,对于具有许多字段的表,这是不可行的.有没有办法解决标头名称?但这似乎并不是一件大事...熊猫一直都在自动执行此操作.

I know I can manually add field names under Schema, however, that is not feasible with tables that have many fields. Is there a way to fix the header names? It doesn't seem to be a big thing though...Pandas does this automatically all the time.

谢谢!

csv文件:

推荐答案

问题是文件中只有String类型.因此,BigQuery无法区分标题和实际有效行.如果您说了另一列而不是字符串,例如整数,然后它将检测列名称.例如:

The problem is that you only have String types in your file. So, BigQuery can't differentiate between the header and actual valid rows. If you had say another column with something other than a String e.g. Integer, then it will detect the column names. For example:

column1,column2,column3
foo,bar,1
cat,dog,2
fizz,buzz,3

按这样正确加载,因为数据中不仅包含字符串,而且还有其他内容:

Correctly loads as this because there is something other than just Strings in the data:

因此,您需要的不只是字符串,还需要您自己明确指定架构.

So, either you need to have something other than just Strings, or you need to explicitly specify the schema yourself.

提示:您没有使用UI并单击大量按钮来定义架构.您可以使用API​​或gcloud CLI工具以编程方式进行操作.

Hint: you don't have the use the UI and click a load of buttons for define the schema. You can programatically do it using the API or the gcloud CLI tool.

这篇关于如何让Google BigQuery正确检测标头名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆