如何让 Google BigQuery 正确检测标题名称? [英] How to have Google BigQuery properly detect header names?

查看:19
本文介绍了如何让 Google BigQuery 正确检测标题名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用上传到 Google Cloud Platform's Storage 的数据成功创建了一个新表,但问题是当我使用自动检测设置并将标题行跳过"设置为 1 时,标题字段名称总是错误的...我刚刚得到了通用名称,例如string_field_0".

I successfully created a new table using the data I uploaded onto Google Cloud Platform's Storage, but the problem is the header field names are always wrong when I use the Automatically Detect setting, and set "Header rows to skip" to be 1...I just got generic names such as "string_field_0".

我知道我可以在 Schema 下手动添加字段名称,但是,这对于具有许多字段的表是不可行的.有没有办法修复标题名称?不过,这似乎没什么大不了的……Pandas 一直会自动执行此操作.

I know I can manually add field names under Schema, however, that is not feasible with tables that have many fields. Is there a way to fix the header names? It doesn't seem to be a big thing though...Pandas does this automatically all the time.

谢谢!

Excel 中的 csv 文件:

csv file in Excel:

推荐答案

问题是您的文件中只有 String 类型.因此,BigQuery 无法区分标题和实际有效行.如果您说另一列包含字符串以外的内容,例如整数,然后它会检测列名.例如:

The problem is that you only have String types in your file. So, BigQuery can't differentiate between the header and actual valid rows. If you had say another column with something other than a String e.g. Integer, then it will detect the column names. For example:

column1,column2,column3
foo,bar,1
cat,dog,2
fizz,buzz,3

正确加载,因为数据中除了字符串之外还有其他内容:

Correctly loads as this because there is something other than just Strings in the data:

因此,要么您需要拥有字符串以外的其他内容,要么您需要自己明确指定架构.

So, either you need to have something other than just Strings, or you need to explicitly specify the schema yourself.

提示:您没有使用 UI 并单击大量按钮来定义架构.您可以使用 API 或 gcloud CLI 工具以编程方式执行此操作.

Hint: you don't have the use the UI and click a load of buttons for define the schema. You can programatically do it using the API or the gcloud CLI tool.

这篇关于如何让 Google BigQuery 正确检测标题名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆