BigQuery-从架构中删除未使用的列 [英] BigQuery - remove unused column from schema
问题描述
我不小心在BigQuery表架构中添加了错误的列.
I accidentally added a wrong column to my BigQuery table schema.
我想知道是否可以执行以下操作,而不是重新加载整个表(数百万行):
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
- 通过使用某种过滤器在表上运行"select *"查询,然后将结果保存到同一表中,从而删除不良行(具有值的行包含错误的列).
- 删除(现在)未使用的列.
是否支持此功能(或类似功能)? 可能的将结果保存到表"功能可以具有紧凑模式"选项.
Is this functionality (or similar) supported? Possibly the "save result to table" functionality can have a "compact schema" option.
推荐答案
如果您的表不包含记录/重复的类型字段-您的简单选择是:
If your table does not consist of record/repeated type fields - your simple option is:
-
选择有效列,同时将不良记录过滤到新的临时表中
Select valid columns while filtering out bad records into new temp table
选择<原始列列表>
从YourTable
<过滤以删除此处的错误条目>
SELECT < list of original columns >
FROM YourTable
WHERE < filter to remove bad entries here >
在上方写入临时表-YourTable_Temp
制作损坏的"表的备份副本-YourTable_Backup
Make a backup copy of "broken" table - YourTable_Backup
请注意:以上#1的费用与您问题中第一个项目符号中的操作完全相同.其余操作(复制)是免费的
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free
如果您有重复/记录字段的情况-您仍然可以执行以上计划,但是在#1中,您将需要使用一些
In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you
我希望在某些时候您需要操纵和输出重复/记录数据时,Google BigQuery团队会为像您这样的情况提供更好的支持,但是目前,这是我发现的最佳解决方法-至少对我自己而言
I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself
这篇关于BigQuery-从架构中删除未使用的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!