在Google BigQuery中创建UUID列 [英] Create a column of UUIDs in Google BigQuery
本文介绍了在Google BigQuery中创建UUID列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
Google BigQuery 不支持
Google BigQuery doesn't support UUID as data type. So, which option is better to store it:
-
STRING
:字符串,格式为8-4-4-4-12 -
BYTES
:16个字节(128位)的数组
STRING
: String with the format 8-4-4-4-12BYTES
: Array of 16 bytes (128 bits)
推荐答案
BigQuery now supports a function called GENERATE_UUID
. This returns a STRING
with 32 hexadecimal digits in five groups separated by hyphens in the form 8-4-4-4-12.
原始内容:
关于权衡的一些讨论:
- UUID与其他系统中的表示形式兼容,例如,如果您导出为CSV,然后希望与其他位置的导出合并.
- UUID与BigQuery的可能是UUID实现兼容.您可以使用一个函数(在实现该功能时)生成具有相同形式的UUID.
- 如果以后决定将UUID表示为
BYTES
,则可以使用UDF进行转换. - 缺点:比较运算符可能不如
BYTES
那样快,具体取决于运算符,因为字符串比较必须考虑UTF-8编码. (听起来这对您来说不是问题). - 缺点:存储成本较高. (听起来这对您来说不是问题).
- UUIDs are compatible with the representation in other systems, such as if you export to CSV and then want to merge with exports from elsewhere.
- UUIDs are compatible with BigQuery's probably UUID implementation. You will be able to generate UUIDs of this same form using a function (when the feature is implemented).
- If you decide to represent the UUIDs as
BYTES
later, you can potentially convert using a UDF. - Downside: Comparisons may not be as fast as with
BYTES
depending on the operator, since string comparisons have to take UTF-8 encoding into account. (It sounds like this isn't an issue for you). - Downside: Storage costs are higher. (It sounds like this isn't an issue for you).
- UUID的存储更加紧凑;存储更便宜,比较也更快.
- 如果您决定以后将UUID表示为
STRING
,则可以使用UDF进行转换. - 缺点:导出后,UUID与其他系统不兼容,并且也可能与BigQuery的实现不兼容.
- UUIDs are stored more compactly; storage is cheaper and comparisons are faster.
- If you decide to represent the UUIDs as
STRING
s later, you can potentially convert them using a UDF. - Downside: UUIDs are not compatible with other systems after export, and will likely not be compatible with BigQuery's implementation either.
这篇关于在Google BigQuery中创建UUID列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文