使用IBM Cloud SQL查询时如何展平Parquet Array数据类型 [英] How to flatten an Parquet Array datatype when using IBM Cloud SQL Query
问题描述
我必须将正在从IBM Cloud SQL查询中读取的镶木地板文件数据推送到Cloud上的Db2.
I have to push parquet file data which I am reading from IBM Cloud SQL Query to Db2 on Cloud.
我的实木复合地板文件具有数组格式的数据,我也想将其推送到Cloud上的DB2.
My parquet file has data in array format, and I want to push that to DB2 on Cloud too.
有什么方法可以将镶木地板文件的阵列数据推送到Cloud上的Db2?
Is there any way to push that array data of parquet file to Db2 on Cloud?
推荐答案
您是否已在文档中检查了此建议?
Have you checked out this advise in the documentation?
https://cloud.ibm .com/docs/services/sql-query?topic = sql-query-overview#limitations
如果JSON,ORC或Parquet对象包含嵌套或数组 结构,使用通配符的CSV输出查询(例如, SELECT * from cos://...)返回错误,例如无效的CSV数据 使用的类型:struct."使用以下之一 解决方法:
If a JSON, ORC, or Parquet object contains a nested or arrayed structure, a query with CSV output using a wildcard (for example, SELECT * from cos://...) returns an error such as "Invalid CSV data type used: struct." Use one of the following workarounds:
- 对于嵌套结构,请使用FLATTEN表转换功能.
- 或者,您可以指定完全嵌套的列名称
而不是通配符,例如
SELECT address.city, address.street, ... from cos://....
- 对于数组,请使用Spark SQL explode()函数,例如
select explode(contact_names) from cos://....
- For a nested structure, use the FLATTEN table transformation function.
- Alternatively, you can specify the fully nested column names
instead of the wildcard, for example,
SELECT address.city, address.street, ... from cos://....
- For an array, use the Spark SQL explode() function, for example,
select explode(contact_names) from cos://....
这篇关于使用IBM Cloud SQL查询时如何展平Parquet Array数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!