S3查询异常(提取) [英] S3 Query Exception (Fetch)
问题描述
我已将数据从Redshift以Parquet格式上传到S3,并在Glue中创建了数据目录。我已经能够从雅典娜查询表,但是当我在Redshift上创建外部模式并尝试在表上查询时,出现以下错误
I have uploaded data from Redshift to S3 in Parquet format and created the data catalog in Glue. I have been able to query the table from Athena but when I create the external schema on Redshift and tried to query on the table I'm getting the below error
ERROR: S3 Query Exception (Fetch)
DETAIL:
-----------------------------------------------
error: S3 Query Exception (Fetch)
code: 15001
context: Task failed due to an internal error. File 'https://s3-eu-west-1.amazonaws.com/bucket/folder/partition_key/filename.parquet_1 has an incompatible Parquet schema for column 's3://bucket/folder
query: 560922
location: dory_util.cpp:717
process: query1_118_560922 [pid=32409]
-----------------------------------------------
查询在Athena上很好用
The queries are workinh well in Athena
推荐答案
这种方式告诉您出了什么问题-表/分区的模式和文件内容差异太大。最简单的解决方法是在数据位置上运行搜寻器,并选中更新表中的每个分区定义。
It kind of tells you what's wrong - the schema of table/partition and the file contents differ too much. The easiest way to fix that would be to run a crawler over the data location with the "update each partition definition from table" checked.
这篇关于S3查询异常(提取)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!