S3查询异常(提取) [英] S3 Query Exception (Fetch)

查看:131
本文介绍了S3查询异常(提取)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已将数据从Redshift以Parquet格式上传到S3,并在Glue中创建了数据目录。我已经能够从雅典娜查询表,但是当我在Redshift上创建外部模式并尝试在表上查询时,出现以下错误

I have uploaded data from Redshift to S3 in Parquet format and created the data catalog in Glue. I have been able to query the table from Athena but when I create the external schema on Redshift and tried to query on the table I'm getting the below error

ERROR:  S3 Query Exception (Fetch)
DETAIL:
  -----------------------------------------------
  error:  S3 Query Exception (Fetch)
  code:      15001
  context:   Task failed due to an internal error. File 'https://s3-eu-west-1.amazonaws.com/bucket/folder/partition_key/filename.parquet_1  has an incompatible Parquet schema for column 's3://bucket/folder
  query:     560922
  location:  dory_util.cpp:717
  process:   query1_118_560922 [pid=32409]
  -----------------------------------------------

查询在Athena上很好用

The queries are workinh well in Athena

推荐答案

这种方式告诉您出了什么问题-表/分区的模式和文件内容差异太大。最简单的解决方法是在数据位置上运行搜寻器,并选中更新表中的每个分区定义。

It kind of tells you what's wrong - the schema of table/partition and the file contents differ too much. The easiest way to fix that would be to run a crawler over the data location with the "update each partition definition from table" checked.

这篇关于S3查询异常(提取)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆