nvalid语法:使用初始sortkeys自动创建表sortkey [英] nvalid syntax: Create table sortkey auto with initial sortkeys
问题描述
我正在尝试使用 target-redshift
将数据推送到 aws-redshift
https://pypi.org/project/target-redshift/ >
我正在使用气流
来监控etl状态
这是错误日志,我不知道这是什么意思.几乎没有关于目标红移的在线文档.有什么办法可以解决这个错误?
[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-错误写入记录的异常[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-追溯(最近一次通话结束):[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-文件"/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/postgres.py",第300行,位于write_batch中[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-{'version':target_table_version})[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-文件"/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/sql_base.py",第840行,位于write_batch_helper中[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-元数据)[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-文件"/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/postgres.py",第588行,位于write_table_batch中[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-table = sql.Identifier(remote_schema ['name'])[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-文件"/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/执行中的postgres.py",第65行[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-返回super(_MillisLoggingCursor,self).execute(query,vars)[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-文件"/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/psycopg2/第461行的"extras.py"[2021-03-07 15:04:11,841] {bash_operator.py:126}信息-返回super(LoggingCursor,self).execute(query,vars)[2021-03-07 15:04:11,842] {bash_operator.py:126}信息-psycopg2.errors.InternalError_:无效的语法:使用初始排序键自动创建表sortkey.[2021-03-07 15:04:11,842] {bash_operator.py:126}信息-[2021-03-07 15:04:11,842] {bash_operator.py:126}信息-严重(异常写入记录",InternalError_(无效语法:使用初始排序键自动创建表排序键.\ n'))
正在执行此形式的查询的任务下方.
创建表schema.tablename_copy(如schema.tablename)
这就是引发错误的原因无效的语法:自动创建带有初始sortkeys的表sortkey auto.
修复或删除SORTKEY&原始表的DISTSTYLE解决了该问题.例如.
ALTER TABLE schema.tablename ALTER SORTKEY NONE;ALTER TABLE schema.tablename ALTER DISTSTYLE EVEN;
根据表的实际需求(在排序和distkey方面),您可能会做其他事情.但是,很有可能这是一些加载表,并且没有SORTKEY等应该没什么大不了的.
其他方法如果上述方法失败,则可以使用多种方法来避免 Like
.由于此 LIKE
可能是加载/向上插入序列中可能的一部分.例如
- 制作目标表的临时副本(使用LIKE语句)
- 将新数据加载到临时表中
- 从目标表中删除ID在新数据中的行
- 将新数据插入目标表
- 清理临时表.
Airflow可能会在幕后为您执行此SQL查询序列.您可以将Airflow的步骤分解为单独的任务,并避免使用 Like
.
- 删除原始表(根据使用情况可能是可能的)-我发现重新创建表时LIKE问题消失了.
- 使用普通的CREATE STATEMENT显式创建重复表(而不是依赖于LIKE)
- 考虑直接对表进行INSERT/APPEND/COPY并稍后处理重复项
其他信息:此新错误位于 Redshift端上.不能完全确定为什么,但是 LIKE
语句正在尝试传输SORTKEY,但无法这样做.
如果检查原始表 schema.tablename
的定义,您可能会发现Redshifts对它的理解有些奇怪.
亚马逊提供了此视图,使您可以获取完整的表DDL语句 也许会显示一些有用的东西.在我的情况下,SORTKEY和DISTKEY都抛出错误:对我来说,这表明Redshift对键的内部理解存在缺陷.更重要的是-就我而言,我从未设置过这些键,因此我认为Redshift暗示了它们. I'm trying to use https://pypi.org/project/target-redshift/ I am using This is error log and i have no clue what it means. Online documentation hardly exists for target-redshift. Is there any way to go around this error ?
Underneath your task somewhere a query of this form is being run. This is what is throwing the error Fixing or removing the SORTKEY & DISTSTYLE of the origin table fixes the issue. eg. Depending on what your table actually needs (in terms of sort and distkeys) you might do something else. However more than likely this is some loading table and having no SORTKEY etc shouldn't be a big deal. Other Approaches
If the above fails any number of ways of avoiding the Airflow is probably doing this sequence of SQL queries behind the scenes for you. You could just dismantle the steps of the Airflow into separate tasks and avoid the use of Extra info:
This new error and it is on the Redshift side.
Not entirely sure why but the If you check the definition of the origin table The Amazon guys have provided this view to allow you to get full table DDL statements link Running against the view: Will maybe show something useful. In my case the SORTKEY and DISTKEYs were both throwing errors: which to me indicated that Redshift had a flawed internal understanding of the keys. What is more - in my case I never set these keys, so I think they were implied by Redshift. 这篇关于nvalid语法:使用初始sortkeys自动创建表sortkey的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! SELECT * FROMWHERE schemaname ='schema'AND tablename ='tablename'
target-redshift
to push data to aws-redshift
airflow
to monitor etl status[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - ERROR Exception writing records
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - Traceback (most recent call last):
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - File "/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/postgres.py", line 300, in write_batch
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - {'version': target_table_version})
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - File "/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/sql_base.py", line 840, in write_batch_helper
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - metadata)
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - File "/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/postgres.py", line 588, in write_table_batch
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - table=sql.Identifier(remote_schema['name'])
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - File "/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/target_postgres/postgres.py", line 65, in execute
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - return super(_MillisLoggingCursor, self).execute(query, vars)
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - File "/usr/local/airflow/.virtualenvs/target-redshift/lib/python3.7/site-packages/psycopg2/extras.py", line 461, in execute
[2021-03-07 15:04:11,841] {bash_operator.py:126} INFO - return super(LoggingCursor, self).execute(query, vars)
[2021-03-07 15:04:11,842] {bash_operator.py:126} INFO - psycopg2.errors.InternalError_: Invalid syntax: Create table sortkey auto with initial sortkeys.
[2021-03-07 15:04:11,842] {bash_operator.py:126} INFO -
[2021-03-07 15:04:11,842] {bash_operator.py:126} INFO - CRITICAL ('Exception writing records', InternalError_('Invalid syntax: Create table sortkey auto with initial sortkeys.\n'))
CREATE TABLE schema.tablename_copy
(LIKE schema.tablename)
Invalid syntax: Create table sortkey auto with initial sortkeys.
ALTER TABLE schema.tablename ALTER SORTKEY NONE;
ALTER TABLE schema.tablename ALTER DISTSTYLE EVEN;
LIKE
will also work. Since this LIKE
is more than likely part of a load / upsert sequence. eg
LIKE
.
LIKE
statement is trying to transfer SORTKEYs and is unable to do so.schema.tablename
you'll perhaps find something odd with Redshifts understanding of it.SELECT * FROM
WHERE schemaname = 'schema' AND tablename = 'tablename'