在Amazon Redshift COPY命令中转义定界符 [英] Escaping delimiter in Amazon Redshift COPY command

查看:191
本文介绍了在Amazon Redshift COPY命令中转义定界符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据从Amazon S3拉入Amazon Redshift中的表中.该表包含多个列,其中某些列数据可能包含特殊字符.

I'm pulling data from Amazon S3 into a table in Amazon Redshift. The table contains various columns, where some column data might contain special characters.

复制命令有一个名为Delimiter的选项,在这里我们可以在将数据提取到表中时指定定界符.

The copy command has an option called Delimiter where we can specify the delimiter while pulling the data into the table.

问题是2折-

当我使用定界符将(unload command)导出到S3时-例如说,-可以正常工作,但是当我尝试从S3导入Redshift时,由于某些列包含','运算符,因此问题逐渐蔓延复制命令误解为定界符并引发错误.

When I export (unload command) to S3 using a delimiter - say , - it works fine, but when I try to import into Redshift from S3, the issue creeps in because certain columns contain the ',' operator which the copy command misinterprets as delimiter and throws error.

我尝试了各种定界符,但表中的数据似乎包含某种或其他特殊字符,导致了上述问题.

I tried various delimiters, but the data in my table seems to contain some or other kind of special character which causes the above issue.

我什至尝试使用多个定界符进行卸载-如#%~,,但是当使用copy命令从s3加载时-不支持双重定界符.

I even tried unloading using multiple delimiter - like #% or ~, but when loading from s3 using copy command - the dual delimiter is not supported.

有解决方案吗?

我认为可以使用\转义分隔符,但由于某些原因,该分隔符也不起作用,或者也许我没有使用正确的语法在转义命令中转义.

I think the delimiter can be escaped using \ but for some reason that isn't working either, or maybe I'm not using the right syntax for escaping in copy command.

推荐答案

下面的示例显示文本文件的内容,其字段值用逗号分隔.

The following example shows the contents of a text file with the field values separated by commas.

12,Shows,Musicals,Musical theatre
13,Shows,Plays,All "non-musical" theatre  
14,Shows,Opera,All opera, light, and "rock" opera
15,Concerts,Classical,All symphony, concerto, and choir concerts

如果使用DELIMITER参数指定逗号分隔的输入来加载文件,则COPY命令将失败,因为某些输入字段包含逗号.您可以通过使用CSV参数并将包含逗号的字段括在引号中来避免该问题.如果引号字符出现在带引号的字符串中,则需要通过将引号字符加倍来对其进行转义.默认的引号字符是双引号,因此您将需要使用附加的双引号对每个双引号进行转义.您的新输入文件将如下所示.

If you load the file using the DELIMITER parameter to specify comma-delimited input, the COPY command will fail because some input fields contain commas. You can avoid that problem by using the CSV parameter and enclosing the fields that contain commas in quote characters. If the quote character appears within a quoted string, you need to escape it by doubling the quote character. The default quote character is a double quotation mark, so you will need to escape each double quotation mark with an additional double quotation mark. Your new input file will look something like this.

12,Shows,Musicals,Musical theatre
13,Shows,Plays,"All ""non-musical"" theatre"
14,Shows,Opera,"All opera, light, and ""rock"" opera"
15,Concerts,Classical,"All symphony, concerto, and choir concerts"


来源:-从CSV文件加载报价


Source :- Load Quote from a CSV File


我用的是-


What I use -

COPY tablename FROM 'S3-Path' CREDENTIALS '' MANIFEST CSV QUOTE '\"' DELIMITER ',' TRUNCATECOLUMNS ACCEPTINVCHARS MAXERROR 2

如果我做出了错误的假设,请发表评论,然后重新调整答案.

If I’ve made a bad assumption please comment and I’ll refocus my answer.

这篇关于在Amazon Redshift COPY命令中转义定界符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆