带有双引号和逗号的 AWS Glue 问题 [英] AWS Glue issue with double quote and commas

查看:24
本文介绍了带有双引号和逗号的 AWS Glue 问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个 CSV 文件:

I have this CSV file:

reference,address
V7T452F4H9,"12410 W 62TH ST, AA D"

表定义中使用了以下选项

The following options are being used in the table definition

ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES ( 
  'quoteChar'='"', 
  'separatorChar'=',') 

但它仍然无法识别数据中的双引号,并且双引号字段中的逗号将数据弄乱了.当我运行 Athena 查询时,结果如下

but it still won't recognize the double quotes in the data, and that comma in the double quote fiel is messing up the data. When I run the Athena query, the result looks like this

reference     address
V7T452F4H9    "12410 W 62TH ST

我该如何解决这个问题?

How do I fix this issue?

推荐答案

貌似还需要添加escapeChar.AWS Athena docs 显示了此示例:

Look like you also need to add escapeChar. AWS Athena docs shows this example:

CREATE EXTERNAL TABLE myopencsvtable (
   col1 string,
   col2 string,
   col3 string,
   col4 string
)
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   'separatorChar' = ',',
   'quoteChar' = '"',
   'escapeChar' = '\'
   )
STORED AS TEXTFILE
LOCATION 's3://location/of/csv/';

这篇关于带有双引号和逗号的 AWS Glue 问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆