Presto(Athena)加载带有引号转义逗号的CSV文件 [英] Presto (Athena) loading of a CSV file with quote-escaped commas

查看：99 发布时间：2021/4/3 18:39:01 csv amazon-athena presto

本文介绍了Presto(Athena)加载带有引号转义逗号的CSV文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑CSV文件中的以下行:

Consider the following row in a CSV file:

1,0,True,"{""foo"":null,""bar"":null}",0,1
                       ▲

突出显示的是列的一部分.也就是说，此全文:"{""foo":null，""bar":null}" 是单个列的值.但是，AWS Athena会将突出显示的解释为以逗号分隔的逗号，从而将该文本错误地拆分为多列.

The highlighted , is part of a column. That is, this full text: " {""foo"":null,""bar"":null}" is the value of a single column. However AWS Athena is interpreting the highlighted , as a column-delimiting comma, incorrectly splitting that text into multiple columns.

我知道我可以将列定界符更改为其他名称以避免此问题.我的问题是:这是AWS Athena/Presto中的错误吗?如何避免这些逗号?

I know I could change the column delimiter to something else to avoid this problem. My question is: Is this a bug in AWS Athena / Presto? How can I escape these commas?

推荐答案

如果数据用双引号引起来，则需要使用

If your data is enclosed in double quotes, you need to use OpenCSVSerDe .

对于示例数据，下表定义有效:

for the sample data, the following table definition works:

1,0,True,"{""foo"":null,""bar"":null}",0,1

如何在数据中转义逗号

CREATE EXTERNAL TABLE `extra_comma`(
  `a` string COMMENT 'from deserializer', 
  `b` string COMMENT 'from deserializer', 
  `c` string COMMENT 'from deserializer', 
  `d` string COMMENT 'from deserializer',
  `e` string COMMENT 'from deserializer',
  `f` string COMMENT 'from deserializer'
  )
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://aws-glue-stackoverflow/comma_in_data/'

这篇关于Presto(Athena)加载带有引号转义逗号的CSV文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Presto(Athena)加载带有引号转义逗号的CSV文件 [英] Presto (Athena) loading of a CSV file with quote-escaped commas

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Presto(Athena)加载带有引号转义逗号的CSV文件 [英] Presto (Athena) loading of a CSV file with quote-escaped commas

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭