配置单元中的多行JSON文件查询 [英] Multi-line JSON file querying in hive

查看：137 发布时间：2019/11/26 19:44:09 json hive amazon-athena

本文介绍了配置单元中的多行JSON文件查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我了解大多数 JSON SerDe格式都希望将.json文件存储为每行一条记录.

I understand that the majority of JSON SerDe formats expect .json files to be stored with one record per line.

我有一个S3存储桶，其中包含多行缩进的.json文件(不控制源代码)，我想使用Amazon Athena进行查询(尽管我认为这同样适用于Hive). /p>

I have an S3 bucket with multi-line indented .json files (don't control the source) that I'd like to query using Amazon Athena (though I suppose this applies just as well to Hive generally).

那里是否存在SerDe格式，可以解析多行缩进的.json文件?
如果没有 SerDe格式，请执行以下操作:
- 是否有处理此类文件的最佳实践?
  - 我是否打算使用其他工具(如python)将这些记录弄平?

Is there a SerDe format out there that is able to parse multi-line indented .json files?
If there isn't a SerDe format to do this:
- Is there a best practice for dealing with files like this?
  - Should I plan on flattening these records out using a different tool like python?

示例文件正文:

[
  {
    "id": 1,
    "name": "ryan",
    "stuff: {
      "x": true,
      "y": [
        123,
        456
      ]
    },
  },
  ...
]

配置单元中的多行JSON文件查询 [英] Multi-line JSON file querying in hive

问题描述

推荐答案

相关文章

JavaScript最新文章

热门教程

热门工具

登录关闭

配置单元中的多行JSON文件查询 [英] Multi-line JSON file querying in hive

问题描述

推荐答案

相关文章

JavaScript最新文章

热门教程

热门工具

登录 关闭

登录关闭