将实木复合地板转换为json以进行dynamodb导入 [英] convert parquet to json for dynamodb import

查看：94 发布时间：2020/6/4 0:31:37 pyspark amazon-dynamodb

本文介绍了将实木复合地板转换为json以进行dynamodb导入的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用AWS Glue作业以拼花格式备份s3的dynamodb表，以便能够在Athena中使用它。

I am using AWS Glue jobs to backup dynamodb tables in s3 in parquet format to be able to use it in Athena.

如果我想使用这些拼花格式s3文件能够在dynamodb中还原表，这就是我的想法-读取每个实木复合地板文件并将其转换为json，然后将json格式的数据插入dynamodb中（在下面几行中使用pyspark）

If I want to use these parquet format s3 files to be able to do restore of the table in dynamodb, this is what I am thinking - read each parquet file and convert it into json and then insert the json formatted data into dynamodb (using pyspark on the below lines)

# set sql context
parquetFile = sqlContext.read.parquet(input_file)
parquetFile.write.json(output_path)

使用- https://github.com/Alonreznik/dynamodb-json

这种方法可以听起来对吗？此方法还有其他替代方法吗？

Does this approach sound right? Are there any other alternatives to this approach?

将实木复合地板转换为json以进行dynamodb导入 [英] convert parquet to json for dynamodb import

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将实木复合地板转换为json以进行dynamodb导入 [英] convert parquet to json for dynamodb import

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭