AWS数据管道S3 CSV到DynamoDB JSON错误 [英] AWS Data Pipeline S3 CSV to DynamoDB JSON Error

查看:100
本文介绍了AWS数据管道S3 CSV到DynamoDB JSON错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过AWS DATA Pipeline插入S3目录中的多个csv,但是,我遇到了这个错误.

I'm trying to insert several csv located in the S3 directory with the AWS DATA Pipeline But, I'm taking this error.

在org.apache.hadoop.mapred.YarnChild上的org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)在javax.security.auth.Subject.doAs(Subject.java:422) main(YarnChild.java:169)由以下原因引起:com.google.gson.stream.MalformedJsonException:在com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505)的第1行第10列处应为':' com.google.gson.stream.JsonReader.peek(JsonReader.java:414)上的com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519)com.google.gson.internal.bind.ReflectiveTypeAdapterFactory $ com.google.gson.internal.bind.MapTypeAdapterFactory $ Adapter.read(MapTypeAdapterFactory.com)上的Adapter.read(ReflectiveTypeAdapterFactory.java:157),位于com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40). com.google.gson.internal.bind.MapTypeAdapterFactory $ Adapter.read(MapTypeAdapterFactory.java:145)的com.google.gson.Gson.fromJson(Gson.java:803)的java:187)...还有15个例外在线程中主要"java.io. errorStackTrace amazonaws.datapipeline.taskrunner.TaskExecutionException:无法完成EMR转换.在amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:67)在amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16)在amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:136) )在amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:105)在amazonaws.datapipeline.taskrunner.TaskPoller $ 1.run(TaskPoller.java:81)在private.com.amazonaws.services.datapipeline.poller.PollWorker在private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53)处的.executeWork(PollWorker.java:76)在java.lang.Thread.run(Thread.java:748)处由以下原因引起: amazonaws.datapipeline.taskrunner.TaskExecutionException:在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)在javax.security.auth.Subject.doAs(Subject.java:422)在org.apache.hadoop .mapred.YarnChild.main(YarnChild.java:169)原因:com.google.gson.stream.MalformedJsonException:预期在第1行第10列为:" com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519)上的com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505)(com.google.gson.stream.JsonReader.peek( com.google.gson.internal.bind.ReflectiveTypeAdapterFactory $ Adapter.read(ReflectiveTypeAdapterFactory.java:157)上的JsonReader.java:414)com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40)上的JsonReader.java:414)在com.google.gson.internal.bind.MapTypeAdapterFactory $ Adapter.read(MapTypeAdapterFactory.java:145)在com.google.gson.internal.bind.MapTypeAdapterFactory $ Adapter.read(MapTypeAdapterFactory.java:145)在com.google.gson.internal.bind.MapTypeAdapterFactory $ Adapter.read(MapTypeAdapterFactory.java:187) .gson.Gson.fromJson(Gson.java:803)...还有15个线程"main"中的异常java.io.IOException:作业失败!在org.apache.hadoop.dynamodb.tools.DynamoDBImport.run(DynamoDBImport.java:81)在org.apache.hadoop.dynamodb.tools.org(org.apache.hadoop.util.ToolRunner) org.apache.hadoop.dynamodb.tools.DynamoDBImport.main(DynamoDBImport.java:43)的.run(ToolRunner.java:76)在sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62),位于org.apache.hadoop.util.RunJar,位于java.lang.reflect.Method.invoke(Method.java:498),位于sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43).在amazonaws.datapipeline.cluster.EmrUtil.runSteps(EmrUtil.java:286)在org.apache.hadoop.util.RunJar.main(RunJar.java:153)处运行(RunJar.java:239) .EmrActivity.runActivity(EmrActivity.java:63)...还有7个

at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: com.google.gson.stream.MalformedJsonException: Expected ':' at line 1 column 10 at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505) at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519) at com.google.gson.stream.JsonReader.peek(JsonReader.java:414) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:157) at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:187) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:145) at com.google.gson.Gson.fromJson(Gson.java:803) ... 15 more Exception in thread "main" java.io. errorStackTrace amazonaws.datapipeline.taskrunner.TaskExecutionException: Failed to complete EMR transform. at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:67) at amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16) at amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:136) at amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:105) at amazonaws.datapipeline.taskrunner.TaskPoller$1.run(TaskPoller.java:81) at private.com.amazonaws.services.datapipeline.poller.PollWorker.executeWork(PollWorker.java:76) at private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53) at java.lang.Thread.run(Thread.java:748) Caused by: amazonaws.datapipeline.taskrunner.TaskExecutionException: at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: com.google.gson.stream.MalformedJsonException: Expected ':' at line 1 column 10 at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505) at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519) at com.google.gson.stream.JsonReader.peek(JsonReader.java:414) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:157) at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:187) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:145) at com.google.gson.Gson.fromJson(Gson.java:803) ... 15 more Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at org.apache.hadoop.dynamodb.tools.DynamoDBImport.run(DynamoDBImport.java:81) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.dynamodb.tools.DynamoDBImport.main(DynamoDBImport.java:43) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) at amazonaws.datapipeline.cluster.EmrUtil.runSteps(EmrUtil.java:286) at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:63) ... 7 more

推荐答案

这解决了我的问题.

AWS DATA管道使用的格式.

format that the AWS DATA Pipeline uses.

{"Name": {"S":"Amazon push"},"Category": {"S":"Amazon Web Services"}}
{"Name": {"S":"Amazon S3"},"Category": {"S":"Amazon Web Services"}}```

References:

https://calorious.wordpress.com/2016/03/18/episode-4-importing-json-into-dynamodb/

https://medium.com/@ashleywnj/appsync-s3-data-pipeline-dynamodb-854f99d70b41

这篇关于AWS数据管道S3 CSV到DynamoDB JSON错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆