如何从云数据流恢复作业com.google.api.client.googleapis.json.GoogleJsonResponseException失败:410 Gone [英] How to recover from Cloud Dataflow job failed on com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
问题描述
java.io.IOException:com.google.api.client.googleapis.json.GoogleJsonResponseException :410 Gone {code:500,errors:[{domain:global,message:Backend Error,reason:backendError}],message: }
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:431)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close( AbstractGoogleAsyncWriteChannel.java:289)
,位于com.google.cloud.dataflow.sdk.io.FileBasedSink $ FileBasedWriter.close(FileBasedSink.java:516)
,位于com.google.cloud.dataflow.sdk。 io.FileBasedSink $ FileBasedWriter.close(FileBasedSink.java:419)
at com.google.cloud.dataflow.sdk.io.Write $ Bound $ 2.finishBundle(Write.java:201)引起:com.google .api.client.googleapis.json.GoogleJsonResponseException:410 Gone {code:500,errors:[{domain:global ,message:Backend Error,reason:backendError}],message:Backend Error}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from GoogleJsonResponseException.java:146)
,位于com.google.api.cli.api.cli.api.googleapis上
的com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
。 services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
位于com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
位于com.google。 api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
,位于com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel $ UploadOperation.call(AbstractGoogleAs $ java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
在java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)$ b $在java.lang.Thread.run(Thread.java:745)
堆栈跟踪中的任何一个类都不是直接从我的作业中获取的,所以我甚至无法捕获并恢复。
我检查了我的区域,云存储(由同一项目拥有)等等,它们都可以。其他工人也运行良好。看起来像Dataflow中的某种错误?如果没有别的,我真的想知道如何从中恢复:作业花费了30多个小时,现在生成了一堆临时文件,我不知道它们有多完整......如果我重新运行工作我担心它会再次失败。
Google工作人员的工作ID是 2016-08-25_21_50_44-3818926540093331568 。感谢!!
解决方案是指定 withNumShards()
输出具有固定值< 10000.这是我们希望在将来删除的限制。
My Cloud Dataflow job, after running for 4 hours, mysteriously failed because a worker is throwing this exception four times (in a span of an hour). The exception stack looks like this.
java.io.IOException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone { "code" : 500, "errors" : [ { "domain" : "global", "message" : "Backend Error", "reason" : "backendError" } ], "message" : "Backend Error" }
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:431)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:289)
at com.google.cloud.dataflow.sdk.io.FileBasedSink$FileBasedWriter.close(FileBasedSink.java:516)
at com.google.cloud.dataflow.sdk.io.FileBasedSink$FileBasedWriter.close(FileBasedSink.java:419)
at com.google.cloud.dataflow.sdk.io.Write$Bound$2.finishBundle(Write.java:201) Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone { "code" : 500, "errors" : [ { "domain" : "global", "message" : "Backend Error", "reason" : "backendError" } ], "message" : "Backend Error" }
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:357)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
None of the class in the stacktrace is from my job directly, so I cannot even catch and recover.
I checked my region, Cloud storage (owned by the same project) etc, they are all OK. Other workers were also running fine. Looks like some kind of bug in Dataflow? If nothing else I really would like to know how to recover from this: the job spend 30+ hours in totally and now produced a bunch of temp files that I don't know how complete they are... If I re-run the job I am concerned that it would fail again.
The job id is 2016-08-25_21_50_44-3818926540093331568 , for the Google folks. Thanks!!
The solution was to specify withNumShards()
on the output with a fixed value < 10000. This is a limitation that we hope to remove in the future.
这篇关于如何从云数据流恢复作业com.google.api.client.googleapis.json.GoogleJsonResponseException失败:410 Gone的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!