火花1.4.1 saveAsTextFile到S3是很慢的EMR-4.0.0 [英] spark-1.4.1 saveAsTextFile to S3 is very slow on emr-4.0.0

查看:235
本文介绍了火花1.4.1 saveAsTextFile到S3是很慢的EMR-4.0.0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我跑火花1.4.1 amazom AWS EMR 4.0.0

对于一些振振有辞火花saveAsTextFile是电子病历4.0.0非常缓慢相比,EMR 3.8(为5秒,现在95秒)

其实saveAsTextFile说,它在4.356秒的完成,但在那之后我看到很多INFO消息与404错误从com.amazonaws.latency logger在接下来的90秒

 火花> sc.parallelize(List.range(0,160),160).MAP(X => X +\\ t+A* 100).saveAsTextFile(S3N:// foo的酒吧的/ tmp / test40_20 )2015年9月1日21:16:17637 INFO [DAG-调度事件循环] scheduler.DAGScheduler(Logging.scala:LOGINFO(59)) -  ResultStage 5(saveAsTextFile AT<&控制台GT; 22)完成了4.356小号
2015年9月1日21:16:17637 INFO [任务结果的getter-2] cluster.YarnScheduler(Logging.scala:LOGINFO(59)) - 删除包括taskset 5.0,其任务已全部建成后,从池
2015年9月1日21:16:17637 INFO [主] scheduler.DAGScheduler(Logging.scala:LOGINFO(59)) - 工作完成5:saveAsTextFile AT<&控制台GT;:22,花了4.547829小号
2015年9月1日21:16:17638 INFO [主] s3n.S3NativeFileSystem(S3NativeFileSystem.java:listStatus(896)) - listStatus S3N:// foo的酒吧的/ tmp / test40_20 / _temporary / 0递归假
2015年9月1日21:16:17651 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:3B2F06FD11682D22),S3扩展请求ID:C8T3rXVSEIk3swlwkUWJJX3gWuQx3QKC3Yyfxuhs7y0HXn3sEI9 + c1a0f7 / QK8BZ],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [3B2F06FD11682D22],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.923],的Htt prequestTime = [11.388],HttpClientReceiveResponseTime = [9.544],RequestSigningTime = [0.274],HttpClientSendRequestTime = [0.129],
2015年9月1日21:16:17723 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [E5D513E52B20FF17] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [71.927],的Htt prequestTime = [53.517] ,HttpClientReceiveResponseTime = [51.81],RequestSigningTime = [0.209],ResponseProcessingTime = [17.97],HttpClientSendRequestTime = [0.089]
2015年9月1日21:16:17756 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:62C6B413965447FD),S3扩展请求ID:4w5rKMWCt9EdeEKzKBXZgWpTcBZCfDikzuRrRrBxmtHYxkZyS4GxQVyADdLkgtZf],服务名称= [Amazon S3的] AWSError code = [404未找​​到],AWSRequestID = [62C6B413965447FD],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.044],的Htt prequestTime = [10.543],HttpClientReceiveResponseTime = [8.743],RequestSigningTime = [0.271],HttpClientSendRequestTime = [0.138],
2015年9月1日21:16:17774 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [F62B991825042889] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [16.724],的Htt prequestTime = [16.292] ,HttpClientReceiveResponseTime = [14.728],RequestSigningTime = [0.148],ResponseProcessingTime = [0.155],HttpClientSendRequestTime = [0.068]
2015年9月1日21:16:17786 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:4846575A1C373BB9),S3扩展请求ID:AW / MMKxKPmuDuxTj4GKyDbp8hgpQbTjipJBzdjdTgbwPgt5NsZS4z + tRf2bk3I2E],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [4846575A1C373BB9],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.531],的Htt prequestTime = [11.134],HttpClientReceiveResponseTime = [9.434],RequestSigningTime = [0.206],HttpClientSendRequestTime = [0.13],
2015年9月1日21:16:17786 INFO [主] s3n.S3NativeFileSystem(S3NativeFileSystem.java:listStatus(896)) - listStatus S3N:// foo的酒吧的/ tmp / test40_20 / _temporary / 0 / task_201509012116_0005_m_000000递归假
2015年9月1日21:16:17798 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:8A91D9A08CE3C1FE),S3扩展请求ID:u5RLzX1OvlIHBMCggSs3AGR96raYgD / Xu8RmoJuN / B + qZchoF1ZkbWIHRcqbzPNN],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [8A91D9A08CE3C1FE],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0, RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.472],的Htt prequestTime = [11.147],HttpClientReceiveResponseTime = [9.594],RequestSigningTime = [0.168],HttpClientSendRequestTime = [0.088],
2015年9月1日21:16:17817 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [006EE9124BA77E28] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [19.185],的Htt prequestTime = [16.691] ,HttpClientReceiveResponseTime = [15.039],RequestSigningTime = [0.17],ResponseProcessingTime = [2.141],HttpClientSendRequestTime = [0.11]
2015年9月1日21:16:17830 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:62F097583E42AB48),S3扩展请求ID:EoJ7XNxQzKAm6yanlrf7ukIJOxYrhr5m8xEROkLc1wjFpPRgjuwY / JzznCshredZ],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [62F097583E42AB48],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1, HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [12.004],的Htt prequestTime = [11.57],HttpClientReceiveResponseTime = [9.879],RequestSigningTime = [0.218],HttpClientSendRequestTime = [0.089],
2015年9月1日21:16:17844 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:A96FDB3E0E0E13FE),S3扩展请求ID:Y1nnEJAd / wNtW + T2pFvg8HG5fzcjs + ztuLcXwFV3I6Bda4nKU + 9rSdbTkoDtNwtu] ,服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [A96FDB3E0E0E13FE],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [13.543],的Htt prequestTime = [13.145],HttpClientReceiveResponseTime = [11.505],RequestSigningTime = [0.207],HttpClientSendRequestTime = [0.108],
2015年9月1日21:16:17911 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [4C105174ADF12A0B] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [66.408],的Htt prequestTime = [63.949] ,HttpClientReceiveResponseTime = [62.298],RequestSigningTime = [0.211],ResponseProcessingTime = [2.049],HttpClientSendRequestTime = [0.085]
2015年9月1日21:16:17,912 INFO [主] s3n.S3NativeFileSystem(S3NativeFileSystem.java:rename(1182)) - 重S3N:// foo的酒吧的/ tmp / test40_20 / _temporary / 0 / task_201509012116_0005_m_000000 /兼职00000 S3N:// foo的酒吧的/ tmp / test40_20 /兼职00000
2015年9月1日21:16:17927 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:547162454610B1C3),S3扩展请求ID:VgjjiHVtd / RutYxW3jPAZgos64j7JYfBmaMhkZvmyhkgD5ZuCAMSRMd / TrWQmTci],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [547162454610B1C3],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [15.214],的Htt prequestTime = [14.764],HttpClientReceiveResponseTime = [13.047],RequestSigningTime = [0.243],HttpClientSendRequestTime = [0.124],
2015年9月1日21:16:18037 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:6F10454BF138C69F),S3扩展请求ID:HSt8mkimmo9fK5qqTaU6OBGKXTQ1wvyctgMZSBsoIgxEFY + Yu5eq / Bn8fOCSsk3B],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [6F10454BF138C69F],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [108.944]的Htt prequestTime = [108.542],HttpClientReceiveResponseTime = [106.874],RequestSigningTime = [0.171],HttpClientSendRequestTime = [0.067],
2015年9月1日21:16:18215 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [942D4DFF59A2B262] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [177.058]的Htt prequestTime = [174.523] ,HttpClientReceiveResponseTime = [172.689],RequestSigningTime = [0.263],ResponseProcessingTime = [2.049],HttpClientSendRequestTime = [0.117]
2015年9月1日21:16:18235 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:712A1FF2554DDD5D),S3扩展请求ID:RZZDuIrkdE / cdhAFijZix2juyAfZHyj7Mw2xJuyrEaJR5He0HREB30LATWvMJX3A],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [712A1FF2554DDD5D],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1, HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [20.187],的Htt prequestTime = [19.728],HttpClientReceiveResponseTime = [18.001],RequestSigningTime = [0.238],HttpClientSendRequestTime = [0.125],
2015年9月1日21:16:18248 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [B386866C749DB8E0] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.628],的Htt prequestTime = [11.091] ,HttpClientReceiveResponseTime = [9.513],RequestSigningTime = [0.24],ResponseProcessingTime = [0.139],HttpClientSendRequestTime = [0.079]
2015年9月1日21:16:18,365 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [2621F3858DF8245B] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [117.034]的Htt prequestTime = [116.494] ,HttpClientReceiveResponseTime = [114.81],RequestSigningTime = [0.168],ResponseProcessingTime = [0.202],HttpClientSendRequestTime = [0.1]
2015年9月1日21:16:18382 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:595CA0A458D41C97),S3扩展请求ID:TP + Hh6CER + g31u6GqpWuLttrjUg2oTPCQ9SWVPsSgcD98MvI88eTqSTjIzrSYmu3],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [595CA0A458D41C97],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [16.308],的Htt prequestTime = [15.715],HttpClientReceiveResponseTime = [13.752],RequestSigningTime = [0.276],HttpClientSendRequestTime = [0.164],
2015年9月1日21:16:18,647 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [7785739C9F12EB4A] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [264.11]的Htt prequestTime = [261.533] ,HttpClientReceiveResponseTime = [259.67],RequestSigningTime = [0.309],ResponseProcessingTime = [2.05],HttpClientSendRequestTime = [0.131]
2015年9月1日21:16:18674 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [204],服务名称= [Amazon S3的],AWSRequestID = [1F975359BBCA55FD] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [25.921],的Htt prequestTime = [25.504] ,HttpClientReceiveResponseTime = [23.823],RequestSigningTime = [0.238],ResponseProcessingTime = [0.003],HttpClientSendRequestTime = [0.118]
2015年9月1日21:16:18706 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [204],服务名称= [Amazon S3的],AWSRequestID = [144CA7E763BB12C6] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [31.69]的Htt prequestTime = [31.444] ,HttpClientReceiveResponseTime = [29.976],RequestSigningTime = [0.139],ResponseProcessingTime = [0.002],HttpClientSendRequestTime = [0.07]
2015年9月1日21:16:18,718 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:102338387163D94E),S3扩展请求ID:iFxuOYrjFEWmk / mCTxIa4OlgWqwAFOh3qE4YxlqkcVb3 / oeVuW9usRPRS73w9CAg],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [102338387163D94E],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [11.867],的Htt prequestTime = [11.606],HttpClientReceiveResponseTime = [10.146],RequestSigningTime = [0.12],HttpClientSendRequestTime = [0.072],
2015年9月1日21:16:18732 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:7FF86B27A748C229),S3扩展请求ID:tgQfRHB + cLoNpNf6lEWVF3v9LwVwheh + / 0Gl0Q8JuQDnV / nkZWfxo29W3ZqUB9uA],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [7FF86B27A748C229],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0, RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [13.874],的Htt prequestTime = [13.622],HttpClientReceiveResponseTime = [12.153],RequestSigningTime = [0.121],HttpClientSendRequestTime = [0.055],
2015年9月1日21:16:18,733 INFO [主] s3n.S3NativeFileSystem(S3NativeFileSystem.java:listStatus(896)) - listStatus S3N:// foo的酒吧的/ tmp / test40_20 / _temporary / 0 / task_201509012116_0005_m_000001递归假
2015年9月1日21:16:18749 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:F850C0C2262580C7),S3扩展请求ID:Sg4K3l / Q3pd1Cyhr5V6y9pH3nDeInGIxZoJdOi6QyTrgFWggw09 + HLy82lm8C6sg],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [F850C0C2262580C7],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [15.981],的Htt prequestTime = [15.697],HttpClientReceiveResponseTime = [14.223],RequestSigningTime = [0.145],HttpClientSendRequestTime = [0.076],
2015年9月1日21:16:18784 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [33695DA390D1B8DF] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [34.601],的Htt prequestTime = [32.989] ,HttpClientReceiveResponseTime = [31.53],RequestSigningTime = [0.126],ResponseProcessingTime = [1.354],HttpClientSendRequestTime = [0.056]
2015年9月1日21:16:18801 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:61A128E7DA02A7B7),S3扩展请求ID:Qc3EqsJl / PQ / E / MnNQrW7 / pgqmPZ700D4hA5sZdo / nWolKm6oq5ZYnERIEEElsOP],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [61A128E7DA02A7B7],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1, HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [16.427],的Htt prequestTime = [16.181],HttpClientReceiveResponseTime = [14.718],RequestSigningTime = [0.123],HttpClientSendRequestTime = [0.072],
2015年9月1日21:16:18813 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:F45035D7D2C5B0C9),S3扩展请求ID:fYLd2JtGOeI2BeltWzcpObGSQBh8VS92dedQuBSDkZVwjCUAVz4k + cv7k + bmLfGb],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [F45035D7D2C5B0C9],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [12.083],的Htt prequestTime = [11.832],HttpClientReceiveResponseTime = [10.379],RequestSigningTime = [0.124],HttpClientSendRequestTime = [0.056],
2015年9月1日21:16:18828 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [D5899A9BA4A95E07] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 1,ClientExecuteTime = [15.137],的Htt prequestTime = [13.767] ,HttpClientReceiveResponseTime = [12.305],RequestSigningTime = [0.123],ResponseProcessingTime = [1.128],HttpClientSendRequestTime = [0.081]
2015年9月1日21:16:18829 INFO [主] s3n.S3NativeFileSystem(S3NativeFileSystem.java:rename(1182)) - 重S3N:// foo的酒吧的/ tmp / test40_20 / _temporary / 0 / task_201509012116_0005_m_000001 /一部分-00001 S3N:// foo的酒吧的/ tmp / test40_20 /一部分-00001
...跳过3400行和95秒...
2015年9月1日21:17:53821 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [204],服务名称= [Amazon S3的],AWSRequestID = [CEDEF99979579E6E] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [20.718],的Htt prequestTime = [20.288] ,HttpClientReceiveResponseTime = [18.391],RequestSigningTime = [0.248],ResponseProcessingTime = [0.006],HttpClientSendRequestTime = [0.158]
2015年9月1日21:17:53846 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [204],服务名称= [Amazon S3的],AWSRequestID = [80AD0657203B53A6] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [24.782],的Htt prequestTime = [24.353] ,HttpClientReceiveResponseTime = [22.444],RequestSigningTime = [0.236],ResponseProcessingTime = [0.006],HttpClientSendRequestTime = [0.113]
2015年9月1日21:17:53859 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:E271C72B2B91FAE6),S3扩展请求ID:jRwTxrz / DSmPZTWGscxLuhBzRHL5CcXeyPfzQ / urdL0Tyki2mJrl0x3SIS / yGpC5yOzSksZUuAc =]服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [E271C72B2B91FAE6],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0 ,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [11.98]的Htt prequestTime = [11.566],HttpClientReceiveResponseTime = [9.793],RequestSigningTime = [0.214],HttpClientSendRequestTime = [0.136],
2015年9月1日21:17:53870 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:156B6DC4EE7BABA6),S3扩展请求ID:F / rPjLYwwXHcxJnpsHwHdUoMQf7diS6r0SV66AvfwQ7mv0z4jigD2RpyXYBTvSvZFODW5E1K8q4 =],服务名称= [亚马逊S3],AWSError code = [404未找​​到],AWSRequestID = [156B6DC4EE7BABA6],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1 ,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [11.161],的Htt prequestTime = [10.893],HttpClientReceiveResponseTime = [9.311],RequestSigningTime = [0.116],HttpClientSendRequestTime = [0.089],
2015年9月1日21:17:53889 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [957AFF2AEC49DB6B] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [17.906],的Htt prequestTime = [15.035] ,HttpClientReceiveResponseTime = [13.306],RequestSigningTime = [0.151],ResponseProcessingTime = [2.521],HttpClientSendRequestTime = [0.125]
2015年9月1日21:17:53912 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [200],服务名称= [Amazon S3的],AWSRequestID = [7CAEE08C0A6B3D2B] ,ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [21.727],的Htt prequestTime = [21.166] ,HttpClientReceiveResponseTime = [19.19],RequestSigningTime = [0.225],ResponseProcessingTime = [0.031],HttpClientSendRequestTime = [0.115]
2015年9月1日21:17:53913 INFO [主] s3n.Jets3tNativeFileSystemStore(Jets3tNativeFileSystemStore.java:storeFile(141)) - s3.putObject富酒吧TMP / test40_20 / _SUCCESS 0
2015年9月1日21:17:53926 INFO [主] amazonaws.latency(AWSRequestMetricsFullSupport.java:log(203)) - 状态code = [404],异常= [com.amazonaws.services.s3.model .AmazonS3Exception:未找到(服务:亚马逊S3;状态code:404;错误code:404未找​​到;请求ID:2D8B08BCE0E24AE5),S3扩展请求ID:f4gTZ9I05s5IzQnwvJP7QieN5eaO3SBgez5ZS9R + f70n9WWWFeTpcg7WoHPa5bf / cIB2U6hQueM =],服务名称= [Amazon S3的],AWSError code = [404未找​​到],AWSRequestID = [2D8B08BCE0E24AE5],ServiceEndpoint = [https://foo-bar.s3.amazonaws.com],异常= 1,HttpClientPoolLeasedCount = 0,RequestCount = 1,HttpClientPoolPendingCount = 0,HttpClientPoolAvailableCount = 20,ClientExecuteTime = [13.082],的Htt prequestTime = [12.543],HttpClientReceiveResponseTime = [10.591],RequestSigningTime = [0.265],HttpClientSendRequestTime = [0.14],


解决方案

要解决这个问题,我添加以下设置马preD-site.xml中作为user@spark.apache.org <建议由Neil Jonkers / p>

 &LT;性&gt;
  &LT;名称&gt; MA pred.output.direct.EmrFileSystem&LT; /名称&gt;
  &LT; VALUE&GT;真&LT; /值&GT;
&LT; /性&gt;
&LT;性&gt;
  &LT;名称&gt; MA pred.output.direct.NativeS3FileSystem&LT; /名称&gt;
  &LT; VALUE&GT;真&LT; /值&GT;
&LT; /性&gt;

它可以通过添加以下AWS的命令来完成

<$p$p><$c$c>classification=ma$p$pd-site,properties=[ma$p$pd.output.direct.EmrFileSystem=true,ma$p$pd.output.direct.NativeS3FileSystem=true]

或通过添加以下内容配置JSON文件

  {
    分类:马preD-网站
    属性:{
      马pred.output.direct.EmrFileSystem:真正的,
      马pred.output.direct.NativeS3FileSystem:真正的
    }
  }

I run spark 1.4.1 in amazom aws emr 4.0.0

For some reson spark saveAsTextFile is very slow on emr 4.0.0 in comparison to emr 3.8 (was 5 sec, now 95 sec)

Actually saveAsTextFile says that it's done in 4.356 sec but after that I see lots of INFO messages with 404 error from com.amazonaws.latency logger for next 90 sec

spark> sc.parallelize(List.range(0, 1600000),160).map(x => x + "\t" + "A"*100).saveAsTextFile("s3n://foo-bar/tmp/test40_20")

2015-09-01 21:16:17,637 INFO  [dag-scheduler-event-loop] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - ResultStage 5 (saveAsTextFile at <console>:22) finished in 4.356 s
2015-09-01 21:16:17,637 INFO  [task-result-getter-2] cluster.YarnScheduler (Logging.scala:logInfo(59)) - Removed TaskSet 5.0, whose tasks have all completed, from pool 
2015-09-01 21:16:17,637 INFO  [main] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Job 5 finished: saveAsTextFile at <console>:22, took 4.547829 s
2015-09-01 21:16:17,638 INFO  [main] s3n.S3NativeFileSystem (S3NativeFileSystem.java:listStatus(896)) - listStatus s3n://foo-bar/tmp/test40_20/_temporary/0 with recursive false
2015-09-01 21:16:17,651 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 3B2F06FD11682D22), S3 Extended Request ID: C8T3rXVSEIk3swlwkUWJJX3gWuQx3QKC3Yyfxuhs7y0HXn3sEI9+c1a0f7/QK8BZ], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[3B2F06FD11682D22], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.923], HttpRequestTime=[11.388], HttpClientReceiveResponseTime=[9.544], RequestSigningTime=[0.274], HttpClientSendRequestTime=[0.129], 
2015-09-01 21:16:17,723 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[E5D513E52B20FF17], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[71.927], HttpRequestTime=[53.517], HttpClientReceiveResponseTime=[51.81], RequestSigningTime=[0.209], ResponseProcessingTime=[17.97], HttpClientSendRequestTime=[0.089], 
2015-09-01 21:16:17,756 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 62C6B413965447FD), S3 Extended Request ID: 4w5rKMWCt9EdeEKzKBXZgWpTcBZCfDikzuRrRrBxmtHYxkZyS4GxQVyADdLkgtZf], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[62C6B413965447FD], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.044], HttpRequestTime=[10.543], HttpClientReceiveResponseTime=[8.743], RequestSigningTime=[0.271], HttpClientSendRequestTime=[0.138], 
2015-09-01 21:16:17,774 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[F62B991825042889], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[16.724], HttpRequestTime=[16.292], HttpClientReceiveResponseTime=[14.728], RequestSigningTime=[0.148], ResponseProcessingTime=[0.155], HttpClientSendRequestTime=[0.068], 
2015-09-01 21:16:17,786 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 4846575A1C373BB9), S3 Extended Request ID: aw/MMKxKPmuDuxTj4GKyDbp8hgpQbTjipJBzdjdTgbwPgt5NsZS4z+tRf2bk3I2E], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[4846575A1C373BB9], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.531], HttpRequestTime=[11.134], HttpClientReceiveResponseTime=[9.434], RequestSigningTime=[0.206], HttpClientSendRequestTime=[0.13], 
2015-09-01 21:16:17,786 INFO  [main] s3n.S3NativeFileSystem (S3NativeFileSystem.java:listStatus(896)) - listStatus s3n://foo-bar/tmp/test40_20/_temporary/0/task_201509012116_0005_m_000000 with recursive false
2015-09-01 21:16:17,798 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 8A91D9A08CE3C1FE), S3 Extended Request ID: u5RLzX1OvlIHBMCggSs3AGR96raYgD/Xu8RmoJuN/B+qZchoF1ZkbWIHRcqbzPNN], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[8A91D9A08CE3C1FE], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.472], HttpRequestTime=[11.147], HttpClientReceiveResponseTime=[9.594], RequestSigningTime=[0.168], HttpClientSendRequestTime=[0.088], 
2015-09-01 21:16:17,817 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[006EE9124BA77E28], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[19.185], HttpRequestTime=[16.691], HttpClientReceiveResponseTime=[15.039], RequestSigningTime=[0.17], ResponseProcessingTime=[2.141], HttpClientSendRequestTime=[0.11], 
2015-09-01 21:16:17,830 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 62F097583E42AB48), S3 Extended Request ID: EoJ7XNxQzKAm6yanlrf7ukIJOxYrhr5m8xEROkLc1wjFpPRgjuwY/JzznCshredZ], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[62F097583E42AB48], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[12.004], HttpRequestTime=[11.57], HttpClientReceiveResponseTime=[9.879], RequestSigningTime=[0.218], HttpClientSendRequestTime=[0.089], 
2015-09-01 21:16:17,844 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: A96FDB3E0E0E13FE), S3 Extended Request ID: Y1nnEJAd/wNtW+T2pFvg8HG5fzcjs+ztuLcXwFV3I6Bda4nKU+9rSdbTkoDtNwtu], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[A96FDB3E0E0E13FE], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[13.543], HttpRequestTime=[13.145], HttpClientReceiveResponseTime=[11.505], RequestSigningTime=[0.207], HttpClientSendRequestTime=[0.108], 
2015-09-01 21:16:17,911 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[4C105174ADF12A0B], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[66.408], HttpRequestTime=[63.949], HttpClientReceiveResponseTime=[62.298], RequestSigningTime=[0.211], ResponseProcessingTime=[2.049], HttpClientSendRequestTime=[0.085], 
2015-09-01 21:16:17,912 INFO  [main] s3n.S3NativeFileSystem (S3NativeFileSystem.java:rename(1182)) - rename s3n://foo-bar/tmp/test40_20/_temporary/0/task_201509012116_0005_m_000000/part-00000 s3n://foo-bar/tmp/test40_20/part-00000
2015-09-01 21:16:17,927 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 547162454610B1C3), S3 Extended Request ID: VgjjiHVtd/RutYxW3jPAZgos64j7JYfBmaMhkZvmyhkgD5ZuCAMSRMd/TrWQmTci], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[547162454610B1C3], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[15.214], HttpRequestTime=[14.764], HttpClientReceiveResponseTime=[13.047], RequestSigningTime=[0.243], HttpClientSendRequestTime=[0.124], 
2015-09-01 21:16:18,037 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 6F10454BF138C69F), S3 Extended Request ID: HSt8mkimmo9fK5qqTaU6OBGKXTQ1wvyctgMZSBsoIgxEFY+Yu5eq/Bn8fOCSsk3B], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[6F10454BF138C69F], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[108.944], HttpRequestTime=[108.542], HttpClientReceiveResponseTime=[106.874], RequestSigningTime=[0.171], HttpClientSendRequestTime=[0.067], 
2015-09-01 21:16:18,215 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[942D4DFF59A2B262], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[177.058], HttpRequestTime=[174.523], HttpClientReceiveResponseTime=[172.689], RequestSigningTime=[0.263], ResponseProcessingTime=[2.049], HttpClientSendRequestTime=[0.117], 
2015-09-01 21:16:18,235 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 712A1FF2554DDD5D), S3 Extended Request ID: RZZDuIrkdE/cdhAFijZix2juyAfZHyj7Mw2xJuyrEaJR5He0HREB30LATWvMJX3A], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[712A1FF2554DDD5D], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[20.187], HttpRequestTime=[19.728], HttpClientReceiveResponseTime=[18.001], RequestSigningTime=[0.238], HttpClientSendRequestTime=[0.125], 
2015-09-01 21:16:18,248 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[B386866C749DB8E0], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.628], HttpRequestTime=[11.091], HttpClientReceiveResponseTime=[9.513], RequestSigningTime=[0.24], ResponseProcessingTime=[0.139], HttpClientSendRequestTime=[0.079], 
2015-09-01 21:16:18,365 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[2621F3858DF8245B], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[117.034], HttpRequestTime=[116.494], HttpClientReceiveResponseTime=[114.81], RequestSigningTime=[0.168], ResponseProcessingTime=[0.202], HttpClientSendRequestTime=[0.1], 
2015-09-01 21:16:18,382 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 595CA0A458D41C97), S3 Extended Request ID: tP+Hh6CER+g31u6GqpWuLttrjUg2oTPCQ9SWVPsSgcD98MvI88eTqSTjIzrSYmu3], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[595CA0A458D41C97], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[16.308], HttpRequestTime=[15.715], HttpClientReceiveResponseTime=[13.752], RequestSigningTime=[0.276], HttpClientSendRequestTime=[0.164], 
2015-09-01 21:16:18,647 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[7785739C9F12EB4A], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[264.11], HttpRequestTime=[261.533], HttpClientReceiveResponseTime=[259.67], RequestSigningTime=[0.309], ResponseProcessingTime=[2.05], HttpClientSendRequestTime=[0.131], 
2015-09-01 21:16:18,674 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[204], ServiceName=[Amazon S3], AWSRequestID=[1F975359BBCA55FD], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[25.921], HttpRequestTime=[25.504], HttpClientReceiveResponseTime=[23.823], RequestSigningTime=[0.238], ResponseProcessingTime=[0.003], HttpClientSendRequestTime=[0.118], 
2015-09-01 21:16:18,706 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[204], ServiceName=[Amazon S3], AWSRequestID=[144CA7E763BB12C6], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[31.69], HttpRequestTime=[31.444], HttpClientReceiveResponseTime=[29.976], RequestSigningTime=[0.139], ResponseProcessingTime=[0.002], HttpClientSendRequestTime=[0.07], 
2015-09-01 21:16:18,718 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 102338387163D94E), S3 Extended Request ID: iFxuOYrjFEWmk/mCTxIa4OlgWqwAFOh3qE4YxlqkcVb3/oeVuW9usRPRS73w9CAg], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[102338387163D94E], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[11.867], HttpRequestTime=[11.606], HttpClientReceiveResponseTime=[10.146], RequestSigningTime=[0.12], HttpClientSendRequestTime=[0.072], 
2015-09-01 21:16:18,732 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 7FF86B27A748C229), S3 Extended Request ID: tgQfRHB+cLoNpNf6lEWVF3v9LwVwheh+/0Gl0Q8JuQDnV/nkZWfxo29W3ZqUB9uA], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[7FF86B27A748C229], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[13.874], HttpRequestTime=[13.622], HttpClientReceiveResponseTime=[12.153], RequestSigningTime=[0.121], HttpClientSendRequestTime=[0.055], 
2015-09-01 21:16:18,733 INFO  [main] s3n.S3NativeFileSystem (S3NativeFileSystem.java:listStatus(896)) - listStatus s3n://foo-bar/tmp/test40_20/_temporary/0/task_201509012116_0005_m_000001 with recursive false
2015-09-01 21:16:18,749 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: F850C0C2262580C7), S3 Extended Request ID: Sg4K3l/Q3pd1Cyhr5V6y9pH3nDeInGIxZoJdOi6QyTrgFWggw09+HLy82lm8C6sg], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[F850C0C2262580C7], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[15.981], HttpRequestTime=[15.697], HttpClientReceiveResponseTime=[14.223], RequestSigningTime=[0.145], HttpClientSendRequestTime=[0.076], 
2015-09-01 21:16:18,784 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[33695DA390D1B8DF], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[34.601], HttpRequestTime=[32.989], HttpClientReceiveResponseTime=[31.53], RequestSigningTime=[0.126], ResponseProcessingTime=[1.354], HttpClientSendRequestTime=[0.056], 
2015-09-01 21:16:18,801 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 61A128E7DA02A7B7), S3 Extended Request ID: Qc3EqsJl/Pq/e/MnNQrW7/pgqmPZ700D4hA5sZdo/nWolKm6oq5ZYnERIEEElsOP], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[61A128E7DA02A7B7], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[16.427], HttpRequestTime=[16.181], HttpClientReceiveResponseTime=[14.718], RequestSigningTime=[0.123], HttpClientSendRequestTime=[0.072], 
2015-09-01 21:16:18,813 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: F45035D7D2C5B0C9), S3 Extended Request ID: fYLd2JtGOeI2BeltWzcpObGSQBh8VS92dedQuBSDkZVwjCUAVz4k+cv7k+bmLfGb], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[F45035D7D2C5B0C9], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[12.083], HttpRequestTime=[11.832], HttpClientReceiveResponseTime=[10.379], RequestSigningTime=[0.124], HttpClientSendRequestTime=[0.056], 
2015-09-01 21:16:18,828 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[D5899A9BA4A95E07], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=1, ClientExecuteTime=[15.137], HttpRequestTime=[13.767], HttpClientReceiveResponseTime=[12.305], RequestSigningTime=[0.123], ResponseProcessingTime=[1.128], HttpClientSendRequestTime=[0.081], 
2015-09-01 21:16:18,829 INFO  [main] s3n.S3NativeFileSystem (S3NativeFileSystem.java:rename(1182)) - rename s3n://foo-bar/tmp/test40_20/_temporary/0/task_201509012116_0005_m_000001/part-00001 s3n://foo-bar/tmp/test40_20/part-00001
...skip 3400 rows and 95 sec...
2015-09-01 21:17:53,821 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[204], ServiceName=[Amazon S3], AWSRequestID=[CEDEF99979579E6E], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[20.718], HttpRequestTime=[20.288], HttpClientReceiveResponseTime=[18.391], RequestSigningTime=[0.248], ResponseProcessingTime=[0.006], HttpClientSendRequestTime=[0.158], 
2015-09-01 21:17:53,846 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[204], ServiceName=[Amazon S3], AWSRequestID=[80AD0657203B53A6], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[24.782], HttpRequestTime=[24.353], HttpClientReceiveResponseTime=[22.444], RequestSigningTime=[0.236], ResponseProcessingTime=[0.006], HttpClientSendRequestTime=[0.113], 
2015-09-01 21:17:53,859 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: E271C72B2B91FAE6), S3 Extended Request ID: jRwTxrz/DSmPZTWGscxLuhBzRHL5CcXeyPfzQ/urdL0Tyki2mJrl0x3SIS/yGpC5yOzSksZUuAc=], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[E271C72B2B91FAE6], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[11.98], HttpRequestTime=[11.566], HttpClientReceiveResponseTime=[9.793], RequestSigningTime=[0.214], HttpClientSendRequestTime=[0.136], 
2015-09-01 21:17:53,870 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 156B6DC4EE7BABA6), S3 Extended Request ID: F/rPjLYwwXHcxJnpsHwHdUoMQf7diS6r0SV66AvfwQ7mv0z4jigD2RpyXYBTvSvZFODW5E1K8q4=], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[156B6DC4EE7BABA6], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[11.161], HttpRequestTime=[10.893], HttpClientReceiveResponseTime=[9.311], RequestSigningTime=[0.116], HttpClientSendRequestTime=[0.089], 
2015-09-01 21:17:53,889 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[957AFF2AEC49DB6B], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[17.906], HttpRequestTime=[15.035], HttpClientReceiveResponseTime=[13.306], RequestSigningTime=[0.151], ResponseProcessingTime=[2.521], HttpClientSendRequestTime=[0.125], 
2015-09-01 21:17:53,912 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[7CAEE08C0A6B3D2B], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[21.727], HttpRequestTime=[21.166], HttpClientReceiveResponseTime=[19.19], RequestSigningTime=[0.225], ResponseProcessingTime=[0.031], HttpClientSendRequestTime=[0.115], 
2015-09-01 21:17:53,913 INFO  [main] s3n.Jets3tNativeFileSystemStore (Jets3tNativeFileSystemStore.java:storeFile(141)) - s3.putObject foo-bar tmp/test40_20/_SUCCESS 0
2015-09-01 21:17:53,926 INFO  [main] amazonaws.latency (AWSRequestMetricsFullSupport.java:log(203)) - StatusCode=[404], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 2D8B08BCE0E24AE5), S3 Extended Request ID: f4gTZ9I05s5IzQnwvJP7QieN5eaO3SBgez5ZS9R+f70n9WWWFeTpcg7WoHPa5bf/cIB2U6hQueM=], ServiceName=[Amazon S3], AWSErrorCode=[404 Not Found], AWSRequestID=[2D8B08BCE0E24AE5], ServiceEndpoint=[https://foo-bar.s3.amazonaws.com], Exception=1, HttpClientPoolLeasedCount=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=20, ClientExecuteTime=[13.082], HttpRequestTime=[12.543], HttpClientReceiveResponseTime=[10.591], RequestSigningTime=[0.265], HttpClientSendRequestTime=[0.14], 

解决方案

To solve the problem I added the following settings to mapred-site.xml as suggested by Neil Jonkers on user@spark.apache.org

<property>
  <name>mapred.output.direct.EmrFileSystem</name>
  <value>true</value>
</property>
<property>
  <name>mapred.output.direct.NativeS3FileSystem</name>
  <value>true</value>
</property>

It can be done by adding the following to aws command

classification=mapred-site,properties=[mapred.output.direct.EmrFileSystem=true,mapred.output.direct.NativeS3FileSystem=true]

or by adding the following to configuration json file

  {
    "Classification": "mapred-site",
    "Properties": {
      "mapred.output.direct.EmrFileSystem": "true",
      "mapred.output.direct.NativeS3FileSystem": "true"
    }
  }

这篇关于火花1.4.1 saveAsTextFile到S3是很慢的EMR-4.0.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆