在Spark上运行AWS S3客户端时出现NoSuchMethodError,而在Javap上显示否则 [英] NoSuchMethodError while running AWS S3 client on Spark while javap shows otherwise

查看:133
本文介绍了在Spark上运行AWS S3客户端时出现NoSuchMethodError,而在Javap上显示否则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Apache Spark上运行的一段代码存在运行时问题.我依靠AWS开发工具包将文件上传到S3-并出现NoSuchMethodError错误.值得注意的是,我正在使用捆绑了Spark依赖项的uber jar.运行代码时出错:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator.<init>(Lorg/apache/http/conn/scheme/SchemeRegistry;Lorg/apache/http/conn/DnsResolver;)V
at org.apache.http.impl.conn.PoolingClientConnectionManager.createConnectionOperator(PoolingClientConnectionManager.java:140)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:114)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:99)
at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:29)
at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:97)
at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:165)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:119)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:103)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:357)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:339)

但是,当我检查jar的方法签名时,会清楚地看到它:

vagrant@mesos:~/installs/spark-1.0.1-bin-hadoop2$ javap -classpath /tmp/rickshaw-spark-0.0.1-SNAPSHOT.jar org.apache.http.impl.conn.DefaultClientConnectionOperator
Compiled from "DefaultClientConnectionOperator.java"
public class org.apache.http.impl.conn.DefaultClientConnectionOperator implements     org.apache.http.conn.ClientConnectionOperator {
protected final org.apache.http.conn.scheme.SchemeRegistry schemeRegistry;
protected final org.apache.http.conn.DnsResolver dnsResolver;
public  org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry);
public org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry, org.apache.http.conn.DnsResolver); <-- it exists!
public org.apache.http.conn.OperatedClientConnection createConnection();
public void openConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, java.net.InetAddress, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
public void updateSecureConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected void prepareSocket(java.net.Socket, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected java.net.InetAddress[] resolveHostname(java.lang.String) throws java.net.UnknownHostException;

}

我检查了spark分布中的其他一些jar-它们似乎没有此特定的方法签名.因此,我想知道Spark运行时选择了什么导致此问题.这个jar是建立在一个maven项目上的,在该项目中我对依赖项进行了排列,以确保正确的aws java sdk依赖项也被选中.

解决方案

Spark 1.0.x发行版已经包含不兼容的DefaultClientConnectionOperator版本,没有简单的替换方法.

我发现的唯一解决方法是包括PoolingClientConnectionManager的自定义实现,以避免使用缺少的构造函数.

替换:

return new DefaultClientConnectionOperator(schreg, this.dnsResolver);

针对:

return new DefaultClientConnectionOperator(schreg);

您需要确定,您的课程将包括在内:

case PathList("org", "apache", "http", "impl", xs @ _*) => MergeStrategy.first

自定义PoolingClientConnectionManager: https://gist.github.com/felixgborrego/568f3460d82d9c12e23c

I have a runtime problem with a piece of code I'm running on top of Apache Spark. I depend on the AWS SDK to upload files to S3 - and this is erroring out with a NoSuchMethodError. It is worthwhile to note that I'm using an uber jar with the Spark dependency bundled in. Error when running my code:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator.<init>(Lorg/apache/http/conn/scheme/SchemeRegistry;Lorg/apache/http/conn/DnsResolver;)V
at org.apache.http.impl.conn.PoolingClientConnectionManager.createConnectionOperator(PoolingClientConnectionManager.java:140)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:114)
at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:99)
at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:29)
at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:97)
at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:165)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:119)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:103)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:357)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:339)

However, when I inspect the jar for the method signature, I see it clearly listed:

vagrant@mesos:~/installs/spark-1.0.1-bin-hadoop2$ javap -classpath /tmp/rickshaw-spark-0.0.1-SNAPSHOT.jar org.apache.http.impl.conn.DefaultClientConnectionOperator
Compiled from "DefaultClientConnectionOperator.java"
public class org.apache.http.impl.conn.DefaultClientConnectionOperator implements     org.apache.http.conn.ClientConnectionOperator {
protected final org.apache.http.conn.scheme.SchemeRegistry schemeRegistry;
protected final org.apache.http.conn.DnsResolver dnsResolver;
public  org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry);
public org.apache.http.impl.conn.DefaultClientConnectionOperator(org.apache.http.conn.scheme.SchemeRegistry, org.apache.http.conn.DnsResolver); <-- it exists!
public org.apache.http.conn.OperatedClientConnection createConnection();
public void openConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, java.net.InetAddress, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
public void updateSecureConnection(org.apache.http.conn.OperatedClientConnection, org.apache.http.HttpHost, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected void prepareSocket(java.net.Socket, org.apache.http.protocol.HttpContext, org.apache.http.params.HttpParams) throws java.io.IOException;
protected java.net.InetAddress[] resolveHostname(java.lang.String) throws java.net.UnknownHostException;

}

I checked some of the other jars in the spark distribution - they don't seem have this particular method signature. So I'm left wondering what is being picked up by the Spark runtime to cause this issue. The jar is built on a maven project where I lined up the dependencies to ensure the correct aws java sdk dependency was being picked up as well.

解决方案

The Spark 1.0.x distribution already contains an incompatible version of DefaultClientConnectionOperator and there is not easy way to replace it.

The only workaround I've found is including a custom implementation of PoolingClientConnectionManager to avoid using the missing constructor.

Replacing:

return new DefaultClientConnectionOperator(schreg, this.dnsResolver);

for:

return new DefaultClientConnectionOperator(schreg);

You need to be sure, your class is going to be included:

case PathList("org", "apache", "http", "impl", xs @ _*) => MergeStrategy.first

Custom PoolingClientConnectionManager: https://gist.github.com/felixgborrego/568f3460d82d9c12e23c

这篇关于在Spark上运行AWS S3客户端时出现NoSuchMethodError,而在Javap上显示否则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆