NoSuchMethodError Sets.newConcurrentHashSet() 在使用 hadoop 运行 jar 时 [英] NoSuchMethodError Sets.newConcurrentHashSet() while running jar using hadoop

查看:44
本文介绍了NoSuchMethodError Sets.newConcurrentHashSet() 在使用 hadoop 运行 jar 时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 cassandra-all 2.0.7 api 和 hadoop 2.2.0.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>zazzercode</groupId><artifactId>doctorhere-engine-writer</artifactId><version>1.0</version><包装>罐</包装><name>DoctorhereEngineWriter</name><属性><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><cassandra.version>2.0.7</cassandra.version><hector.version>1.0-2</hector.version><guava.version>15.0</guava.version><hadoop.version>2.2.0</hadoop.version></属性><构建><插件><插件><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>2.3.2</version><配置><来源>1.6</来源><目标>1.6</目标></配置></插件><插件><artifactId>maven-assembly-plugin</artifactId><配置><存档><清单><mainClass>zazzercode.DiseaseCountJob</mainClass></清单></归档><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs></配置></插件></插件></build><依赖项><依赖><groupId>junit</groupId><artifactId>junit</artifactId><version>3.8.1</version><范围>测试</范围></依赖><依赖><groupId>me.prettyprint</groupId><artifactId>hector-core</artifactId><version>${hector.version}</version><排除事项><排除><artifactId>org.apache.thrift</artifactId><groupId>libthrift</groupId></排除></排除项></依赖><依赖><groupId>org.apache.cassandra</groupId><artifactId>cassandra-all</artifactId><version>${cassandra.version}</version><排除事项><排除><artifactId>libthrift</artifactId><groupId>org.apache.thrift</groupId></排除></排除项></依赖><依赖><groupId>org.apache.cassandra</groupId><artifactId>cassandra-thrift</artifactId><version>${cassandra.version}</version><排除事项><排除><artifactId>libthrift</artifactId><groupId>org.apache.thrift</groupId></排除></排除项></依赖><依赖><groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId><version>${hadoop.version}</version></依赖><依赖><groupId>org.apache.thrift</groupId><artifactId>libthrift</artifactId><version>0.7.0</version></依赖><依赖><groupId>com.google.guava</groupId><artifactId>番石榴</artifactId><version>${guava.version}</version></依赖><依赖><groupId>com.googlecode.concurrentlinkedhashmap</groupId><artifactId>concurrentlinkedhashmap-lru</artifactId><version>1.3</version></依赖></依赖项></项目>

当我从 hduser 启动 jar(在 mvn assembly:assembly 之后从普通用户 prayagupd 创建)如下时,

hduser@prayagupd$ hadoop jar target/doctorhere-engine-writer-1.0-jar-with-dependencies.jar/user/hduser/shakespeare

我在 cassandra api 上遇到以下番石榴收集错误,

14/11/23 17:51:04 WARN mapred.LocalJobRunner: job_local800673408_0001java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;在 org.apache.cassandra.config.Config.(Config.java:53)在 org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:105)在 org.apache.cassandra.hadoop.BulkRecordWriter.(BulkRecordWriter.java:105)在 org.apache.cassandra.hadoop.BulkRecordWriter.(BulkRecordWriter.java:90)在 org.apache.cassandra.hadoop.BulkOutputFormat.getRecordWriter(BulkOutputFormat.java:69)在 org.apache.cassandra.hadoop.BulkOutputFormat.getRecordWriter(BulkOutputFormat.java:29)在 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:558)在 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:632)在 org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:405)在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:445)14/11/23 17:51:04 INFO mapreduce.Job:映射 100% 减少 0%

git clone --branch doctor-engine-writer https://github.com/prayagupd/doctorhereCD Doctorhere/doctorhere-engine-writer

参考文献

mapreduce 时的 Hadoop 库冲突

解决方案

您基本上遇到了版本冲突.问题是这样的,

  • hadoop 本地库和 cassandra 都使用谷歌番石榴.
  • 但是您的 hadoop 版本使用的是旧版本的 guava (11.xx),而您的 cassandra 正在更新并使用 guava 16.0.对于企业规模的 hadoop 设置来说,随着每个新版本更新他们的环境并不常见.
  • cassandra 配置加载器使用旧版本中不存在的 newConcurrentHashSet() 方法.
  • hadoop 使用的 jar 总是在任何第三方 jar 之前加载.因此,即使您的具有依赖项"jar 中存在正确版本的番石榴,但旧版本的番石榴 jar 正在从 hadoop 类路径加载并分发到您的映射器/减速器.

解决方案:

  • 在您的 Job 的 run 方法中将配置参数mapreduce.job.user.classpath.first"设置为 true :

    job.getConfiguration().set("mapreduce.job.user.classpath.first", "true");

  • 现在,在您的 bin/hadoop 中,添加语句

    导出 HADOOP_USER_CLASSPATH_FIRST=true这将告诉 hadoop 首先加载用户定义的库.

  • 确保在旧版本之前,您的 hadoop 类路径中存在最新版本的库.

I'm using cassandra-all 2.0.7 api with hadoop 2.2.0.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>zazzercode</groupId>
    <artifactId>doctorhere-engine-writer</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>DoctorhereEngineWriter</name>

    <properties>
      <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
      <cassandra.version>2.0.7</cassandra.version>
      <hector.version>1.0-2</hector.version>
      <guava.version>15.0</guava.version>
      <hadoop.version>2.2.0</hadoop.version>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>zazzercode.DiseaseCountJob</mainClass>
                        </manifest>
                    </archive>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>me.prettyprint</groupId>
            <artifactId>hector-core</artifactId>
            <version>${hector.version}</version>
            <exclusions>
                <exclusion>
                    <artifactId>org.apache.thrift</artifactId>
                    <groupId>libthrift</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.cassandra</groupId>
            <artifactId>cassandra-all</artifactId>
        <version>${cassandra.version}</version>
            <exclusions>
                <exclusion>
                    <artifactId>libthrift</artifactId>
                    <groupId>org.apache.thrift</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.cassandra</groupId>
            <artifactId>cassandra-thrift</artifactId>
        <version>${cassandra.version}</version>
            <exclusions>
                <exclusion>
                    <artifactId>libthrift</artifactId>
                    <groupId>org.apache.thrift</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.thrift</groupId>
            <artifactId>libthrift</artifactId>
            <version>0.7.0</version>
        </dependency>
      <dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
        <version>${guava.version}</version>
      </dependency>

      <dependency>
        <groupId>com.googlecode.concurrentlinkedhashmap</groupId>
        <artifactId>concurrentlinkedhashmap-lru</artifactId>
        <version>1.3</version>
      </dependency>


    </dependencies>
</project>

When I start the jar (created after mvn assembly:assembly from normal user prayagupd) as below from hduser,

hduser@prayagupd$ hadoop jar target/doctorhere-engine-writer-1.0-jar-with-dependencies.jar /user/hduser/shakespeare

I get following guava collection error on cassandra api,

14/11/23 17:51:04 WARN mapred.LocalJobRunner: job_local800673408_0001
java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;
    at org.apache.cassandra.config.Config.<init>(Config.java:53)
    at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:105)
    at org.apache.cassandra.hadoop.BulkRecordWriter.<init>(BulkRecordWriter.java:105)
    at org.apache.cassandra.hadoop.BulkRecordWriter.<init>(BulkRecordWriter.java:90)
    at org.apache.cassandra.hadoop.BulkOutputFormat.getRecordWriter(BulkOutputFormat.java:69)
    at org.apache.cassandra.hadoop.BulkOutputFormat.getRecordWriter(BulkOutputFormat.java:29)
    at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:558)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:632)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:405)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:445)
14/11/23 17:51:04 INFO mapreduce.Job:  map 100% reduce 0%

Line#53 of cassandra api's Config.java has this code,

public Set<String> hinted_handoff_enabled_by_dc = Sets.newConcurrentHashSet();

Whereas, I find Sets class with the jar itself,

  hduser@prayagupd$ jar tvf target/doctorhere-engine-writer-1.0-jar-with-dependencies.jar | grep com/google/common/collect/Sets
  2358 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$1.class
  2019 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$2.class
  1705 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$3.class
  1327 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$CartesianSet$1.class
  4224 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$CartesianSet.class
  5677 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$DescendingSet.class
  4187 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$FilteredNavigableSet.class
  1567 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$FilteredSet.class
  2614 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$FilteredSortedSet.class
  1174 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$ImprovedAbstractSet.class
  1361 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$PowerSet$1.class
  3727 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$PowerSet.class
  1398 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$SetView.class
  1950 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$SubSet$1.class
  2058 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$SubSet.class
  4159 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets$UnmodifiableNavigableSet.class
 17349 Fri Sep 06 15:52:24 NPT 2013 com/google/common/collect/Sets.class

Also, there exists the method when I checked the jar as below,

hduser@prayagupd$ javap -classpath target/doctorhere-engine-writer-1.0-jar-with-dependencies.jar com.google.common.collect.Sets | grep newConcurrentHashSet
  public static <E extends java/lang/Object> java.util.Set<E> newConcurrentHashSet();
  public static <E extends java/lang/Object> java.util.Set<E> newConcurrentHashSet(java.lang.Iterable<? extends E>);

I see com.google.guava under /META/INF/maven library when I navigate the jar file,

I have following artifacts in ~/.m2 from outside of hdfs user,

$ ll ~/.m2/repository/com/google/guava/guava
total 20
drwxrwxr-x 5 prayagupd prayagupd 4096 Nov 23 20:05 ./
drwxrwxr-x 4 prayagupd prayagupd 4096 Nov 23 20:05 ../
drwxrwxr-x 2 prayagupd prayagupd 4096 Nov 23 20:05 11.0.2/
drwxrwxr-x 2 prayagupd prayagupd 4096 Nov 23 20:06 15.0/
drwxrwxr-x 2 prayagupd prayagupd 4096 Nov 23 20:05 r09/

And hadoop classpath is

$ hadoop classpath
/usr/local/hadoop-2.2.0/etc/hadoop:
/usr/local/hadoop2.2.0/share/hadoop/common/lib/*:
/usr/local/hadoop-2.2.0/share/hadoop/common/*:
/usr/local/hadoop-2.2.0/share/hadoop/hdfs:
/usr/local/hadoop-2.2.0/share/hadoop/hdfs/lib/*:
/usr/local/hadoop-2.2.0/share/hadoop/hdfs/*:
/usr/local/hadoop-2.2.0/share/hadoop/yarn/lib/*:
/usr/local/hadoop-2.2.0/share/hadoop/yarn/*:
/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:
/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/*:
/usr/local/hadoop-2.2.0/contrib/capacity-scheduler/*.jar

dependency tree is as below where com.google.guava:guava:jar:r09:compile is used by me.prettyprint:hector-core:jar:1.0-2:compile, whereas guava-11.0.2.jar is used by hadoop-2.2.0 or hadoop-2.6.0 and cassandra-2.0.6 uses guava-15.0..jar

$ find /usr/local/apache-cassandra-2.0.6/ -name "guava*"
/usr/local/apache-cassandra-2.0.6/lib/guava-15.0.jar
/usr/local/apache-cassandra-2.0.6/lib/licenses/guava-15.0.txt


$ mvn dependency:tree
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building DoctorhereEngineWriter 1.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.1:tree (default-cli) @ doctorhere-engine-writer ---
[INFO] zazzercode:doctorhere-engine-writer:jar:1.0
[INFO] +- junit:junit:jar:3.8.1:test (scope not updated to compile)
[INFO] +- me.prettyprint:hector-core:jar:1.0-2:compile
[INFO] |  +- commons-lang:commons-lang:jar:2.4:compile
[INFO] |  +- commons-pool:commons-pool:jar:1.5.3:compile
[INFO] |  +- com.google.guava:guava:jar:r09:compile
[INFO] |  +- org.slf4j:slf4j-api:jar:1.6.1:compile
[INFO] |  +- com.github.stephenc.eaio-uuid:uuid:jar:3.2.0:compile
[INFO] |  - com.ecyrd.speed4j:speed4j:jar:0.9:compile
[INFO] +- org.apache.cassandra:cassandra-all:jar:2.0.7:compile
[INFO] |  +- org.xerial.snappy:snappy-java:jar:1.0.5:compile
[INFO] |  +- net.jpountz.lz4:lz4:jar:1.2.0:compile
[INFO] |  +- com.ning:compress-lzf:jar:0.8.4:compile
[INFO] |  +- commons-cli:commons-cli:jar:1.1:compile
[INFO] |  +- commons-codec:commons-codec:jar:1.2:compile
[INFO] |  +- org.apache.commons:commons-lang3:jar:3.1:compile
[INFO] |  +- com.googlecode.concurrentlinkedhashmap:concurrentlinkedhashmap-lru:jar:1.3:compile
[INFO] |  +- org.antlr:antlr:jar:3.2:compile
[INFO] |  |  - org.antlr:antlr-runtime:jar:3.2:compile
[INFO] |  |     - org.antlr:stringtemplate:jar:3.2:compile
[INFO] |  |        - antlr:antlr:jar:2.7.7:compile
[INFO] |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.2:compile
[INFO] |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.2:compile
[INFO] |  +- jline:jline:jar:1.0:compile
[INFO] |  +- com.googlecode.json-simple:json-simple:jar:1.1:compile
[INFO] |  +- com.github.stephenc.high-scale-lib:high-scale-lib:jar:1.1.2:compile
[INFO] |  +- org.yaml:snakeyaml:jar:1.11:compile
[INFO] |  +- edu.stanford.ppl:snaptree:jar:0.1:compile
[INFO] |  +- org.mindrot:jbcrypt:jar:0.3m:compile
[INFO] |  +- com.yammer.metrics:metrics-core:jar:2.2.0:compile
[INFO] |  +- com.addthis.metrics:reporter-config:jar:2.1.0:compile
[INFO] |  |  - org.hibernate:hibernate-validator:jar:4.3.0.Final:compile
[INFO] |  |     +- javax.validation:validation-api:jar:1.0.0.GA:compile
[INFO] |  |     - org.jboss.logging:jboss-logging:jar:3.1.0.CR2:compile
[INFO] |  +- com.thinkaurelius.thrift:thrift-server:jar:0.3.3:compile
[INFO] |  |  - com.lmax:disruptor:jar:3.0.1:compile
[INFO] |  +- net.sf.supercsv:super-csv:jar:2.1.0:compile
[INFO] |  +- log4j:log4j:jar:1.2.16:compile
[INFO] |  +- com.github.stephenc:jamm:jar:0.2.5:compile
[INFO] |  - io.netty:netty:jar:3.6.6.Final:compile
[INFO] +- org.apache.cassandra:cassandra-thrift:jar:2.0.7:compile
[INFO] +- org.apache.hadoop:hadoop-client:jar:2.2.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-common:jar:2.2.0:compile
[INFO] |  |  +- org.apache.commons:commons-math:jar:2.1:compile
[INFO] |  |  +- xmlenc:xmlenc:jar:0.52:compile
[INFO] |  |  +- commons-httpclient:commons-httpclient:jar:3.1:compile
[INFO] |  |  +- commons-io:commons-io:jar:2.1:compile
[INFO] |  |  +- commons-net:commons-net:jar:3.1:compile
[INFO] |  |  +- commons-logging:commons-logging:jar:1.1.1:compile
[INFO] |  |  +- commons-configuration:commons-configuration:jar:1.6:compile
[INFO] |  |  |  +- commons-collections:commons-collections:jar:3.2.1:compile
[INFO] |  |  |  +- commons-digester:commons-digester:jar:1.8:compile
[INFO] |  |  |  |  - commons-beanutils:commons-beanutils:jar:1.7.0:compile
[INFO] |  |  |  - commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
[INFO] |  |  +- org.apache.avro:avro:jar:1.7.4:compile
[INFO] |  |  |  - com.thoughtworks.paranamer:paranamer:jar:2.3:compile
[INFO] |  |  +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
[INFO] |  |  +- org.apache.hadoop:hadoop-auth:jar:2.2.0:compile
[INFO] |  |  +- org.apache.zookeeper:zookeeper:jar:3.4.5:compile
[INFO] |  |  - org.apache.commons:commons-compress:jar:1.4.1:compile
[INFO] |  |     - org.tukaani:xz:jar:1.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-hdfs:jar:2.2.0:compile
[INFO] |  |  - org.mortbay.jetty:jetty-util:jar:6.1.26:compile
[INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-app:jar:2.2.0:compile
[INFO] |  |  +- org.apache.hadoop:hadoop-mapreduce-client-common:jar:2.2.0:compile
[INFO] |  |  |  +- org.apache.hadoop:hadoop-yarn-client:jar:2.2.0:compile
[INFO] |  |  |  - org.apache.hadoop:hadoop-yarn-server-common:jar:2.2.0:compile
[INFO] |  |  - org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:2.2.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-yarn-api:jar:2.2.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.2.0:compile
[INFO] |  |  - org.apache.hadoop:hadoop-yarn-common:jar:2.2.0:compile
[INFO] |  +- org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:2.2.0:compile
[INFO] |  - org.apache.hadoop:hadoop-annotations:jar:2.2.0:compile
[INFO] - org.apache.thrift:libthrift:jar:0.7.0:compile
[INFO]    +- javax.servlet:servlet-api:jar:2.5:compile
[INFO]    - org.apache.httpcomponents:httpclient:jar:4.0.1:compile
[INFO]       - org.apache.httpcomponents:httpcore:jar:4.0.1:compile
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 27.124s
[INFO] Finished at: Wed Mar 18 01:39:42 CDT 2015
[INFO] Final Memory: 15M/982M
[INFO] ------------------------------------------------------------------------

Here's hadoop script for hadoop 2.2.0,

$ cat /usr/local/hadoop-2.2.0/bin/hadoop
#!/usr/bin/env bash

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This script runs the hadoop core commands. 

bin=`which $0`
bin=`dirname ${bin}`
bin=`cd "$bin"; pwd`

DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh

export HADOOP_USER_CLASSPATH_FIRST=true

function print_usage(){
  echo "Usage: hadoop [--config confdir] COMMAND"
  echo "       where COMMAND is one of:"
  echo "  fs                   run a generic filesystem user client"
  echo "  version              print the version"
  echo "  jar <jar>            run a jar file"
  echo "  checknative [-a|-h]  check native hadoop and compression libraries availability"
  echo "  distcp <srcurl> <desturl> copy file or directories recursively"
  echo "  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive"
  echo "  classpath            prints the class path needed to get the"
  echo "                       Hadoop jar and the required libraries"
  echo "  daemonlog            get/set the log level for each daemon"
  echo " or"
  echo "  CLASSNAME            run the class named CLASSNAME"
  echo ""
  echo "Most commands print help when invoked w/o parameters."
}

if [ $# = 0 ]; then
  print_usage
  exit
fi

COMMAND=$1
case $COMMAND in
  # usage flags
  --help|-help|-h)
    print_usage
    exit
    ;;

  #hdfs commands
  namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups|portmap|nfs3)
    echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2
    echo "Instead use the hdfs command for it." 1>&2
    echo "" 1>&2
    #try to locate hdfs and if present, delegate to it.  
    shift
    if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then
      exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups}  "$@"
    elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then
      exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@"
    else
      echo "HADOOP_HDFS_HOME not found!"
      exit 1
    fi
    ;;

  #mapred commands for backwards compatibility
  pipes|job|queue|mrgroups|mradmin|jobtracker|tasktracker)
    echo "DEPRECATED: Use of this script to execute mapred command is deprecated." 1>&2
    echo "Instead use the mapred command for it." 1>&2
    echo "" 1>&2
    #try to locate mapred and if present, delegate to it.
    shift
    if [ -f "${HADOOP_MAPRED_HOME}"/bin/mapred ]; then
      exec "${HADOOP_MAPRED_HOME}"/bin/mapred ${COMMAND/mrgroups/groups} "$@"
    elif [ -f "${HADOOP_PREFIX}"/bin/mapred ]; then
      exec "${HADOOP_PREFIX}"/bin/mapred ${COMMAND/mrgroups/groups} "$@"
    else
      echo "HADOOP_MAPRED_HOME not found!"
      exit 1
    fi
    ;;

  classpath)
    echo $CLASSPATH
    exit
    ;;

  #core commands  
  *)
    # the core commands
    if [ "$COMMAND" = "fs" ] ; then
      CLASS=org.apache.hadoop.fs.FsShell
    elif [ "$COMMAND" = "version" ] ; then
      CLASS=org.apache.hadoop.util.VersionInfo
    elif [ "$COMMAND" = "jar" ] ; then
      CLASS=org.apache.hadoop.util.RunJar
    elif [ "$COMMAND" = "checknative" ] ; then
      CLASS=org.apache.hadoop.util.NativeLibraryChecker
    elif [ "$COMMAND" = "distcp" ] ; then
      CLASS=org.apache.hadoop.tools.DistCp
      CLASSPATH=${CLASSPATH}:${TOOL_PATH}
    elif [ "$COMMAND" = "daemonlog" ] ; then
      CLASS=org.apache.hadoop.log.LogLevel
    elif [ "$COMMAND" = "archive" ] ; then
      CLASS=org.apache.hadoop.tools.HadoopArchives
      CLASSPATH=${CLASSPATH}:${TOOL_PATH}
    elif [[ "$COMMAND" = -*  ]] ; then
        # class and package names cannot begin with a -
        echo "Error: No command named `$COMMAND' was found. Perhaps you meant `hadoop ${COMMAND#-}'"
        exit 1
    else
      CLASS=$COMMAND
    fi
    shift

    # Always respect HADOOP_OPTS and HADOOP_CLIENT_OPTS
    HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"

    #make sure security appender is turned off
    HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}"

    export CLASSPATH=$CLASSPATH
    exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
    ;;

esac

How could this google-collection issue be fixed?

Actual code here

git clone --branch doctor-engine-writer https://github.com/prayagupd/doctorhere
cd doctorhere/doctorhere-engine-writer

References

Hadoop library conflict at mapreduce time

解决方案

You are basically running into a version conflict. The problem goes like this,

  • Both hadoop native libraries and cassandra uses google guava.
  • But your hadoop version is using an older version of guava (11.xx) while your cassandra is update and uses guava 16.0. It is not very common for enterprise scale hadoop setups to update their environment with every new release.
  • cassandra config loader uses newConcurrentHashSet() method which is not present in your older version.
  • jars used by hadoop are always loaded before any third party jars. Hence even though a correct version of guava is present in your "with dependencies" jar, an older version of guava jar was being loaded from hadoop classpath and distributed to your mappers/reducers.

Solution:

  • Set the configuration parameter "mapreduce.job.user.classpath.first" to true in the run method of your Job :

    job.getConfiguration().set("mapreduce.job.user.classpath.first", "true");
    

  • Now, in your bin/hadoop , add the statement

    export HADOOP_USER_CLASSPATH_FIRST=true
    
    which will tell hadoop to load user defined libraries first. 
    

  • Make sure the latest version of your library is present in your hadoop classpath before the older one.

这篇关于NoSuchMethodError Sets.newConcurrentHashSet() 在使用 hadoop 运行 jar 时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆