Spark + Cassandra上的EMR LinkageError [英] Spark + Cassandra on EMR LinkageError

查看:514
本文介绍了Spark + Cassandra上的EMR LinkageError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在EMR 4.4.0上部署了Spark 1.6
我连接到部署在EC2上的datastax cassandra 2.2.5。



使用spark-connector将数据写入cassandra 1.4.2_s2.10(因为它有番石榴14)然而,使用1.4.2版本的连接器从cassandra读取数据失败。



正确的组合建议使用1.5.x,因此我开始使用1.5.0。
首先我遇到了番石榴问题,我使用userClasspathFirst解决方案修复它。

  spark-shell --conf spark。 yarn.executor.memoryOverhead = 2048 
--packages datastax:spark-cassandra-connector:1.5.0-s_2.10
--conf spark.cassandra.connection.host = 10.236.250.96
--conf spark.executor.extraClassPath = / home / hadoop / lib / guava-16.0.1.jar:/ etc / hadoop / conf:/ etc / hive / conf:/ usr / lib / hadoop-lzo / *:/ usr / share / aws / aws-java-sdk / *:/ usr / share / aws / emr / emrfs / conf:/usr/share/aws/emr/emrfs/lib/ * aws / emr / emrfs / auxlib / *
--conf spark.driver.extraClassPath = / home / hadoop / lib / guava-16.0.1.jar:/ etc / hadoop / conf:/ etc / hive / conf :/ usr / lib / hadoop-lzo / lib / *:/ usr / share / aws / aws-java-sdk / *:/ usr / share / aws / emr / emrfs / conf:/ usr / share / aws / emr / emrfs / lib / *:/ usr / share / aws / emr / emrfs / auxlib / *
--conf spark.driver.userClassPathFirst = true
--conf spark.executor.userClassPathFirst = true

现在我遇到了Guava 16错误,但是由于我使用userClassPathFirst我面临另一个冲突,我没有办法解决它。

 在2.0阶段丢失任务2.1(TID 6,ip-10-187-78-197.ec2.internal):java .lang.LinkageError:
加载器约束违反:以前启动的名称为org / slf4j / Logger的不同类型的加载器(org / apache / spark / util / ChildFirstURLClassLoader的实例)

当我使用Java代码而不是spark-shell重复这些步骤时,我有同样的麻烦。
任何解决方案要通过它,或任何其他更清洁的方式?



谢谢!

当使用'userClassPathFirst'标志时,我得到了相同的错误。



从配置中删除这两个标志,只使用'extraClassPath 'paramter。



详细回答:
http://stackoverflow.com / a / 40235289/3487888


I have Spark 1.6 deployed on EMR 4.4.0 I am connecting to datastax cassandra 2.2.5 deployed on EC2.

The connection works to save data into cassandra using spark-connector 1.4.2_s2.10 (Since it has guava 14) However reading data from cassandra fails using the 1.4.2 version of connector.

The right combination suggests to use 1.5.x and hence I started using 1.5.0. First I faced the guava problem and I fixed it using userClasspathFirst solution.

spark-shell --conf spark.yarn.executor.memoryOverhead=2048 
--packages datastax:spark-cassandra-connector:1.5.0-s_2.10 
--conf spark.cassandra.connection.host=10.236.250.96 
--conf spark.executor.extraClassPath=/home/hadoop/lib/guava-16.0.1.jar:/etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* 
--conf spark.driver.extraClassPath=/home/hadoop/lib/guava-16.0.1.jar:/etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* 
--conf spark.driver.userClassPathFirst=true 
--conf spark.executor.userClassPathFirst=true

Now I get past Guava 16 error, however since I am using the userClassPathFirst i am facing another conflict, and I am not getting any way to resolve it.

Lost task 2.1 in stage 2.0 (TID 6, ip-10-187-78-197.ec2.internal): java.lang.LinkageError: 
loader constraint violation: loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) previously initiated loading for a different type with name "org/slf4j/Logger"

I am having the same trouble when I repeast the steps using Java code instead of spark-shell. Any solution to get past it, or any other cleaner way?

Thanks!

解决方案

I got the same error when using the 'userClassPathFirst' flag.

Remove these 2 flags from configuration, and just use the 'extraClassPath' paramter.

Detailed answer here: http://stackoverflow.com/a/40235289/3487888

这篇关于Spark + Cassandra上的EMR LinkageError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆