AWS EMR上的avro错误 [英] avro error on AWS EMR
问题描述
我正在使用spark-redshift( https://github.com/databricks/spark-redshift <
从Redshift中读取是可以的,在写入时我得到
导致:java.lang.NoSuchMethodError:org.apache.avro.generic.GenericData.createDatumWriter(Lorg / apache / avro / Schema;)Lorg / apache / avro / io / DatumWriter
尝试使用Amazon EMR 4.1.0(Spark 1.5.0)和4.0.0(Spark 1.4。 1)。 我正在使用scala shell 并添加这些jar来激发classpath。可能需要以某种方式调整Hadoop(EMR)。 这是否对任何人都有好处? 仅供参考 - 由Alex Nastetsky解决方法 从主节点删除jar 从子节点删除jar 乔纳森提出的正确配置配置也值得一试。 I'm using spark-redshift (https://github.com/databricks/spark-redshift) which uses avro for transfer. Reading from Redshift is OK, while writing I'm getting tried using Amazon EMR 4.1.0 (Spark 1.5.0) and 4.0.0 (Spark 1.4.1).
Cannot do either, just I'm using scala shell
Tried download several others avro-mapred and avro jars, tried setting and adding those jars to spark classpath. Possibly need to tune Hadoop (EMR) somehow. Does this ring a bell to anyone? just for reference - workaround by Alex Nastetsky delete jars from master node delete jars from slave nodes Setting configs correctly as proposed by Jonathan is worth a shot too. 这篇关于AWS EMR上的avro错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
不能做
$ b $ $ p $ import org.apache.avro.generic.GenericData.createDatumWriter
$ c
$ p
$ b $ p $ import org .apache.avro.generic.GenericData
尝试下载几个其他avro-mapred和avro jar,尝试设置
{classification:mapred-site,properties:{ mapreduce.job.user.classpath.first: 真}},{ 分类: 火花-ENV, 属性:{ spark.executor.userClassPathFirst: 真,spark.driver .userClassPathFirst:true}}
find / -name* avro * jar2> / dev / null -print0 | xargs -0 -I文件sudo rm文件
yarn node -list | sed's /。* // g'| tail -n +3 | sed's /:.*// g'| xargs -I node ssh nodefind / -name* avro * jar2> / dev / null -print0 | xargs -0 -I file sudo rm file
Caused by: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter
import org.apache.avro.generic.GenericData.createDatumWriter
import org.apache.avro.generic.GenericData
{"classification":"mapred-site","properties":{"mapreduce.job.user.classpath.first":"true"}},{"classification":"spark-env","properties":{"spark.executor.userClassPathFirst":"true","spark.driver.userClassPathFirst":"true"}}
find / -name "*avro*jar" 2> /dev/null -print0 | xargs -0 -I file sudo rm file
yarn node -list | sed 's/ .*//g' | tail -n +3 | sed 's/:.*//g' | xargs -I node ssh node "find / -name "*avro*jar" 2> /dev/null -print0 | xargs -0 -I file sudo rm file