通过在Amazon EMR上通过火花提交添加Postgresql jar [英] Adding postgresql jar though spark-submit on amazon EMR
问题描述
我已经尝试使用--drivers-class-path和--jars进行spark-submit,并且尝试了此方法 https://petz2000.wordpress.com/2015/08/18/get-blas-working-with-spark-on -amazon-emr /
I've tried spark-submit with --driver-class-path, with --jars as well as tried this method https://petz2000.wordpress.com/2015/08/18/get-blas-working-with-spark-on-amazon-emr/
在命令行中使用SPARK_CLASSPATH的方式如
On using SPARK_CLASSPATH in the commandline as in
SPARK_CLASSPATH=/home/hadoop/pg_jars/postgresql-9.4.1208.jre7.jar pyspark
我收到此错误
Found both spark.executor.extraClassPath and SPARK_CLASSPATH. Use only the former.
但是我无法添加它。如何添加postgresql JDBC jar文件以从pyspark使用它?我正在使用EMR 4.2版
But I'm not able to add it. How do I add postgresql JDBC jar file to use it from pyspark? I'm using EMR version 4.2
谢谢
推荐答案
1)清除环境变量:
unset SPARK_CLASSPATH
2)使用--jars选项在群集上分布postgres驱动程序:
2) Use --jars option to distribute postgres driver over your cluster:
pyspark --jars=/home/hadoop/pg_jars/postgresql-9.4.1208.jre7.jar
//or
spark-submit --jars=/home/hadoop/pg_jars/postgresql-9.4.1208.jre7.jar <your py script or app jar>
这篇关于通过在Amazon EMR上通过火花提交添加Postgresql jar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!