Spark 2.0 CSV错误 [英] Spark 2.0 CSV Error
问题描述
我正在从1.6升级到spark 2,并且在读取CSV文件时遇到问题.在spark 1.6中,我可以将类似的内容读取到CSV文件中.
I am upgrading to spark 2 from 1.6 and am having an issue reading in CSV files. In spark 1.6 I would have something like this to read in a CSV file.
val df = sqlContext.read.format("com.databricks.spark.csv")
.option("header", "true")
.load(fileName)
现在,我使用文档中给出的以下代码:
Now I use the following code as given in the documentation:
val df = spark.read
.option("header", "true")
.csv(fileName)
这会在运行时导致以下错误:
This results in the following error when running:
线程主"中的异常" java.lang.RuntimeException:为csv找到了多个源(org.apache.spark.sql.execution.datasources.csv.CSVFileFormat,com.databricks.spark.csv.DefaultSource15),请指定完全合格的类名称."
我认为这是因为我仍然具有spark-csv依赖关系,但是我删除了该依赖关系并重建了应用程序,但仍然遇到相同的错误.删除数据砖依存关系后,如何仍能找到它?
I assume this is because I still had the spark-csv dependency, however I removed that dependency and rebuilt the application and I still get the same error. How is the databricks dependency still being found once I have removed it?
推荐答案
该错误消息表示您在运行时具有-packages com.databricks:spark-csv_2.11:1.5.0
选项 spark-shell
或在您的类路径中包含这些jar.请检查您的课程路径并将其删除.
The error message means you have --packages com.databricks:spark-csv_2.11:1.5.0
option while you run spark-shell
or have those jars in your class path. Please check your class path and remove that.
这篇关于Spark 2.0 CSV错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!