Spark SQL“作业中未指定输入路径"基于JSON文件创建DataFrame时 [英] Spark SQL "No input paths specified in jobs" when create DataFrame based on JSON file
问题描述
我是Spark的初学者,并且我正在按照以下指南使用PySpark创建基于JSON文件内容的DataFrame:
I am a beginner in Spark and I am trying to create a DataFrame based on the content of JSON file using PySpark by following the guide: http://spark.apache.org/docs/1.6.1/sql-programming-guide.html#overview
但是,每当我执行此命令时(同时使用相对路径或绝对路径)
However, whenever I execute this command (using both relative path or absolute path)
df = sqlContext.read.json("examples/src/main/resources/people.json")
总是给我错误
java.io.IOException: No input paths specified in job
这些问题的原因是什么,或者我错过了任何Spark配置?我正在使用Spark 1.6.1和Python 2.7.6.
What is the cause of these issue or is there any Spark configuration that I have missed out? I am using Spark 1.6.1 and Python 2.7.6.
推荐答案
我也遇到了这个问题,添加"file://"或"hdfs://"对我有用!感谢杰西卡的回答!!!
I run into this problem too, add "file://" or "hdfs://" works for me! Thanks for Jessika's answer!!!
最后,如果您的json文件位于本地文件系统中,请使用
In conclusion, if your json file is in your local file system,use
df = sqlContext.read.json("file:///user/ABC/examples/src/main/resources/people.json")
否则,如果您的json文件位于hdfs中,请使用
else, if your json file is in hdfs, use
df = sqlContext.read.json("hdfs://ip:port/user/ABC/examples/src/main/resources/people.json")
这篇关于Spark SQL“作业中未指定输入路径"基于JSON文件创建DataFrame时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!