DataFrame对象不显示任何数据 [英] DataFrame Object is not showing any data

查看:561
本文介绍了DataFrame对象不显示任何数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用spark csv lib在hdfs文件上创建一个dataframe对象,如图所示



但是,当我试图获取DataFrame对象的计数,它显示为0



这是我的文件,看起来像,

雇员.csv:

  empid,empname 
1000,Tom
2000,Jerry
code>

我使用加载上述文件,

  val empDf = sqlContext.read.format(com.databricks.spark.csv)。option(header,true)。option(delimiter,,)。load(hdfs :///user/.../employee.csv); 

当我查询时,empDf object.printSchema()给empid赋予适当的模式,empname作为字符串字段,我可以看到分隔符被正确读取。

但是,当我试图显示dataFrame使用,
empDf.show只给列标题,没有数据,当我做empDf.count给0记录。



请改正我是否错过了这里需要做的事情。 / div>

确保 spark-csv 版本和Spark分发版的Scala版本相同。



例如,如果您的Spark发行版是使用Scala 2.10(Databricks预建的Spark发行版的默认Scala版本)构建的,则需要 spark-csv_2.10 - 版本 spark-csv_2.11 (在上述教程中显示)将不起作用,并且将返回一个仅包含列名的空数据框 - 请参阅我对这个SO问题的回答类似的情况。


I was trying to create a dataframe object on a hdfs file using spark csv lib as shown in this tutorial.

But when i tried to get the count of DataFrame object , it is showing as 0

Here is my file look like,

employee.csv:

empid,empname
1000,Tom
2000,Jerry

I loaded the above file using,

val empDf = sqlContext.read.format("com.databricks.spark.csv").option("header","true").option("delimiter",",").load("hdfs:///user/.../employee.csv");

When i queried like, empDf object.printSchema() is giving proper schema with empid,empname as string fields and i could see that delimiter was read properly.

But when i tried to display the dataFrame using, empDf.show giving only column header and no data in it and when i do empDf.count giving 0 records.

Please correct me if i missed something to do which is very much required here.

解决方案

Be sure that the spark-csv version and the Scala version with which your Spark distribution is built are the same.

For example, if your Spark distro is built with Scala 2.10 (the default Scala version for Databricks prebuilt Spark distros), you will need spark-csv_2.10 - version spark-csv_2.11 (shown in the mentioned tutorial) will not work, and will return an empty dataframe with only column names - see my answer to this SO question for a similar case.

这篇关于DataFrame对象不显示任何数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆