在Spark 1.6中将csv读取为数据帧 [英] Read csv as Data Frame in spark 1.6

查看:317
本文介绍了在Spark 1.6中将csv读取为数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有Spark 1.6,并尝试读取csv(或tsv)文件作为数据框. 这是我要采取的步骤:

I have Spark 1.6 and trying to read a csv (or tsv) file as a dataframe. Here are the steps I take:

scala>  val sqlContext= new org.apache.spark.sql.SQLContext(sc)
scala> import sqlContext.implicits._
scala> val df = sqlContext.read
scala> .format("com.databricks.spark.csv")
scala> .option("header", "true")
scala.option("inferSchema", "true")
scala> .load("data.csv")
scala> df.show()

错误:

<console>:35: error: value show is not a member of org.apache.spark.sql.DataFrameReader df.show()

最后一条命令应该显示数据帧的前几行,但是我收到了错误消息.任何帮助将不胜感激.

The last command is supposed to show the first few lines of the dataframe, but I get the error message. Any help will be much appreciated.

推荐答案

看起来您的函数未正确链接在一起,它正在尝试在val df上运行"show()",这是对DataFrameReader类的引用. 如果我运行以下命令,则可以重现您的错误:

Looks like you functions are not chained together properly and it's attempting to run "show()" on the val df, which is a reference to the DataFrameReader class. If I run the following, I can reproduce your error:

val df = sqlContext.read
df.show()

如果重构代码,它将可以正常工作:

If you restructure the code, it would work:

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("data.csv")
df.show()

这篇关于在Spark 1.6中将csv读取为数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆