如何在IntelliJ IDEA中创建Spark / Scala项目(无法解析build.sbt中的依赖项)? [英] How to create Spark/Scala project in IntelliJ IDEA (fails to resolve dependencies in build.sbt)?

查看:2250
本文介绍了如何在IntelliJ IDEA中创建Spark / Scala项目(无法解析build.sbt中的依赖项)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在IntelliJ IDEA中构建和运行Scala / Spark项目。

I'm trying to build and run a Scala/Spark project in IntelliJ IDEA.

我添加了 org.apache.spark:全局库中的spark-sql_2.11:2.0.0 和我的 build.sbt 如下所示。

I have added org.apache.spark:spark-sql_2.11:2.0.0 in global libraries and my build.sbt looks like below.

name := "test"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.0"
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.0.0"

我仍然收到错误消息


未知工件。无法解析或编入索引

unknown artifact. unable to resolve or indexed

spark-sql

当试图建立项目时,错误是

When tried to build the project the error was


错误:(19,26)没有发现:输入sqlContext,val sqlContext = new sqlContext(sc)

Error:(19, 26) not found: type sqlContext, val sqlContext = new sqlContext(sc)

我不知道问题是什么。如何在IntelliJ IDEA中创建Spark / Scala项目?

I have no idea what the problem could be. How to create a Spark/Scala project in IntelliJ IDEA?

更新
按照建议更新代码以使用 Spark Session ,但仍然无法读取csv文件。我在这做错了什么?谢谢!

Update: Following the suggestions I updated the code to use Spark Session, but it still unable to read a csv file. What am I doing wrong here? Thank you!

 val spark = SparkSession
.builder()
.appName("Spark example")
.config("spark.some.config.option", "some value")
.getOrCreate()

import spark.implicits._

val testdf = spark.read.csv("/Users/H/Desktop/S_CR_IP_H.dat")
testdf.show()  //it doesn't show anything 
//pdf.select("DATE_KEY").show()


推荐答案

sql应该是大写字母如下

sql should upper case letters as below

val sqlContext = new SQLContext(sc)

SQLContext 不推荐使用较新版本的spark,因此建议您使用 SparkSession

SQLContext is deprecated for newer versions of spark so I would suggest you to use SparkSession

val spark = SparkSession.builder().appName("testings").getOrCreate 
val sqlContext = spark.sqlContext

如果你想设置 master 通过代码而不是 spark-submit 命令然后你可以设置 .master (你也可以设置配置

If you want to set the master through your code instead of from spark-submit command then you can set .master as well (you can set configs too)

val spark = SparkSession.builder().appName("testings").master("local").config("configuration key", "configuration value").getOrCreate 
val sqlContext = spark.sqlContext

更新

查看您的样本数据

DATE|PID|TYPE
8/03/2017|10199786|O

并测试您的代码

val testdf = spark.read.csv("/Users/H/Desktop/S_CR_IP_H.dat")
testdf.show()

我的输出为

+--------------------+
|                 _c0|
+--------------------+
|       DATE|PID|TYPE|
|8/03/2017|10199786|O|
+--------------------+

现在为分隔符标题 .option c> as

Now adding .option for delimiter and header as

val testdf2 = spark.read.option("delimiter", "|").option("header", true).csv("/Users/H/Desktop/S_CR_IP_H.dat")
testdf2.show()

输出

+---------+--------+----+
|     DATE|     PID|TYPE|
+---------+--------+----+
|8/03/2017|10199786|   O|
+---------+--------+----+

注意:我使用 .master(local)获取 SparkSession 对象

这篇关于如何在IntelliJ IDEA中创建Spark / Scala项目(无法解析build.sbt中的依赖项)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆