IntelliJ中的结构化流不向控制台显示DataFrame [英] Structured Streaming in IntelliJ not showing DataFrame to console

查看:100
本文介绍了IntelliJ中的结构化流不向控制台显示DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用结构化流加载Spark流数据帧,并且无法使用IntelliJ Idea在控制台中看到任何输出.

I'm trying to load a Spark streaming dataframe using structured streaming and cannot get see any output in the console using IntelliJ Idea.

我的代码:

import org.apache.spark.sql._

Object SparkConsumerTest {

  def main(args: Array[String]): Unit = {

    System.setProperty("hadoop.home.dir", "C:\\hadoop\\")

    val spark = SparkSession
      .builder
      .appName("test_local")
      .config("spark.master", "local")
      .getOrCreate()

    val data_stream = spark.readStream.text("src/main/resources/data_string.txt")

    val result = data_stream.writeStream.format("console").start()

data_string.txt文件中的内容

whats in my data_string.txt file

structured streaming

运行应用程序后,这是IntelliJ Idea中的控制台/运行窗口

Here is the console/run window in IntelliJ Idea after I run the application

Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties

18/09/07 19:03:33 INFO SparkContext: Running Spark version 2.1.0
18/09/07 19:03:33 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where
applicable 

18/09/07 19:03:33 INFO SecurityManager: Changing view acls to: userID 

18/09/07 19:03:33 INFO SecurityManager: Changing modify acls to:
userID 

18/09/07 19:03:33 INFO SecurityManager: Changing view acls groups to: 


18/09/07 19:03:33 INFO SecurityManager: Changing modify acls groups
to:  

18/09/07 19:03:33 INFO SecurityManager: SecurityManager:
authentication  disabled; ui acls disabled; users  with view
permissions: Set(userID); groups with view permissions: Set(); users 
with modify permissions: Set(userID); groups with modify permissions:
Set() 

18/09/07 19:03:34 INFO Utils: Successfully started service
'sparkDriver' on port 60845. 

18/09/07 19:03:34 INFO SparkEnv: Registering MapOutputTracker 

18/09/07 19:03:34 INFO SparkEnv: Registering BlockManagerMaster 

18/09/07 19:03:34 INFO BlockManagerMasterEndpoint: Using
org.apache.spark.storage.DefaultTopologyMapper for getting topology
information 

18/09/07 19:03:34 INFO BlockManagerMasterEndpoint:
BlockManagerMasterEndpoint up 

18/09/07 19:03:34 INFO DiskBlockManager: Created local directory at
C:\Users\userID\AppData\Local\Temp\etc...


18/09/07 19:03:34 INFO MemoryStore: MemoryStore started with capacity
1983.3 MB 

18/09/07 19:03:34 INFO SparkEnv: Registering OutputCommitCoordinator 

18/09/07 19:03:35 INFO Utils: Successfully started service 'SparkUI'
on port 4040. 

18/09/07 19:03:35 INFO SparkUI: Bound SparkUI to
0.0.0.0, and started at "http address"

18/09/07 19:03:35 INFO Executor: Starting executor ID driver on host
localhost 

18/09/07 19:03:35 INFO Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port
60855. 

18/09/07 19:03:35 INFO NettyBlockTransferService: Server created on
"server address" 

18/09/07 19:03:35 INFO BlockManager: Using
org.apache.spark.storage.RandomBlockReplicationPolicy for block
replication policy 

18/09/07 19:03:35 INFO BlockManagerMaster: Registering BlockManager
BlockManagerId(driver, server address, 60855, None) 

18/09/07 19:03:35 INFO BlockManagerMasterEndpoint: Registering block
manager server address with 1983.3 MB RAM, BlockManagerId(driver,
server address, 60855, None) 

18/09/07 19:03:35 INFO BlockManagerMaster: Registered BlockManager
BlockManagerId(driver, server address, 60855, None) 

18/09/07 19:03:35 INFO BlockManager: Initialized BlockManager:
BlockManagerId(driver,
server address, 60855, None) 

18/09/07 19:03:35 INFO SharedState: Warehouse path is
'file:/C:/Users/userid/Documents//SparkTestLocal/spark-warehouse/'.

Process finished with exit code 0

推荐答案

这是因为您没有放置awaitTermination.您需要添加以下内容,

This is is because you haven't put the awaitTermination. You need to add the following,

result.awaitTermination()

在第一行开始查询后,

val result = data_stream.writeStream.format("console").start()

希望这会有所帮助.

这篇关于IntelliJ中的结构化流不向控制台显示DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆