如何将 Scala Spark Dataset.show 重定向到 log4j 记录器 [英] how to redirect Scala Spark Dataset.show to log4j logger

查看:19
本文介绍了如何将 Scala Spark Dataset.show 重定向到 log4j 记录器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Spark API 文档展示了如何从发送到标准输出的数据集或数据帧中获取漂亮的打印片段.

The Spark API Doc's show how to get a pretty-print snippit from a dataset or dataframe sent to stdout.

可以将此输出定向到 log4j 记录器吗?或者:有人可以共享将创建类似于 df.show() 格式的输出的代码吗?

Can this output be directed to a log4j logger? Alternately: can someone share code which will create output formatted similarly to the df.show()?

有没有办法让标准输出在将 .show() 输出推送到记录器之前和之后都进入控制台?

Is there a way to do this which allow stdout to go to the console both before and after pushing the .show() output to the logger?

http://spark.apache.org/docs/latest/sql-programming-guide.htm

val df = spark.read.json("examples/src/main/resources/people.json")

// Displays the content of the DataFrame to stdout
df.show()
// +----+-------+
// | age|   name|
// +----+-------+
// |null|Michael|
// |  30|   Andy|
// |  19| Justin|
// +----+-------+

推荐答案

中的 showString() 函数teserecter 来自 Spark 代码(Dataset.scala).

The showString() function from teserecter comes from Spark code (Dataset.scala).

您不能在代码中使用该函数,因为它是包私有的,但您可以将以下代码段放在源代码中的文件 DatasetShims.scala 中,并在您的类中混入该特征访问该功能.

You can't use that function from your code because it's package private but you can place the following snippet in a file DatasetShims.scala in your source code and mix-in the trait in your classes to access the function.

package org.apache.spark.sql

trait DatasetShims {
  implicit class DatasetHelper[T](ds: Dataset[T]) {
    def toShowString(numRows: Int = 20, truncate: Int = 20, vertical: Boolean = false): String =
      "
" + ds.showString(numRows, truncate, vertical)
  }
}

这篇关于如何将 Scala Spark Dataset.show 重定向到 log4j 记录器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆