如何将Scala Spark Dataset.show重定向到log4j记录器 [英] how to redirect Scala Spark Dataset.show to log4j logger
问题描述
Spark API文档显示了如何从发送到stdout的数据集或数据帧中获得漂亮的代码片段.
The Spark API Doc's show how to get a pretty-print snippit from a dataset or dataframe sent to stdout.
此输出可以定向到log4j记录器吗?或者:有人可以共享将创建类似于df.show()格式的输出的代码吗?
Can this output be directed to a log4j logger? Alternately: can someone share code which will create output formatted similarly to the df.show()?
是否有一种方法可以使stdout在将.show()输出推送到记录器之前和之后都进入控制台?
Is there a way to do this which allow stdout to go to the console both before and after pushing the .show() output to the logger?
http://spark.apache.org/docs/latest/sql-programming-guide.htm
val df = spark.read.json("examples/src/main/resources/people.json")
// Displays the content of the DataFrame to stdout
df.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
推荐答案
teserecter 的showString()
函数来自火花代码(Dataset.scala
).
The showString()
function from teserecter comes from Spark code (Dataset.scala
).
您不能在代码中使用该函数,因为它是私有程序包,但是您可以将以下代码段放在源代码中的文件DatasetShims.scala
中,并混入类中的特征以访问该函数.
You can't use that function from your code because it's package private but you can place the following snippet in a file DatasetShims.scala
in your source code and mix-in the trait in your classes to access the function.
package org.apache.spark.sql
trait DatasetShims {
implicit class DatasetHelper[T](ds: Dataset[T]) {
def toShowString(numRows: Int = 20, truncate: Int = 20, vertical: Boolean = false): String =
"\n" + ds.showString(numRows, truncate, vertical)
}
}
这篇关于如何将Scala Spark Dataset.show重定向到log4j记录器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!