如何在 Spark 中检索 DataFrame 的别名 [英] How can I retrieve the alias for a DataFrame in Spark
本文介绍了如何在 Spark 中检索 DataFrame 的别名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我使用的是 Spark 2.0.2.我有一个带有别名的 DataFrame,我希望能够检索它.下面是我想要的原因的简化示例.
I'm using Spark 2.0.2. I have a DataFrame that has an alias on it, and I'd like to be able to retrieve that. A simplified example of why I'd want that is below.
def check(ds: DataFrame) = {
assert(ds.count > 0, s"${df.getAlias} has zero rows!")
}
上面的代码当然失败了,因为DataFrame没有getAlias函数.有没有办法做到这一点?
The above code of course fails because DataFrame has no getAlias function. Is there a way to do this?
推荐答案
你可以尝试这样的事情,但我不会说它受支持:
You can try something like this but I wouldn't go so far to claim it is supported:
火花<2.1:
Spark < 2.1:
import org.apache.spark.sql.catalyst.plans.logical.SubqueryAlias
import org.apache.spark.sql.Dataset
def getAlias(ds: Dataset[_]) = ds.queryExecution.analyzed match {
case SubqueryAlias(alias, _) => Some(alias)
case _ => None
}
Spark 2.1+:
Spark 2.1+:
def getAlias(ds: Dataset[_]) = ds.queryExecution.analyzed match {
case SubqueryAlias(alias, _, _) => Some(alias)
case _ => None
}
示例用法:
val plain = Seq((1, "foo")).toDF
getAlias(plain)
Option[String] = None
val aliased = plain.alias("a dataset")
getAlias(aliased)
Option[String] = Some(a dataset)
这篇关于如何在 Spark 中检索 DataFrame 的别名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文