如何在Spark中检索DataFrame的别名 [英] How can I retrieve the alias for a DataFrame in Spark
本文介绍了如何在Spark中检索DataFrame的别名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用Spark 2.0.2.我有一个上面带有别名的DataFrame,我希望能够检索它.下面是为什么我想要一个简化的示例.
I'm using Spark 2.0.2. I have a DataFrame that has an alias on it, and I'd like to be able to retrieve that. A simplified example of why I'd want that is below.
def check(ds: DataFrame) = {
assert(ds.count > 0, s"${df.getAlias} has zero rows!")
}
上面的代码当然会失败,因为DataFrame没有 getAlias 函数.有办法吗?
The above code of course fails because DataFrame has no getAlias function. Is there a way to do this?
推荐答案
您可以尝试类似的方法,但我不会声称它受到支持:
You can try something like this but I wouldn't go so far to claim it is supported:
-
火花< 2.1:
Spark < 2.1:
import org.apache.spark.sql.catalyst.plans.logical.SubqueryAlias
import org.apache.spark.sql.Dataset
def getAlias(ds: Dataset[_]) = ds.queryExecution.analyzed match {
case SubqueryAlias(alias, _) => Some(alias)
case _ => None
}
Spark 2.1 +:
Spark 2.1+:
def getAlias(ds: Dataset[_]) = ds.queryExecution.analyzed match {
case SubqueryAlias(alias, _, _) => Some(alias)
case _ => None
}
示例用法:
val plain = Seq((1, "foo")).toDF
getAlias(plain)
Option[String] = None
val aliased = plain.alias("a dataset")
getAlias(aliased)
Option[String] = Some(a dataset)
这篇关于如何在Spark中检索DataFrame的别名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文