Spark无法找到窗口功能 [英] Spark Couldn't Find Window Function

查看:67
本文介绍了Spark无法找到窗口功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 https://stackoverflow.com/a/32407543/5379015 中提供的解决方案我尝试重新创建相同的查询,但使用编程语法代替了 Dataframe API,如下所示:

Using the solution provided in https://stackoverflow.com/a/32407543/5379015 I tried to recreate the same query but using the programmatic syntax in stead of the Dataframe API as follows:

import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._

object HiveContextTest {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("HiveContextTest")
    val sc = new SparkContext(conf)
    val sqlContext = new HiveContext(sc)
    import sqlContext.implicits._

    val df = sc.parallelize(
      ("foo", 1) :: ("foo", 2) :: ("bar", 1) :: ("bar", 2) :: Nil
    ).toDF("k", "v")


    // using dataframe api works fine

    val w = Window.partitionBy($"k").orderBy($"v")
    df.select($"k",$"v", rowNumber().over(w).alias("rn")).show


    //using programmatic syntax doesn't work

    df.registerTempTable("df")
    val w2 = sqlContext.sql("select k,v,rowNumber() over (partition by k order by v) as rn from df")
    w2.show()

  }
}

第一个 df.select($"k",$"v",rowNumber().over(w).alias("rn")).show 可以正常工作,但是 w2.show()导致

The first df.select($"k",$"v", rowNumber().over(w).alias("rn")).show works fine but the w2.show() results in

Exception in thread "main" org.apache.spark.sql.AnalysisException: Couldn't find window function rowNumber;

有人对我如何使用程序化语法进行这项工作有任何想法吗?预先非常感谢.

Does anyone have any ideas as to how I can make this work with the programmatic syntax? Many thanks in advance.

推荐答案

SQL等效于 rowNumber 的是 row_number :

SELECT k, v, row_number() OVER (PARTITION BY k ORDER BY v) AS rn FROM df

这篇关于Spark无法找到窗口功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆