本月按Spark SQL排序 [英] Sorting in Spark SQL for the Month

查看:212
本文介绍了本月按Spark SQL排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列为月份",内容"为

I have a column as Month with Contents as

(Jan2016,Feb2016,Mar2016,Jun2016)

我正尝试将其订购为

df.orderBy("Month")

但Col的订购月份为

Feb2016,Jan2016

按字母顺序,如何按月排序?

in the alphabetical order, How can I order it by month?

推荐答案

我引用了Antot的代码.

I refer the code of Antot.

    val monthWithIndex = Seq("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec").zipWithIndex.toMap

val monthSim = udf( (mon : String) => {
  monthWithIndex( mon.substring( 0, 3))
})
val df = session.sparkContext.parallelize( Seq("Jan2016","Feb2016","Mar2016","Jun2016")).toDF("Month")
df.withColumn("newMonth", monthSim($"Month")).orderBy("newMonth").drop("newMonth").show

如果您想按年份和月份订购,则可以按上述代码添加年份列.

If you wanto order by year and month, you can add the year column by above code.

这篇关于本月按Spark SQL排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆