与集团聚集按日期星火SQL [英] Aggregation with Group By date in Spark SQL

查看:117
本文介绍了与集团聚集按日期星火SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含类型的名为时间戳的时间的RDD长:

I have an RDD containing a timestamp named time of type long:

root
 |-- id: string (nullable = true)
 |-- value1: string (nullable = true)
 |-- value2: string (nullable = true)
 |-- time: long (nullable = true)
 |-- type: string (nullable = true)

我想通过组值1,值和时间YYYY-MM-DD。我试图通过组投(时间日期),但后来我得到了以下错误:

I am trying to group by value1, value2 and time as YYYY-MM-DD. I tried to group by cast(time as Date) but then I got the following error:

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
    at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.RuntimeException: [1.21] failure: ``DECIMAL'' expected but identifier Date found

这是否意味着没有办法按日期?我甚至尝试添加铸件的另一个层面把它作为一个字符串:

Does that mean there is not way to group by a date? I even tried to add another level of casting to have it as a String:

cast(cast(time as Date) as String)

它返回相同的错误。

Which returns the same error.

我读过,我大概可以使用aggregateByKey在RDD但我不知道如何使用它几列和转换,长期为YYYY-MM-DD字符串。我应该如何进行?

I've read that I could use probably aggregateByKey on the RDD but I don't understand how to use it for a few columns and convert that long to a YYYY-MM-DD String. How should I proceed?

推荐答案

我加入这个功能解决了这个问题:

I solved the issue by adding this functions:

def convert( time:Long ) : String = {
  val sdf = new java.text.SimpleDateFormat("yyyy-MM-dd")
  return sdf.format(new java.util.Date(time))
}

和它注册到sqlContext是这样的:

And registering it into the sqlContext like this:

sqlContext.registerFunction("convert", convert _)

然后我终于可以按日期组:

Then I could finally group by date:

select * from table convert(time)

这篇关于与集团聚集按日期星火SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆