如何在SparkSQL中获取两个日期之间的月,年差异 [英] how to get months,years difference between two dates in sparksql

查看:1687
本文介绍了如何在SparkSQL中获取两个日期之间的月,年差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了错误:

org.apache.spark.sql.analysisexception: cannot resolve 'year'

我的输入数据:

1,2012-07-21,2014-04-09

我的代码:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
import org.apache.spark.sql.SaveMode
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
case class c (id:Int,start:String,end:String)
val c1 = sc.textFile("date.txt")
val c2 = c1.map(_.split(",")).map(r=>(c(r(0).toInt,r(1).toString,r(2).toString)))
val c3 = c2.toDF();
c3.registerTempTable("c4")
val r = sqlContext.sql("select id,datediff(year,to_date(end), to_date(start)) AS date from c4")

该如何解决以上错误?

What can I do resolve above error?

我尝试了以下代码,但几天后就得到了输出,几年后就需要了

I have tried the following code but I got the output in days and I need it in years

val r = sqlContext.sql("select id,datediff(to_date(end), to_date(start)) AS date from c4")

请告知我是否可以使用to_date之类的任何函数来获取年份差异.

Please advise me if i can use any function like to_date to get year difference.

推荐答案

val r = sqlContext.sql("select id,datediff(year,to_date(end), to_date(start)) AS date from c4")

在上面的代码中,"year"不是数据帧中的列,即它不是表"c4"中的有效列,这就是为什么由于查询无效而引发分析异常,查询无法找到年份"列.

In the above code, "year" is not a column in the data frame i.e it is not a valid column in table "c4" that is why analysis exception is thrown as query is invalid, query is not able to find the "year" column.

使用Spark 用户定义函数(UDF),这将是一种更可靠的方法.

Use Spark User Defined Function (UDF), that will be a more robust approach.

这篇关于如何在SparkSQL中获取两个日期之间的月,年差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆