在所有列火花上应用功能 [英] applying function on all column spark

查看:21
本文介绍了在所有列火花上应用功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经完成了这段代码,我的问题是关于函数转换数据类型,除了列时间戳之外,我如何才能同时转换包含在数据集中的所有列,另一个问题是如何对所有列应用函数avg列除了列时间戳.非常感谢

i have done this code , my question is for the function cast data type , how can i cast all column included in a dataset at the same execept the column time stamp , and the other question is how to apply function avg on all column except also column timestamp. Thanks a lot

val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:/Users/mhattabi/Desktop/dataTest.csv")
val result=df.withColumn("new_time",((unix_timestamp(col("time")) /300).cast("long") * 300).cast("timestamp"))
result("value").cast("float")//here the first question 
val finalresult=result.groupBy("new_time").agg(avg("value")).sort("new_time")//here the second question about avg
finalresult.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").save("C:/mydata.csv")

推荐答案

这在 pyspark 中很容易实现,但我在尝试将其重写为 Scala 代码时遇到了麻烦……我希望你能以某种方式管理它.

This is quite easy to implement in pyspark, but I run into touble trying to rewrite this to scala code... I hope you will manage it somehow.

from pyspark.sql.functions import *
df = spark.createDataFrame([(100, "4.5", "5.6")], ["new_time", "col1", "col2"])
columns = [col(c).cast('float') if c != 'new_time' else col(c) for c in df.columns]
aggs = [avg(c) for c in df.columns if c != 'new_time']
finalresult = df.select(columns).groupBy('new_time').agg(*aggs)
finalresult.explain()

*HashAggregate(keys=[new_time#0L], functions=[avg(cast(col1#14 as double)), avg(cast(col2#15 as double))])
+- Exchange hashpartitioning(new_time#0L, 200)
   +- *HashAggregate(keys=[new_time#0L], functions=[partial_avg(cast(col1#14 as double)), partial_avg(cast(col2#15 as double))])
      +- *Project [new_time#0L, cast(col1#1 as float) AS col1#14, cast(col2#2 as float) AS col2#15]
         +- Scan ExistingRDD[new_time#0L,col1#1,col2#2]

这篇关于在所有列火花上应用功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆