如何区分Spark中的操作是转换还是动作? [英] how to distinguish an operation in spark is a transformation or an action?

查看:240
本文介绍了如何区分Spark中的操作是转换还是动作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在学习火花,并对转换和动作操作感到困惑.我阅读了spark文档和一些有关spark的书,并且我知道操作会导致spark作业在集群中执行,而转换不会执行.但是未说明在spark的api文档中列出的rdd的操作是转换操作还是动作操作.

I'm learning spark recently and confused about the transformation and action operation. I read the spark document and some books about spark, and I know action will cause a spark job to be executed in the cluster while transformation will not. But the operations of rdd listed in spark's api doc are not stated whether it is a transformation or an action operation.

例如,reduce是一个动作,而reduceByKey是一个转换!为什么会这样.

For example, reduce is an action, on the other hand reduceByKey is a transformation! Why could this be.

推荐答案

您可以通过查看返回类型来判断.动作将返回非RDD类型(通常是您的存储值类型),而转换将返回RDD[Type],因为它仍只是计算的表示形式.

You can tell by looking at the return type. An action will return a non-RDD type (your stored value types usually), whereas a transformation will return an RDD[Type] as it is still just a representation of your computation.

这篇关于如何区分Spark中的操作是转换还是动作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆