rdd之后的数字是什么意思 [英] What does the number meaning after the rdd

查看:128
本文介绍了rdd之后的数字是什么意思的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

rdd之后方括号中的数字是什么意思?

What does the meaning of the number in the bracket after rdd?

推荐答案

RDD之后的数字是其标识符:

The number after RDD is its identifier:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.

scala> val rdd = sc.range(0, 42)
rdd: org.apache.spark.rdd.RDD[Long] = MapPartitionsRDD[1] at range at <console>:24

scala> rdd.id
res0: Int = 1

它用于跟踪整个会话中的RDD,例如用于caching:

It is used to track RDD across the session, for example for purposes like caching:

scala> rdd.cache
res1: rdd.type = MapPartitionsRDD[1] at range at <console>:24

scala> rdd.count
res2: Long = 42

scala> sc.getPersistentRDDs
res3: scala.collection.Map[Int,org.apache.spark.rdd.RDD[_]] = Map(1 -> MapPartitionsRDD[1] at range at <console>:24)

这个数字很简单,一个增量整数(nextRddId只是一个AtomicInteger):

This number is simple an incremental integer (nextRddId is just an AtomicInteger):

private[spark] def newRddId(): Int = nextRddId.getAndIncrement()

生成的在构建RDD时:

/** A unique ID for this RDD (within its SparkContext). */
val id: Int = sc.newRddId()

所以,如果我们遵循:

scala> val pairs1 = sc.parallelize(Seq((1, "foo")))
pairs1: org.apache.spark.rdd.RDD[(Int, String)] = ParallelCollectionRDD[2] at parallelize at <console>:24

scala> val pairs2 = sc.parallelize(Seq((1, "bar")))
pairs2: org.apache.spark.rdd.RDD[(Int, String)] = ParallelCollectionRDD[3] at parallelize at <console>:24


scala> pairs1.id
res5: Int = 2

scala> pairs2.id
res6: Int = 3

您将看到2和3,并且如果执行,

you'll see 2 and 3, and if you execute

scala> pairs1.join(pairs2).foreach(_ => ())

您希望得到4,可以通过检查UI来确认:

you'd expect 4, which can confirmed by checking the UI:

我们还可以看到join在封面(56)下创建了一些新的RDDs.

We can also see that join creates a few new RDDs under the covers (5 and 6).

这篇关于rdd之后的数字是什么意思的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆