会引发unpersist的RDD本身,当它意识到它不会再被使用? [英] Would Spark unpersist the RDD itself when it realizes it won't be used anymore?

查看:1241
本文介绍了会引发unpersist的RDD本身,当它意识到它不会再被使用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以坚持一个RDD到内存和/或磁盘,当我们要使用它不止一次。但是,我们是否要unpersist它自己以后,还是不星火时不再需要某种垃圾收集RDD和unpersist呢?我注意到,如果我叫unpersist功能我自己,我得到性能下降。

We can persist an RDD into memory and/or disk when we want to use it more than once. However, do we have to unpersist it ourselves later on, or does Spark does some kind of garbage collection and unpersist the RDD when it is no longer needed? I notice that If I call unpersist function myself, I get slower performance.

推荐答案

是的,它的垃圾回收时的Apache星火将unpersist的RDD。

Yes, Apache Spark will unpersist the RDD when it's garbage collected.

在<一个href=\"https://github.com/apache/spark/blob/v1.5.0/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L166\"相对=nofollow> RDD.persist 可以看到:

In RDD.persist you can see:

sc.cleaner.foreach(_.registerRDDForCleanup(this))

这使一个的WeakReference的RDD在ReferenceQueue上导致<一个href=\"https://github.com/apache/spark/blob/v1.5.0/core/src/main/scala/org/apache/spark/ContextCleaner.scala#L189\"相对=nofollow> ContextCleaner.doCleanupRDD 当RDD是垃圾收集。还有:

This puts a WeakReference to the RDD in a ReferenceQueue leading to ContextCleaner.doCleanupRDD when the RDD is garbage collected. And there:

sc.unpersistRDD(rddId, blocking)

有关更多的上下文看ContextCleaner一般和提交了补充它

For more context see ContextCleaner in general and the commit that added it.

这篇关于会引发unpersist的RDD本身,当它意识到它不会再被使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆