缓存 RDD 的缺点是什么? [英] What are the drawbacks of caching RDDs?

查看：94 发布时间：2021/6/24 20:44:13 apache-spark pyspark rdd

本文介绍了缓存 RDD 的缺点是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们最近开始缓存重复使用多次的 RDD，即使这些 RDD 不需要很长时间来计算.

We recently started caching RDD that reused multiple times even if those RDD don't take a long time to compute.

根据文档，Spark 将使用 LRU 策略自动驱逐未使用的缓存数据.

According to the docs Spark will automatically evict the unused cached data using a LRU strategy.

那么过度缓存 RDD 有什么缺点吗?我在想，也许将所有反序列化的数据放在内存中可能会给 GC 带来更大的压力，但这是我们应该担心的事情吗?

So is there any drawback of overcaching RDDs? I was thinking that maybe that having all that deserialized data in memory could put more pressure on the GC but is this something that we should worry about?

缓存 RDD 的缺点是什么? [英] What are the drawbacks of caching RDDs?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

缓存 RDD 的缺点是什么? [英] What are the drawbacks of caching RDDs?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭