缓存和持久性有什么区别? [英] What is the difference between cache and persist?

查看：146 发布时间：2020/9/3 23:56:09 apache-spark distributed-computing rdd

本文介绍了缓存和持久性有什么区别?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

就RDD持久性而言，cache()和persist()在spark中有什么区别?

In terms of RDD persistence, what are the differences between cache() and persist() in spark ?

推荐答案

对于cache()，您仅使用默认存储级别:

With cache(), you use only the default storage level :

MEMORY_ONLY用于 RDD
MEMORY_AND_DISK用于数据集

MEMORY_ONLY for RDD
MEMORY_AND_DISK for Dataset

使用persist()，您可以为 RDD 和数据集指定所需的存储级别.

With persist(), you can specify which storage level you want for both RDD and Dataset.

摘自官方文档:

您可以使用persist()或cache()方法将RDD标记为持久化.

每个持久化的RDD可以使用不同的storage level
存储
cache()方法是使用默认存储级别StorageLevel.MEMORY_ONLY(将反序列化的对象存储在内存中)的简写.

You can mark an RDD to be persisted using the persist() or cache() methods on it.

each persisted RDD can be stored using a different storage level

The cache() method is a shorthand for using the default storage level, which is StorageLevel.MEMORY_ONLY (store deserialized objects in memory).

如果要分配除以下级别之外的存储级别，请使用persist():

Use persist() if you want to assign a storage level other than :

MEMORY_ONLY到 RDD
或MEMORY_AND_DISK用于数据集

MEMORY_ONLY to the RDD
or MEMORY_AND_DISK for Dataset

官方文档的有趣链接:

Interesting link for the official documentation : which storage level to choose

这篇关于缓存和持久性有什么区别?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

缓存和持久性有什么区别? [英] What is the difference between cache and persist?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

缓存和持久性有什么区别? [英] What is the difference between cache and persist?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭